[Libguestfs] [PATCH nbdkit] Minimal implementation of NBD Structured Replies.
Eric Blake
eblake at redhat.com
Fri Mar 8 15:03:05 UTC 2019
On 3/8/19 4:04 AM, Richard W.M. Jones wrote:
> This is about the simplest implementation of NBD Structured Replies
> (SR) that we can do right now.
>
> It accepts NBD_OPT_STRUCTURED_REPLIES when negotiated by the client,
> but only sends back the simplest possible SR when required to by
> NBD_CMD_READ. The rest of the time it will send back simple replies
> as before. We do not modify the plugin API so plugins are unable to
> send complex SRs.
Yep, sounds compliant, and similar to how I added it in qemu. I'll read
through this one in detail, but interoperability with qemu is already a
good sign that it's workable.
>
> Also we do not handle human-readable error messages yet because that
> would require changes in how we handle nbdkit_error().
>
> Also we do not understand NBD_CMD_FLAG_DF, but that seems to be OK
> because (a) we don't advertize the feature and (b) we only send back a
> single chunk anyway.
Or, we COULD advertise it because we always honor it (but that's a
larger diffstat, and thus at odds with "minimal implementation"). Either
way works.
> +/* Structured reply types. */
> +extern const char *name_of_nbd_reply_type (int);
> +#define NBD_REPLY_TYPE_NONE 0
> +#define NBD_REPLY_TYPE_OFFSET_DATA 1
> +#define NBD_REPLY_TYPE_OFFSET_HOLE 2
> +#define NBD_REPLY_TYPE_BLOCK_STATUS 3
> +#define NBD_REPLY_TYPE_ERROR 32769
> +#define NBD_REPLY_TYPE_ERROR_OFFSET 32770
Worth writing these later ones in hex or via a helper macro that does
((1 << 15) | value)? Or would that mess up the generated
protocol-to-lookup magic?
> +++ b/plugins/nbd/nbd.c
> @@ -345,7 +345,7 @@ nbd_request (struct handle *h, uint16_t flags, uint16_t type, uint64_t offset,
> static int
> nbd_reply_raw (struct handle *h, int *fd)
> {
> - struct reply rep;
> + struct simple_reply rep;
> struct transaction *trans;
> void *buf;
> uint32_t count;
> @@ -353,7 +353,7 @@ nbd_reply_raw (struct handle *h, int *fd)
> *fd = -1;
> if (read_full (h->fd, &rep, sizeof rep) < 0)
> return nbd_mark_dead (h);
> - if (be32toh (rep.magic) != NBD_REPLY_MAGIC)
> + if (be32toh (rep.magic) != NBD_SIMPLE_REPLY_MAGIC)
> return nbd_mark_dead (h);
> nbdkit_debug ("received reply for cookie %#" PRIx64 ", status %s",
> rep.handle, name_of_nbd_error(be32toh (rep.error)));
Teaching the nbd plugin to negotiate structured replies as the client is
obviously a separate patch, I'm fine if you leave that to me :)
> +static int
> +send_structured_reply_read (struct connection *conn,
> + uint64_t handle, uint16_t cmd,
> + const char *buf, uint32_t count, uint64_t offset)
> +{
> + /* Once we are really using structured replies and sending data back
> + * in chunks, we'll be able to grab the write lock for each chunk,
> + * allowing other threads to interleave replies. As we're not doing
> + * that yet we acquire the lock for the whole function.
> + */
Fair enough.
> + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
> + struct structured_reply reply;
> + struct structured_reply_offset_data offset_data;
> + int r;
> +
> + assert (cmd == NBD_CMD_READ);
> +
> + reply.magic = htobe32 (NBD_STRUCTURED_REPLY_MAGIC);
> + reply.handle = handle;
> + reply.flags = htobe16 (NBD_REPLY_FLAG_DONE);
> + reply.type = htobe16 (NBD_REPLY_TYPE_OFFSET_DATA);
> + reply.length = htobe32 (sizeof offset_data + count);
This line is correct, but I had to remind myself of C precedence rules
on this one; writing 'count + sizeof offset_data' instead has the same
effect without worrying whether sizeof binds with higher or lower
precedence than +.
> @@ -1317,40 +1449,33 @@ recv_request_send_reply (struct connection *conn)
>
> /* Send the reply packet. */
> send_reply:
> - {
> - ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
> - if (get_status (conn) < 0)
> - return -1;
> - reply.magic = htobe32 (NBD_REPLY_MAGIC);
> - reply.handle = request.handle;
> - reply.error = htobe32 (nbd_errno (error));
> + if (get_status (conn) < 0)
> + return -1;
Hmm, previously get_status() was checked under lock. But since it is a
thread-local variable, we should be safe grabbing it while unlocked.
ACK.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190308/27981e3f/attachment.sig>
More information about the Libguestfs
mailing list