[Libguestfs] [libnbd PATCH 05/13] protocol: Prepare to receive 64-bit replies

Fri Aug 12 21:20:44 UTC 2022

[reducing cc list]

On Fri, Dec 03, 2021 at 05:17:33PM -0600, Eric Blake wrote:
> Support receiving headers for 64-bit replies if extended headers were
> negotiated.  We already insist that the server not send us too much
> payload in one reply, so we can exploit that and merge the 64-bit
> length back into a normalized 32-bit field for the rest of the payload
> length calculations.  The NBD protocol specifically made extended
> simple and structured replies both occupy 32 bytes, while the handle
> field is still in the same offset between all reply types.
> 
> Note that if we negotiate extended headers, but a non-compliant server
> replies with a non-extended header, we will stall waiting for the
> server to send more bytes rather than noticing that the magic number
> is wrong.  The alternative would be to read just the first 4 bytes of
> magic, then determine how many more bytes to expect; but that would
> require more states and syscalls, and not worth it since the typical
> server will be compliant.

If it hasn't been obvious from my patch series this week, this is one
of the things I'm trying to make nicer for my v2 64-bit extension
patches when I'm eventually ready to post them.

> +++ b/generator/states-reply.c
> @@ -1,5 +1,5 @@
>  /* nbd client library in userspace: state machine
> - * Copyright (C) 2013-2019 Red Hat Inc.
> + * Copyright (C) 2013-2021 Red Hat Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public
> @@ -68,7 +68,10 @@ STATE_MACHINE {
>    assert (h->rlen == 0);
> 
>    h->rbuf = &h->sbuf;
> -  h->rlen = sizeof h->sbuf.simple_reply;
> +  if (h->extended_headers)
> +    h->rlen = sizeof h->sbuf.simple_reply_ext;
> +  else
> +    h->rlen = sizeof h->sbuf.simple_reply;
> 
>    r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen);

I still want the benefit of the common case doing a single read()
syscall for 32 bytes rather than two back-to-back read()s of the magic
number followed by the rest of the reply header (using the length
determined by the magic field) - reducing syscalls is going to have
performance benefits.  But I think I've come up with a plan:

When extended headers are negotiated, request a single read() for the
full 32-byte header.  But in state REPLY.RECV_REPLY, if
recv_into_rbuf() returns 1 (ie. we had a short read), instead of just
blindly returning the state machine back to %READY to wait for more
data, I will first check if we have at least read enough bytes to see
if the magic number matches expectations.  That way, even if a server
replies with a simple reply instead of an extended reply, I can detect
that our read() is now waiting for data that won't arrive until a
future server reply (basically, deadlock if we are in something like a
blocking nbd_pwrite where we can't request another server reply), and
instead move the connection over to DEAD.  It's not as pleasant as
blindly accepting the narrower header (the way my recent patches have
been working around server bugs where it was easy), but it is better
than hanging.  To actually pretend the narrower header is okay, I'd
need to implement some sort of push-back queue in order to re-read()
the tail bytes that were already taken off the wire but not consumed
as part of the narrow header.  So for now, I'm happy living with
negotiating extended headers implying that the server must use
extended replies to avoid the DEAD state.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org