[Libguestfs] [libnbd PATCH] tls: Check for pending bytes in gnutls buffers

Fri Jun 7 14:46:47 UTC 2019

On Thu, Jun 06, 2019 at 02:36:48PM -0500, Eric Blake wrote:
> Checking for poll(POLLIN) only wakes us up when the server sends more
> bytes over the wire. But the way gnutls is implemented, it reads as
> many non-blocking bytes as possible from the wire before then parsing
> where the encoded message boundaries lie within those bytes. As a
> result, we may already have the server's next reply in memory without
> needing to block on the server sending yet more bytes (especially if
> the server was employing tricks like corking to batch things up for
> fewer packets over the network).
> 
> If we are using synchronous API, this is not a problem (we never have
> more than one request in flight at once, which implies there are no
> outstanding replies from the server when we send a request, so we
> always get the needed POLLIN when the server's next reply begins); but
> if we are using asynch API and have more than one message in flight,
> it is conceivable that the server flushes its complete reply queue
> (two or more replies) contained within a single encrypted network
> packet, but where our state machine reads only the first reply and
> then waits for a POLLIN kick before reading again.  Fortunately, I
> could not think of a scenario where this would turn into direct
> deadlock in our state machine alone (although I also haven't
> definitively ruled out that possibility) - in the common case, the
> only way the server can send more than one reply at once is if it
> supports parallel in-flight requests; which means we are safe that the
> next asynch message we send will be received by the server, and its
> reply to that message will give us the POLLIN kick we need to finally
> process the reply that had been stuck (for a possibly long time) in
> the local buffer.
> 
> But I _did_ come up with a problematic scenario - if the server's
> final two replies before our NBD_CMD_DISC occur in the same packet, we
> could end up with the socket closed without ever handling the second
> reply, giving us data loss for that reply.  It's also a potential
> indirect deadlock: the NBD spec recommends that a client not call
> NBD_CMD_DISC until it has received all responses to outstanding
> in-flight requests, so a client that has finished its work and does
> not plan to make any more aio requests other than NBD_CMD_DISC and
> wants to block until all server replies are in will hang waiting for a
> POLLIN that the server never sends because we already have the reply
> and aren't sending more requests.
> 
> Testing shows that this patch has a slight performance improvement (to
> be expected, since we are eliminating an artificial latency on
> starting the REPLY sequence in the state machine) - on my machine,
> running tests/aio-parallel-load-tls saw an increase from 11522.9 IOPS
> pre-patch to 12558.1 IOPS post-patch.
> ---
> 
> I'm pushing this because of the performance improvement, but wanted
> to post it because it was an interesting problem.  (Sad when my writeup
> is three times longer than the patch itself...)
> 
>  generator/states.c | 5 +++++
>  lib/crypto.c       | 7 +++++++
>  lib/internal.h     | 1 +
>  3 files changed, 13 insertions(+)
> 
> diff --git a/generator/states.c b/generator/states.c
> index bce4f85..145e8c1 100644
> --- a/generator/states.c
> +++ b/generator/states.c
> @@ -113,6 +113,11 @@ send_from_wbuf (struct nbd_handle *h)
>   READY:
>    if (h->cmds_to_issue)
>      SET_NEXT_STATE (%ISSUE_COMMAND.START);
> +  else {
> +    assert (h->sock);
> +    if (h->sock->ops->pending && h->sock->ops->pending (h->sock))
> +      SET_NEXT_STATE (%REPLY.START);
> +  }
>    return 0;
> 
>   DEAD:
> diff --git a/lib/crypto.c b/lib/crypto.c
> index 3f7c744..e0f173f 100644
> --- a/lib/crypto.c
> +++ b/lib/crypto.c
> @@ -181,6 +181,12 @@ tls_send (struct nbd_handle *h,
>    return r;
>  }
> 
> +static bool
> +tls_pending (struct socket *sock)
> +{
> +  return gnutls_record_check_pending (sock->u.tls.session) > 0;
> +}
> +
>  static int
>  tls_get_fd (struct socket *sock)
>  {
> @@ -209,6 +215,7 @@ tls_close (struct socket *sock)
>  static struct socket_ops crypto_ops = {
>    .recv = tls_recv,
>    .send = tls_send,
> +  .pending = tls_pending,
>    .get_fd = tls_get_fd,
>    .close = tls_close,
>  };
> diff --git a/lib/internal.h b/lib/internal.h
> index 5b6152b..61ddbde 100644
> --- a/lib/internal.h
> +++ b/lib/internal.h
> @@ -192,6 +192,7 @@ struct socket_ops {
>                     struct socket *sock, void *buf, size_t len);
>    ssize_t (*send) (struct nbd_handle *h,
>                     struct socket *sock, const void *buf, size_t len);
> +  bool (*pending) (struct socket *sock);
>    int (*get_fd) (struct socket *sock);
>    int (*close) (struct socket *sock);
>  };
> -- 

Sorry for the late review, was busy with meeting this week.

Patch looks good to me, ACK.

Thanks, Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/