[Libguestfs] [libnbd PATCH] tls: Check for pending bytes in gnutls buffers
Richard W.M. Jones
rjones at redhat.com
Fri Jun 7 14:46:47 UTC 2019
On Thu, Jun 06, 2019 at 02:36:48PM -0500, Eric Blake wrote:
> Checking for poll(POLLIN) only wakes us up when the server sends more
> bytes over the wire. But the way gnutls is implemented, it reads as
> many non-blocking bytes as possible from the wire before then parsing
> where the encoded message boundaries lie within those bytes. As a
> result, we may already have the server's next reply in memory without
> needing to block on the server sending yet more bytes (especially if
> the server was employing tricks like corking to batch things up for
> fewer packets over the network).
>
> If we are using synchronous API, this is not a problem (we never have
> more than one request in flight at once, which implies there are no
> outstanding replies from the server when we send a request, so we
> always get the needed POLLIN when the server's next reply begins); but
> if we are using asynch API and have more than one message in flight,
> it is conceivable that the server flushes its complete reply queue
> (two or more replies) contained within a single encrypted network
> packet, but where our state machine reads only the first reply and
> then waits for a POLLIN kick before reading again. Fortunately, I
> could not think of a scenario where this would turn into direct
> deadlock in our state machine alone (although I also haven't
> definitively ruled out that possibility) - in the common case, the
> only way the server can send more than one reply at once is if it
> supports parallel in-flight requests; which means we are safe that the
> next asynch message we send will be received by the server, and its
> reply to that message will give us the POLLIN kick we need to finally
> process the reply that had been stuck (for a possibly long time) in
> the local buffer.
>
> But I _did_ come up with a problematic scenario - if the server's
> final two replies before our NBD_CMD_DISC occur in the same packet, we
> could end up with the socket closed without ever handling the second
> reply, giving us data loss for that reply. It's also a potential
> indirect deadlock: the NBD spec recommends that a client not call
> NBD_CMD_DISC until it has received all responses to outstanding
> in-flight requests, so a client that has finished its work and does
> not plan to make any more aio requests other than NBD_CMD_DISC and
> wants to block until all server replies are in will hang waiting for a
> POLLIN that the server never sends because we already have the reply
> and aren't sending more requests.
>
> Testing shows that this patch has a slight performance improvement (to
> be expected, since we are eliminating an artificial latency on
> starting the REPLY sequence in the state machine) - on my machine,
> running tests/aio-parallel-load-tls saw an increase from 11522.9 IOPS
> pre-patch to 12558.1 IOPS post-patch.
> ---
>
> I'm pushing this because of the performance improvement, but wanted
> to post it because it was an interesting problem. (Sad when my writeup
> is three times longer than the patch itself...)
>
> generator/states.c | 5 +++++
> lib/crypto.c | 7 +++++++
> lib/internal.h | 1 +
> 3 files changed, 13 insertions(+)
>
> diff --git a/generator/states.c b/generator/states.c
> index bce4f85..145e8c1 100644
> --- a/generator/states.c
> +++ b/generator/states.c
> @@ -113,6 +113,11 @@ send_from_wbuf (struct nbd_handle *h)
> READY:
> if (h->cmds_to_issue)
> SET_NEXT_STATE (%ISSUE_COMMAND.START);
> + else {
> + assert (h->sock);
> + if (h->sock->ops->pending && h->sock->ops->pending (h->sock))
> + SET_NEXT_STATE (%REPLY.START);
> + }
> return 0;
>
> DEAD:
> diff --git a/lib/crypto.c b/lib/crypto.c
> index 3f7c744..e0f173f 100644
> --- a/lib/crypto.c
> +++ b/lib/crypto.c
> @@ -181,6 +181,12 @@ tls_send (struct nbd_handle *h,
> return r;
> }
>
> +static bool
> +tls_pending (struct socket *sock)
> +{
> + return gnutls_record_check_pending (sock->u.tls.session) > 0;
> +}
> +
> static int
> tls_get_fd (struct socket *sock)
> {
> @@ -209,6 +215,7 @@ tls_close (struct socket *sock)
> static struct socket_ops crypto_ops = {
> .recv = tls_recv,
> .send = tls_send,
> + .pending = tls_pending,
> .get_fd = tls_get_fd,
> .close = tls_close,
> };
> diff --git a/lib/internal.h b/lib/internal.h
> index 5b6152b..61ddbde 100644
> --- a/lib/internal.h
> +++ b/lib/internal.h
> @@ -192,6 +192,7 @@ struct socket_ops {
> struct socket *sock, void *buf, size_t len);
> ssize_t (*send) (struct nbd_handle *h,
> struct socket *sock, const void *buf, size_t len);
> + bool (*pending) (struct socket *sock);
> int (*get_fd) (struct socket *sock);
> int (*close) (struct socket *sock);
> };
> --
Sorry for the late review, was busy with meeting this week.
Patch looks good to me, ACK.
Thanks, Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
More information about the Libguestfs
mailing list