[Libguestfs] [nbdkit PATCH 1/2] server: Add support for corking

Eric Blake eblake at redhat.com
Fri Jun 7 11:32:15 UTC 2019


On 6/6/19 5:48 PM, Eric Blake wrote:
> Any time we reply to NBD_CMD_READ or NBD_CMD_BLOCK_STATUS, we end up
> calling conn->send() more than once. Now that we've disabled Nagle's
> algorithm, this implies that we try harder to send the small header
> immediately, rather than batching it with the rest of the payload,
> which causes more overhead in the amount of actual network traffic.
> For interfaces that support corking (gnutls, or Linux TCP sockets), we
> can give a hint that the separate send() calls should be batched into
> a single network packet where practical.
> 
> This patch just wires up support; the next one will actually use it
> and provide performance measurements.
> 
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
>  server/internal.h    |  3 +++
>  server/connections.c | 25 +++++++++++++++++++++++++
>  server/crypto.c      | 19 +++++++++++++++++++
>  3 files changed, 47 insertions(+)
> 
> diff --git a/server/internal.h b/server/internal.h
> index 2ee5e23..cb34323 100644
> --- a/server/internal.h
> +++ b/server/internal.h
> @@ -145,6 +145,8 @@ typedef int (*connection_recv_function) (struct connection *,
>  typedef int (*connection_send_function) (struct connection *,
>                                           const void *buf, size_t len)
>    __attribute__((__nonnull__ (1, 2)));
> +typedef int (*connection_cork_function) (struct connection *, bool)
> +  __attribute__((__nonnull__ (1)));
>  typedef void (*connection_close_function) (struct connection *)
>    __attribute__((__nonnull__ (1)));

After thinking more on it, I'm not sure I like this interface.

Linux also has MSG_MORE, which achieves the same effect as TCP_CORK but
with fewer syscalls:

setsockopt(TCP_CORK, 1)
send()
send()
setsockopt(TCP_CORK, 0)

vs.

send(,MSG_MORE)
send(,0)

We can't use send() on a non-socket (think nbdkit -s), so we'd need two
different handlers for raw_send (one for anything that passes an initial
fstat()/S_ISSCK and/or getsockopt(), the other for all else), but I'm
thinking that using send-like semantics everywhere will be just as easy
to implement the same semantics: if flag NBDKIT_MORE is set, conn->send
does either send(MSG_MORE) (if MSG_MORE is available) or the pair
set_cork/write; if flag NBDKIT_MORE is not set, conn->send does either
send(,0) (if MSG_MORE is available) or the pair write/uncork.

I'll post a v2 along those lines, and see if it makes any difference on
performance numbers (avoiding syscalls is always a good thing).  Using
sendmsg() could avoid even more syscalls by using a vectored send, but
that's more work to the callers to prepare the vector (our reply to
MSG_CMD_BLOCK_STATUS would require the most work), and we'd still have
to write fallback code for non-sockets to unvector.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190607/77177916/attachment.sig>


More information about the Libguestfs mailing list