[Libguestfs] [PATCH libnbd v2 3/4] api: Implement concurrent writer.
Eric Blake
eblake at redhat.com
Tue Jun 4 12:02:29 UTC 2019
On 6/4/19 6:16 AM, Richard W.M. Jones wrote:
> There are several races / deadlocks which I've thought about. Let's
> see if I can remember them all ...
>
> (1) This I experienced: nbd_aio_get_fd deadlocks if there are
> concurrent synchronous APIs going on. A typical case is where you set
> up the concurrent writer thread before connecting, and then call a
> synchronous connect function such as connect_tcp. The synchronous
> function grabs h->lock, then writes something, which eventually
> invokes the writer thread which calls nbd_aio_get_fd and deadlocks on
> h->lock.
>
> -> Probably the writer thread should be forbidden from using
> nbd_handle.
Or rather, forbidden from using nbd_handle in a locking manner. We
should basically document a whitelist of functions that are safe to call...
>
> (2) The writer thread calls nbd_aio_get_fd, but the fd returned might
> might be closed before we use it, resulting in either EBADF or worse
> using another fd that happens to be opened around the same time.
>
> -> I think the solution to this would be to allow the writer callback
> to signal that the socket is about to be closed (eg. add an extra flag
> parameter to the callback), which would kill the writer thread.
Yes, I think a flag argument to the writer callback can serve a couple
of additional purposes:
Right now, the writer callback is reached exactly once for most commands
(on data from h->request), and exactly twice for NBD_CMD_WRITE
(h->request, then the user's persistent buffer). A flag argument would
let us inform the user whether the buffer is transient (h->request is
volatile; the writer thread MUST memcpy it aside; but it is fixed in
length so the memcpy is not performance sensitive) or persistent (the
NBD_CMD_WRITE buffer was provided by the caller, and may be several
megabytes; memcpy() can negatively impact the cache and is pointless
since the original buffer is not going to be scribbled on); it also lets
us inform the writer thread on whether TCP_CORK would be worthwhile
(corking the outgoing socket for NBD_CMD_WRITE would let the OS buffer
things into a single wire transaction on both the request header and the
payload, rather than more overhead for two separate TCP packets,
particularly since we've disabled Nagle's algorithm).
Or, in pseudo-code:
nbd_aio_pread(,userbuf,) - >
write_callback(data, h->request, len, 0);
nbd_aio_pwrite(,userbuf,) - >
write_callback(data, h->request, len, LIBNBD_WRITE_CORK)
write_callback(data, userbuf, len, LIBNBD_WRITE_PERSISTENT_BUF)
nbd_aio_shutdown() - >
write_callback(data, NULL, 0, LIBNBD_WRITE_SHUTDOWN)
>
> (3) nbd_concurrent_writer_error could lose errors. This might happen
> if the socket is closed normally without writing anything, which would
> never check h->writer_error.
>
> (4) nbd_concurrent_writer_error possibly deadlocks too since it needs
> to grab h->lock. Basically the same as (1).
...and this function must be in the documented safe whitelist, but that
means that instead of grabbing h->lock, we have to implement error
reporting by use of atomics.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190604/f7d178b6/attachment.sig>
More information about the Libguestfs
mailing list