[Libguestfs] [PATCH 0/3] Alternate way to avoid race conditions when nbdkit exits.

Eric Blake eblake at redhat.com
Thu Nov 16 23:09:57 UTC 2017


On 11/14/2017 11:30 AM, Richard W.M. Jones wrote:
> This fixes the race conditions for me, using the test described here:
> 
> https://www.redhat.com/archives/libguestfs/2017-September/msg00226.html

Running test-socket-activation in a loop, I've hit other races (some
provoked a bit more easily with the code I'm working on), and will be
posting some patches where I know the solution:

Right now, our use of threadlocal_set_name (plugin_name ()) makes our
thread-local storage point to a string in module memory. If we have any
nbdkit_debug() or other statement that prints after .unload, we get a
SEGV.  Solution(s): initializing a plugin should strdup() the name so
that plugin_name() is valid for the life of the program, rather than
pointing into module memory; and threadlocal_set_name() should strdup()
the name rather than relying on the lifetime of the storage of whatever
the caller passed in (either fix in isolation solves that SEGV, but
using both seems like a good idea).

We also have a race that results in a hang in
sockets.c:accept_incoming_connections(); if I understand the problem, it
is something like:

main                    signal context
--------------------------------------
first iteration, finish accept_connect()
checks !quit, starts second iteration
                        SIGTERM received
                        set quit
call poll()


which hangs trying to connect to the socket, compared to the usual:

main                    signal context
--------------------------------------
first iteration, finish accept_connect()
                        SIGTERM received
                        set quit
checks !quit, ends loop


or:

main                    signal context
--------------------------------------
first iteration, finish accept_connect()
checks !quit, starts second iteration
call poll()
                        SIGTERM received
                        set quit
poll() fails with EINTR
checks !quit, ends loop


Here, I'm less sure of what would reliably solve things; maybe we want a
pipe-to-self setup where our signal handler (which currently sets !quit)
instead (or in addition) writes into the pipe so that our event loop
(the poll() in accept_connect()) has an additional fd it can poll in
parallel to all the socket fds; that way, the arrival of a signal will
ensure that poll() wakes up even without any of the pending socket fds
receiving another connection.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 619 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20171116/878c88f1/attachment.sig>


More information about the Libguestfs mailing list