[Libguestfs] [libnbd PATCH v2] lib/errors.c: Fix assert fail in exit path in multi-threaded code

Eric Blake eblake at redhat.com
Wed Mar 8 22:47:38 UTC 2023


On Wed, Mar 08, 2023 at 04:29:03PM -0600, Eric Blake wrote:
> 
> There aren't any good ways to fix this.  We have proven that the
> assertions are too tight (it IS possible for one thread's first use of
> libnbd API, which causes the allocation of a thread-local error
> context, to occur after another thread has already triggered the
> destructor via exit(3)), so remove those.  Meanwhile, since POSIX says
> any use of a destroyed key is undefined, add some atomic bookkeeping
> such that we track a reference count of how many threads have
> allocated an error context and subsequently had the per-thread data
> destructor called.  Only call pthread_key_delete() if all threads had
> the chance to exit; leaking small amounts of memory is preferable to
> undefined behavior.

Correction: Continue to always call pthread_key_delete() - if the
module is being unloaded, we do NOT want any other thread exiting to
call back into our (now-unloaded) data destructor code.  But diagnose
when that unload action is known to trigger a memory leak.

> 
>  /* Destroy the thread-local key when the library is unloaded. */
> @@ -59,16 +62,16 @@ static void errors_key_destroy (void) __attribute__ ((destructor));
>  static void
>  errors_key_destroy (void)
>  {
> -  struct last_error *last_error = pthread_getspecific (errors_key);
> -
>    /* "No destructor functions shall be invoked by
>     * pthread_key_delete().  Any destructor function that may have been
>     * associated with key shall no longer be called upon thread exit."
> +   * If any threads were still using the key, the memory they allocated
> +   * will be leaked; this is unusual enough to diagnose.

Based on Dan's comments about libvirt, I can see that libvirt never
called pthread_key_delete(), which DOES leave a thread liable to call
into unloaded code if dlclose() does not leave the library resident.
But since libnbd IS trying to delete the key, I'm not yet convinced
that preventing library unloading is necessary to solve this
particular issue (although the broader conclusion that preventing
unload may be wise for other reasons, and these days memory is cheap
enough that preventing unload is unlikely to cause grief, may still
warrant that change in a separate patch).

>     */
> -  if (last_error != NULL) {
> -    free (last_error->error);
> -    free (last_error);
> -  }
> +  if (key_use_count > 1)
> +    fprintf (stderr, "%s: %s: %d threads leaking thread-local data\n",
> +             "libnbd", "errors_key_destroy", key_use_count - 1);
> +  key_use_count = 0;
>    pthread_key_delete (errors_key);
>  }

In short, it is intentional that this version of the patch
unconditionally calls pthread_key_delete() on module unload.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


More information about the Libguestfs mailing list