[libvirt] [PATCH v2 0/5] Workaround mdev uevent race affecting node device driver

Erik Skultety eskultet at redhat.com
Fri Jul 28 10:06:25 UTC 2017


On Fri, Jul 28, 2017 at 09:43:57AM +0200, Erik Skultety wrote:
> v1 here https://www.redhat.com/archives/libvir-list/2017-June/msg00826.html
>
> This series addresses [1]. It builds on top of [2], series which introduces
> a few small fixes and improvements. Even though that one hasn't gotten a review
> yet, you can fetch and build this from my github branch [3].
>
> Changes since v1:
> - moved the device handling routine into a detached worker thread, so the event
> loop wouldn't block
> - adjusted virFileAccess according to the reviewer's comments
>
> Notes:
> Merging the nodedev obj's refactor storing the list of nodedev objects in a
> self-lockable object made it possible to drop the driver locks at certain
> spots either completely or at least rearrange them according to the needs. The
> worker thread therefore doesn't need to hold the driver lock while waiting on
> the device's attributes to appear (which would be wrong anyway, since it would
> still block the event loop) and only acquires the driver lock when doing sanity
> checks on the udev monitor.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1463285
> [2] https://www.redhat.com/archives/libvir-list/2017-July/msg01077.html
> [3] https://github.com/eskultety/libvirt/commits/mdev-nodedev-next
>
> Erik Skultety (5):
>   nodedev: Introduce udevCheckMonitorFD helper function
>   udev: Split udevEventHandleCallback in two functions
>   udev: Convert udevEventHandleThread to an actual thread routine
>   util: Introduce virFileWaitForAccess
>   nodedev: Work around the uevent race by hooking up
>     virFileWaitForAccess
>
>  src/libvirt_private.syms           |   1 +
>  src/node_device/node_device_udev.c | 194 +++++++++++++++++++++++++++++++------
>  src/util/virfile.c                 |  29 ++++++
>  src/util/virfile.h                 |   2 +
>  4 files changed, 194 insertions(+), 32 deletions(-)

SNACK, I was pointed out that the worker thread spams the log with an
unexpected error("monitor received NULL"), so it looks like there's a race
somewhere. I checked the event counting, that looks okay, except that there's
quite a few events coming on the fd during daemon's init phase and for most of
them, there's nothing coming out of the monitor actually. I need to investigate
more.

Erik




More information about the libvir-list mailing list