[libvirt PATCH] nodedev: wait a bit longer for new node devices

Erik Skultety eskultet at redhat.com
Wed Aug 24 16:12:57 UTC 2022


On Wed, Aug 24, 2022 at 10:56:22AM -0500, Jonathon Jongsma wrote:
> On 8/24/22 2:09 AM, Erik Skultety wrote:
> > On Tue, Aug 23, 2022 at 12:43:03PM -0500, Jonathon Jongsma wrote:
> > > Openstack developers reported that newly-created mdevs were not
> > > recognized by libvirt until after a libvirt daemon restart. The source
> > > of the problem appears to be that when libvirt gets the udev 'add'
> > > event, the sysfs tree for that device might not be ready and so libvirt
> > > waits 100ms for it to appear (max 100 waits of 1ms each). But in the
> > > OpenStack environment, the sysfs tree for new mediated devices was
> > > taking closer to 250ms to appear and therefore libvirt gave up waiting
> > > and didn't add these new devices to its list of nodedevs.
> > > 
> > > By changing the wait time to 1 second (max 100 waits of 10ms each), this
> > > should provide enough time to enable these deployments to recognize
> > > newly-created mediated devices, but it shouldn't increase the delay for
> > > more traditional deployments too much.
> > > 
> > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2109450
> > > 
> > > Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
> > > ---
> > > 
> > > Alternatively, we could switch to triggering off of the udev 'bind' event
> > > rather than the 'add' event, but I wasn't able to convince myself that this
> > > would result in 100% compatible behavior, so this felt like the safest
> > > solution. If others can convince me that switching to 'bind' is safe, I can
> > > re-submit this patch.
> > 
> > Is there a guarantee that the filesystem tree is ready by the time the event
> > arrives? I remember back in the day when I implemented this, this was even
> > discussed on the kernel list and the outcome was that each application needs to
> > sort this out on its own hinting that at least at that time there wasn't
> > any other way to do this reliably? Has something changed in the meantime?
> > 
> > Erik
> > 
> 
> I'm afraid I don't actually know if anything has changed in the kernel in
> this area. That's basically the reason that I proposed the approach that I
> did. But I do know that in the bug referenced, the 'bind' event comes about
> 250ms later than the 'add' event. I'm not sure if the filesystem tree is
> necessarily ready on 'bind', but the fact that it is 250ms later means that,
> at minimum, there's a significantly better chance that it is ready by that
> point than at the time of 'add'.

In that case I'd accept this solution over bind since on a loaded system you
neither have a guarantee that the filesystem tree is ready by the time bind is
delivered nor that bind cannot be delayed for significantly longer period (less
likely).

So, from my POV:
Reviewed-by: Erik Skultety <eskultet at redhat.com>

to the patch as is.

Regards,
Erik



More information about the libvir-list mailing list