[libvirt] [PATCH v4 5/7] nodedev: Disable/re-enable polling on the udev fd

Thu Sep 28 10:00:08 UTC 2017

[...]

> >
> >          nodeDeviceLock();
> > +        priv = driver->privateData;
> >          udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
> >
> >          if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) {
> > @@ -1725,6 +1727,9 @@ udevEventHandleThread(void *opaque)
> >          device = udev_monitor_receive_device(udev_monitor);
> >          nodeDeviceUnlock();
> >
> > +        /* Re-enable polling for new events on the @udev_monitor */
> > +        virEventUpdateHandle(priv->watch, VIR_EVENT_HANDLE_READABLE);
> > +
>
> I think this should only be done when privateData->nevents == 0?  If we
> have multiple events to read, then calling virEventPollUpdateHandle,
> (eventually) for every pass through the loop seems like a bit of
> overkill especially if udevEventHandleCallback turns right around and
> disables it again.
>
> Also fortunately there isn't more than one udev thread sending the
> events since you access the priv->watch without the driver lock...
>
> Conversely perhaps we only disable if events > 1... Then again, how does
> one get to 2 events queued if we're disabling as soon as we increment

Very good point, technically events would still get queued, we just wouldn't
check and yes, we would process 1 event at a time. Not optimal, but if you look
at the original code and compare it with this one performance-wise (and I hope
I haven't missed anything that would render everything I write next a complete
rubbish), the time complexity hasn't changed, the space complexity hasn't
changed, what changed is code complexity which makes the code a bit slower due
to the excessive locking and toggling the FD polling back and forth. So
essentially you've got the same thing as you had before..but asynchronous.
However, yes, the usage of @nevents is completely useless now (haven't realized
that immediately, thanks) and a simple signalling should suffice.

So how could we make it faster though? I thought more about the idea you shared
in one of the previous reviews, letting the thread actually pull all the data
from the monitor, to which I IIRC replied something in the sense that the event
counting mechanism wouldn't allow that and it would break. Okay, let's drop the
event counting. What if we now let both the udev handler thread and the event
loop "poll" the file descriptor, IOW let the event loop polling the monitor fd,
thus invoking udevHandleCallback which would in turn keep signalling the handler
thread that there are some data. The difference now in the handler thread would
be that it wouldn't blindly trust the callback about the data, because of the
scheduling issue, it would keep poking the monitor itself until it gets either
EAGAIN or EWOULDBLOCK from recvmsg() called in libudev,  which would then be the
signal to start waiting on the condition. The next an event appears, a signal
from udevHandleCallback would finally have a meaning and wouldn't be ignored.

This way, it's actually the handler thread who's 'the boss', as most of the
signals from udevHandleCallback would be lost/NOPs.

Would you agree to such an approach?

Erik

> nevents? Wouldn't this totally obviate the need for nevents?
>
> I would think it would be overkill to disable/enable for just 1 event.
> If multiple are queued, then yeah we're starting to get behind... Of
> course that could effect the re-enable when events == 1 (or some
> MAGIC_CONSTANT_NUMBER_THRESHOLD_TO_DISABLE_POLLING)

This magic threshold approach still wouldn't work because you don't know
anything about the event at that point, you'd have to process them to actually
find out whether that's a legitimate event or it's an already noted event that
hasn't been processed yet, so it would still mess with your counters.

>
> IIRC from the previous trip down this rabbit hole (I did briefly go back
> and read the comments) that what you're trying to avoid is the following
> message spamming the error logs because of a scheduling mishap... Thus
> perhaps the following message could only be displayed if nevents > 1 and
> nothing was found knowing that we have this "timing problem" between
> udev, this thread, and the fd polling scanner of when a device should be
> "found"...  and then reset nevents == 0 with the appropriate lock.
>
> Curious on your thoughts before an R-B/ACK though.
>
> John
>
>
> >          if (!device) {
> >              virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> >                             _("udev_monitor_receive_device returned NULL"));
> > @@ -1750,10 +1755,12 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
> >                          int events ATTRIBUTE_UNUSED,
> >                          void *opaque)
> >  {
> > +    udevPrivate *priv = NULL;
> >      struct udev_monitor *udev_monitor = NULL;
> >      udevEventThreadDataPtr threadData = opaque;
> >
> >      nodeDeviceLock();
> > +    priv = driver->privateData;
> >      udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
> >
> >      if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
> > @@ -1771,6 +1778,16 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
> >      threadData->nevents++;
> >      virCondSignal(&threadData->threadCond);
> >      virMutexUnlock(&threadData->lock);
> > +
> > +    /* Due to scheduling, the eventloop might poll the udev monitor before the
> > +     * handler thread actually removes the data from the socket, thus causing it
> > +     * to spam errors about data not being ready yet, because
> > +     * udevEventHandleCallback would be invoked multiple times incrementing the
> > +     * counter of events queuing for a single event.
> > +     * Therefore we need to disable polling on the fd until the thread actually
> > +     * removes the data from the socket.
> > +     */
> > +    virEventUpdateHandle(priv->watch, 0);
> >  }
> >
> >
> >