[libvirt PATCH v4 12/25] nodedev: Refresh mdev devices when changes are detected

Erik Skultety eskultet at redhat.com
Mon Feb 22 13:12:23 UTC 2021


On Thu, Feb 18, 2021 at 12:04:56PM -0600, Jonathon Jongsma wrote:
> On Wed, 17 Feb 2021 14:35:51 +0100
> Erik Skultety <eskultet at redhat.com> wrote:
> 
> > > +static void
> > > +mdevctlEventHandleCallback(GFileMonitor *monitor G_GNUC_UNUSED,
> > > +                           GFile *file,
> > > +                           GFile *other_file G_GNUC_UNUSED,
> > > +                           GFileMonitorEvent event_type,
> > > +                           gpointer user_data)
> > > +{
> > > +    udevEventDataPtr priv = user_data;
> > > +    /* if a new directory appears, monitor that directory for
> > > changes */
> > > +    if (event_type == G_FILE_MONITOR_EVENT_CREATED &&
> > > +        g_file_query_file_type(file, G_FILE_QUERY_INFO_NONE, NULL)
> > > ==
> > > +        G_FILE_TYPE_DIRECTORY) {
> > > +        GList *newmonitors = monitorFileRecursively(priv, file);
> > > +        priv->mdevctlMonitors =
> > > g_list_concat(priv->mdevctlMonitors, newmonitors);  
> > 
> > priv->mdevctlMonitors is shared among threads, but you only acquire
> > the mutex just before exec'ing mdevctl in the mdevctlHandler thread,
> > so ^this can yield unexpected results in terms of directories
> > monitored.
> 
> A) It doesn't really affect the directories being monitored, it only
> affects our ability to free these monitors later in the destructor.
> 
> B) Aside from the very first time this variable is set in
> nodeStateInitialize(), the only place that accesses the mdevctlMonitors
> variable is this glib signal callback (to add new monitors to the list)
> and the destructor (to free the monitors). And I believe those should
> always run in the main thread. So I don't think there's really much of
> a thread-safety issue here. Perhaps there's a slight race in
> nodeStateInitialize() which appears to be called from a non-main
> thread. If a directory gets created immediately after connecting to the
> file monitor signal but before the function returns and the GList gets
> assigned to priv->mdevctlMonitors, I suppose the signal handler could
> run and it could get corrupted. Would you like me to protect all
> accesses with a Mutex?

If for some reason we need to extend the code in the future, it's so easy to
miss a single spot. So, I'd say - better safe than sorry - let's protect the
variable the way it's supposed to be even though the race might not exist right
now.

Erik

> 
> 
> > 
> > > +    }
> > > +
> > > +    /* When mdevctl creates a device, it can result in multiple
> > > notify events
> > > +     * emitted for a single logical change (e.g. several CHANGED
> > > events, or a
> > > +     * CREATED and CHANGED event followed by CHANGES_DONE_HINT).
> > > To avoid
> > > +     * spawning a mdevctl thread multiple times for a single
> > > logical
> > > +     * configuration change, try to coalesce these changes by
> > > waiting for the
> > > +     * CHANGES_DONE_HINT event. As a fallback,  add a timeout to
> > > trigger the
> > > +     * signal if that event never comes */  
> > 
> > So there's no guaranteed pattern to monitor, is that so? IOW Even the
> > CHANGES_DONE_HINT may not actually arrive? That's a very poor design.
> > I'd like to ditch the timer as much as possible.
> 
> Maybe we can simply rely on the CHANGES_DONE_HINT. I tend to be
> conservative and was trying to handle the case where the
> CHANGES_DONE_HINT might not be delivered (it is called a "hint" after
> all). Part of this was caused by my reading this comment in a test
> case within the glib repository:
> 
>   /* Sometimes the emission of 'CHANGES_DONE_HINT' may be late because
>    * it depends on the ability of file monitor implementation to report
>    * 'CHANGES_DONE_HINT' itself. If the file monitor implementation
>    * doesn't report 'CHANGES_DONE_HINT' itself, it may be emitted by
>    * GLocalFileMonitor after a few seconds, which causes the event to
>    * mix with results from different steps. Since 'CHANGES_DONE_HINT'
>    * is just a hint, we don't require it to be reliable and we simply
>    * ignore unexpected 'CHANGES_DONE_HINT' events here. */
> 
> So it is not reliably communicated if your file monitor backend doesn't
> implement it. But the above comment seems to suggest that if the file monitor
> backend doesn't support it, the event will be generated by glib itself and will
> simply be delayed by a couple of seconds rather than missing altogether. I'd
> have to read the glib code to tell whether it's possible for it to be omitted
> altogether. I guess I don't see the timer as a horrible workaround, but I can
> ditch it if you want.
> 
> Alternatively I suppose I could simply kick off a new thread for each file
> change event and not bother trying to coalesce them at all. This will result in
> perhaps a an extra thread or two being launched when mdevctl modifies
> the mdev registry behind our backs, but that's unlikely to be a common
> occurrence and so shouldn't really be a significant performance issue.

If there may be multiple events connected to a single change, which one would
trigger the actual device list update? I guess I'm starting to be okay with the
timeout in the end.

Erik




More information about the libvir-list mailing list