[dm-devel] [PATCH v2 27/37] multipathd: watch bindings file with inotify + timestamp

Martin Wilck mwilck at suse.com
Thu Sep 14 13:25:35 UTC 2023


On Wed, 2023-09-13 at 17:07 -0500, Benjamin Marzinski wrote:
> On Mon, Sep 11, 2023 at 06:38:36PM +0200, mwilck at suse.com wrote:
> > From: Martin Wilck <mwilck at suse.com>
> > 
> > Since "libmultipath: keep bindings in memory", we don't re-read the
> > bindings file after every modification. Add a notification
> > mechanism
> > that makes multipathd aware of changes to the bindings file.
> > Because
> > multipathd itself will change the bindings file, it must compare
> > timestamps in order to avoid reading the file repeatedly.
> > 
> > Because select_alias() can be called from multiple thread contexts
> > (uxlsnr,
> > uevent handler), we need to add locking for the bindings file. The
> > timestamp must also be protected by a lock, because it can't be
> > read
> > or written atomically.
> > 
> > Note: The notification mechanism expects the bindings file to be
> > atomically replaced by rename(2). Changes must be made in a
> > temporary file and
> > applied using rename(2), as in update_bindings_file(). The inotify
> > mechanism deliberately does not listen to close-after-write events
> > that would be generated by editing the bindings file directly. This
> > 
> > Note also: new bindings will only be read from add_map_with_path(),
> > i.e. either during reconfigure(), or when a new map is created
> > during
> > runtime. Existing maps will not be renamed if the binding file
> > changes,
> > unless the user runs "multipathd reconfigure". This is not a change
> > wrt the previous code, but it should be mentioned anyway.
> > 
> > Signed-off-by: Martin Wilck <mwilck at suse.com>
> > 
> > libmultipath: protect global_bindings with a mutex
> > 
> > Signed-off-by: Martin Wilck <mwilck at suse.com>
> > 
> > libmultipath: check timestamp of bindings file before reading it
> > 
> > Signed-off-by: Martin Wilck <mwilck at suse.com>
> > ---
> >  libmultipath/alias.c              | 250 +++++++++++++++++++++++++-
> > ----
> >  libmultipath/alias.h              |   3 +-
> >  libmultipath/libmultipath.version |   5 +
> >  multipathd/uxlsnr.c               |  36 ++++-
> >  tests/alias.c                     |   3 +
> >  5 files changed, 252 insertions(+), 45 deletions(-)
> > 
> > diff --git a/libmultipath/alias.c b/libmultipath/alias.c
> > index 66e34e3..76ed62d 100644
> > --- a/libmultipath/alias.c
> > +++ b/libmultipath/alias.c
> > @@ -10,6 +10,7 @@
> >  #include <stdio.h>
> >  #include <stdbool.h>
> >  #include <assert.h>
> > +#include <sys/inotify.h>
> >  
> >  #include "debug.h"
> >  #include "util.h"
> > @@ -22,6 +23,7 @@
> >  #include "config.h"
> >  #include "devmapper.h"
> >  #include "strbuf.h"
> > +#include "time-util.h"
> >  
> >  /*
> >   * significant parts of this file were taken from iscsi-bindings.c
> > of the
> > @@ -50,6 +52,12 @@
> >  "# alias wwid\n" \
> >  "#\n"
> >  
> > +/* uatomic access only */
> > +static int bindings_file_changed = 1;
> > +
> > +static pthread_mutex_t timestamp_mutex =
> > PTHREAD_MUTEX_INITIALIZER;
> > +static struct timespec bindings_last_updated;
> > +
> >  struct binding {
> >         char *alias;
> >         char *wwid;
> > @@ -60,6 +68,9 @@ struct binding {
> >   * an abstract type.
> >   */
> >  typedef struct _vector Bindings;
> 
> I'm pretty sure that the global_bindings is only ever accessed with
> the
> vecs lock held, so it doesn't really need it's own lock. On the
> otherhand, I understand the desire to not keep adding things that the
> vecs lock is protecting, so I'm fine with this.

I have to admit I didn't think about the vecs lock. I find it counter-
intuitive to think about the vecs lock as "global multipath lock", even
if it is in practice. In my mind, the vecs lock protects the pathvec
and the mpvec, and possibly their members, but no other data
structures. Implicitly relying on the vecs lock protecting unrelated
data structures seems unwise to me. Of course we have to consider the
possibility of deadlock, but that's avoided by holding the new lock
only in very short portions of the code, and not across any function
calls or the like.

> 
> > +
> > +/* Protect global_bindings */
> > +static pthread_mutex_t bindings_mutex = PTHREAD_MUTEX_INITIALIZER;
> >  static Bindings global_bindings = { .allocated = 0 };
> >  
> >  enum {
> > @@ -78,6 +89,26 @@ static void _free_binding(struct binding *bdg)
> >         free(bdg);
> >  }
> >  
> > +static void free_bindings(Bindings *bindings)
> > +{
> > +       struct binding *bdg;
> > +       int i;
> > +
> > +       vector_foreach_slot(bindings, bdg, i)
> > +               _free_binding(bdg);
> > +       vector_reset(bindings);
> > +}
> > +
> > +static void set_global_bindings(Bindings *bindings)
> > +{
> 
> However, if we are acting like the vecs lock isn't protecting
> global_bindings, then we should move this to inside the bindings
> mutex below.

> Otherwise, if it was possible for another thread to modify
> global_bindings after setting old_bindings but before grabbing the
> mutex, it could add a binding, making old_binding->allocated and
> old_binding->slot incorrect, so we'd be freeing random memory below.
> 
Right, thanks.

> > +	Bindings old_bindings = global_bindings;
> > +
> > +	pthread_mutex_lock(&bindings_mutex);
> > +	global_bindings = *bindings;
> > +	pthread_mutex_unlock(&bindings_mutex);
> > +	free_bindings(&old_bindings);
> > +}
> > +

> > +void handle_bindings_file_inotify(const struct inotify_event
> *event)
> > +{
> > +	struct config *conf;
> > +	const char *base;
> > +	bool changed = false;
> > +	struct stat st;
> > +	struct timespec ts = { 0 };
> > +	int ret;
> > +
> > +	if (!(event->mask & IN_MOVED_TO))
> > +		return;
> > +
> > +	conf = get_multipath_config();
> > +	base = strrchr(conf->bindings_file, '/');
> > +	changed = base && base > conf->bindings_file &&
> > +		!strcmp(base + 1, event->name);
> > +	ret = stat(conf->bindings_file, &st);
> > +	put_multipath_config(conf);
> > +
> > +	if (!changed)
> > +		return;
> > +
> > +	pthread_mutex_lock(&timestamp_mutex);
> I'm not sure if we should assert that the file has changed if we
> can't stat() it.

I think it's better to (try to) reread the file than pretend that the
file hadn't changed ("if in doubt, reread"). Rationale:

In general, the most likely cause for stat() to fail would be that the
file (or the directory or file system containing it) had been removed.
Actually, almost every possible error documented in stat(2) (except
ENOMEM) indicates such a condition in one way or the other. But in an
IN_MOVED_TO handler for just this file, that seems quite unlikely, so
we're really looking at a corner case situation here. A non-existing
file means no bindings; a "reconfigure" operation would cause existing
bindings to be preserved, newly probed maps would get a new alias
assigned. Looking at it that way, "rereading" a non-existing file
doesn't do much harm. Our current bindings list may contain additional
bindings that might be lost by re-reading, but still I think we have to
assume that the file was intentionally deleted, and act accordingly.

> > +	if (ret == 0) {
> > +		ts = st.st_mtim;
> > +		changed = timespeccmp(&ts, &bindings_last_updated) >
> 0;
> > +	}
> 

> > 
> > @@ -248,7 +328,7 @@ static int update_bindings_file(const Bindings
> > *bindings,
> >         }
> >         umask(old_umask);
> >         pthread_cleanup_push(cleanup_fd_ptr, &fd);
> > -       rc = write_bindings_file(bindings, fd);
> > +       rc = write_bindings_file(bindings, fd, &ts);
> >         pthread_cleanup_pop(1);
> >         if (rc == -1) {
> >                 condlog(1, "failed to write new bindings file");
> > @@ -257,8 +337,12 @@ static int update_bindings_file(const Bindings
> > *bindings,
> >         }
> 
> Isn't there a race where we rename the file and then update the
> timestamp.  If we respond to the inotify event between when we rename
> and when we lock the timestamp_mutex, we will trigger a reread based
> on
> our own changes.  Perhaps we should set the timestamp before renaming
> the file. If the rename fails, it's not clear what we want to do
> anyway,
> since we have bindings that didn't make it into the file, and if we
> reread the file, we lose them.

True, if the rename fails, we're in an awkward situation. But I think
it's important that, even then, we don't lie about the timestamp,
therefore it shouldn't be updated if the file was not.

Again, this follows the "if in doubt, re-read" mind set. There is no
completely race-free way to implement this using time stamps, in
particular if we consider a situation where multipathd's own update
races with some other process (multipath) updating the file. 
Therefore believe it's correct to set the timestamp after the rename. I
must definitely move the timestamp update before the condlog(), though.

> 
> >         if ((rc = rename(tempname, bindings_file)) == -1)
> >                 condlog(0, "%s: rename: %m", __func__);
> > -       else
> > +       else {
> >                 condlog(1, "updated bindings file %s",
> > bindings_file);
> > +               pthread_mutex_lock(&timestamp_mutex);
> > +               bindings_last_updated = ts;
> > +               pthread_mutex_unlock(&timestamp_mutex);
> > +       }
> >         return rc;
> >  }
> >  
> > ...
> >  
> >  int get_user_friendly_wwid(const char *alias, char *buff)
> >  {
> >         const struct binding *bdg;
> > +       int rc = -1;
> >  
> >         if (!alias || *alias == '\0') {
> >                 condlog(3, "Cannot find binding for empty alias");
> >                 return -1;
> >         }
> 
> Don't we want to call read_bindings_file() here as well?

Good point, thanks.

> 
> > +       pthread_mutex_lock(&bindings_mutex);
> > +       pthread_cleanup_push(cleanup_mutex, &bindings_mutex);
> >         bdg = get_binding_for_alias(&global_bindings, alias);
> > -       if (!bdg) {
> > +       if (bdg) {
> > +               strlcpy(buff, bdg->wwid, WWID_SIZE);
> > +               rc = 0;
> > +       } else
> >                 *buff = '\0';
> > -               return -1;
> > -       }
> > -       strlcpy(buff, bdg->wwid, WWID_SIZE);
> > -       return 0;
> > -}
> > -
> > -static void free_bindings(Bindings *bindings)
> > -{
> > -       struct binding *bdg;
> > -       int i;
> > -
> > -       vector_foreach_slot(bindings, bdg, i)
> > -               _free_binding(bdg);
> > -       vector_reset(bindings);
> > +       pthread_cleanup_pop(1);
> > +       return rc;
> >  }
> >  
> >  void cleanup_bindings(void)
> >  {
> > +       pthread_mutex_lock(&bindings_mutex);
> >         free_bindings(&global_bindings);
> > +       pthread_mutex_unlock(&bindings_mutex);
> >  }
> >  
> >  enum {
> > @@ -595,7 +707,21 @@ static int _check_bindings_file(const struct
> > config *conf, FILE *file,
> >         char *line = NULL;
> >         size_t line_len = 0;
> >         ssize_t n;
> > +       char header[sizeof(BINDINGS_FILE_HEADER)];
> >  
> > +       header[sizeof(BINDINGS_FILE_HEADER) - 1] = '\0';
> > +       if (fread(header, sizeof(BINDINGS_FILE_HEADER) - 1, 1,
> > file)
> 
> I'm pretty sure fread returns the number of items read, in this case
> 1.

Argh, thanks. I think this is the first time I've knowingly used this
API, and I got it wrong :-/

> 
> > +           < sizeof(BINDINGS_FILE_HEADER)  - 1) {
> > +               condlog(2, "%s: failed to read header from %s",
> > __func__,
> > +                       conf->bindings_file);
> > +               fseek(file, 0, SEEK_SET);
> > +               rc = -1;
> > +       } else if (strcmp(header, BINDINGS_FILE_HEADER)) {
> > +               condlog(2, "%s: invalid header in %s", __func__,
> > +                       conf->bindings_file);
> > +               fseek(file, 0, SEEK_SET);
> > +               rc = -1;
> > +       }
> >         pthread_cleanup_push(cleanup_free_ptr, &line);
> >         while ((n = getline(&line, &line_len, file)) >= 0) {
> >                 char *alias, *wwid;
> > @@ -643,6 +769,68 @@ static int mp_alias_compar(const void *p1,
> > const void *p2)
> >                             &((*(struct mpentry * const *)p2)-
> > >alias));
> >  }
> >  
> > +static int _read_bindings_file(const struct config *conf, Bindings
> > *bindings,
> > +                              bool force)
> > +{
> > +       int can_write;
> > +       int rc = 0, ret, fd;
> > +       FILE *file;
> > +       struct stat st;
> > +       int has_changed = uatomic_xchg(&bindings_file_changed, 0);
> > +
> > +       if (!force) {
> > +               if (!has_changed) {
> > +                       condlog(4, "%s: bindings are unchanged",
> > __func__);
> > +                       return BINDINGS_FILE_UP2DATE;
> > +               }
> > +       }
> > +
> > +       fd = open_file(conf->bindings_file, &can_write,
> > BINDINGS_FILE_HEADER);
> > +       if (fd == -1)
> > +               return BINDINGS_FILE_ERROR;
> > +
> > +       file = fdopen(fd, "r");
> > +       if (file != NULL) {
> > +               condlog(3, "%s: reading %s", __func__, conf-
> > >bindings_file);
> > +
> > +               pthread_cleanup_push(cleanup_fclose, file);
> > +               ret = _check_bindings_file(conf, file, bindings);
> > +               if (ret == 0) {
> > +                       struct timespec ts;
> > +
> > +                       rc = BINDINGS_FILE_READ;
> > +                       ret = fstat(fd, &st);
> > +                       if (ret == 0)
> > +                               ts = st.st_mtim;
> > +                       else {
> > +                               condlog(1, "%s: fstat failed (%m),
> > using current time", __func__);
> > +                               clock_gettime(CLOCK_REALTIME_COARSE
> > , &ts);
> > +                       }
> > +                       pthread_mutex_lock(&timestamp_mutex);
> > +                       bindings_last_updated = ts;
> > +                       pthread_mutex_unlock(&timestamp_mutex);
> > +               } else if (ret == -1 && can_write && !conf-
> > >bindings_read_only) {
> > +                       ret = update_bindings_file(bindings, conf-
> > >bindings_file);
> > +                       if (ret == 0)
> > +                               rc = BINDINGS_FILE_READ;
> > +                       else
> > +                               rc = BINDINGS_FILE_BAD;
> 
> I don't think _read_bindings_file() can return a value other than 0
> or
> -1, and we already know it didn't return 0 here.

You meant _check_bindings_file(). And yes, you're right.

Thanks,
Martin



More information about the dm-devel mailing list