[dm-devel] [PATCH] libmultipath: Use existing user friendly name if possible
christophe.varoqui at opensvc.com
Thu May 15 21:45:40 UTC 2014
I'd need your ack on this one.
On Thu, May 15, 2014 at 9:21 PM, Stewart, Sean <Sean.Stewart at netapp.com>wrote:
> Ping... Any additional comments or suggestions for this patch?
> Bumping in case it got lost in the backlog. :)
> On Fri, 2014-04-11 at 17:01 +0000, Stewart, Sean wrote:
> > On Fri, 2014-04-11 at 17:03 +0100, Bryn M. Reeves wrote:
> > > On Fri, Mar 28, 2014 at 09:01:14PM +0000, Stewart, Sean wrote:
> > > > When a system is booted to the SAN, a condition can occur where one
> > > > user friendly name is given to a disk during boot, but multipathd
> > > > to allocate a different one after boot. If the second alias is
> > > > used by another device, multipathd can't rename it. Multipathd then
> > > > incorrect information about the alias/wwid relationships, which can
> > > > result in paths being added to the wrong map.
> > >
> > > This should only happen if the initramfs and root file system have
> > > inconsistent multipath configurations (either multipath.conf or
> > > / wwids file mismatched). That's not really a valid configuration for
> > > the system to be in and leads to the type of problems you describe.
> > That is true that it only happens if they are out of sync. We tried
> > remaking the initramfs to fix the problem, but it didn't help.
> > >
> > > > This patch works around this problem by first trying to use the alias
> > > > already bound to a device during boot. If the bindings file has that
> > > > alias bound to a different device, it'll auto generate a new alias to
> > > > rename it to.
> > >
> > > To be honest I'd prefer to see this cause an error. These types of
> > > configurations currently run the risk of silent data corruption - I'd
> > > much rather deal with a system that refuses to boot due to an out of
> > > date initramfs image than one that quietly remaps paths in unexpected
> > > ways.
> > The issue, though, is that the system does not refuse to boot. In the
> > case we saw, it booted anyway, our QA engineer ran a test, and it ended
> > with a data corruption. A user could perform a fresh installation,
> > map
> > new luns, reboot, and without any way of realizing it have essentially a
> > ticking time bomb on their hands, ready to go off as soon as there's a
> > blip in the SAN.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dm-devel