[libvirt] [PATCH 2/2] virDomainLockManagerAddImage: Recursively lock backing chain

Daniel P. Berrange berrange at redhat.com
Wed Sep 9 13:32:27 UTC 2015


On Wed, Sep 09, 2015 at 02:37:13PM +0200, Martin Kletzander wrote:
> On Wed, Sep 09, 2015 at 09:49:16AM +0100, Daniel P. Berrange wrote:
> >On Wed, Sep 09, 2015 at 10:13:24AM +0200, Michal Privoznik wrote:
> >>On 08.09.2015 18:04, Daniel P. Berrange wrote:
> >>>>> following backing chain: D (top) -> B -> C (bottom), because in
> >>>>> both cases B and C are there just for reading. In order to
> >>>>> achieve that we must lock the rest of backing chain with
> >>>>> VIR_LOCK_MANAGER_RESOURCE_SHARED flag.
> >>>>
> >>>> See below ...
> >>>
> >>>>> +
> >>>>> +        if (!src->path)
> >>>>> +            break;
> >>>>> +
> >>>>> +        VIR_DEBUG("Add disk %s", src->path);
> >>>>> +        if (virLockManagerAddResource(lock,
> >>>>> +                                      VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK,
> >>>>> +                                      src->path,
> >>>>> +                                      0,
> >>>>> +                                      NULL,
> >>>>> +                                      diskFlags) < 0) {
> >>>>> +            VIR_DEBUG("Failed add disk %s", src->path);
> >>>>> +            goto cleanup;
> >>>>> +        }
> >>>>> +
> >>>>> +        /* Now that we've added the first disk (top layer of backing chain)
> >>>>> +         * into the lock manager, let's traverse the rest of the backing chain
> >>>>> +         * and lock each layer for RO. This should prevent accidentally
> >>>>> +         * starting a domain with RW disk from the middle of the chain. */
> >>>>> +        diskFlags = VIR_LOCK_MANAGER_RESOURCE_SHARED;
> >>>>
> >>>> The VIR_LOCK_MANAGER_RESOURCE_SHARED flag means "shared, writable" mode
> >>>> according to lock_driver.h, I think you want to lock it with
> >>>> VIR_LOCK_MANAGER_RESOURCE_READONLY.
> >>>
> >>> Yep, backing files should be marked readonly - 'shared' will allow
> >>> multiple VMs to write to the image at once. So you've only provided
> >>> mutual exclusion between an exclusive writer, vs a shared writer.
> >>
> >>Except that both of our locking drivers threat
> >>VIR_LOCK_MANAGER_RESOURCE_READONLY as no-op.
> >>
> >>But I think that since shared and exclusive writer are mutually
> >>exclusive, it doesn't matter that the lower layers are locked as shared
> >>writer. As soon as somebody wants to lock it exclusively they will fail.
> >>Nor virtlockd nor sanlock is actually enforcing the RO vs RW policy (if
> >>a file is locked for RO no writes are allowed). It's merely for tracking
> >>shared and exclusive locks.
> >
> >Right, so I remember why we ignore READONLY locks now - we were only
> >trying to protect against 2 processes /writing/ to the same image at
> >once, which would cause unrecoverable data corruption.
> >
> >We were not trying to protect against one guest having a disk readonly
> >while another guest has the disk read-write. That is bad because the
> >guest holding the image readonly will see inconsistent filesystem
> >state which may result in it kernel panicing. But the underling disk
> >is still only written by one guest at a time, so there's no disk
> >image corruption.
> >
> >This is similar to the scenario you illustrated with backing files.
> >One guest had a backing file open for write, and another guest had
> >it open readonly by virtue of having an upper layer open for write.
> >We're not actually going to suffer data corription of the backing
> >file here since we're still protected to not have concurrent writes
> >to it.
> >
> >We are going to destroy the upper layer image, by invalidating the
> >backing file, but that is true regardless of whether another guest
> >has the upper layer open right now or not.
> >
> >So in fact I think there's a case to say that this entire patch
> >is out of scope for the lock managers, as their goal is to prevent
> >concurrent writes to an image, not to protect against administrator
> >screwups with backing file writing.
> >
> 
> I'm just jumping into the discussion from the middle, so don't beat me
> for details I might've missed.
> 
> I don't quite get these last two paragraphs.  There will always be
> cases that can screw up the whole backing chain and there will always
> be users doing so.  But for the cases libvirt is able to know about,
> we should utilize the locks if we have the option to do so.  Whatever
> two domains are doing concurrently, we can make sure it's OK.  If one
> domain is invalidating a backing file of a another domain just because
> it's not running... well, there's plenty of stuff you can break
> without utilizing backing chains.  I think the idea of this patch is
> good, it just needs some polishing and after that, few follow-up
> patches that will utilize exclusive locks for all files used by block
> jobs.

The issue is that this is really special casing the lock manager operation
in the case of backing files. While it may solve the particular scenario,
outlined in the cover letter, that's only one of many scenarios related
to backing files. Furthermore, it does this by changing the semantics of
the lock manager's handling of read-only files, but only in the case of
backing files. I think this kind of special casing is something that is
liable to come back & cause pain in the future.  I realize the lock
manager is an appealing hammer to use here, but I don't think it should
be used on this particular nail.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list