[libvirt] Checking the expected behavior on dynamic image file ownership when live migrating

Christian Ehrhardt christian.ehrhardt at canonical.com
Tue Jul 2 10:32:12 UTC 2019


On Tue, Jul 2, 2019 at 10:29 AM Daniel P. Berrangé <berrange at redhat.com> wrote:
>
> On Mon, Jul 01, 2019 at 08:20:44PM +0200, Christian Ehrhardt wrote:
> > On Mon, Jul 1, 2019 at 7:56 PM Daniel P. Berrangé <berrange at redhat.com> wrote:
> > >
> > > On Mon, Jul 01, 2019 at 05:26:45PM +0200, Christian Ehrhardt wrote:
> > > > Hi,
> > > > today I was debugging an issue that I found with qemu 4.0 and ended up
> > > > puzzled about file ownership. I'm almost EOD now, but wanted to reach
> > > > out here for a sanity check before I debug further. The case
> > > > sumamrized is this:
> > > >
> > > > 1. start guest
> > > >   1.1 image files are changed to libvirt-qemu:kvm (which matches
> > > > Ubuntus user/group config)
> > > > 2. live migrate the guest to a different node
> > > >   2.1 image files go back to root:root which they initially had (ok)
> > > > 3. migrate guest back to the original node
> > > >   x. image files stay root:root and are not changed back to libvirt-qemu:kvm
> > > >
> > > > That is odd/unexpected, but it seems the same applies to older
> > > > versions and there it was never a problem so far.
> > >
> > > This sounds odd - I see no reason why the first migration should
> > > behave differently from the second migration, as libvirt makes
> > > no distinction & has no knowledge of previous migrations.
> >
> > I might have been unclear, let me clarify:
> > 1. migration without copy (changes the ownership)
> > lxc exec testkvm-eoan-from -- virsh migrate  --live --unsafe
> > kvmguest-eoan-normal qemu+ssh://10.222.144.19/system
> > (and the same type backwards)
> > 2. migration away from here with --copy-storage* fails due to file ownership
> >
> > > Could the two hosts be configured differently in some way. For
> > > example is the disk image on ext4 on one host, but nfs on the
> > > other host ? With shared filesystems, we'd generally expect
> > > the disk to be exposed on a shared filesystem on both hosts.
> >
> > The two hosts are in fact two LXD containers sharing the same
> > Filesystem (for the first migration)
> > And still two containers, but without shared FS for the second
> > copy-storage-* migration that then eventually fails due to the bad
> > file ownership.
> >
> > But since they in fact use shared storage that might still be the
> > reason - maybe the ordering is important here.
> > Since in the setup for migration #1 they really use shared storage it
> > might be that we have something like
> >
> > 1. source: migrates off guest
> > 2. target: receives guest
> > 3. target: guest is complete and changes file ownership to
> > libvirt-qemu:kvm (correct)
> > 4. source: shuts down the stub that is left after migration and
> > changes file ownership to root:root
> >
> > Due to really using the same storage (not like the same FCP or scsi
> > device but the same FS) that could be the reason the ownership after a
> > migration cycle is reset.
> > Thanks for the hint, worth a check with some slightly altered setups ...
> > ... and confirmed - if I use copied images, but not on a shared FS the
> > ownership is handled correctly.
> >
> > So my way of sharing the FS might be odd. And libvirt does not detect
> > it as such, and due to that above ordering triggers the issue or the
> > ownership being set to root:root by the migration source after the
> > migration is complete.
>
> Yes, this is going to be a problem. Libvirt will look at the filesystem
> and see that it is a local-only filesystem, and so assume it can safely
> reset ownership when source VM shuts down. It cannot tell that this is
> going to negatively impact the target VM using the same filesystem.
>
> This is an example of one of the problems that makes us say that
> localhost migration is not supported.
>
> If you're going to use containers, you need to make sure that each
> container either has a separate filesystem mount, or that the container
> sees a filesystem like NFS so it knows it is shared.
>
> > Quoting the virsh man page my setup is similar to "disk images are
> > stored on coherent clustered filesystem, such as GFS2 or GPFS" but my
> > libvirt doesn't know that and therefore changes the ownership.
> > Maybe it would have the same bug on GFS2/GPFS?
> > I don't have such a setup at hand, how is the issue avoided there?
> > Is there a way to make libvirt realize that it really is on "the same"
> > FS to avoid these operations?
>
> Libvirt uses statfs to find the filesystem magic
> virFileIsSharedFSType checks NFS, GFS2, OCFS, AFS, SMB,
> CIFS, CEPH & GPFS

Thanks for the pointer Daniel!
To stat -f [1] it appears to be the same (ext2/ext3) as in the host.
I started a discussion with our container people if I could
detect/differentiate that somehow.
Until then, now that I'm aware I can just let automation clean up
ownership after those actions.

[1]: http://paste.ubuntu.com/p/jffSrtKg7t/

>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd




More information about the libvir-list mailing list