[libvirt] Checking the expected behavior on dynamic image file ownership when live migrating

Christian Ehrhardt christian.ehrhardt at canonical.com
Mon Jul 1 18:20:44 UTC 2019


On Mon, Jul 1, 2019 at 7:56 PM Daniel P. Berrangé <berrange at redhat.com> wrote:
>
> On Mon, Jul 01, 2019 at 05:26:45PM +0200, Christian Ehrhardt wrote:
> > Hi,
> > today I was debugging an issue that I found with qemu 4.0 and ended up
> > puzzled about file ownership. I'm almost EOD now, but wanted to reach
> > out here for a sanity check before I debug further. The case
> > sumamrized is this:
> >
> > 1. start guest
> >   1.1 image files are changed to libvirt-qemu:kvm (which matches
> > Ubuntus user/group config)
> > 2. live migrate the guest to a different node
> >   2.1 image files go back to root:root which they initially had (ok)
> > 3. migrate guest back to the original node
> >   x. image files stay root:root and are not changed back to libvirt-qemu:kvm
> >
> > That is odd/unexpected, but it seems the same applies to older
> > versions and there it was never a problem so far.
>
> This sounds odd - I see no reason why the first migration should
> behave differently from the second migration, as libvirt makes
> no distinction & has no knowledge of previous migrations.

I might have been unclear, let me clarify:
1. migration without copy (changes the ownership)
lxc exec testkvm-eoan-from -- virsh migrate  --live --unsafe
kvmguest-eoan-normal qemu+ssh://10.222.144.19/system
(and the same type backwards)
2. migration away from here with --copy-storage* fails due to file ownership

> Could the two hosts be configured differently in some way. For
> example is the disk image on ext4 on one host, but nfs on the
> other host ? With shared filesystems, we'd generally expect
> the disk to be exposed on a shared filesystem on both hosts.

The two hosts are in fact two LXD containers sharing the same
Filesystem (for the first migration)
And still two containers, but without shared FS for the second
copy-storage-* migration that then eventually fails due to the bad
file ownership.

But since they in fact use shared storage that might still be the
reason - maybe the ordering is important here.
Since in the setup for migration #1 they really use shared storage it
might be that we have something like

1. source: migrates off guest
2. target: receives guest
3. target: guest is complete and changes file ownership to
libvirt-qemu:kvm (correct)
4. source: shuts down the stub that is left after migration and
changes file ownership to root:root

Due to really using the same storage (not like the same FCP or scsi
device but the same FS) that could be the reason the ownership after a
migration cycle is reset.
Thanks for the hint, worth a check with some slightly altered setups ...
... and confirmed - if I use copied images, but not on a shared FS the
ownership is handled correctly.

So my way of sharing the FS might be odd. And libvirt does not detect
it as such, and due to that above ordering triggers the issue or the
ownership being set to root:root by the migration source after the
migration is complete.
Quoting the virsh man page my setup is similar to "disk images are
stored on coherent clustered filesystem, such as GFS2 or GPFS" but my
libvirt doesn't know that and therefore changes the ownership.
Maybe it would have the same bug on GFS2/GPFS?
I don't have such a setup at hand, how is the issue avoided there?
Is there a way to make libvirt realize that it really is on "the same"
FS to avoid these operations?


> We do take special action if we see an NFS filesystem, as we
> cannot reset file ownership on the migration source, as that
> would impact the migration dest too.
>
> > In my case I stumbled over it because the newer qemu --copy-storage-*
> > options now want to re-open the file at some point and that fails
> > with:
> > error: internal error: unable to execute QEMU command 'drive-mirror':
> > Could not reopen file: Permission denied
> >
> > If I e.g. shut down and start the guest on the source node (which
> > fixes up the ownership at runtime) then migration with copy-storage
> > works fine again.
> >
> > I'm mostly looking for hints if this is a known issue or being worked
> > on to not waste time tomorrow.
> > So hints are welcome.
> >
> > --
> > Christian Ehrhardt
> > Software Engineer, Ubuntu Server
> > Canonical Ltd
> >
> > --
> > libvir-list mailing list
> > libvir-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/libvir-list
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd




More information about the libvir-list mailing list