[libvirt-users] Sometimes libvirt fails to update domain block file after blockcommit.

Matthew Schumacher matt.s at aptalaska.com
Wed Apr 29 17:49:03 UTC 2015


Posted to https://bugzilla.redhat.com/show_bug.cgi?id=1217185

I just stumbled on another bug while snapshotting and think it's related
to 1210903 and 1197592 as it seems like some sort of race condition
because it depends on what logging is in place and doesn't happen every
time.

Here are the details:

I wrote this test script to snapshot and commit over and over:

#!/bin/sh

while [ 1 ]; do

  echo "Starting snapshot test `date`"
  virsh snapshot-create-as test 20150429 20150429-backup --disk-only
--atomic
  virsh domblklist test
  virsh blockcommit test vda --active --pivot --verbose
  virsh snapshot-delete test 20150429 --metadata
  virsh domblklist test
  rm /glustervol1/vm/test/test.20150429
  echo "Ending snapshot test `date`"
  echo
  echo

  sleep 2

done

If I run libvirtd in the foreground with debug set to 1 I can't get it
to fail, it does what it's supposed to do, snapshot and commit over and
over.

If I run libvirtd in the foreground with debug set to 3, then I will
always eventually get this:

Starting snapshot test Wed Apr 29 09:34:34 AKDT 2015
Domain snapshot 20150429 created
Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

Block Commit: [100 %]
Successfully pivoted
Domain snapshot 20150429 deleted

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

Ending snapshot test Wed Apr 29 09:34:35 AKDT 2015


Starting snapshot test Wed Apr 29 09:34:37 AKDT 2015
error: unsupported configuration: source for disk 'vda' is not a regular
file; refusing to generate external snapshot name

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

error: internal error: qemu block name '/glustervol1/vm/test/test.qcow2'
doesn't match expected '/glustervol1/vm/test/test.20150429'

error: Domain snapshot not found: no domain snapshot with matching name
'20150429'

Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

rm: can't remove '/glustervol1/vm/test/test.20150429': No such file or
directory
Ending snapshot test Wed Apr 29 09:34:37 AKDT 2015

At this point libvirt is confused about which file is the backing store
because the first run did pivot after blockcommit, but didn't update the
block file.  From the logs:

2015-04-29 17:33:41.052+0000: 25192: warning : qemuDomainObjTaint:1972 :
Domain id=2 name='test' uuid=4b9cc25b-68d1-4ce8-8a65-2a378e255e36 is
tainted: high-privileges
2015-04-29 17:34:37.322+0000: 25191: error :
virDomainSnapshotAlignDisks:609 : unsupported configuration: source for
disk 'vda' is not a regular file; refusing to generate external snapshot
name
2015-04-29 17:34:37.352+0000: 25194: error :
qemuMonitorJSONDiskNameLookup:3977 : internal error: unable to find
backing name for device drive-virtio-disk0
2015-04-29 17:34:37.354+0000: 25194: error :
qemuMonitorJSONDiskNameLookupOne:3914 : internal error: qemu block name
'/glustervol1/vm/test/test.qcow2' doesn't match expected
'/glustervol1/vm/test/test.20150429'


So libvirt insists that the block file is:

root at wasvirt2:/glustervol1/vm/waspbx# virsh domblklist test
Target     Source
------------------------------------------------
vda        /glustervol1/vm/test/test.20150429
hdc        /dev/sr0

But that file isn't in use and isn't what qemu is using:

root at wasvirt2:/glustervol1/vm/waspbx# lsof | grep test
25300   /usr/bin/qemu-system-x86_64     /var/log/libvirt/qemu/test.log
25300   /usr/bin/qemu-system-x86_64     /var/log/libvirt/qemu/test.log
25300   /usr/bin/qemu-system-x86_64     /glustervol1/vm/test/test.qcow2

The only way to straighten this out is to destroy and start the domain.




More information about the libvirt-users mailing list