[libvirt] [frankschreuder at outlook.com: Fwd: Reproducible live disk detach bug]

Daniel P. Berrange berrange at redhat.com
Mon Aug 10 12:36:32 UTC 2015


On Fri, Aug 07, 2015 at 10:18:59AM +0200, Frank Schreuder wrote:
> Hi Laine,
> 
> I'm sorry for starting 3 threads with the same topic, seems like my mail was
> heavily delayed yesterday.
> 
> [ removing personal mail ]
> 
> Op 8/7/2015 om 12:24 AM schreef Laine Stump:
> >>>Subject: Reproducible live disk detach bug
> >>>
> >>>Hey guys,
> >>>
> >>>I am currently running into a reproducible libvirt bug when I detach a
> >>>disk from a running VM using libvirt 1.2.18 and qemu 2.3.0.
> >>>
> >>>I can attach a disk to a running VM without any problems.
> >>>During the attach operation I don't create a disk alias name, which
> >>>results in libvirt generating an unique alias.
> >>>Using virsh dumpxml I get the following output(snippet):
> >>>
> >>><disk type='file' device='disk'>
> >>>       <driver name='qemu' type='raw' cache='none'/>
> >>>       <source file='/var/local/mnt/test-storage/test-storage.raw'/>
> >>>       <backingStore/>
> >>>       <target dev='vde' bus='virtio'/>
> >>>       <iotune>
> >>>         <read_bytes_sec>94371840</read_bytes_sec>
> >>>         <write_bytes_sec>31457280</write_bytes_sec>
> >>>         <read_iops_sec>1000</read_iops_sec>
> >>>         <write_iops_sec>200</write_iops_sec>
> >>>       </iotune>
> >>>       <alias name='virtio-disk4'/>
> >>>       <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
> >>></disk>
> >>>
> >>>As you can see, libvirt generates 'virtio-disk4' as an alias. But as
> >>>soon as I try to detach the disk from the running VM, I will receive the
> >>>following two errors in my libvirt log:
> >>>
> >>>2015-08-06 10:48:59.943+0000: 11657: warning :
> >>>qemuMonitorJSONHandleDeviceDeleted:939 : missing device in device
> >>>deleted event
> >>>2015-08-06 10:48:59.945+0000: 11662: error :
> >>>qemuMonitorTextDriveDel:2594 : operation failed: deleting
> >>>drive-virtio-disk4 drive failed: 2015-08-06T10:48:59.945058Z Device
> >>>'drive-virtio-disk4' not found
> >>>
> >>>Apparently libvirt tries to detach the disk with a drive- prefix, which
> >>>doesn't exist?
> >I can't tell you right off why the failure message is there, but the
> >alias with the added "drive-" is easily explained.
> >
> >Each <disk> element in the libvirt config corresponds to two entities in
> >the qemu virtual machine - a "device" component (what is seen by the
> >guest) and a "drive" component (what resources are used on the host to
> >back up that device). Each of these corresponds to a separate qemu
> >commandline argument (named "-device" and "-drive"), and each of those
> >must have a different, unique "id" option. For example, this is what is
> >added to my qemu commandline when I add a virtio disk named
> >"rawhide.qcow2" to my virtual machine:
> >
> >-drive
> >file=/var/lib/libvirt/images/rawhide.qcow2,if=none,id=drive-virtio-disk0,format=qcow2
> >-device
> >virtio-blk-pci,scsi=off,bus=pci.2,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> >
> >
> >If you look at the status XML of the domain while it is running, you'll
> >see that the <alias> for the disk element is "virtio-disk0", and that's
> >what is given to the -device half of the disk as its qemu id. The -drive
> >half uses "drive-$alias" as its qemu id though (you'll notice that the
> >-device half references that id with its "drive" option - that's how the
> >two halves are tied together by qemu).
> >
> >So, there is nothing unusual about qemu wanting to delete something
> >called drive-virtio-disk4 when you have a <disk> with
> >alias='virtio-disk4'. It obviously isn't right that an error message is
> >being logged. Something sounds familiar about the "missing device in
> >device deleted event" message, but I can't recall details. Does anyone
> >else have a better memory? Frank - would it be possible for you to try
> >this same sequence of events with the same libvirt and an older qemu
> >(and/or vice-versa)?
> >
> 
> I have some more interesting debug information.
> 
> I'm -not- able to reproduce it on libvirt 1.2.11 - qemu 2.3.0 and libvirt
> 1.2.11 - qemu 1.7.1 using a FreeBSD VM.
> I'm able to reproduce the errors with libvirt 1.2.11 - qemu 2.3.0 and
> libvirt 1.2.18 - qemu 2.3.0 using a Debian VM.
> Seems like the error triggers depending on the guest OS?
> 
> During hot-unplugging of network interfaces libvirt also triggers the same
> error on a Debian VM:
> 
> 2015-08-06 10:48:59.945+0000: 11662: error : qemuMonitorJSONHandleDeviceDeleted:939 : missing device in device deleted event
> 
> 
> I gathered some logging information with info level logging enabled while
> detaching a disk:
> http://pastebin.com/raw.php?i=vVLuavG2

The interesting this is this line:

2015-08-06 21:14:56.847+0000: 8037: info : qemuMonitorIOProcess:452 : QEMU_MONITOR_IO_PROCESS: mon=0x7f9477cc0fc0 buf={"return": "2015-08-06T21:14:56.847359Z Device 'drive-virtio-disk4' not found\r\n", "id": "libvirt-12"}

Notice how the data read from the monitor command has a timestamp
prefixed on it. This is very bad. We absolutely do not expect to
see a timestamp on HMP monitor replies, so whatever is doing that
in QEMU needs fixing.

AFAIK, QEMU is only supposed to add timestamps to stuff it prints
on stderr, when -msg timestamp=on. It should never timestamp HMP

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list