[libvirt] [PATCH 0/2] qemu: Rework CD/DVD changing

Li Wei lw at cn.fujitsu.com
Mon May 20 00:50:28 UTC 2013


On 05/19/2013 05:51 PM, Michal Privoznik wrote:
> On 18.05.2013 05:31, Li Wei wrote:
>> On 05/17, Michal Privoznik wrote:
>>> On 17.05.2013 11:37, Li Wei wrote:
>>>> On 05/17/2013 04:52 PM, Michal Privoznik wrote:
>>>>> On 17.05.2013 10:16, Li Wei wrote:
>>>>>> On 05/17/2013 04:07 PM, Michal Privoznik wrote:
>>>>>>> On 17.05.2013 07:47, Li Wei wrote:
>>>>>>>> Hi Michal, Daniel
>>>>>>>>
>>>>>>>> It seems there is something wrong with the 2/2 part of this patchset.
>>>>>>>> When I do an "change-media" command in virsh, it doesn't do any better
>>>>>>>> than before, but even worse(I must to wait 5 secs to see the error).
>>>>>>>>
>>>>>>>> I'm not family with libvirt, just add some log things in the qemu_hotplug.c
>>>>>>>> and found the tray_status never change to open, but with michal's original
>>>>>>>> patch (which do active poll on tray_status), I can do "change-media" successfully
>>>>>>>> every time.
>>>>>>>
>>>>>>> Yes & no. Thing is - originally we ignored tray status in the guest. So
>>>>>>> in case guest has locked the tray and thus ignored our eject request, we
>>>>>>> have changed the media anyway, resulting in possible I/O errors within
>>>>>>> guest.
>>>>>>>
>>>>>>> With my patch, we send eject request to the qemu, and actively poll for
>>>>>>> the tray to open. We check for up to 10 times every 200ms. So guest has
>>>>>>> 2 seconds to open the tray. If the guest doesn't eject media and open
>>>>>>> the tray we don't proceed with forcibly changing the media.
>>>>>>>
>>>>>>> These patches are actually a bug fix for
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=892289
>>>>>>
>>>>>> Hmm... I think you misunderstood my meaning(sorry for my pool english),
>>>>>> simply to say, I can get expected behavior by apply patch V1 but not V2.
>>>>>>
>>>>>> In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just
>>>>>> wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but
>>>>>> this never happened.
>>>>>>
>>>>>> Thanks,
>>>>>> Wei
>>>>>
>>>>> Aaah, now I understand. So you are saying that the event is not being
>>>>> delivered. What's the qemu version?
>>>>
>>>>
>>>> I have the following version:
>>>>
>>>> virsh # version
>>>> Compiled against library: libvirt 1.0.5
>>>> Using library: libvirt 1.0.5
>>>> Using API: QEMU 1.0.5
>>>> Running hypervisor: QEMU 1.4.1
>>>>
>>>>>
>>>>> If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them?
>>>>> http://wiki.libvirt.org/page/DebugLogs
>>>>>
>>>>> Or maybe the timeout is just short for your guest to eject the tray. So
>>>>> after a while, does 'virsh change-media' succeed?
>>>>
>>>> Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after
>>>> the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0".
>>>> does that mean the timeout is too short? 
>>>>
>>>> If the guest needs too much time to eject, I'm wondering why when I use
>>>> the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there
>>>> are other problems cause tray_status not updated in time?
>>>>
>>>> I appended the log file for your reference, if you need any other information,
>>>> please tell me.
>>>>
>>>> Thanks,
>>>> Wei
>>>
>>> I am pasting some interesting knobs from the log:
>>>
>>> 2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"}
>>>  fd=-1
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"}
>>>  len=92 ret=92 errno=2
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is locked"}}
>>>  len=102
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 : internal error unable to execute QEMU command 'eject': Device 'drive-ide0-1-0' is locked
>>> 2013-05-17 09:04:46.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 9
>>> 2013-05-17 09:04:46.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 8
>>> 2013-05-17 09:04:47.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 7
>>> 2013-05-17 09:04:47.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 6
>>> 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:48.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 5
>>> 2013-05-17 09:04:48.735+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 4
>>> 2013-05-17 09:04:49.235+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 3
>>> 2013-05-17 09:04:49.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 2
>>> 2013-05-17 09:04:50.236+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 1
>>> 2013-05-17 09:04:50.736+0000: 27004: debug : qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. Retries left 0
>>> 2013-05-17 09:04:51.236+0000: 27004: error : qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject media before changing it
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0
>>> 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "drive-ide0-1-0", "tray-open": true}}
>>>  len=151
>>>
>>>
>>> What we can see here is:
>>> 1) libvirt asks qemu to eject
>>> 2) qemu responds with "Device is locked" - which is correct as it emulates guest request
>>> 3) However, after a while, the guest decides it's now safe to eject the media (e.g. all reads have finished, and the device has been unmounted) which is done
>>> 4) [not in the pasted log] After a while, you've decided to try again and this time it succeeded, as the tray is already opened.
>>>
>>> What we may do here is lifting the number of retries up. You can try yourself:
>>
>> I have indeed increased retries to 20 and unfortunately that didn't work either,
>> we can see in the log file that the DEVICE_TRAY_MOVED event received immediately
>> after the final retry. After we doubled the retries, that would be more that enough
>> time for the guest to eject the tray. So, maybe there is something wrong otherwhere.
>>
>> Thanks,
>> Wei
> 
> I think I know where the bug is. When waiting on tray to open, we don't unlock the domain. So later, when an event arrives, the first thing that the event handler does is locking the domain. Hence, it seems like the event is delivered right after we've failed waiting. That's just because after we've finished waiting, we've unlocked the domain. And I think same applies for the domain monitor. So can you please try this out:

This patch works fine, thanks :)

Tested-by: Li Wei <lw at cn.fujitsu.com>

Thanks,
Wei

> 
> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> index d037c9d..767d5a1 100644
> --- a/src/qemu/qemu_hotplug.c
> +++ b/src/qemu/qemu_hotplug.c
> @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver,
>      qemuDomainObjEnterMonitor(driver, vm);
>      ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
>  
> +    virObjectRef(vm);
>      /* we don't want to report errors from media tray_open polling */
>      while (retries--) {
>          if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN)
>              break;
>  
> +        qemuDomainObjExitMonitor(driver, vm);
> +        virObjectUnlock(vm);
>          VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", retries);
>          usleep(500 * 1000); /* sleep 500ms */
> +        virObjectLock(vm);
> +        qemuDomainObjEnterMonitor(driver, vm);
>      }
> +    virObjectUnref(vm);
>  
>      if (disk->src) {
>          /* deliberately don't depend on 'ret' as 'eject' may have failed for the
> 
> 
> Michal
> 




More information about the libvir-list mailing list