[libvirt] [PATCH 5/5] qemu_hotplug: Fix a rare race condition when detaching a device twice

Michal Privoznik mprivozn at redhat.com
Wed Mar 13 10:35:50 UTC 2019


On 3/12/19 7:57 PM, Peter Krempa wrote:
> On Tue, Mar 12, 2019 at 16:41:15 +0100, Peter Krempa wrote:
>> On Tue, Mar 12, 2019 at 16:13:20 +0100, Michal Privoznik wrote:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1623389
>>>
>>> If a device is detached twice from the same domain the following
>>> race condition may happen:
>>>
>>> 1) The first DetachDevice() call will issue "device_del" on qemu
>>> monitor, but since the DEVICE_DELETED event did not arrive in
>>> time, the API ends claiming "Device detach request sent
>>> successfully".
>>>
>>> 2) The second DetachDevice() therefore still find the device in
>>> the domain and thus proceeds to detaching it again. It calls
>>> EnterMonitor() and qemuMonitorSend() trying to issue "device_del"
>>> command again. This gets both domain lock and monitor lock
>>> released.
>>>
>>> 3) At this point, qemu sends us the DEVICE_DELETED event which is
>>> going to be handled by the event loop which ends up calling
>>> qemuDomainSignalDeviceRemoval() to determine who is going to
>>> remove the device from domain definition. Whether it is the
>>> caller that marked the device for removal or whether it is going
>>> to be the event processing thread.
>>>
>>> 4) Because the device was marked for removal,
>>> qemuDomainSignalDeviceRemoval() returns true, which means the
>>> event is to be processed by the thread that has marked the device
>>> for removal (and is currently still trying to issue "device_del"
>>> command)
>>>
>>> 5) The thread finally issues the "device_del" command, which
>>> fails (obviously) and therefore it calls
>>> qemuDomainResetDeviceRemoval() to reset the device marking and
>>> quits immediately after, NOT removing any device from the domain
>>> definition.
>>>
>>> At this point, the device is still present in the domain
>>> definition but doesn't exist in qemu anymore. Worse, there is no
>>> way to remove it from the domain definition.
>>>
>>> Solution is to note down that we've seen the event and if the
>>> second "device_del" fails, not take it as a failure but carry on
>>> with the usual execution.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>> ---
>>>   src/qemu/qemu_domain.h  |  1 +
>>>   src/qemu/qemu_hotplug.c | 83 +++++++++++++++++++++++++++++------------
>>>   2 files changed, 60 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>>> index 9f468e5661..fb361515ba 100644
>>> --- a/src/qemu/qemu_domain.h
>>> +++ b/src/qemu/qemu_domain.h
>>> @@ -218,6 +218,7 @@ typedef qemuDomainUnpluggingDevice *qemuDomainUnpluggingDevicePtr;
>>>   struct _qemuDomainUnpluggingDevice {
>>>       const char *alias;
>>>       qemuDomainUnpluggingDeviceStatus status;
>>> +    bool eventSeen; /* True if DEVICE_DELETED event arrived. */
>>>   };
>>>   
>>>   
>>> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
>>> index 574477e916..93c0e14adf 100644
>>> --- a/src/qemu/qemu_hotplug.c
>>> +++ b/src/qemu/qemu_hotplug.c
>>> @@ -70,22 +70,47 @@ unsigned long long qemuDomainRemoveDeviceWaitTime = 1000ull * 5;
>>>   /**
>>>    * qemuDomainDeleteDevice:
>>>    * @mon: qemu monitor
>>> + * @vm: domain object
>>>    * @alias: device to remove
>>>    *
>>>    * A simple wrapper around qemuMonitorDelDevice().
>>> - * @mon must be locked upon entry.
>>> + * @mon must be locked upon entry, @vm shan't.
>>
>> Please use either the more contemporary short form or the long form.
>>
>>>    *
>>>    * Returns: 0 on success,
>>>    *         -1 otherwise.
>>>    */
>>>   static inline int
>>>   qemuDomainDeleteDevice(qemuMonitorPtr mon,
>>> +                       virDomainObjPtr vm,
>>>                          const char *alias)
>>>   {
>>> -    if (qemuMonitorDelDevice(mon, alias) < 0)
>>> -        return -1;
>>> +    qemuDomainObjPrivatePtr priv;
>>> +    int ret = 0;
>>>   
>>> -    return 0;
>>> +    if (qemuMonitorDelDevice(mon, alias) < 0) {
>>> +        if (vm) {
>>> +            /* It is safe to lock and unlock both @mon and @vm
>>> +             * here because:
>>> +             * a) qemuDomainObjEnterMonitor() ensures @mon is
>>> +             * ref()'d
>>> +             * b) The API that is calling us ensures that @vm is
>>> +             * ref()'d
>>
>> Don't break the two lines.
>>
>>> +             */
>>> +            virObjectUnlock(mon);
>>> +            virObjectLock(vm);
>>> +            priv = vm->privateData;
>>> +            if (priv->unplug.eventSeen)
>>
>> You can't do this without also clearing priv->unplug.alias. If the event
>> was not seen at this point you need to make sure that it will
>> unconditionally be handled via the separate event handler thread prior
>> to giving up the lock on VM. By not clearing it, you still have another
>> chance at the event handler thinking the unplug will be handled here
>> without actually doing so.
>>
>>> +                virResetLastError();
>>
>> Additionally it would be better if qemuMonitorJSONDelDevice actually
>> returns the error string or object without reporting it so that we can
>> decide here not to report it at all rather than resetting it.
> 
> After some more thinking I think that:
> 
> 1) The new helper should encapsulate also the call to enter the monitor
>      - that way you can avoid the super-hacky way that re-locks the
>        monitor.
>      - you can then intergrate setting of the alias into private data
>      - resetting of it after exit of the monitor
>      - checking that the event was seen
>    This should work nicely as AFAIK all code paths using 'device_del'
>    should only ever call this one API and do everything else.
> 
> 2) Refactor the monitor APIs for device_del so that they return the
>     error message reported by the monitor as success and return the copy
>     of the error message. This will probably also require a different
>     monitor error checking function
> 
>     That way you can get the error message and report it if you decide
>     so.

Alternatively, I can introduce an argument to qemuMonitorDelDevice() 
which would supress error reporting for this specific case (we still 
want to report errors from QEMU_CHECK_MONITOR(), 
qemuMonitorJSONMakeCommand(), ...), return say -2 so that my code knows 
device_del failed because the device is already missing.

> 
> 
> If any code path requires more than one monitor call, the 1) will also
> ensure that the alias which we are waiting for is set only during
> controlled amount of time and thus we are sure that the event thread
> handles any DEVICE_DELETED events otherwise.
> 

I started writing it this way. But problem is the
qemuDomainDetachExtensionDevice() which is called in some rollback
codes. For instance in qemuDomainAttachDiskGeneric:

     qemuDomainObjEnterMonitor(driver, vm);

     if (qemuHotplugDiskSourceAttach(priv->mon, diskdata) < 0)
         goto exit_monitor;

     if (qemuDomainAttachExtensionDevice(priv->mon, &disk->info) < 0)
         goto exit_monitor;

     if (qemuMonitorAddDevice(priv->mon, devstr) < 0) {
         ignore_value(qemuDomainDetachExtensionDevice(priv->mon, 
&disk->info));
         goto exit_monitor;
     }

     if (qemuDomainObjExitMonitor(driver, vm) < 0) {

But I guess I can do what I'm doing now - if some argument is NULL then
I can assume caller has done EnterMonitor() and not call it again.

Michal




More information about the libvir-list mailing list