[libvirt] [PATCH v2 2/2] qemu: avoid rare race when undefining domain

Martin Kletzander mkletzan at redhat.com
Mon Nov 3 09:24:52 UTC 2014


On Mon, Nov 03, 2014 at 09:54:09AM +0100, Michal Privoznik wrote:
>On 31.10.2014 15:47, Martin Kletzander wrote:
>> When one domain is being undefined and at the same time started, for
>> example, there is a possibility of a rare problem occuring.
>>
>>   - Thread 1 does virDomainUndefine(), has the lock, checks that the
>>     domain is active and because it's not, calls
>>     virDomainObjListRemove().
>>
>>   - Thread 2 does virDomainCreate() and tries to lock the domain.
>>
>>   - Thread 1 needs to lock domain list in order to remove the domain from
>>     it, but must unlock domain first (proper order is to lock domain list
>>     first and the domain itself second).
>>
>>   - Thread 2 grabs the lock, starts the domain and releases the lock.
>>
>>   - Thread 1 grabs the lock and removes the domain from list.
>>
>> With this patch:
>>
>>   - qemuDomainRemoveInactive() creates a QEMU_JOB_MODIFY if that's
>>     possible, but since it must remove the domain from list either way,
>>     it continues even when starting the job failed.
>>
>>   - Because virDomainObjListRemove() removes a reference to the
>>     virDomainObj (and it must be kept that way for other drivers), we
>>     hack around that by acquiring one more reference that's kept until
>>     the job is ended.
>>
>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
>>
>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>> ---
>>   src/qemu/qemu_domain.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 76fccce..98c4763 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -2392,9 +2392,13 @@ void
>>   qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>>                            virDomainObjPtr vm)
>>   {
>> +    bool haveJob = true;
>>       char *snapDir;
>>       virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
>>
>> +    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
>> +        haveJob = false;
>> +
>>       /* Remove any snapshot metadata prior to removing the domain */
>>       if (qemuDomainSnapshotDiscardAllMetadata(driver, vm) < 0) {
>>           VIR_WARN("unable to remove all snapshots for domain %s",
>> @@ -2409,8 +2413,13 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>>               VIR_WARN("unable to remove snapshot directory %s", snapDir);
>>           VIR_FREE(snapDir);
>>       }
>> +    virObjectRef(vm);
>>       virDomainObjListRemove(driver->domains, vm);
>>       virObjectUnref(cfg);
>> +
>> +    if (haveJob)
>> +        ignore_value(qemuDomainObjEndJob(driver, vm));
>> +    virObjectUnref(vm);
>>   }
>>
>>   void
>>
>
>I don't think there's a reason to do the Ref() and Unref(). Job control
>functions do that already. ACK with that removed.
>

Oh, they do :)  I removed those two lines and the part of commit
message as well.  And I'll push it in a while.

Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141103/9374a256/attachment-0001.sig>


More information about the libvir-list mailing list