[libvirt] [PATCH v2 2/2] qemu: avoid rare race when undefining domain
Martin Kletzander
mkletzan at redhat.com
Mon Nov 3 09:24:52 UTC 2014
On Mon, Nov 03, 2014 at 09:54:09AM +0100, Michal Privoznik wrote:
>On 31.10.2014 15:47, Martin Kletzander wrote:
>> When one domain is being undefined and at the same time started, for
>> example, there is a possibility of a rare problem occuring.
>>
>> - Thread 1 does virDomainUndefine(), has the lock, checks that the
>> domain is active and because it's not, calls
>> virDomainObjListRemove().
>>
>> - Thread 2 does virDomainCreate() and tries to lock the domain.
>>
>> - Thread 1 needs to lock domain list in order to remove the domain from
>> it, but must unlock domain first (proper order is to lock domain list
>> first and the domain itself second).
>>
>> - Thread 2 grabs the lock, starts the domain and releases the lock.
>>
>> - Thread 1 grabs the lock and removes the domain from list.
>>
>> With this patch:
>>
>> - qemuDomainRemoveInactive() creates a QEMU_JOB_MODIFY if that's
>> possible, but since it must remove the domain from list either way,
>> it continues even when starting the job failed.
>>
>> - Because virDomainObjListRemove() removes a reference to the
>> virDomainObj (and it must be kept that way for other drivers), we
>> hack around that by acquiring one more reference that's kept until
>> the job is ended.
>>
>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1150505
>>
>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>> ---
>> src/qemu/qemu_domain.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 76fccce..98c4763 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -2392,9 +2392,13 @@ void
>> qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>> virDomainObjPtr vm)
>> {
>> + bool haveJob = true;
>> char *snapDir;
>> virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
>>
>> + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
>> + haveJob = false;
>> +
>> /* Remove any snapshot metadata prior to removing the domain */
>> if (qemuDomainSnapshotDiscardAllMetadata(driver, vm) < 0) {
>> VIR_WARN("unable to remove all snapshots for domain %s",
>> @@ -2409,8 +2413,13 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>> VIR_WARN("unable to remove snapshot directory %s", snapDir);
>> VIR_FREE(snapDir);
>> }
>> + virObjectRef(vm);
>> virDomainObjListRemove(driver->domains, vm);
>> virObjectUnref(cfg);
>> +
>> + if (haveJob)
>> + ignore_value(qemuDomainObjEndJob(driver, vm));
>> + virObjectUnref(vm);
>> }
>>
>> void
>>
>
>I don't think there's a reason to do the Ref() and Unref(). Job control
>functions do that already. ACK with that removed.
>
Oh, they do :) I removed those two lines and the part of commit
message as well. And I'll push it in a while.
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141103/9374a256/attachment-0001.sig>
More information about the libvir-list
mailing list