[libvirt] [PATCH 7/9] qemu: Grab modify job for changing domain XML
Michal Privoznik
mprivozn at redhat.com
Tue Jun 4 14:23:09 UTC 2019
On 6/4/19 11:07 AM, Nikolay Shirokovskiy wrote:
>
>
> On 30.05.2019 12:34, Michal Privoznik wrote:
>> Changing domain definition must be guarded with acquiring modify
>> job. The problem is if there is a thread executing say
>> qemuDomainSetMemoryStatsPeriod() which is an API that acquires
>> modify job and then possibly unlock the domain object and locks
>> it again. However, becasue virDomainObjAssignDef() does not take
>> a job (only object lock) it may have changed the domain
>> definition while the other thread unlocked the domain object in
>> order to talk on the monitor. For instance:
>>
>> Thread1:
>> 1) lookup domain object and lock it
>> 2) acquire job
>> 3) get pointers to live and persistent defs
>> 4) unlock the domain object
>> 5) talk to qemu on the monitor
>>
>> Thread2 (Execute qemuDomainDefineXMLFlags):
>> 1) lookup domain object and lock it
>> 2) virDomainObjAssignDef() which overwrites persistent def and
>> frees the old one
>> 3) unlock domain object
>>
>> Thread1:
>> 6) lock the domain object again
>> 7) try to access the persistent def via pointer stored in 3)
>>
>> Now this will crash because the pointer from step 3) points to a
>> memory that was freed.
>>
>> However, if Thread2 waited and acquired modify job (just like
>> Thread1) then these two would serialize and no harm would be
>> done.
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>> src/qemu/qemu_domain.c | 3 ++-
>> src/qemu/qemu_driver.c | 29 +++++++++++++++++++++++++----
>> src/qemu/qemu_migration.c | 7 +++++++
>> 3 files changed, 34 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index c61d39b12b..fff2f1932e 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -8784,7 +8784,8 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>> return;
>> }
>>
>> - qemuDomainRemoveInactiveCommon(driver, vm);
>> + if (vm->def)
>> + qemuDomainRemoveInactiveCommon(driver, vm);
>>
>> virDomainObjListRemove(driver->domains, vm);
>
> Hmm, virDomainObjListRemoveLocked uses vm->def too.
>
>> }
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index 8bbac339e0..77cb7e4b87 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -1706,9 +1706,16 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn,
>> VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
>> goto cleanup;
>>
>> + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
>> + qemuDomainRemoveInactive(driver, vm);
>> + goto cleanup;
>> + }
>> +
>> virDomainObjAssignDef(vm, def, true, NULL);
>> def = NULL;
>>
>> + qemuDomainObjEndJob(driver, vm);
>> +
>
> Looks like we can begin VIR_DOMAIN_JOB_OPERATION_START from the start instead
> of acquiring job twice.
That is an async job. We shouldn't rely on async job when modifying a
domain.
>
>> if (qemuProcessBeginJob(driver, vm, VIR_DOMAIN_JOB_OPERATION_START,
>> flags) < 0) {
>> qemuDomainRemoveInactiveJob(driver, vm);
>> @@ -6976,6 +6983,11 @@ qemuDomainRestoreFlags(virConnectPtr conn,
>> VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
>> goto cleanup;
>>
>> + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
>> + qemuDomainRemoveInactive(driver, vm);
>> + goto cleanup;
>> + }
>> +
>> virDomainObjAssignDef(vm, def, true, NULL);
>> def = NULL;
>>
>> @@ -6988,6 +7000,7 @@ qemuDomainRestoreFlags(virConnectPtr conn,
>> priv = vm->privateData;
>> priv->hookRun = true;
>> }> + qemuDomainObjEndJob(driver, vm);
>
> Same here
>
>>
>> if (qemuProcessBeginJob(driver, vm, VIR_DOMAIN_JOB_OPERATION_RESTORE,
>> flags) < 0)
>> @@ -7651,6 +7664,11 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
>> driver->xmlopt, 0)))
>> goto cleanup;
>>
>> + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
>> + qemuDomainRemoveInactive(driver, vm);
>> + goto cleanup;
>> + }
>> +
>> virDomainObjAssignDef(vm, def, false, &oldDef);
>> def = NULL;
>>
>> @@ -7673,7 +7691,7 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
>> vm->persistent = 0;
>> qemuDomainRemoveInactiveJob(driver, vm);
>> }
>> - goto cleanup;
>> + goto endjob;
>> }
>>
>> event = virDomainEventLifecycleNewFromObj(vm,
>> @@ -7685,6 +7703,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
>> VIR_INFO("Creating domain '%s'", vm->def->name);
>> dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
>>
>> + endjob:
>> + qemuDomainObjEndJob(driver, vm);
>> +
>> cleanup:
>> virDomainDefFree(oldDef);
>> virDomainDefFree(def);
>
> Additionally we need to replace qemuDomainRemoveInactiveJob with qemuDomainRemoveInactive
> as we have a job already.
I'm planning on wrapping what you suggested in your reply to the cover
letter into a single function. This way, the change won't be that
intrusive and a lot of these changes won't be made.
Michal
More information about the libvir-list
mailing list