[libvirt] [PATCH 7/9] qemu: Grab modify job for changing domain XML
Nikolay Shirokovskiy
nshirokovskiy at virtuozzo.com
Tue Jun 4 14:30:01 UTC 2019
On 04.06.2019 17:23, Michal Privoznik wrote:
> On 6/4/19 11:07 AM, Nikolay Shirokovskiy wrote:
>>
>>
>> On 30.05.2019 12:34, Michal Privoznik wrote:
>>> Changing domain definition must be guarded with acquiring modify
>>> job. The problem is if there is a thread executing say
>>> qemuDomainSetMemoryStatsPeriod() which is an API that acquires
>>> modify job and then possibly unlock the domain object and locks
>>> it again. However, becasue virDomainObjAssignDef() does not take
>>> a job (only object lock) it may have changed the domain
>>> definition while the other thread unlocked the domain object in
>>> order to talk on the monitor. For instance:
>>>
>>> Thread1:
>>> 1) lookup domain object and lock it
>>> 2) acquire job
>>> 3) get pointers to live and persistent defs
>>> 4) unlock the domain object
>>> 5) talk to qemu on the monitor
>>>
>>> Thread2 (Execute qemuDomainDefineXMLFlags):
>>> 1) lookup domain object and lock it
>>> 2) virDomainObjAssignDef() which overwrites persistent def and
>>> frees the old one
>>> 3) unlock domain object
>>>
>>> Thread1:
>>> 6) lock the domain object again
>>> 7) try to access the persistent def via pointer stored in 3)
>>>
>>> Now this will crash because the pointer from step 3) points to a
>>> memory that was freed.
>>>
>>> However, if Thread2 waited and acquired modify job (just like
>>> Thread1) then these two would serialize and no harm would be
>>> done.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>> ---
>>> src/qemu/qemu_domain.c | 3 ++-
>>> src/qemu/qemu_driver.c | 29 +++++++++++++++++++++++++----
>>> src/qemu/qemu_migration.c | 7 +++++++
>>> 3 files changed, 34 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index c61d39b12b..fff2f1932e 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -8784,7 +8784,8 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>>> return;
>>> }
>>> - qemuDomainRemoveInactiveCommon(driver, vm);
>>> + if (vm->def)
>>> + qemuDomainRemoveInactiveCommon(driver, vm);
>>> virDomainObjListRemove(driver->domains, vm);
>>
>> Hmm, virDomainObjListRemoveLocked uses vm->def too.
>>
>>> }
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index 8bbac339e0..77cb7e4b87 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -1706,9 +1706,16 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn,
>>> VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
>>> goto cleanup;
>>> + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
>>> + qemuDomainRemoveInactive(driver, vm);
>>> + goto cleanup;
>>> + }
>>> +
>>> virDomainObjAssignDef(vm, def, true, NULL);
>>> def = NULL;
>>> + qemuDomainObjEndJob(driver, vm);
>>> +
>>
>> Looks like we can begin VIR_DOMAIN_JOB_OPERATION_START from the start instead
>> of acquiring job twice.
>
> That is an async job. We shouldn't rely on async job when modifying a domain.
>
To me it looks ok. We don't drop vm lock between begin job and assigning definition
so other threads have no chance to see invalid state.
Nikolay
More information about the libvir-list
mailing list