[libvirt] [PATCH 7/9] qemu: Grab modify job for changing domain XML

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Tue Jun 4 14:30:01 UTC 2019



On 04.06.2019 17:23, Michal Privoznik wrote:
> On 6/4/19 11:07 AM, Nikolay Shirokovskiy wrote:
>>
>>
>> On 30.05.2019 12:34, Michal Privoznik wrote:
>>> Changing domain definition must be guarded with acquiring modify
>>> job. The problem is if there is a thread executing say
>>> qemuDomainSetMemoryStatsPeriod() which is an API that acquires
>>> modify job and then possibly unlock the domain object and locks
>>> it again. However, becasue virDomainObjAssignDef() does not take
>>> a job (only object lock) it may have changed the domain
>>> definition while the other thread unlocked the domain object in
>>> order to talk on the monitor. For instance:
>>>
>>> Thread1:
>>> 1) lookup domain object and lock it
>>> 2) acquire job
>>> 3) get pointers to live and persistent defs
>>> 4) unlock the domain object
>>> 5) talk to qemu on the monitor
>>>
>>> Thread2 (Execute qemuDomainDefineXMLFlags):
>>> 1) lookup domain object and lock it
>>> 2) virDomainObjAssignDef() which overwrites persistent def and
>>>     frees the old one
>>> 3) unlock domain object
>>>
>>> Thread1:
>>> 6) lock the domain object again
>>> 7) try to access the persistent def via pointer stored in 3)
>>>
>>> Now this will crash because the pointer from step 3) points to a
>>> memory that was freed.
>>>
>>> However, if Thread2 waited and acquired modify job (just like
>>> Thread1) then these two would serialize and no harm would be
>>> done.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>> ---
>>>   src/qemu/qemu_domain.c    |  3 ++-
>>>   src/qemu/qemu_driver.c    | 29 +++++++++++++++++++++++++----
>>>   src/qemu/qemu_migration.c |  7 +++++++
>>>   3 files changed, 34 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index c61d39b12b..fff2f1932e 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -8784,7 +8784,8 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
>>>           return;
>>>       }
>>>   -    qemuDomainRemoveInactiveCommon(driver, vm);
>>> +    if (vm->def)
>>> +        qemuDomainRemoveInactiveCommon(driver, vm);
>>>         virDomainObjListRemove(driver->domains, vm);
>>
>> Hmm, virDomainObjListRemoveLocked uses vm->def too.
>>
>>>   }
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index 8bbac339e0..77cb7e4b87 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -1706,9 +1706,16 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn,
>>>                                      VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
>>>           goto cleanup;
>>>   +    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
>>> +        qemuDomainRemoveInactive(driver, vm);
>>> +        goto cleanup;
>>> +    }
>>> +
>>>       virDomainObjAssignDef(vm, def, true, NULL);
>>>       def = NULL;
>>>   +    qemuDomainObjEndJob(driver, vm);
>>> +
>>
>> Looks like we can begin VIR_DOMAIN_JOB_OPERATION_START from the start instead
>> of acquiring job twice.
> 
> That is an async job. We shouldn't rely on async job when modifying a domain.
> 

To me it looks ok. We don't drop vm lock between begin job and assigning definition
so other threads have no chance to see invalid state.

Nikolay




More information about the libvir-list mailing list