[libvirt] [PATCH 3/9] qemu: snapshot: save/restore inactive persistent config

John Ferlan jferlan at redhat.com
Fri Dec 21 14:03:19 UTC 2018



On 12/21/18 3:23 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 20.12.2018 23:31, John Ferlan wrote:
>>
>>
>> On 12/13/18 3:03 AM, Nikolay Shirokovskiy wrote:
>>> In case of active persistent domain snapshot metadata is
>>> not complete. We save only active configuration and on
>>> restore use it both for active and inactive configuration.
>>> Let's fix it and save and restore both in this case.
>>>
>>> In case of active transient or inactive domain we have
>>> only one config and thus everything is good.
>>>
>>> By the way this patch fixes config memleak on error paths.
>>
>> use @config to make it clearer...
>>
>> I see how/why you combined things and although extracting out the
>> memleak is a bit difficult, I think it should be done.
>>
>> Perhaps in the cleanup code to *SaveConfig, you just set @config = NULL
>> "temporarily" since we know/assume virDomainObjAssignDef was successful
>> and we wouldn't want the subsequently added virDomainDefFree(config) to
>> free it on those paths, but we would want it to be free'd on error
>> paths.  Then by using @newConfig in this patch, it's even more obvious
>> and of course the temporary config = NULL would be removed...  Hope this
>> makes sense.
> 
> Ok. I thought of a distinct patch too, but having in mind that adding/removing
> code in subsequent patches is not preferred I decided to just note this moment
> in the message.
> 

I'm mostly 50/50 on this - leaning towards separation mainly because I
think it is possible, but also because when you mention adding a new
feature and fixing an existing memory leak in the commit message that
generally results in someone asking for separation. Certainly makes it
easier to bisect to a commit and not have to wonder did the new
functionality or the bugfix cause some issue...

>>
>>>
>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>> ---
>>>  src/conf/snapshot_conf.c |  1 +
>>>  src/conf/snapshot_conf.h |  2 ++
>>>  src/qemu/qemu_driver.c   | 44 +++++++++++++++++++++++++++++++++++++++-----
>>>  3 files changed, 42 insertions(+), 5 deletions(-)
>>>
>>
>> Not my area of expertise, but rather than have it sit unreviewed...
>>
>>
>>> diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c
>>> index 5a511b4..2e4a0b9 100644
>>> --- a/src/conf/snapshot_conf.c
>>> +++ b/src/conf/snapshot_conf.c
>>> @@ -102,6 +102,7 @@ void virDomainSnapshotDefFree(virDomainSnapshotDefPtr def)
>>>          virDomainSnapshotDiskDefClear(&def->disks[i]);
>>>      VIR_FREE(def->disks);
>>>      virDomainDefFree(def->dom);
>>> +    virDomainDefFree(def->persistDom);
>>>      virObjectUnref(def->cookie);
>>>      VIR_FREE(def);
>>>  }
>>> diff --git a/src/conf/snapshot_conf.h b/src/conf/snapshot_conf.h
>>> index 20a42bd..3da204a 100644
>>> --- a/src/conf/snapshot_conf.h
>>> +++ b/src/conf/snapshot_conf.h
>>> @@ -75,6 +75,8 @@ struct _virDomainSnapshotDef {
>>>      virDomainSnapshotDiskDef *disks;
>>>  
>>>      virDomainDefPtr dom;
>>> +    /* inactive domain config in case of active persistent domain */
>>> +    virDomainDefPtr persistDom;
>>>  
>>>      virObjectPtr cookie;
>>>  
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index f7c1d6f..7e0585d 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -15687,6 +15687,12 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain,
>>>                                                   VIR_DOMAIN_DEF_PARSE_SKIP_VALIDATE)))
>>>              goto endjob;
>>>  
>>> +        if (virDomainObjIsActive(vm) && vm->persistent &&
>>
>> As it's easy to get lost or forget... IIRC vm->persistent means that w> have used the Define functions to provide a "config" at some point in
> 
>> time. Then eventually when/if we modify the "active" object to make a
>> configuration specific change, then the @newDef gets populated with the
>> config change(s); otherwise, the changes are made to the "active" def
>> and eventually/possibly saved.  All the magic is hidden away in
>> accessors that I don't think about all that often any more.
> 
> If by otherwise you mean "if we make changes to active config" - then 
> such changes are saved to status files but they can't be written 
> directly to config file.
> 

Like I said - some of the magic and details are lost in accessors. Yes,
the active is essentially saved to status... The whole def/newDef fetch,
store, and manipulation gets hidden behind those *Domain.*Persistent*
type helpers. The @newDef nomenclature has always struck me as a bit
odd, but using a longer more descriptive name may not help me either.

>>
>> So, should the vm->persistent be vm->newDef instead? Either way if the
>> following call is made and vm->newDef == NULL, then eventually in
>> virDomainDefFormatInternal we would core when @def was NULL.
> 
> Well there is at least one case when vm->newDef != NULL but domain is not persistent
> - if we undefine domain vm->newDef is not released. This can be fixed but
> I'm not sure there are no other cases. So checking vm->persisent is more reliable.

So we have an active persistent domain with a newDef that eventually is
made non persistent... I'm sure there's a reason for it...

> 
> I think this is currenly invariant that if virDomainObjIsActive(vm) && vm->persistent
> then vm->newDef != NULL but we can add explicit check.
> 
> The best option I think is to fix all cases when vm->newDef != NULL for non persistent
> active domain and replace check as you suggested. What do you think?
> 

I think my main concern would be vm->persistent == true and vm->newDef
== NULL.


> 
>>
>>
>>> +            !(def->persistDom = qemuDomainDefCopy(driver, vm->newDef,
>>> +                                                  VIR_DOMAIN_XML_SECURE |
>>> +                                                  VIR_DOMAIN_XML_MIGRATABLE)))
>>
>> Do we need VIR_DOMAIN_XML_INACTIVE too since this is the "next"
>> persistent config def?  A variant for parse is used later in patch5.
> 
> This is one of the hardest questions - what combinations of flags one should
> use for one's purpuses :) I'm not kidding, there are no good definitions of 

110% agree.

> VIR_DOMAIN_XML_INACTIVE or VIR_DOMAIN_DEF_FORMAT_INACTIVE. So I use same
> combination migration uses to pass persistent config to destination -
> check qemuMigrationSrcRun.

and obviously I asked because I find the design of the flags is
confusing and what each means with respect to what post parse processing
or validation occurs - I just cannot keep it in memory.

> 
> Hmm, virDomainDefFormatInternal has next hunk at the beginning:
> 
>     if (def->id == -1)                                                                                              
>         flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;  
> 
> and def->id == -1 for inactive config (@id is set to -1 on parsing
> inactive configs and on domain start @def is copied to @newDef before
> allocating @id - check qemuProcessInit). So we have this flag anyway :)

<sigh>... commit f24e67d24

> 
> Also even the above fact does not have the case  AFAIU having this flag for inactive config 
> does not have much meaning - random hunks of code that I checked filter
> dumping info that gained by config on domain start. However I run across a 
> kind of exception too - @id is not dumped at all if VIR_DOMAIN_DEF_FORMAT_INACTIVE
> is set however this is not make much difference.
> 

The case you're chasing though would have an id != -1 since it's an
active guest, but we're saving the newDef/next config. Perhaps similar
to how various qemu_driver commands can manipulate both active and
config (attach, detach, change, etc).  It's a "hypervisor dependent"
action as to what happens.

>>
>>> +            goto endjob;
>>> +
>>
>> Since we are doing this as "best chance" and aren't requiring it, what
>> are your (and perhaps others) thoughts related to making this really
>> optional. As in if the qemuDomainDefCopy fails, then we just throw a
>> VIR_INFO, clear the last error, and continue on?
>>
>> I don't have a preference, but I'm just throwing it out there as an idea.
> 
> I should say I'm not looking at this step as "best chance" - if we can
> not revert exact state I'd prefer to fail here.
> 
>>
>>>          if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) {
>>>              align_location = VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL;
>>>              align_match = false;
>>> @@ -16219,6 +16225,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>      qemuDomainObjPrivatePtr priv;
>>>      int rc;
>>>      virDomainDefPtr config = NULL;
>>> +    virDomainDefPtr persistConfig = NULL;
>>>      virQEMUDriverConfigPtr cfg = NULL;
>>>      virCapsPtr caps = NULL;
>>>      bool was_stopped = false;
>>> @@ -16226,6 +16233,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>      virCPUDefPtr origCPU = NULL;
>>>      unsigned int start_flags = VIR_QEMU_PROCESS_START_GEN_VMID;
>>>      qemuDomainAsyncJob jobType = QEMU_ASYNC_JOB_START;
>>> +    virDomainDefPtr newConfig = NULL;
>>>  
>>>      virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING |
>>>                    VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED |
>>> @@ -16332,6 +16340,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>              goto endjob;
>>>      }
>>>  
>>> +    if (snap->def->persistDom &&
>>> +        !(persistConfig = virDomainDefCopy(snap->def->persistDom, caps,
>>> +                                           driver->xmlopt, NULL, true)))
>>> +        goto endjob;
>>> +
>>
>> Based only on my previous comment, for this part I agree especially
>> since if we were able to save a config, then we'd want to be sure to
>> restore it.
>>
>>>      cookie = (qemuDomainSaveCookiePtr) snap->def->cookie;
>>>  
>>>      switch ((virDomainState) snap->def->state) {
>>> @@ -16440,8 +16453,18 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>                   * failed loadvm attempt? */
>>>                  goto endjob;
>>>              }
>>> -            if (config) {
>>> -                virDomainObjAssignDef(vm, config, false, NULL);
>>> +
>>> +            /* Older versions do not save inactive config in metadata, instead
>>> +             * they use active config for this purpose, so keep this behaviour
>>> +             * for backward compat.
>>> +             */
>>> +            if (persistConfig)
>>> +                VIR_STEAL_PTR(newConfig, persistConfig);
>>> +            else
>>> +                VIR_STEAL_PTR(newConfig, config);
>>> +
>>> +            if (newConfig) {
>>> +                virDomainObjAssignDef(vm, newConfig, false, NULL);
>>>                  virCPUDefFree(priv->origCPU);
>>>                  VIR_STEAL_PTR(priv->origCPU, origCPU);
>>>              }
>>> @@ -16449,8 +16472,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>              /* Transitions 2, 3 */
>>>          load:
>>>              was_stopped = true;
>>> -            if (config)
>>> +            if (config) {
>>>                  virDomainObjAssignDef(vm, config, false, NULL);
>>> +                config = NULL;
>>
>> This "makes sense" except for the case where @persistConfig == NULL,
>> then we wouldn't virDomainSaveConfig below as we currently do. So maybe
>> this should be:
>>
>>     if (!persistConfig)
>>         VIR_STEAL_PTR(newConfig, config);
>>     else
>>         config = NULL;
>>
>> That way we won't virDomainDefFree @config below (which is what you're
>> trying to avoid), but we will save @config as we did before unless of
>> course @persistConfig is going to be used.
>>
>> Make sense?
> 
> Good catch! I think I'll change to:
> 
>             if (config) {
>                 virDomainObjAssignDef(vm, config, false, NULL);
>                 VIR_STEAL_PTR(newConfig, config);
>             }
> 
> because if we have persistConfig but fail to assing it - means
> we are on error path and we won't save newConfig anyway.
> 

OK - I was going explicit with the alternative to the code below for
persistConfig/newConfig.

>>
>>> +            }
>>>  
>>>              /* No cookie means libvirt which saved the domain was too old to
>>>               * mess up the CPU definitions.
>>> @@ -16471,6 +16496,11 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>                                               detail);
>>>              if (rc < 0)
>>>                  goto endjob;
>>> +
>>> +            if (persistConfig) {
>>
>> Perhaps we should point out:
>>
>>    /* This will save the @persistConfig in the vm->newDef where it
>>     * was originally copied from. */
>>
>> John
>>
>>> +                virDomainObjAssignDef(vm, persistConfig, false, NULL);
>>> +                VIR_STEAL_PTR(newConfig, persistConfig);
>>> +            }
>>>          }
> 
> 
> This makes me wonder if virDomainObjAssignDef is too overcomplicated
> if we need such comments.
> 

Nah, no way that's over complicated <eyeroll>. Touching it though means
you get to own making it better... job security and all that.

John


> Nikolay
> 
>>>  
>>>          /* Touch up domain state.  */
>>> @@ -16535,8 +16565,10 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>              qemuDomainRemoveInactive(driver, vm);
>>>              goto endjob;
>>>          }
>>> -        if (config)
>>> +        if (config) {
>>>              virDomainObjAssignDef(vm, config, false, NULL);
>>> +            VIR_STEAL_PTR(newConfig, config);
>>> +        }
>>>  
>>>          if (flags & (VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING |
>>>                       VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) {
>>> @@ -16600,7 +16632,7 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>      } else if (snap) {
>>>          snap->def->current = false;
>>>      }
>>> -    if (ret == 0 && config && vm->persistent &&
>>> +    if (ret == 0 && newConfig && vm->persistent &&
>>>          !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
>>>                                      vm->newDef ? vm->newDef : vm->def))) {
>>>          detail = VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT;
>>> @@ -16616,6 +16648,8 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>      virObjectUnref(cfg);
>>>      virNWFilterUnlockFilterUpdates();
>>>      virCPUDefFree(origCPU);
>>> +    virDomainDefFree(config);
>>> +    virDomainDefFree(persistConfig);
>>>  
>>>      return ret;
>>>  }
>>>




More information about the libvir-list mailing list