[libvirt] [PATCH v4 11/20] wip: backup: Parse and output checkpoint XML

Thu Feb 28 00:57:43 UTC 2019

On 2/25/19 4:06 PM, Eric Blake wrote:
> On 2/12/19 11:23 AM, John Ferlan wrote:
>>
>>
>> On 2/6/19 2:18 PM, Eric Blake wrote:
>>> Work in progress - the checkpoint code is not quite passing
>>> tests (part of that is figuring out the minimal XML that is
>>> still valid as a <domain> element, or just use --no-domain flag).
>>>
>>> Signed-off-by: Eric Blake <eblake at redhat.com>
>>> ---
>>>  src/conf/checkpoint_conf.h          |  150 ++++
>>>  src/conf/domain_conf.h              |   11 +-
>>>  po/POTFILES                         |    1 +
>>>  src/conf/Makefile.inc.am            |    2 +
>>>  src/conf/checkpoint_conf.c          | 1030 +++++++++++++++++++++++++++
>>>  src/conf/domain_conf.c              |    7 +-
>>>  src/libvirt_private.syms            |   21 +
>>>  tests/Makefile.am                   |    9 +-
>>>  tests/domaincheckpointxml2xmltest.c |  231 ++++++
>>>  9 files changed, 1458 insertions(+), 4 deletions(-)
>>>  create mode 100644 src/conf/checkpoint_conf.h
>>>  create mode 100644 src/conf/checkpoint_conf.c
>>>  create mode 100644 tests/domaincheckpointxml2xmltest.c
>>>
>>
>> Starting to lose some steam - seeing wip means I don't want to spend too
>> much time on some algorithms for fear they'll change, but then again I
>> know you're looking for feedback...
> 
> I understand. The 'wip' tags should be gone for v5.
> 
>>
>>> diff --git a/src/conf/checkpoint_conf.h b/src/conf/checkpoint_conf.h
>>> new file mode 100644
>>> index 0000000000..994a8bd083
>>> --- /dev/null
>>> +++ b/src/conf/checkpoint_conf.h
>>> @@ -0,0 +1,150 @@
>>> +/*
>>> + * checkpoint_conf.h: domain checkpoint XML processing
>>> + *
>>> + * Copyright (C) 2006-2019 Red Hat, Inc.
>>> + * Copyright (C) 2006-2008 Daniel P. Berrange
>>
>> This is new, right?  Not a copy of...
> 
> New file, but so heavily copied from snapshot_conf.h that I felt safer
> preserving all existing copyrights.
> 

See src/conf/{vir*obj}.c for some examples, e.g.:

 * virinterfaceobj.c: interface object handling
 *                    (derived from interface_conf.c)

that's what I ended up doing when splitting apart XML handling from
Object handling code.

> 
>>> +struct _virDomainCheckpointDef {
>>> +    /* Public XML.  */
>>> +    char *name;
>>> +    char *description;
>>> +    char *parent;
>>> +    long long creationTime; /* in seconds */
>>> +
>>> +    size_t ndisks; /* should not exceed dom->ndisks */
>>> +    virDomainCheckpointDiskDef *disks;
>>> +
>>> +    virDomainDefPtr dom;
>>> +
>>> +    /* Internal use.  */
>>> +    bool current; /* At most one checkpoint in the list should have this set */
>>
>> Typically these are then in the *Obj...
> 
> Copy-and-paste from virDomainSnapshotDef, but my recent work there has
> made me want to reconsider where the current object is tracked.
> 

Yeah, understandable. Like I noted earlier - when I did the work to
split out object code and make it more common, I skipped snapshots for
various reasons, but mostly because I feared that code and wasn't sure
if changing it would cause issues with any blockdev work Peter was
doing. So I want the safer route.

I do think snapshot_conf.c could be split up with virsnapshotobjlist.c
being created, but that's future work.  Maybe something you can consider
for checkpoints though.

> The problem is that the current way we store things on disk is via a
> separate <domainsnapshot> XML per snapshot, with no other obvious place
> for the internal representation of a <domain> to track which snapshot is
> current.  BUT, as I have just posted patches to add a bulk dump/redefine
> with VIR_DOMAIN_XML_SNAPSHOTS, we COULD just ditch our current
> separate-file storage of snapshots, and instead store the snapshots
> directly with the <domain> (for all new saves; when loading libvirtd,
> we'd still have to keep the code to load from older separate snapshot
> files for back-compat reasons, even if we no longer track separate files
> after the one-time conversion).  And if I make snapshots cleaner like
> that, then checkpoints can just start life with the saner approach,
> rather than being stuck with copying the older one-file-per-checkpoint
> approach that I implemented using copy-and-paste.
> 

Hey, I recognize that explanation!

>>
>>> +};
>>> +
>>> +struct _virDomainCheckpointObj {
>>> +    virDomainCheckpointDefPtr def; /* non-NULL except for metaroot */
>>> +
>>> +    virDomainCheckpointObjPtr parent; /* non-NULL except for metaroot, before
>>> +                                         virDomainCheckpointUpdateRelations, or
>>> +                                         after virDomainCheckpointDropParent */
>>> +    virDomainCheckpointObjPtr sibling; /* NULL if last child of parent */
>>> +    size_t nchildren;
>>> +    virDomainCheckpointObjPtr first_child; /* NULL if no children */
>>
>> The whole relationship thing is I think overly complex and a bit
>> difficult to follow. Having @sibling and @first_child seems to make the
>> code even more complex. I would think you have a parent and maybe a
>> child. If this is the "first" checkpoint, then there is no parent. I
>> would think that means it's the root.
>>
>> There's just much more to be see when it comes to how these
>> relationships play out as checkpoints are made and removed. I guess I
>> just have a very linear model in mind, but reading the code seems to go
>> beyond linearality.
>>
>> The use of hash tables just makes it easier to ensure no def->name is
>> reused. Not having locks in the object means some parent lock is
>> managing to make sure two checkpoint operations cannot occur at the same
>> time from different threads (e.g. a domain object lock), but that is not
>> 100% clear.
> 
> Again, heavy copy-and-paste from snapshots, which DO lend themselves to
> non-linear DAGs of related snapshots. (If you create snapshot A, then
> run the guest, then create B, then revert to A, then run the guest and
> create C, then both B and C are children of A).  I don't readily know if
> checkpoints lend themselves as easily to non-linear relations, or if my
> copy-and-paste is an instance of YAGNI.  And it also makes me wonder if
> I would be better served by factoring out the snapshot non-linear
> relationships into some more reusable helper functions rather than just
> open-coding things in both files.

The KISS principle comes to mind ;-).

> 
> The short answer is that there is a hash table (for O(1) name lookup)
> AND a linked-list relationship listing (for tracking DAG relationships,
> where one child is common, but more than one child is possible); and
> that the domain tracks a metaroot (a static instance with name==NULL) so
> that all other snapshots/checkpoints have nice properties of always
> having a parent (either another real snapshot as the parent, or the
> metaroot as the parent when the presentation to the user is that the
> snapshot is the root of a DAG with no user-created parent).  If I
> improve anything here with better comments, I should also apply similar
> improvements to the snapshot code.
> 

If the complexity is in revert, then is that required for what backup
needs? If checkpoints are a "pseudo-child" of a snapshot or a backup,
then snapshots can handle that awful complexity and backups can do
something slightly different. Of course that goes back to a previous
point about where the <domaincheckpoint...> should live <sigh>.

My KISS sees checkpoints as points in time saving what's there allowing
other parts of the system to build upon being able to handle differences
in those points in time. Trying to merge/squeeze in snapshot logic or
functionality into checkpoints is/was overwhelming... There's so much
about snapshot processing that I avoided because unless you work with it
often enough, it really don't necessarily make sense. A lot to do with
terminology and thinking about the corresponding QEMU functionality
you're trying to work with. Still checkpoints (and snapshots for that
matter) could be theoretically implemented by a different hypervisor
too, but all I have only partially understood is the QEMU logistics.

>>
>>
>>> +};
>>
>> When objectifying module to to avoid leaking *obj entries into other
>> modules, this struct would go inside .c file w/ accessors to whatever is
>> needed externally. I purposefully avoided domain & domain snapshot as
>> there was just too much rework.  Whether or not that can be done here -
>> not sure yet, still reading, but figured I'd point it out at least ;-)
> 
> Indeed, since there is such heavy copy-and-paste between the two, first
> refactoring virDomainSnapshot to be Object-ified with better helper
> routines, and then making virDomainCheckpoint share a common parent
> class with virDomainSnapshot to reuse those helper routines, rather than
> open-coded copy-and-paste is probably worth the effort.  (I didn't do it
> initially in order to get a working demo out the door, but I also admit
> that adds technical debt, and I should really be trying to reduce it
> rather than adding to it).
> 

True - there's a lot going into many decisions made and to be made that
is hard to quantify in the review. I know you've been working at this
for a while and I wonder if there's a point where it is it better to get
something in that perhaps someone else can refactor while you complete
the rest of the work. Or know that you can come back to it. Since there
is overlap from copypaste I'm sure someone can or will come up ways to
converge logic.

Still yet as you point out later in your response - it could be worth
generating the common module/code for both checkpoints and snapshots to
use. I think it would be too hard to unwind everything this late in the
game!

> 
>>> +int virDomainCheckpointUpdateRelations(virDomainCheckpointObjListPtr checkpoints);
>>> +void virDomainCheckpointDropParent(virDomainCheckpointObjPtr checkpoint);
>>
>> More recently in the .h prototypes, been trying to follow the .c entry
>> as well... where it's
>>
>> type
>> function(args...);
>>
>> makes copy-pasta easier...
> 
> Yeah, I'll have to double-check what improvements have been made to
> snapshot in the meantime, and make sure I'm not regressing back to
> things the way they were before cleanups (since my copy-and-paste point
> is mostly from sometime around Sep-Oct 2018).
> 

Luckily a module that isn't in the often changed category... It's in the
don't be the last to change it because you don't want to be the one to
break it and thus own it ;-).

> 
>>
>> Add a:
>>
>> VIR_DEFINE_AUTOPTR_FUNC(virDomainCheckpointDef, virDomainCheckpointDefFree);
>>
>> details coming...
> 
> Yeah, I know it's coming.  From what I've seen, I'll like it, too.
> 
> 
>>> @@ -2631,6 +2637,9 @@ struct _virDomainObj {
>>>
>>>      bool hasManagedSave;
>>>
>>> +    virDomainCheckpointObjListPtr checkpoints;
>>> +    virDomainCheckpointObjPtr current_checkpoint;
>>> +
>>
>> Note to self or you since I could forget by the time I find this
>> usage... wouldn't update to current_checkpoint need a lock?
> 
> Same question for snapshots. So far, all modification of snapshots vs.
> current_snapshot have been under an appropriate API lock, but you do
> raise the point that I should be careful about it. In fact, I'm
> half-tempted to first fix virDomainSnapshotObjListPtr to subsume the
> current_snapshot designation rather than letting virDomainPtr track it
> independently, in which case this would need the same treatment.
> 

I think I knew the answer was related to Snapshots and that snapshots
may lack locking... Perhaps something else that caused me to avoid it
when working through the common object changes. I do remember making
some changes at some point in time in private branches, but the details
escape me.

>>
>> What's really not clear is whether the domain object lock is in place or
>> not when we get here.  I think the domainobj's are just plain write
>> locks and not the domainobjlist fancy rwlocks.
> 
> All the more reason to put the current object as part of the ObjListPtr,
> so that locking for adding/removing members from the list is guaranteed
> to be the same as locking for updating which member is current.
> 

If we come through the domain connect API's then we'd have domain lock -
which perhaps would be the right thing here... Especially if there's
this "relationship" between snapshots and checkpoints that we'd need to
be really, really, careful to manage.

> 
>>> +#include <config.h>
>>> +
>>> +#include <fcntl.h>
>>> +#include <sys/stat.h>
>>> +#include <sys/time.h>
>>> +#include <unistd.h>
>>> +
>>> +#include "internal.h"
>>> +#include "virbitmap.h"
>>> +#include "virbuffer.h"
>>> +#include "count-one-bits.h"
>>> +#include "datatypes.h"
>>> +#include "domain_conf.h"
>>> +#include "virlog.h"
>>> +#include "viralloc.h"
>>> +#include "netdev_bandwidth_conf.h"
>>> +#include "netdev_vport_profile_conf.h"
>>> +#include "nwfilter_conf.h"
>>> +#include "secret_conf.h"
>>> +#include "checkpoint_conf.h"
>>> +#include "virstoragefile.h"
>>> +#include "viruuid.h"
>>> +#include "virfile.h"
>>> +#include "virerror.h"
>>> +#include "virxml.h"
>>> +#include "virstring.h"
>>
>> Are all of these really needed?
> 
> Possibly not. I'll play the trimming game.
> 
>>
>>> +
>>> +#define VIR_FROM_THIS VIR_FROM_DOMAIN_CHECKPOINT
>>> +
>>> +VIR_LOG_INIT("conf.checkpoint_conf");
>>> +
>>> +VIR_ENUM_IMPL(virDomainCheckpoint, VIR_DOMAIN_CHECKPOINT_TYPE_LAST,
>>> +              "default", "no", "bitmap");
>>> +
>>> +struct _virDomainCheckpointObjList {
>>> +    /* name string -> virDomainCheckpointObj mapping
>>> +     * for O(1), lockless lookup-by-name */
>>> +    virHashTable *objs;
>>> +
>>> +    virDomainCheckpointObj metaroot; /* Special parent of all root checkpoints */'
>>
>> So not a pointer, but copied object from somewhere?
> 
> No, statically allocated right here.
> 
> It's a trick that let's me convert a user's possible multi-root forest:
> 
> A
>  - B
>  - C
> D
>  - E
> 
> into a single-root DAG:
> 
> metarooot
>  - A
>     - B
>     - C
>  - D
>     - E
> 
> and various operations that need to quickly find all the user-visible
> roots can just traverse the children of the metaroot.
> 
> Again, heavily copy-pasted from snapshots, and probably worth first
> refactoring it into a common parent-class object with better comments,
> so that both snapshots and checkpoints can reuse the same parent, rather
> than open-coding things twice.
> 

Again back to the question of how much Checkpoints are going to be
independent or children of snapshot/backup. Is it really desired to take
all the snapshot logic?

>>
>>> +};
>>> +
>>> +/* Checkpoint Def functions */
>>> +static void
>>> +virDomainCheckpointDiskDefClear(virDomainCheckpointDiskDefPtr disk)
>>> +{
>>> +    VIR_FREE(disk->name);
>>> +    VIR_FREE(disk->bitmap);
>>
>> Should we clear idx, size, and type too? Depends on consumer usage which
>> could clear and reuse, thus using something old.
> 
> I'll have to double-check if memset(,0) as a final step in clear would
> make any difference.
> 
> 
>>> +    bitmap = virXMLPropString(node, "bitmap");
>>> +    if (bitmap) {
>>> +        if (def->type != VIR_DOMAIN_CHECKPOINT_TYPE_BITMAP) {
>>> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>>> +                           _("disk checkpoint bitmap '%s' requires "
>>> +                             "type='bitmap'"),
>>> +                           bitmap);
>>> +            goto cleanup;
>>> +        }
>>> +        VIR_STEAL_PTR(def->bitmap, bitmap);
>>> +    }
>>
>> Size is not parsed.  Restore from restart will lose it.
> 
> Intentional. Size is not persistent, but dynamically computed every time
> you pass the VIR_DOMAIN_CHECKPOINT_XML_SIZE flag (and potentially
> constantly changing, for a running domain).  In other words, it is an
> output-only flag, and not tied to the checkpoint itself, so much as the
> checkpoint XML being the easiest way to query that particular piece of
> per-disk runtime information in bulk.
> 

It's a calculated value - I was just covering all the bases.

> 
>>> +static virDomainCheckpointDefPtr
>>> +virDomainCheckpointDefParse(xmlXPathContextPtr ctxt,
>>> +                            virCapsPtr caps,
>>> +                            virDomainXMLOptionPtr xmlopt,
>>> +                            unsigned int flags)
>>> +{
>>> +    virDomainCheckpointDefPtr def = NULL;
>>> +    virDomainCheckpointDefPtr ret = NULL;
>>> +    xmlNodePtr *nodes = NULL;
>>> +    size_t i;
>>> +    int n;
>>> +    char *creation = NULL;
>>
>> Never used.
>>
>>> +    struct timeval tv;
>>> +    int active;
>>> +    char *tmp;
>>
>>     VIR_AUTOPTR(virDomainCheckpointDef) def = NULL;
>>     VIR_AUTOFREE(xmlNodePtr *) nodes = NULL;
>>     VIR_AUTOFREE(char *) tmp = NULL;
>>
>> Erik likes 'em at the bottom too.
>>
>> With @ret autofree'd, the goto cleanup is unnecessary and replaced by
>> return NULL
>>
>> Use of the AUTOFREE stuff really cleans up a lot.
> 
> Yeah, snapshot should get it first, and then I should make sure my code
> here gets the same improvements.
> 

There's been a run on VIR_AUTO* patches lately.  The original thought
was that someone could use them to be a first time contributor. But no
one has done anything since GSOC finished. As I started working through
them and got varying opinions and feedback, I think it's less a first
time contributor type activity than originally thought. There's quite a
bit involved with not only making the physical change but dealing with
refactorings and logic changes since cleanup/error labels become
unnecessary.  OK, so different soapbox topic, sorry ;-).

John

[...]

I think the remainder of this was largely the copypaste from snapshots
discussion. I'm not sure I have a "best option" answer, but I do like
the concept of common code. There is a lot more information which I
probably have to read a few more times to understand completely!