[libvirt] [RFC v2] external (pull) backup API

Vladimir Sementsov-Ogievskiy vsementsov at virtuozzo.com
Mon Apr 23 09:31:32 UTC 2018


21.04.2018 00:26, Eric Blake wrote:
> On 04/20/2018 01:24 PM, John Snow wrote:
>
>>>> Why is option 3 unworkable, exactly?:
>>>>
>>>> (3) Checkpoints exist as structures only with libvirt. They are saved
>>>> and remembered in the XML entirely.
>>>>
>>>> Or put another way:
>>>>
>>>> Can you explain to me why it's important for libvirt to be able to
>>>> reconstruct checkpoint information from a qcow2 file?
>>>>
>>> In short it take extra effort for metadata to be consistent when
>>> libvirtd crashes occurs. See for more detailed explanation
>>> in [1] starting from words "Yes it is possible".
>>>
>>> [1] https://www.redhat.com/archives/libvir-list/2018-April/msg01001.html
> I'd argue the converse. Libvirt already knows how to do atomic updates
> of XML files that it tracks.  If libvirtd crashes/restarts in the middle
> of an API call, you already have indeterminate results of whether the
> API worked or failed; once libvirtd is restarted, you'll have to
> probably retry the command.  For all other cases, the API call
> completes, and either no XML changes were made (the command failed and
> reports the failure properly), or all XML changes were made (the command
> created the appropriate changes to track the new checkpoint, including
> whatever bitmap names have to be recorded to map the relation between
> checkpoints and bitmaps).
>
> Consider the case of internal snapshots.  Already, we have the case
> where qemu itself does not track enough useful metadata about internal
> snapshots (right now, just a name and timestamp of creation); so libvirt
> additionally tracks further information in <domainsnapshot>: the name,
> timestamp, relationship to any previous snapshot (libvirt can then
> reconstruct a tree relationship between all snapshots; where a parent
> can have more than one child if you roll back to a snapshot and then
> execute the guest differently), the set of disks participating in the
> snapshot, and the <domain> description at the time of the snapshot (if
> you hotplug devices, or even the fact that creating external snapshots
> changes which file is the active qcow2 in a backing chain, you'll need
> to know how to roll back to the prior domain state as part of
> reverting).  This is approximately the same set of information that a
> <domaincheckpoint> will need to track.
>
> I'm slightly tempted to just overload <domainsnapshot> to track three
> modes instead of two (internal, external, and now checkpoint); but think
> that will probably be a bit too confusing, so more likely I will create
> <domaincheckpoint> as a new object, but copy a lot of coding paradigms
> from <domainsnapshot>.
>
> So, from that point of view, libvirt tracking the relationship between
> qcow2 bitmaps in order to form checkpoint information can be done ALL
> with libvirt, and without NEEDING the qcow2 file to track any relations
> between bitmaps.  BUT, libvirt's job can probably be made easier if
> qcow2 would, at the least, allow bitmaps to track their parent, and/or
> provide APIs to easily merge a parent..intermediate..child chain of
> related bitmaps to be merged into a single bitmap, for easy runtime
> creation of the temporary bitmap used to express the delta between two
> checkpoints.

I don't think this is a good idea:
https://www.redhat.com/archives/libvir-list/2018-April/msg01306.html

In short, I think, if we do something to support checkpoints in qemu 
(updated BdrvDirtyBitmap, qapi, qcow2 and migration stream, new nbd meta 
context), we'd better implement checkpoints, than .parent relationship.

>
>> OK; I can't speak to the XML design (I'll leave that to Eric and other
>> libvirt engineers) but the data consistency issues make sense.
> And I'm still trying to figure out exactly what is needed, to capture
> everything needed to create checkpoints and take backups (both push and
> pull model).  Reverting to data from an external backup may be a bit
> more manual, at least at first (after all, we STILL don't have decent
> libvirt support for rolling back to external snapshots, several years
> later).  In other words, my focus right now is "how can we safely track
> checkpoints for capturing of point-in-time incremental backups with
> minimal guest downtime", rather than "given an incremental backup
> captured previously, how do we roll a guest back to that point in time".
>
>> ATM I am concerned that by shifting the snapshots into bitmap names that
>> you still leave yourself open for data corruption if these bitmaps are
>> modified outside of libvirt -- these third party tools can't possibly
>> understand the schema that they were created under.
>>
>> (Though I suppose very simply that if a bitmap is missing you'd be able
>> to detect that in libvirt and signal an error, but it's not very nice.)
> Well, we also have to realize that third-party tools shouldn't really be
> mucking around with bitmaps they don't understand.  If you are going to
> manipulate a qcow2 file that contains persistent bitmaps, you should not
> delete a bitmap you did not create; and if the bitmap is autoloaded, you
> must obey the rules and amend the bitmap for any guest-visible changes
> you make during your data edits.  Just like a third-party tool shouldn't
> really be deleting internal snapshots it didn't create.  I don't think
> we have to worry as much about being robust to what a third party tool
> would do behind our backs (after all, the point of the pull model
> backups is so that third-party tools can track the backup in the format
> THEY choose, after reading the dirty bitmap and data over NBD, rather
> than having to learn qcow2).
>
>> I'll pick up discussion with Eric and Vladimir in the other portion of
>> this thread where we're discussing a checkpoints API and we'll pick this
>> up on QEMU list if need be.
> Yes, between this thread, and some IRC chats I've had with John in the
> meantime, it looks like we DO want some improvements on the qcow2 side
> of things on the qemu list.
>
> Other things that I need to capture from IRC:
>
> Right now, it sounds like the incremental backup model (whether push or
> pull) is heavily dependent on qcow2 files for persistent bitmaps.  While
> libvirt can perform external snapshots by creating a qcow2 wrapper
> around any file type, and live commit can then merge that qcow2 file
> back into the original file, libvirt is already insistent that internal
> snapshots can only be taken if all disks are qcow2.  So the same logic
> will apply to taking backups (whether the backup is incremental by
> starting from a checkpoint, or full over the complete disk contents).
>
> Also, how should checkpoints interact with external snapshots?  Suppose
> I have:
>
> base <- snap1
>
> and create a checkpoint at time T1 (which really means I create a bitmap
> titled B1 to track all changes that occur _after_ T1).  Then later I
> create an external snapshot, so that now I have:
>
> base <- snap1 <- snap2
>
> at that point, the bitmap B1 in snap1 is no longer being modified,
> because snap1 is read-only.  But we STILL want to track changes since
> T1, which means we NEED a way in qemu to not only add snap2 as a new
> snapshot, but ALSO to create a new bitmap B2 in snap2, that tracks all
> changes (until the next checkpoint, of course).  Whether B2 starts life
> empty (and libvirt just has to remember that it must merge snap1.B1 and
> snap2.B2 when constructing the delta), or whether B2 starts life as a
> clone of the final contents of snap1.B1, is something that we need to
> consider in qemu.

I'm sure that the latter is a true way, in which snapshots are actually 
unrelated to checkpoints. We just have a "snapshot" of the bitmap in 
snapshot file.
Here is an additional interesting point: it works for internal snapshots 
too, as bitmaps will go to the state through migration channel (if we 
enable corresponding capability, of course)

> And if there is more than one bitmap on snap1, do we
> need to bring all of those bitmaps forward into snap2, or just the one
> that was currently active?

Again, I think, to make snapshots unrelated, it's better to keep them 
all. Let disk snapshot to be a snapshot of dirty-bitmaps too.

> Similarly, if we later decide to live commit
> snap2 back into snap1, we'll want to merge the changes in snap2.B2 back
> into snap1.B1 (now that snap1 is once again active, it needs to track
> all changes that were merged in, and all future changes until the next
> snapshot).

And here we will just drop older versions of bitmaps.

>   Which means we need to at least be thinking about cross-node
> snapshot merges,

hmm, what is it?

>   even if, from the libvirt perspective, checkpoints are
> more of a per-drive attribute rather than a per-node attribute.
>


-- 
Best regards,
Vladimir




More information about the libvir-list mailing list