[libvirt-users] Snapshot system: really confusing.

Fri Apr 27 14:56:40 UTC 2012

On 04/25/2012 03:25 AM, NoxDaFox wrote:
> Hello,
> 
> thank you for your fast reply!
> To help in comprehension I'll explain you a bit what am I trying to
> realize so the picture will be more clear.
> Basically it's a platform for debug, what I need is to access to the
> memory dump image and to the FS ones; I don't need any reverting
> support.
> The common lifecycle of a domain is:
> 
> a) Given a backing store disk (qcow2) I create a new disk image: my_image.qcow2
> b) I start this image and play around based on a persistent domain.
> c) I take several pictures (snapshots, dumps) of the VE state: I need
> at least readable pictures of the FileSystem and the RAM.
> d) I shutdown the guest.
> e) I extract valuable information from the pictures. This is the
> critical phase where all my doubts on libvirt platform come from.
> f) I store those information.
> g) I wipe out everything else.
> h) Ready for a new test, return to point a).

Thanks; that helps in knowing what you are trying to accomplish - it
sounds like you are intentionally forking machine state from a known
point in time (the offline backing store disk), and then want to compare
the running state of those forks by snooping both the RAM and disk state
at the time of the snapshot series you took along each branch of the fork.

> 
> libvirt is a great platform! And documentation is not bad at all. The
> only nebulous part is the snapshot part (I guessed was on hard
> development anyway).

Always nice to hear from happy customers, and yes, we are trying to
improve things.  And as always with open source projects, patches are
welcome, even for things as mundane as documentation.

> 
> Now I'll answer to your points.
> 
>>
>> Hopefully libvirt can supply what you are interested in; and be aware
>> that it is a work in progress (that is, unreleased qemu 1.1 and libvirt
>> 0.9.12 will be adding yet more features to the picture, in the form of
>> live block copy).
> 
> I knew the situation, I saw a RFC of yours that is really interesting
> as it introduces snapshots of single storage volumes:
> virStorageVolSnapshotPtr.
> Really interesting under my point of view.

Yep, but unfortunately still several months down the road.  I basically
envision a way to map between virDomainPtr and virStorageVolPtr (given a
domain, return a list of the storage volumes it is using; and given a
storage volume, return a list of any domains referring to that volume).
 Once that mapping is in place, then we can expose more power to the
various virStorageVol APIs, such as inspecting backing chains or
internal snapshots.

>>>
>>> Is it possible to store a single snapshot providing both the memory and
>>> disks state in a file (maybe a .qcow2 file)?
>>
>> Single file: no.  Single snapshot: yes.  The virDomainSnapshotCreateXML
>> API (exposed by virsh snapshot-create[-as]) is able to create a single
>> XML representation that libvirt uses to track the multiple files that
>> make up a snapshot including both memory and disk state.
> 
> This would be enough, as long as I'm able to read those information.

The 'system checkpoint' created by virDomainSnapshotCreateXML saves VM
state and disk state, but saves the VM state into an internal section of
a qcow2 file; it is done by the 'savevm' monitor command of qemu.  I'm
not sure if that is directly accessible (that is, I have no idea if
qemu-img exposes any way to read the VM state, which may mean that the
only way to read that state is to load the VM back into a fresh qemu
instance).

On the other hand, the VM-only save done by 'virsh save' uses the
migration to file, which is the same format used by 'virsh dump', and
that external file can be loaded up in crash analysis tools in order to
inspect RAM state at the time of the save, without having to worry about
extracting the VM state from an internal segment of a qcow2 disk file.

> 
>>
>>> Is there any way to get a unique interface which handles my snapshots?
>>>
>>> I was used to use the virDomainSnapshotCreateXML() defining the destination
>>> file in the XML with <disk> fields.
>>
>> Right now, you have a choice:
>>
>> 1. Don't use the DISK_ONLY flag.  That means that you can't use <disk>
>> in the snapshot XML.  The snapshot then requires qcow2 files, the VM
>> state is saved (it happens to be in the first qcow2 disk), and the disk
>> state is saved via qcow2 internal snapshots.
> 
> That's good to know, I just got yesterday, reading your mails in the
> dev mailing list, where snapshots were stored; I always tried to look
> for a new file, unsuccessfully.

There's multiple files involved in a snapshot: libvirt maintains an XML
file (this currently happens to live in /etc/libvirt/, but you shouldn't
directly be probing for this file, so much as using the libvirt API to
get at the XML); the disk state is done with an internal snapshot of
each of the qcow2 disks, and the VM state is done with an internal
section of the first qcow2 disk.  'qemu-img info $first_disk' will show
you the size of the internal state, although like I said, I don't know
how to probe for the contents of that state short of starting a new qemu
instance.

> 
>>
>> 2. Use the DISK_ONLY flag.  Then you can set up external snapshot file
>> names (basically, another layer of qcow2 files, where your original file
>> name is now the snapshot backing the new layer), and you get no VM state
>> directly.  Here is where the <disk> element of the snapshot comes into
>> play.  If you _also_ want to save VM state, you can use 'virsh save'
>> (virDomainSaveFlags) to save just VM state; but then you have to
>> coordinate things so that the disk snapshot and the VM state correspond
>> to the same point in guest time by pausing the guest before taking the
>> disk snapshot.
>>
>>> After updating libvirt it was not working anymore, I thought was a bug but
>>> then I realized it was intentional.
>>> The function complains about the fact that the <disk> parameter is not
>>> accepted anymore.
>>
>> What versions of libvirt are you playing with?  <disk> was unrecognized
>> (and ignored) prior to 0.9.5; after that point, it can only be used with
>> the DISK_ONLY flag, but then you have to take the VM state separately.
>>
> 
> I guessed that the <disk> tag was just ignored, I inherited the code
> from a previous project and I spent the first weeks struggling with
> what was not working.
> I was using libvirt 0.8.3-5 from debian squeeze, I migrated to wheezy
> to be able to access to libguestfs features, now I'm running libvirt
> 0.9.11.

Then that explains the change in behavior :)  Libvirt is a fast-moving
target, and while I normally like the slow and cautious approach of
debian stable, it sure makes using new virtualization features difficult.

> 
>> Someday, I'd like to submit more patches to allow <disk> to be mixed
>> with live VM state, but I'm not there yet.
>>
>>> So I started guessing how to solve reading the API documentation and I fall
>>> in a completely nebulous world.
>>>
>>> For what I got:
>>> - virDomainSnapshotCreateXML():
>>> According to flags can take system checkpoints (really useful) and disks
>>> snapshots.
>>> System checkpoints: What I need but I didn't find any way to retrieve the
>>> storage file; I'm only able to get the snapshot pointer, quite useless as
>>> from its pointer I can only print the XML description.
>>
>> The storage is in internal snapshots of your qcow2 file.  Try 'qemu-img
>> info /path/to/file' to see those internal snapshots.  You can also use
>>
>> qemu-img convert -s snapname /path/to/image /path/to/output
> 
> Great! This is the information I needed: how to access to those damned
> snapshot in a readable way.

That accesses just the disk information, not the VM state, but it's
better than nothing.

> I still believe anyway that an interface provided by libvirt would be
> really valuable.

So would I, which is why it is on my to-do list.

> I am trying to stay as much as I can on libvirt to work on a higher
> level of abstraction, this is really important for my architecture.
> Using directly qemu is quite annoying but if at the moment is the only
> solution I'll move to that.
> 
>>
>> as a way to extract that snapshot into a file that you can use
>> independently (but only while your guest is not running); alas, qemu
>> doesn't yet provide a way to extract this information from a running
>> domain, nor have I yet had time to map this functionality into libvirt
>> API.  But I have requested qemu enhancements to allow it (perhaps by
>> qemu 1.2), as well as have a vision of where I plan to take the libvirt
>> API in the next year or so to make this more useful.
> 
> ATM I don't need "on the fly" extraction, everything is done after
> turning off the guest, what I need is a way to get its state whenever
> I want, reading it is another business.

Good to know, as qemu-img is a bit more powerful.  Also, knowing which
qemu-img features would be useful when wrapped by libvirt API will help
me in prioritizing which libvirt API I should be working on.

>>
>> virDomainCoreDump() and virDomainSaveFlags() both do a migration to
>> file, it's just that the core dump version isn't designed for reverting
>> like domain save.  And there have been proposals on list for making core
>> dump management better (such as allowing you to get at a core dump file
>> from a remote machine; right now, the core dump can only be stored on
>> the same machine as the guest being dumped).
>>
> 
> I think this is bad, why not merging these functions? Under an API
> point of view (I'm aware of architectural difficulties on the
> background) this is useless and confusing.
> Better a virDomainSaveFlags() with several configuration than two
> different functions.

There is a proposal on list to make virDomainCoreDump() dump _just_
memory, in ELF format; whereas virDomainSaveFlags() will always dump all
VM state (RAM but also device state) needed to restart a VM from the
same state.  Just because the current implementation of the two
functions uses migrate-to-file as the underlying glue does not mean that
this will always be the case, as the two functions really are for
different purposes (saving a VM in order to resume it, vs. dumping just
the memory state of a VM for analysis).

> 
> This is a complex system-management API, so anyone who'll use it will
> be conscious of that; I don't think that many flags and XML files will
> be confusing as long as they're clearly documented.
> But having multiple interfaces (alias functions/methods) that
> basically do the same in a different way with slightly different
> results is really confusing.

Unfortunately true, but we're stuck with historical naming - libvirt
will not ever remove a function that has been previously exported in the
API.  It is made worse by the fact that some of the older functions are
not easily extensible (such as lacking a flags argument), so we have to
add new API to fill in the gaps.  But when I do add new API, I do try to
cross-reference the similar functions; any documentation that points in
both directions between the older and simpler name and the newer and
more powerful name is worthwhile.

> 
> Better have a single method configurable with XML files and flags than
> two without.
> (This is a point of view of mine as a developer).
> 
>>> (other functions really similar)
>>>
>>> The question is: why all this confusion?
>>
>> Different qemu capabilities ('savevm', 'migrate',
>> 'blockdev-snapshot-sync'), different needs at the time each feature was
>> first added, etc.  Ultimately, virDomainSnapshotCreateXML is the most
>> powerful interface, and I have plans to make it be a superset of
>> virDomainSaveFlags(), but I'm not there yet.
> 
> Indeed is a powerful interface!
> But again: why those functions do something similar?
> Isn't better (always under an API point of view) having:
> - Two different interfaces that handle separately disk and state; if I
> want to revert/migrate I need to give both the results of those
> interfaces.

Not necessarily, since having 2 API instead of 1 means you then have to
worry about atomicity (you don't want the guest to be running between
your two API calls, otherwise, you can't grab a combined picture of the
machine at one point in time).

> - One interface that gives revert/migrate capability and one interface
> that, via flags or XML, gives the separate components.
> - Other combinations: may still be done as long as they're clear.
> 
> Here we have:
> virDomainSnapshotCreateXML():
> Takes complete snapshots, disk snapshots, no single-memory snapshots.

No single-memory snapshots, yet.  It's coming.

> It does it keeping the domain alive (paused). It stores information
> internally if flags=A, externally if flags=B.

It will someday also be possible to keep the domain alive and running
(as is the case with live migration between machines).

> virDomainSaveFlags():
> No complete snapshots, no disk snapshots, takes memory snapshots.
> It does it stopping the domain. It stores information externally.
> 
> Under my point of view this can be done in a more clear way.
> Separation of duties: many small functions that do a single thing and
> a global one that wraps everything giving a complete package is a good
> example.
> 
>>
>>> I absolutely understand the problematic that realizing a multiplatform
>>> snapshots management raises; but I think that for an API purpose what is
>>> implemented here is completely confusing the developer.
>>
>> Is there anything I can do to help improve the documentation?  I know
>> that's an area where a picture can speak a thousand words; and as more
>> features get added into the snapshot picture, it probably becomes more
>> important to accurately display the various APIs and the advantages for
>> using each.
> 
> What's basically missing is data flow and representation, you don't
> need to realize it by scratch (data format is strongly bound to qemu)
> but give an idea and provide references.
> What makes a documentation relevant is the way the reader has to
> access to information, must be easy!
> If snapshots reliy to qemu ones just link their documentation as well,
> so if I don't find enough clues I can go deeper as much as needed.
> 
> If help is needed I can contribute, I just need to know where to look
> at when I need something. The platforms I'm working on are really
> time-consuming but this technology is really important under my point
> of view as it will reduce the time to maintain them.
> 
> Thanks again for answering so fast!
> 
> NoxDaFox
> 
> PS: I forgot to CC the libvirt-users mailing list, sorry for the spam.
> 
> _______________________________________________
> libvirt-users mailing list
> libvirt-users at redhat.com
> https://www.redhat.com/mailman/listinfo/libvirt-users

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20120427/6ab99b7d/attachment.sig>