[libvirt-users] Snapshot system: really confusing.

Sat Apr 28 13:04:03 UTC 2012

On 27/04/12 17:56, Eric Blake wrote:
> On 04/25/2012 03:25 AM, NoxDaFox wrote:
>> Hello,
>>
>> thank you for your fast reply!
>> To help in comprehension I'll explain you a bit what am I trying to
>> realize so the picture will be more clear.
>> Basically it's a platform for debug, what I need is to access to the
>> memory dump image and to the FS ones; I don't need any reverting
>> support.
>> The common lifecycle of a domain is:
>>
>> a) Given a backing store disk (qcow2) I create a new disk image: my_image.qcow2
>> b) I start this image and play around based on a persistent domain.
>> c) I take several pictures (snapshots, dumps) of the VE state: I need
>> at least readable pictures of the FileSystem and the RAM.
>> d) I shutdown the guest.
>> e) I extract valuable information from the pictures. This is the
>> critical phase where all my doubts on libvirt platform come from.
>> f) I store those information.
>> g) I wipe out everything else.
>> h) Ready for a new test, return to point a).
> Thanks; that helps in knowing what you are trying to accomplish - it
> sounds like you are intentionally forking machine state from a known
> point in time (the offline backing store disk), and then want to compare
> the running state of those forks by snooping both the RAM and disk state
> at the time of the snapshot series you took along each branch of the fork.
>
That's it! Really simple. From a known state I want to run a scenario 
(whatever piece of code) and track down the differences on disk and 
memory state.
I need a way to track down those states and to access in a comfortable 
way to the deltas.

I think that a valuable feature that virtualization technology could 
bring, is the capability to study code behavior for debug or reverse 
engineering purpose. The problem is that there aren't yet interfaces 
capable of doing so.
All the available solutions just provide snapshots to revert system to a 
certain state; not to use snapshots for analysis purpose.

This could be something that can differentiate libvirt from other products.

To be more clear imagine this situation:
I have a software (MySoft) and, as it works on Windows registers, I need 
to control that it won't break the OS. MySoft once run does windows 
registry cleanup, write a logs and stops.
The scenario I want to represent is the installation of MySoft and one 
execution of itself.
The information I need are: which files are created/modified once MySoft 
is installed, the memory movements once MySoft is running, which files 
are modified/removed once MySoft has run and the final log file.

Here's the test's lifecycle:
- I set up a disk: Windows Vista Home Edition SP1 fully updated.
- I put MySoft installer in the disk's FS.
- I start the guest.
- Once windows is up and running I take a disk state snapshot (DISK#1).
- I run the installer; once MySoft is installed I take a second disk 
snapshots (DISK#2).
- Windows is idle, I take a memory state snapshot (MEM#1).
- I start MySoft, I take 3 other memory snapshots (MEM#2, MEM#3, MEM#4).
- MySoft closed gracefully, I take both memory and disk snapshots 
(DISK#3, MEM#5).
- I shut down the guest.
- I release the resources.

At this point I have several pictures of MySoft behavior:
I can see which files are created/modified after installation mounting 
DISK#2 snapshot and comparing it with DISK#1.
I can see which files are modified/removed after MySoft execution 
comparing DISK#3 and DISK#2.
I can check if all the memory resources are given back to the OS 
comparing MEM#5 and MEM#1.
I can look at the memory evolution through MEM#2, MEM#3, MEM#4.
I can extract the log file from DISK#3.

I hope this example may be useful to understand what I want to realize.
I think that using virtualization in such a way may be a really useful 
for developers!
>> libvirt is a great platform! And documentation is not bad at all. The
>> only nebulous part is the snapshot part (I guessed was on hard
>> development anyway).
> Always nice to hear from happy customers, and yes, we are trying to
> improve things.  And as always with open source projects, patches are
> welcome, even for things as mundane as documentation.
>
>> Now I'll answer to your points.
>>
>>> Hopefully libvirt can supply what you are interested in; and be aware
>>> that it is a work in progress (that is, unreleased qemu 1.1 and libvirt
>>> 0.9.12 will be adding yet more features to the picture, in the form of
>>> live block copy).
>> I knew the situation, I saw a RFC of yours that is really interesting
>> as it introduces snapshots of single storage volumes:
>> virStorageVolSnapshotPtr.
>> Really interesting under my point of view.
> Yep, but unfortunately still several months down the road.  I basically
> envision a way to map between virDomainPtr and virStorageVolPtr (given a
> domain, return a list of the storage volumes it is using; and given a
> storage volume, return a list of any domains referring to that volume).
> Once that mapping is in place, then we can expose more power to the
> various virStorageVol APIs, such as inspecting backing chains or
> internal snapshots.
The easiest way to track down state differences at the moment is, given 
a snapshot and the image from which has been generated, create a disk 
image, mount it and compare it with the original backing one.
Consistency problems may be solved flushing all the buffers on the disk 
and pausing the guest before snapshotting.

What I miss now is a libvirt interface that allows me to pass from a 
snapshot to a volume.
Would be fantastic to map in such a way that, from a disk snapshot 
pointer, through the volumes pool I generate a new volume that I can 
track with the pool itself.
This will allow a system to be really easy to maintain: I have a 
snapshot list, I choose the desired ones, I generate new volumes, I 
compare those volumes using libguestfs (or whatever else tool), I 
collect the desired information, I clean up the pool and I destroy the 
snapshots list.
>>>> Is it possible to store a single snapshot providing both the memory and
>>>> disks state in a file (maybe a .qcow2 file)?
>>> Single file: no.  Single snapshot: yes.  The virDomainSnapshotCreateXML
>>> API (exposed by virsh snapshot-create[-as]) is able to create a single
>>> XML representation that libvirt uses to track the multiple files that
>>> make up a snapshot including both memory and disk state.
>> This would be enough, as long as I'm able to read those information.
> The 'system checkpoint' created by virDomainSnapshotCreateXML saves VM
> state and disk state, but saves the VM state into an internal section of
> a qcow2 file; it is done by the 'savevm' monitor command of qemu.  I'm
> not sure if that is directly accessible (that is, I have no idea if
> qemu-img exposes any way to read the VM state, which may mean that the
> only way to read that state is to load the VM back into a fresh qemu
> instance).
This is really bad as I need to inspect this memory.
As I said: taking disk and memory states separately or together is not 
relevant at the moment; I just need a way to elegantly implement it. 
Something easy to use programmatically (I won't be the final user of the 
platform).
>
> On the other hand, the VM-only save done by 'virsh save' uses the
> migration to file, which is the same format used by 'virsh dump', and
> that external file can be loaded up in crash analysis tools in order to
> inspect RAM state at the time of the save, without having to worry about
> extracting the VM state from an internal segment of a qcow2 disk file.
So I guess the virDomainSave doesn't use the qemu savevm command. But 
anyway this function is useless under my point of view because it stops 
the guest.
>>>> Is there any way to get a unique interface which handles my snapshots?
>>>>
>>>> I was used to use the virDomainSnapshotCreateXML() defining the destination
>>>> file in the XML with<disk>  fields.
>>> Right now, you have a choice:
>>>
>>> 1. Don't use the DISK_ONLY flag.  That means that you can't use<disk>
>>> in the snapshot XML.  The snapshot then requires qcow2 files, the VM
>>> state is saved (it happens to be in the first qcow2 disk), and the disk
>>> state is saved via qcow2 internal snapshots.
>> That's good to know, I just got yesterday, reading your mails in the
>> dev mailing list, where snapshots were stored; I always tried to look
>> for a new file, unsuccessfully.
> There's multiple files involved in a snapshot: libvirt maintains an XML
> file (this currently happens to live in /etc/libvirt/, but you shouldn't
> directly be probing for this file, so much as using the libvirt API to
> get at the XML); the disk state is done with an internal snapshot of
> each of the qcow2 disks, and the VM state is done with an internal
> section of the first qcow2 disk.  'qemu-img info $first_disk' will show
> you the size of the internal state, although like I said, I don't know
> how to probe for the contents of that state short of starting a new qemu
> instance.
The point is that this XML doesn't track those involved files so I have 
no clue where to get them.
And I still just need to analyze the memory state, no other uses.
> ... <cut> ...
>>> virDomainCoreDump() and virDomainSaveFlags() both do a migration to
>>> file, it's just that the core dump version isn't designed for reverting
>>> like domain save.  And there have been proposals on list for making core
>>> dump management better (such as allowing you to get at a core dump file
>>> from a remote machine; right now, the core dump can only be stored on
>>> the same machine as the guest being dumped).
>>>
>> I think this is bad, why not merging these functions? Under an API
>> point of view (I'm aware of architectural difficulties on the
>> background) this is useless and confusing.
>> Better a virDomainSaveFlags() with several configuration than two
>> different functions.
> There is a proposal on list to make virDomainCoreDump() dump _just_
> memory, in ELF format; whereas virDomainSaveFlags() will always dump all
> VM state (RAM but also device state) needed to restart a VM from the
> same state.  Just because the current implementation of the two
> functions uses migrate-to-file as the underlying glue does not mean that
> this will always be the case, as the two functions really are for
> different purposes (saving a VM in order to resume it, vs. dumping just
> the memory state of a VM for analysis).
With devices you mean disks or other devices (network cards, usb sticks)?
For the moment my first priority are the disk (my guests run on a single 
virtual disk, I don't need to represent multiple disks or partitions in 
my scenario) and the memory state.
Then we can also think about buffers, devices and so on; but this is 
more complex and less important.

The important think is having something clear to use for analysis, with 
function A I get a disk readable state, with B I get a memory one.
For what I understood now the virDomainCoreDump() is what I need, as it 
gives an ELF image whit the memory state, right?
>
>> This is a complex system-management API, so anyone who'll use it will
>> be conscious of that; I don't think that many flags and XML files will
>> be confusing as long as they're clearly documented.
>> But having multiple interfaces (alias functions/methods) that
>> basically do the same in a different way with slightly different
>> results is really confusing.
> Unfortunately true, but we're stuck with historical naming - libvirt
> will not ever remove a function that has been previously exported in the
> API.  It is made worse by the fact that some of the older functions are
> not easily extensible (such as lacking a flags argument), so we have to
> add new API to fill in the gaps.  But when I do add new API, I do try to
> cross-reference the similar functions; any documentation that points in
> both directions between the older and simpler name and the newer and
> more powerful name is worthwhile.
Fully understandable, you cannot brutally remove a deprecated function, 
but in the documentation underlining it as "deprecate, use XXX instead" 
would be more clear.

Basically my idea is to use libvirt as an analysis tool but for the 
moment the project is focusing only on the migration and reverting point 
of view. This is fully understandable as all the virtualization tools 
are basing their business on those features, but I think that 
introducing also analysis support may be really valuable for this 
product (my 2 cents of course).

NoxDaFox.