[libvirt] [RFC v2] external (pull) backup API
John Snow
jsnow at redhat.com
Thu Apr 12 23:53:17 UTC 2018
On 04/12/2018 08:57 AM, Nikolay Shirokovskiy wrote:
>
>
> On 12.04.2018 07:14, John Snow wrote:
>>
>>
>> On 04/11/2018 12:32 PM, Eric Blake wrote:
>>> On 04/03/2018 07:01 AM, Nikolay Shirokovskiy wrote:
>>>> Hi, all.
>>>>
>>>> This is another RFC on pull backup API. This API provides means to read domain
>>>> disks in a snapshotted state so that client can back them up as well as means
>>>> to write domain disks to revert them to backed up state. The previous version
>>>> of RFC is [1]. I'll also describe the API implementation details to shed light
>>>> on misc qemu dirty bitmap commands usage.
>>>
>
>
>
> [snip]
>
>
>>>>
>>>> Qemu can track what disk's blocks are changed from snapshotted state so on next
>>>> backup client can backup only changed blocks. virDomainBlockSnapshotCreateXML
>>>> accepts VIR_DOMAIN_BLOCK_SNAPSHOT_CREATE_CHECKPOINT flag to turn this option
>>>> for snapshot which means to track changes from this particular snapshot. I used
>>>> checkpoint term and not [dirty] bitmap because many qemu dirty bitmaps are used
>>>> to provide changed blocks from the given checkpoint to current snapshot in
>>>> current implementation (see *Implemenation* section for more details). Also
>>>> bitmap keeps block changes and thus itself changes in time and checkpoint is
>>>> a more statical terms means you can query changes from that moment in time.
>>>>
>>>> Checkpoints are visible in active domain xml:
>>>>
>>>> <disk type='file' device='disk'>
>>>> ..
>>>> <target dev='sda' bus='scsi'/>
>>>> <alias name='scsi0-0-0-0'/>
>>>> <checkpoint name="93a5c045-6457-2c09-e56c-927cdf34e178">
>>>> <checkpoint name="5768a388-c1c4-414c-ac4e-eab216ba7c0c">
>>>> ..
>>>> </disk>
>>>>
>>
>> It makes sense to avoid the bitmap name in libvirt, but do these indeed
>> correlate 1:1 with bitmaps?
>>
>> I assume each bitmap will have name=%%UUID%% ?
>
> There is 1:1 correlation but names are different. Checkout checkpoints subsection
> of *implementation details* section below for naming scheme.
>
Yeah, I saw later. You have both "checkpoints" (associated with bitmaps)
and then the bitmaps themselves.
>>
>>>> Every checkpoint requires qemu dirty bitmap which eats 16MiB of RAM with default
>>>> dirty block size of 64KiB for 1TiB disk and the same amount of disk space is used.
>>>> So client need to manage checkpoints and delete unused. Thus next API function:
>>>>
>>>>
>
>
>
> [snip]
>
>
>
>>>> First a few facts about qemu dirty bitmaps.
>>>>
>>>> Bitmap can be either in active or disable state. In disabled state it does not
>>>> get changed on guest writes. And oppositely in active state it tracks guest
>>>> writes. This implementation uses approach with only one active bitmap at
>>>> a time. This should reduce guest write penalties in the presence of
>>>> checkpoints. So on first snapshot we create bitmap B_1. Now it tracks changes
>>>> from the snapshot 1. On second snapshot we create bitmap B_2 and disable bitmap
>>>> B1 and so on. Now bitmap B1 keep changes from snaphost 1 to snapshot 2, B2
>>>> - changes from snaphot 2 to snapshot 3 and so on. Last bitmap is active and
>>>> gets most disk change after latest snapshot.
>>
>> So you are trying to optimize away write penalties if you have, say, ten
>> bitmaps representing checkpoints so we don't have to record all new
>> writes to all ten.
>>
>> This makes sense, and I would have liked to formalize the concept in
>> QEMU, but response to that idea was very poor at the time.
>>
>> Also my design was bad :)
>>
>>>>
>>>> Getting changed blocks bitmap from some checkpoint in past till current snapshot
>>>> is quite simple in this scheme. For example if the last snapshot is 7 then
>>>> to get changes from snapshot 3 to latest snapshot we need to merge bitmaps B3,
>>>> B4, B4 and B6. Merge is just logical OR on bitmap bits.
>>>>
>>>> Deleting a checkpoint somewhere in the middle of checkpoint sequence requires
>>>> merging correspondent bitmap to the previous bitmap in this scheme.
>>>>
>>
>> Previous, or next?
>
> In short previous.
>
>>
>> Say we've got bitmaps (in chronological order from oldest to newest)
>>
>> A B C D E F G H
>>
>> and we want to delete bitmap (or "checkpoint") 'C':
>>
>> A B D E F G H
>>
>> the bitmap representing checkpoint 'D' should now contain the bits that
>> used to be in 'C', right? That way all the checkpoints still represent
>> their appropriate points in time.
>
> I merge to previous due to definition above. "A" contains changes from
> point in time A to point in time B an so no. So if you delete C in
> order for B to keep changes from point in time B to point in time D
> (next in checkpoint chain) you need merge C to B.
>
I'm not sure the way it's explained here makes sense to me, but
Vladimir's explanation does.
>>
>>
>> The only problem comes when you delete a checkpoint on the end and the
>> bits have nowhere to go:
>>
>> A B C
>>
>> A B _
>>
>> In this case you really do lose a checkpoint -- but depending on how we
>> annotate this, it may or may not be possible to delete the most recent
>> checkpoint. Let's assume that the currently active bitmap that doesn't
>> represent *any* point in time yet (because it's still active and
>> recording new writes) is noted as 'X':
>>
>> A B C X
>>
>> If we delete C now, then, that bitmap can get re-merged into the *active
>> bitmap* X:
>>
>> A B _ X
>
> You can delete any bitmap (and accordingly any checkpoint). If checkpoint
> is last we just merge last bitmap to previous and additioanlly make the
> previous bitmap active.
>
>>
>>>> We use persitent bitmaps in the implementation. This means upon qemu process
>>>> termination bitmaps are saved in disks images metadata and restored back on
>>>> qemu process start. This makes checkpoint a persistent property that is we
>>>> keep them across domain start/stops. Qemu does not try hard to keep bitmaps.
>>>> If upon save something goes wrong bitmap is dropped. The same is applied to the
>>>> migration process too. For backup process it is not critical. If we don't
>>>> discover a checkpoint we always can make a full backup. Also qemu provides no
>>>> special means to track order of bitmaps. These facts are critical for
>>>> implementation with one active bitmap at a time. We need right order of bitmaps upon
>>>> merge - for snapshot N and block changes from snanpshot K, K < N to N we need
>>>> to merge bitmaps B_{K}, ..., B_{N-1}. Also if one of the bitmaps to be merged
>>>> is missing we can't calculate desired block changes too.
>>>>
>>
>> Right. A missing bitmap anywhere in the sequence invalidates the entire
>> sequence.
>>
>>>> So the implementation encode bitmap order in their names. For snapshot A1, bitmap
>>>> name will be A1, for snapshot A2 bitmap name will be A2^A1 and so on. Using this naming
>>>> encoding upon domain start we can find out bitmap order and check for missing
>>>> ones. This complicates a bit bitmap removing though. For example removing
>>>> a bitmap somewhere in the middle looks like this:
>>>>
>>>> - removing bitmap K (bitmap name is NAME_{K}^NAME_{K-1}
>>>> - create new bitmap named NAME_{K+1}^NAME_{K-1} ---.
>>>> - disable new bitmap | This is effectively renaming
>>>> - merge bitmap NAME_{K+1}^NAME_{K} to the new bitmap | of bitmap K+1 to comply the naming scheme
>>>> - remove bitmap NAME_{K+1}^NAME_{K} ___/
>>>> - merge bitmap NAME_{K}^NAME_{K-1} to NAME_{K-1}^NAME_{K-2}
>>>> - remove bitmap NAME_{K}^NAME_{K-1}
>>>>
>>>> As you can see we need to change name for bitmap K+1 to keep our bitmap
>>>> naming scheme. This is done creating new K+1 bitmap with appropriate name
>>>> and copying old K+1 bitmap into new.
>>>>
>>
>> That seems... unfortunate. A record could be kept in libvirt instead,
>> couldn't it?
>>
>> A : Bitmap A, Time 12:34:56, Child of (None), Parent of B
>> B : Bitmap B, Time 23:15:46, Child of A, Parent of (None)
>
> Yes it is possible. I was reluctant to implement this way for a couple of reasons:
>
> - if bitmap metadata is in libvirt we need carefully design it for
> things like libvirtd crashes. If metadata is out of sync with qemu then we can get
> broken incremental backups. One possible design is:
>
> - on bitmap deletion save metadata after deletion bitmap in qemu; in case
> of libvirtd crash in between upon libvirtd restart we can drop bitmaps
> that are in metadata but not in qemu as already deleted
>
> - on bitmap add (creating new snapshot with checkpoint) save metadata with bitmap before
> creating bitmap in qemu; then again we have a way to handle libvirtd crashes
> in between
>
> So this approach has tricky parts too. The suggested approach uses qemu
> transactions to keep bitmap consistent.
>
> - I don't like another metadata which looks like belongs to disks and not
> a domain. It is like keeping disk size in domain xml.
>
Yeah, I see... Having to rename bitmaps in the middle of the chain seems
unfortunate, though...
and I'm still a little wary of using the names as important metadata to
be really honest. It feels like a misuse of the field.
>>
>> I suppose in this case you can't *reconstruct* this information from the
>> bitmap stored in the qcow2, which necessitates your naming scheme...
>>
>> ...Still, if you forego this requirement, deleting bitmaps in the middle
>> becomes fairly easy.
>>
>>>> So while it is possible to have only one active bitmap at a time it costs
>>>> some exersices at managment layer. To me it looks like qemu itself is a better
>>>> place to track bitmaps chain order and consistency.
>>>
>>
>> If this is a hard requirement, it's certainly *easier* to track the
>> relationship in QEMU ...
>>
>>> Libvirt is already tracking a tree relationship between internal
>>> snapshots (the virDomainSnapshotCreateXML), because qemu does NOT track
>>> that (true, internal snapshots don't get as much attention as external
>>> snapshots) - but the fact remains that qemu is probably not the best
>>> place to track relationship between multiple persistent bitmaps, any
>>> more than it tracks relationships between internal snapshots. So having
>>> libvirt track relations between persistent bitmaps is just fine. Do we
>>> really have to rename bitmaps in the qcow2 file, or can libvirt track it
>>> all on its own?
>>>
>>
>> This is a way, really, of storing extra metadata by using the bitmap
>> name as arbitrary data storage.
>>
>> I'd say either we promote QEMU to understanding checkpoints, or enhance
>> libvirt to track what it needs independent of QEMU -- but having to
>> rename bitmaps smells fishy to me.
>>
>>> Earlier, you said that the new virDomainBlockSnapshotPtr are
>>> independent, with no relations between them. But here, you are wanting
>>> to keep incremental backups related to one another.
>>>
>>
>> I think the *snapshots*, as temporary objects, are independent and don't
>> carry a relation to each other.
>>
>> The *checkpoints* here, however, are persistent and interrelated.
>>
>>>>
>>>> Now how exporting bitmaps looks like.
>>>>
>>>> - add to export disk snapshot N with changes from checkpoint K
>>>> - add fleece blockdev to NBD exports
>>>> - create new bitmap T
>>>> - disable bitmap T
>>>> - merge bitmaps K, K+1, .. N-1 into T
>>
>> I see; so we compute a new slice based on previous bitmaps and backup
>> arbitrary from that arbitrary slice.
>>
>> So "T" is a temporary bitmap meant to be discarded at the conclusion of
>> the operation, making it much more like a consumable object.
>>
>>>> - add bitmap to T to nbd export
>>>>
>>>> - remove disk snapshot from export
>>>> - remove fleece blockdev from NBD exports
>>>> - remove bitmap T
>>>>
>>
>> Aha.
>>
>>>> Here is qemu commands examples for operation with checkpoints, I'll make
>>>> several snapshots with checkpoints for purpuse of better illustration.
>>>>
>>>> - create snapshot d068765e-8b50-4d74-9b72-1e55c663cbf8 with checkpoint
>>>> - same as without checkpoint but additionally add bitmap on fleece blockjob start
>>>>
>>>> ...
>>>> {
>>>> "execute": "transaction"
>>>> "arguments": {
>>>> "actions": [
>>>> {
>>>> "type": "blockdev-backup"
>>>> "data": {
>>>> "device": "drive-scsi0-0-0-0",
>>>> "sync": "none",
>>>> "target": "snapshot-scsi0-0-0-0"
>>>> },
>>>> },
>>>> {
>>>> "type": "block-dirty-bitmap-add"
>>>> "data": {
>>>> "name": "libvirt-d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> "node": "drive-scsi0-0-0-0",
>>>> "persistent": true
>>>> },
>>>> }
>>>
>>
>> So a checkpoint creates a reference point, but NOT a backup. You are
>> manually creating checkpoint instances.
>>
>> In this case, though, you haven't disabled the previous checkpoint's
>> bitmap (if any?) atomically with the creation of this one...
>
> In the example this is first snapshot so there is no previous checkpoint
> and thus nothing to disable.
>
OK, got it!
>>
>>
>>> Here, the transaction makes sense; you have to create the persistent
>>> dirty bitmap to track from the same point in time. The dirty bitmap is
>>> tied to the active image, not the backup, so that when you create the
>>> NEXT incremental backup, you have an accurate record of which sectors
>>> were touched in snapshot-scsi0-0-0-0 between this transaction and the next.
>>>
>>>> ]
>>>> },
>>>> }
>>>>
>>>> - delete snapshot d068765e-8b50-4d74-9b72-1e55c663cbf8
>>>> - same as without checkpoints
>>>>
>>>> - create snapshot 0044757e-1a2d-4c2c-b92f-bb403309bb17 with checkpoint
>>>> - same actions as for the first snapshot, but additionally disable the first bitmap
>>>
>>> Again, you're showing the QMP commands that libvirt is issuing; which
>>> libvirt API calls are driving these actions?
>>>
>>>>
>>>> ...
>>>> {
>>>> "execute": "transaction"
>>>> "arguments": {
>>>> "actions": [
>>>> {
>>>> "type": "blockdev-backup"
>>>> "data": {
>>>> "device": "drive-scsi0-0-0-0",
>>>> "sync": "none",
>>>> "target": "snapshot-scsi0-0-0-0"
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-disable"
>>>> "data": {
>>>
>>> Do you have measurements on whether having multiple active bitmaps hurts
>>> performance? I'm not yet sure that managing a chain of disabled bitmaps
>>> (and merging them as needed for restores) is more or less efficient than
>>> managing multiple bitmaps all the time. On the other hand, you do have
>>> a point that restore is a less frequent operation than backup, so making
>>> backup as lean as possible and putting more work on restore is a
>>> reasonable tradeoff, even if it adds complexity to the management for
>>> doing restores.
>>>
>>
>> Depending on the number of checkpoints intended to be kept... we
>> certainly make no real promises on the efficiency of marking so many.
>> It's at *least* a linear increase with each checkpoint...
>>
>>>> "name": "libvirt-d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> "node": "drive-scsi0-0-0-0"
>>>> },
>>>> },
>>>> {
>>>> "type": "block-dirty-bitmap-add"
>>>> "data": {
>>>> "name": "libvirt-0044757e-1a2d-4c2c-b92f-bb403309bb17^d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> "node": "drive-scsi0-0-0-0",
>>>> "persistent": true
>>>> },
>>>> }
>>>> ]
>>>> },
>>>> }
>>>>
>>
>> Oh, I see, you handle the "disable old" case here.
>>
>>>> - delete snapshot 0044757e-1a2d-4c2c-b92f-bb403309bb17
>>>> - create snapshot 8fc02db3-166f-4de7-b7aa-1f7303e6162b with checkpoint
>>>>
>>>> - add disk snapshot 8fc02db3-166f-4de7-b7aa-1f7303e6162b to export and bitmap with
>>>> changes from checkpoint d068765e-8b50-4d74-9b72-1e55c663cbf8
>>>> - same as add export without checkpoint, but aditionally
>>>> - form result bitmap
>>>> - add bitmap to NBD export
>>>>
>>>> ...
>>>> {
>>>> "execute": "transaction"
>>>> "arguments": {
>>>> "actions": [
>>>> {
>>>> "type": "block-dirty-bitmap-add"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "name": "libvirt-__export_temporary__",
>>>> "persistent": false
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-disable"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0"
>>>> "name": "libvirt-__export_temporary__",
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-merge"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "src_name": "libvirt-d068765e-8b50-4d74-9b72-1e55c663cbf8"
>>>> "dst_name": "libvirt-__export_temporary__",
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-merge"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "src_name": "libvirt-0044757e-1a2d-4c2c-b92f-bb403309bb17^d068765e-8b50-4d74-9b72-1e55c663cbf#
>>>> "dst_name": "libvirt-__export_temporary__",
>>>> },
>>>> }
>>>> ]
>>>> },
>>>> }
>>
>> OK, so in this transaction you add a new temporary bitmap for export,
>> and merge the contents of two bitmaps into it.
>>
>> However, it doesn't look like you created a new checkpoint and managed
>> that handoff here, did you?
>
> We don't need create checkpoints for the purpuse of exporting. Only temporary
> bitmap to merge appropriate bitmap chain.
>
See reply below
>>
>>>> {
>>>> "execute": "x-vz-nbd-server-add-bitmap"
>>>> "arguments": {
>>>> "name": "sda-8fc02db3-166f-4de7-b7aa-1f7303e6162b"
>>>> "bitmap": "libvirt-__export_temporary__",
>>>> "bitmap-export-name": "d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> },
>>
>> And then here, once the bitmap and the data is already frozen, it's
>> actually alright if we add the export at a later point in time.
>>
>>>
>>> Adding a bitmap to a server is would would advertise to the NBD client
>>> that it can query the
>>> "qemu-dirty-bitmap:d068765e-8b50-4d74-9b72-1e55c663cbf8" namespace
>>> during NBD_CMD_BLOCK_STATUS, rather than just "base:allocation"?
>>>
>>
>> Don't know much about this, I stopped paying attention to the BLOCK
>> STATUS patches. Is the NBD spec the best way to find out the current
>> state right now?
>>
>> (Is there a less technical, briefer overview somewhere, perhaps from a
>> commit message or a cover letter?)
>>
>>>> }
>>>>
>>>> - remove snapshot 8fc02db3-166f-4de7-b7aa-1f7303e6162b from export
>>>> - same as without checkpoint but additionally remove temporary bitmap
>>>>
>>>> ...
>>>> {
>>>> "arguments": {
>>>> "name": "libvirt-__export_temporary__",
>>>> "node": "drive-scsi0-0-0-0"
>>>> },
>>>> "execute": "block-dirty-bitmap-remove"
>>>> }
>>>>
>>
>> OK, this just deletes the checkpoint. I guess we delete the node and
> I would not call it checkpoint. Checkpoint is something visible to client.
> An ability to get CBT from that point in time.
>
> Here we create a temporary bitmap to calculate desired CBT.
>
Aha, right. I misspoke; but it's because in my mind I feel like creating
an export will *generally* be accompanied by a new checkpoint, so I was
surprised to see that missing from the example.
But, yes, there's no reason you *have* to create a new checkpoint when
you do an export -- but I suspect that when you DO create a new
checkpoint it's generally going to be accompanied by an export like
this, right?
>> stop the NBD server too, right?
>
> yeah, just like in case without checkpoint (mentioned in this case description)
>
>>
>>>> - delete checkpoint 0044757e-1a2d-4c2c-b92f-bb403309bb17
>>>> (similar operation is described in the section about naming scheme for bitmaps,
>>>> with difference that K+1 is N here and thus new bitmap should not be disabled)
>>>
>>> A suggestion on the examples - while UUIDs are nice and handy for
>>> management tools, they are a pain to type and for humans to quickly
>>> read. Is there any way we can document a sample transaction stream with
>>> all the actors involved (someone issues a libvirt API call XYZ, libvirt
>>> in turn issues QMP command ABC), and using shorter names that are easier
>>> to read as humans?
>>>
>>
>> Yeah, A-B-C-D terminology would be nice for the examples. It's fine if
>> the actual implementation uses UUIDs.
>>
>>>>
>>>> {
>>>> "arguments": {
>>>> "actions": [
>>>> {
>>>> "type": "block-dirty-bitmap-add"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "name": "libvirt-8fc02db3-166f-4de7-b7aa-1f7303e6162b^d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> "persistent": true
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-merge"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "src_name": "libvirt-0044757e-1a2d-4c2c-b92f-bb403309bb17^d068765e-8b50-4d74-9b72-1e55c663cbf#
>>>> "dst_name": "libvirt-d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> },
>>>> },
>>>> {
>>>> "type": "x-vz-block-dirty-bitmap-merge"
>>>> "data": {
>>>> "node": "drive-scsi0-0-0-0",
>>>> "src_name": "libvirt-8fc02db3-166f-4de7-b7aa-1f7303e6162b^0044757e-1a2d-4c2c-b92f-bb403309bb1#
>>>> "dst_name": "libvirt-8fc02db3-166f-4de7-b7aa-1f7303e6162b^d068765e-8b50-4d74-9b72-1e55c663cbf#
>>>> },
>>>> },
>>>> ]
>>>> },
>>>> "execute": "transaction"
>>>> }
>>>> {
>>>> "execute": "x-vz-block-dirty-bitmap-remove"
>>>> "arguments": {
>>>> "node": "drive-scsi0-0-0-0"
>>>> "name": "libvirt-8fc02db3-166f-4de7-b7aa-1f7303e6162b^0044757e-1a2d-4c2c-b92f-bb403309bb17",
>>>> },
>>>> },
>>>> {
>>>> "execute": "x-vz-block-dirty-bitmap-remove"
>>>> "arguments": {
>>>> "node": "drive-scsi0-0-0-0"
>>>> "name": "libvirt-0044757e-1a2d-4c2c-b92f-bb403309bb17^d068765e-8b50-4d74-9b72-1e55c663cbf8",
>>>> },
>>>> }
>>>>
>>>> Here is a list of bitmap commands used in implementation but not yet in upstream (AFAIK).
>>>>
>>>> x-vz-block-dirty-bitmap-remove
>>
>> We already have this, right? It doesn't even need to be transactionable.
>>
>>>> x-vz-block-dirty-bitmap-merge
>>
>> You need this...
>>
>>>> x-vz-block-dirty-bitmap-disable
>>
>> And this we had originally but since removed, but can be re-added trivially.
>>
>>>> x-vz-block-dirty-bitmap-enable (not in the examples; used when removing most recent checkpoint)
>>>> x-vz-nbd-server-add-bitmap
>>>>
>>
>> Do my comments make sense? Am I understanding you right so far? I'll try
>> to offer a competing writeup to make sure we're on the same page with
>> your proposed design before I waste any time trying to critique it -- in
>> case I'm misunderstanding you.
>
> Yes, looks like we are in tune.
>
More or less. Thank you for taking the time to explain it all out to me.
I think I understand the general shape of your proposal, more or less.
>>
>> Thank you for leading the charge and proposing new APIs for this
>> feature. It will be very nice to expose the incremental backup
>> functionality we've been working on in QEMU to users of libvirt.
>>
>> --js
>
> There are also patches too ( if API design survive review phase at least partially :) )
>
I can only really help (or hinder?) where QEMU primitives are concerned
-- the actual libvirt API is going to be what Eric cares about.
I think this looks good so far, though -- at least, it makes sense to me.
>>
>>>> *Restore operation nuances*
>>>>
>>>> As it was written above to restore a domain one needs to start it in paused
>>>> state, export domain's disks and write them from backup. However qemu currently does
>>>> not let export disks for write even for a domain that never starts guests CPU.
>>>> We have an experimental qemu command option -x-vz-nbd-restore (passed together
>>>> with -incoming option) to fix it.
>>>
>>> Why can't restore be done while the guest is offline? (Oh right, we
>>> still haven't added decent qemu-img support for bitmap manipulation, so
>>> we need a qemu process around for any bitmap changes).
>>>
>>
>> I'm working on this right now, actually!
>>
>> I'm working on JSON format output for bitmap querying, and simple
>> clear/delete commands. I hope to send this out very soon.
>>
>>> As I understand it, the point of bitmaps and snapshots is to create an
>>> NBD server that a third-party can use to read just the dirty portions of
>>> a disk in relation to a known checkpoint, to save off data in whatever
>>> form it wants; so you are right that the third party then needs a way to
>>> rewrite data from whatever internal form it stored it in back to the
>>> view that qemu can consume when rolling back to a given backup, prior to
>>> starting the guest on the restored data. Do you need additional libvirt
>>> APIs exposed for this, or do the proposed APIs for adding snapshots
>>> cover everything already with just an additional flag parameter that
>>> says whether the <domainblocksnapshot> is readonly (the third-party is
>>> using it for collecting the incremental backup data) or writable (the
>>> third-party is actively writing its backup into the file, and when it is
>>> done, then perform a block-commit to merge that data back onto the main
>>> qcow2 file)?
>>>
Thank you!
More information about the libvir-list
mailing list