[libvirt] [PATCH v4 0/8] bulk snapshot list/redefine (incremental backup saga)

Eric Blake eblake at redhat.com
Tue Mar 12 21:02:29 UTC 2019


On 3/12/19 3:29 PM, Nir Soffer wrote:

>>>   snapshots =
>> srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL)
>>>   for snapshot in snapshts:
>>>       xml = snapshot.XMLDesc()
>>>       dstdom.SnapshotCreateXML(xml)
>>>
>>> I don't really see a reason why this is so hard that it justifies
>>> adding these bulk snapshot list/define  APIs.
>>
>> Nir, do you want to chime in here?
>>
> 
> We don't have a need to list or define snapshots since we managed snapshots
> on oVirt side.
> We want an API to list and redefine checkpoints.

But the proposed <domaincheckpoint> also has a <domain> subelement, so
it has the same problem (the XML for a bulk operation can become
prohibitively large).

> 
> Our use case is:
> 
> 1. Start VM on some host
> 2. When oVirt starts a backup, it adds a checkpoint in oVirt database

And does this database preserve topological ordering? If not, why not?

> 3. When libvirt starts a backup, libvirt add the checkpoint oVirt created
> 4. Steps 2 and 3 are repeated while the VM is running...
> 5. Stop VM - at this point we undefine the VM, so libivrt lost all info
> about checkopints
> 6. Start VM on some host
> 7. oVirt use the bulk checkpoint API to redefine all checkpoints in oVirt
> database
> 8. If there is unknown bitmap on a disk, or a bitmap is missing, it means
> that someone
>    modified the disk behind oVirt back, and oVirt will delete the affected
> checkpoints
>    and use libvirt checkpoints API to delete the checkpoints and bitmaps.
> This forces
>   full backup for the next backup of the affected disks.
> 
> We would like that libvirt will fail to redefine the checkpoints if the
> info in the checkpoints
> does not match the the bitmap on the disks mentioned in the checkpoints.

That's a different feature request - you're asking for
virDomainCheckpointCreateXML(REDEFINE) to have some additional flag
where the metadata re-creation fails if the underlying bitmap is
corrupt/missing. Which may still be a good flag to add, but is something
you can do with one-at-a-time checkpoint redefinitions, and doesn't, in
itself, require bulk redefinition.

> To
> do this,
> libvirt must have the complete checkpoint state. It cannot if we define
> checkpoints one
> by one.

As long as you redefine checkpoints in topological order, I don't see
how defining one at a time or bulk changes whether libvirt can validate
that the checkpoints are correct.  And even then, redefining in
topological order is only necessary so that libvirt can ensure that any
<parent/> checkpoint exists before attempting to define the metadata for
a child checkpoint.

> 
> If libivirt cannot fail, we need to a way to list all checkpoints and
> bitmaps, so we can
> sync oVirt database with reality.
> 
> If we don't validate bitmaps with the expected bitmaps, the backup may fail
> when starting
> a backup (good), or succeed, including wrong data in the backup.
> 
> It also does not make sense to redefine checkpoint state in incremental
> way, this like
> building a VM XML with many devices with one call per device.

The argument here is that it will be the same amount of XML whether it
is one call per checkpoint or one call for all checkpoints (note - there
are multiple devices in one checkpoint, so it is NOT one call per
'device * checkpoint')

> 
> The same checkpoint management will have to be duplicated in any libvirt
> client that
> manage more then one host (e.g. KubeVirt, OpenStack), since libvirt does
> not handle
> more than one host.

But again, if you are moving checkpoint metadata from one host to
another, doing so one-at-a-time in topological order shouldn't be much
harder than doing so in one single call, but without the drawbacks of
exceeding XML size over RPC call limits.

> 
> For more info see
> -
> https://ovirt.org/develop/release-management/features/storage/incremental-backup.html
> - https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L11525
> -
> https://github.com/oVirt/vdsm/blob/0874b89a19af2d7726ed2f0b9d674cf0bb48a67e/lib/vdsm/api/vdsm-api.yml#L8129
> 

>> Unless Nir speaks up with strong reasons in favor of new bulk APIs
>> (rather than living with the new topological sort flags), I'm okay
>> dropping this series.
> 
> 
> I'm ok with dropping bulk snapshots, but I think libvirt should provide
> bulk APIs for checkpoints.
> 
> Nir
>

Given that the same amount of XML is involved across RPC, I'm
hard-pressed to include bulk operations on one object without providing
it on both.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190312/83e648ff/attachment-0001.sig>


More information about the libvir-list mailing list