[libvirt] [RFC] externall (pull) backup API

Wed Nov 22 07:12:09 UTC 2017

ping

On 14.11.2017 18:38, Nikolay Shirokovskiy wrote:
>   Table of contents.
> 
>   I  Preface
> 
>   1. Fleece API
>   2. Export API
>   3. Incremental backups
>   4. Other hypervisors
> 
>   II Links
> 
> 
> 
> 
>   I Preface
> 
> This is a RFC for external (or pull) backup API in libvirt. There was a series [1]
> with more limited API scope and functionality for this kind of backup API.
> Besides other issues the series was abandoned as qemu blockdev-del command has
> experimental status at that time. There is also a long pending RFC series for
> internal (or push) backup API [2] which however has not much in comman with
> this RFC. Also there is RFC with overall agreement to having a backup API in
> libvirt [3].
> 
> The aim of external backup API is to provide means for 3d party application to
> read/write domain disks as block devices for the purpuse of backup. Disk is
> read on backup operation and in case of active domain is presented at some
> point in time (preferable in some guest consistent state). Disk is written on
> restore operation.
> 
> As to providing disk state at some point in time one can use existing disks
> snapshots for this purpose. However this RFC introduces API to leverage image
> fleecing (blockdev-backup command) instead. Image fleecing is somewhat inverse
> to snapshots. In case of snapshots writes go to top image thus backing image
> stays constant, in case of fleecing writes go to same image as before but old
> data is previously popped out to fleece image which have original image as
> backing. As a result fleece image became disk snapshot.
> 
> Another task of this API is to provide disks for read/write operations. One
> could try to leverage libvirt stream API for this purpose but AFAIK clients
> want random access to disks data which is not what stream API suitable for.
> I'm not sure what is costs of adding block API to libvirt, particularly what it
> costs to make it effective implementation at RPC level thus this RFC add means
> to export disks data thru existing block interfaces. For qemu it is NBD.
> 
> 
> 
>   1. Fleece API
> 
> So the below API is to provide means to start/stop/query disk image fleecing.
> I use BlockSnaphost name for this operation. Other options are Fleecing, BlockFleecing,
> TempBlockSnapshot etc.
> 
> /* Start fleecing */
> virDomainBlockSnapshotPtr
> virDomainBlockSnapshotCreateXML(virDomainPtr domain,
>                                 const char *xmlDesc,
>                                 unsigned int flags);
> 
> /* Stop fleecing */
> int
> virDomainBlockSnapshotDelete(virDomainBlockSnapshotPtr snapshot,
>                              unsigned int flags);
> 
> /* List active fleecings */
> virDomainBlockSnapshotList(virDomainPtr domain,
>                            virDomainBlockSnapshotPtr **snaps,
>                            unsigned int flags);
> 
> /* Get fleecing description */
> char*
> virDomainBlockSnapshotGetXMLDesc(virDomainBlockSnapshotPtr snapshot,
>                                  unsigned int flags);
> 
> /* Get fleecing by name */
> virDomainBlockSnapshotPtr
> virDomainBlockSnapshotLookupByName(virDomainPtr domain,
>                                    const char *name);
> 
> 
> Here is a minimal block snapshot xml description to feed creating function:
> 
> <domainblocksnapshot>
>   <snapshot disk='sda'>
>     <fleece file="/path/to/fleece-image-sda"/>
>   </snapshot>
>   <snapshot disk='sdb'>
>     <fleece file="/path/to/fleece-image-sdb"/>
>   </snapshot>
> </domainblocksnapshot>
> 
> Below is an example of what getting description function should provide upon
> successful block snaphost creation. The difference with the above xml is that
> name element (it can be specified on creation as well) and aliases are
> generated. Aliases will be useful later to identify block devices on exporting
> thru nbd.
> 
> <domainblocksnapshot>
>   <name>5768a388-c1c4-414c-ac4e-eab216ba7c0c</name>
>   <snapshot disk='sda'>
>     <fleece file="/path/to/fleece-image-sda"/>
>     <alias name="scsi0-0-0-0-backup"/>
>   </snapshot>
>   <snapshot disk='sdb'>
>     <fleece file="/path/to/fleece-image-sdb"/>
>     <alias name="scsi0-0-0-1-backup"/>
>   </snapshot>
> </domainblocksnapshot>
> 
> 
> 
>   2. Export API
> 
> During backup operation we need to provide read access to fleecing image. This
> is done thru qemu process nbd server. We just need to specify the disks to
> export.
> 
> /* start block export */
> int
> virDomainBlockExportStart(virDomainPtr domain,
>                           const char *xmlDesc,
>                           unsigned int flags);
> 
> /* stop block export */
> int
> virDomainBlockExportStop(virDomainPtr domain,
>                          const char *diskName,
>                          unsigned int flags);
> 
> Here is an example of xml for starting function:
> 
> <blockexport type="nbd" port="8001">
>   <listen type="address" address="10.0.2.10"/>
>   <disk name="scsi0-0-0-1-backup"/>
> </blockexport>
> 
> qemu nbd server is started upon first disk export start and shutted down upon
> last disk export stop. Another option is to control ndb server explicitly. One
> way to do it is to consider ndb server a new device so to start/stop/update ndb
> server we can use attach/detach/update device functions. Then in block export
> start we need to refer to this device somehow. This can be a generated
> name/uuid or type/address pair. Actually this approach to expose ndb server
> looks more natural to me even it includes more management from client side.
> I am not suggesting it in the first place mostly due to hesitations on how to
> refer to ndb server on block export.
> 
> In any case I'd like to provide export info in active domain config:
> 
> <devices>
>   <blockexport type="nbd" port="8001">
>     <listen type="address" address="10.0.2.10"/>
>     <disk name="scsi0-0-0-1-backup"/>
>     <disk name="scsi0-0-0-2-backup"/>
>   </blockexport>
> </devices>
> 
> This API is used in restore operation too. Domain is started in paused state,
> the disks to be restored are exported and backup client fills it with the
> backup data.
> 
> 
> 
>   3. Incremental backups
> 
> Qemu can track what disk parts are changed from from fleecing start. This is
> what typically called CBT (dirty bitmap in qemu community I guess). There are
> also experimental ndb support [4] and a bunch of merged/agreed/proposed bitmap
> operation that help to organize incremental backups.
> 
> Different hypervisors has different bitmap implementations with different
> costs thus it is up to hyperivsor whether to start CBT or not upon block snapshot
> create by default. Qemu implementations has memory and disk costs for every
> bitmap thus I suggest by default start fleecing without bitmap and add flag
> VIR_DOMAIN_BLOCK_SNAPSHOT_CREATE_CHECKPOINT to ask to start a bitmap.
> 
> Disks bitmaps are visible in active domain definition with the name
> of block snapshot for which bitmap was started.
> 
> <disk type='file' device='disk'>
>   ..
>   <target dev='sda' bus='scsi'/>
>   <alias name='scsi0-0-0-0'/>
>   <checkpoint name="93a5c045-6457-2c09-e56c-927cdf34e178">
>   <checkpoint name="5768a388-c1c4-414c-ac4e-eab216ba7c0c">
>   ..
> </disk>
> 
> The bitmap can be specified upon disk export like below (I guess there
> is no need to provide more then one bitmap per disk). Active domain
> config section for block export is expanded similarly.
> 
> <blockexport type="nbd" port="8001">
>   <listen type="address" address="10.0.2.10"/>
>   <disk name="scsi0-0-0-1-backup" checkpoint="5768a388-c1c4-414c-ac4e-eab216ba7c0c"/>
> </blockexport>
> 
> If bitmap was created on backup start but client failed to make a backup for some reason
> then it makes no sense to keep this checkpoint anymore. As having bitmap takes
> resources it is convinient to drop bitmap in this case. Also one may
> want to drop bitmap for pure resource managment issues. So we need API to remove bitmap:
> 
> virDomainBlockCheckpointRemove(virDomainPtr domain,
>                                const char *name,
>                                unsigned int flags);
> 
> 
> 
>   4. Other hypervisors
> 
> I took a somewhat considerable look only at vmware backup interface at [5] etc.
> Looks like they don't have fleecing like qemu has so for vmware snapshots one
> can use usual disks snapshots API. Also there is no nbd interface for snapshots
> expectedly thus to deal with vmware snapshot disks one eventually will have to
> add block API to libvirt. So the only point this RFC has to vmware backups is
> exporting checkpoints in disk xml. The vmware documentation does not say much
> about bitmap limitations but I guess they still can provide only a number of
> them which can be exposed as suggested for active domain disks.
> 
> 
> 
>   II Links:
> 
> [1] https://www.redhat.com/archives/libvir-list/2016-September/msg00192.html
> [2] https://www.redhat.com/archives/libvir-list/2017-May/msg00379.html
> [3] https://www.redhat.com/archives/libvir-list/2016-March/msg00937.html
> [4] https://github.com/NetworkBlockDevice/nbd/commit/cfa8ebfc354b2adbdf73b6e6c2520d1b48e43f7a
> [5] https://code.vmware.com/doc/preview?id=4076#/doc/vddkBkupVadp.9.3.html#1014717
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
>