[libvirt] [RFC v3] external (pull) backup API

Eric Blake eblake at redhat.com
Fri Jun 8 21:40:09 UTC 2018


On 05/17/2018 05:43 PM, Eric Blake wrote:
> Here's my updated counterproposal for a backup API.
> 

> /**
>   * virDomainBackupBegin:

>   *
>   * There are two fundamental backup approaches.  The first, called a
>   * push model, instructs the hypervisor to copy the state of the guest
>   * disk to the designated storage destination (which may be on the
>   * local file system or a network device); in this mode, the
>   * hypervisor writes the content of the guest disk to the destination,
>   * then emits VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 when the backup is
>   * either complete or failed (the backup image is invalid if the job
>   * is ended prior to the event being emitted).

Better is VIR_DOMAIN_EVENT_ID_JOB_COMPLETED (BLOCK_JOB can only inform 
status about one disk, while this is intended to inform about multiple 
disks done in a single transaction).  I'm a bit depressed at our 
technical debt in this area: virDomainGetJobStats() and 
virDomainAbortJob() don't take a job id, but only operate on the most 
recently started job, but I did mention elsewhere in my plans:

> 
> I think that it should be possible to run multiple backup operations
> in parallel in the long run.  But in the interest of getting a proof
> of concept implementation out quickly, it's easier to state that for
> the initial implementation, libvirt supports at most one backup
> operation at a time (to do another backup, you have to wait for the
> current one to complete, or else abort and abandon the current
> one). As there is only one backup job running at a time, the existing
> virDomainGetJobInfo()/virDomainGetJobStats() will be able to report
> statistics about the job (insofar as such statistics are available).
> But in preparation for the future, when libvirt does add parallel job
> support, starting a backup job will return a job id; and presumably
> we'd add a new virDomainGetJobStatsByID() for grabbing statistics of
> an arbitrary (rather than the most-recently-started) job.
> 
> Since live migration also acts as a job visible through
> virDomainGetJobStats(), I'm going to treat an active backup job and
> live migration as mutually exclusive.  This is particularly true when
> we have a pull model backup ongoing: if qemu on the source is acting
> as an NBD server, you can't migrate away from that qemu and tell the
> NBD client to reconnect to the NBD server on the migration
> destination.  So, to perform a migration, you have to cancel any
> pending backup operations.  Conversely, if a migration job is
> underway, it will not be possible to start a new backup job until
> migration completes.  However, we DO need to modify migration to
> ensure that any persistent bitmaps are migrated. 

Yes, this means that virDomainBackupEnd() (which takes a job id) and 
virDomainJobAbort() (which does not, but until we support parallel 
backup jobs or a mix of backup and migration at once, it does not 
matter) can initially both do the work of aborting a backup job.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




More information about the libvir-list mailing list