online blockdev-backup, a clarification (was: Summary on new backup interfaces in QEMU)

Tue Feb 21 06:31:30 UTC 2023

On 20.02.23 18:18, John Maline wrote:
> As a qemu newcomer I had a related question and confusion from reading existing docs. Searching qemu-block, this seemed related to my question so I’ll ask…
> 
> 
>> On Mar 15, 2022, at 12:57 PM, Vladimir Sementsov-Ogievskiy <v.sementsov-og at ya.ru> wrote:
>>
>> Hi all!
>>
>> Here I want to summarize new interfaces and use cases for backup in QEMU.
>>
>> TODO for me: convert this into good rst documentation in docs/.
> 
> The existing docs I found at https://qemu.readthedocs.io/en/latest/interop/live-block-operations.html#live-disk-backup-blockdev-backup-and-the-deprecated-drive-backup are confusing me. This, if I’m understanding, seem clearer.
> 
> 
>>
>> OK, let's begin.
>>
>> First, note that drive-backup qmp command is deprecated.
>>
>> Next, some terminology:
>>
>> push backup: the whole process is inside QEMU process, also may be called "internal backup"
>>
>> pull backup: QEMU only exports a kind of snapshot (for example by NBD), and third party software reads this export and stores it somehow, also called "external backup"
>>
>> copy-before-write operations: We usually do backup of active disk, guest is running and may write to the disk during the process of backup. When guest wants to rewrite data region which is not backed up yet, we must stop this guest write, and copy original data to somewhere before continuing guest write. That's a copy-before-write operation.
>>
>> image-fleecing: the technique that allows to export a "snapshotted" state of the active disk with help of copy-before-write operations. We create a temporary image - target for copy-before-write operations, and provide an interface to the user to read the "snapshotted" state. And for read, we do read from temporary image the data which is already changed in original active disk, and we read unchanged data directly from active disk. The temporary image itself is also called "reverse delta" or "reversed delta".
>>
>>
>>
>> == Simple push backup ==
>>
>> Just use blockdev-backup, nothing new here. I just note some technical details, that are relatively new:
>>
>> 1. First, backup job inserts copy-before-write filter above source disk, to do copy-before-write operation.
>> 2. Created copy-before-write filter shares internal block-copy state with backup job, so they work in collaboration, to not copy same things twice.
> 
> The simple case I’m aiming for matches a push backup. I’m OK w/ a snapshot.
> 
> Environment - macos 12.6 on arm processor, guest is aarch64 centos linux using hvf accelerator. Qemu 7.2.
> 
> I assume what you describe w/ copy-before-write is behavior in qemu 7.2. I’m fine if the Linux client needs to do a bit of log replay if I revert to a backup.
> 
> In the docs I link above it talks as if a VM shutdown is recommended after the job completes. Seems to ruin the whole point of an online backup. I tried instead finishing w/ a blockdev-del and I see the backup file closed by qemu. I’m guessing that’s an appropriate way to flush/complete the backup. In an experiment, it seemed the generated backup worked as expected.

Yes, shutdown is unrelated. Also, block-jobs do flush target on finish, so it's really synced after block-job completion event. Still, blockdev-del(target) is right thing to do.

> 
> I’m hoping for confirmation or correction on my approach.
> 
> Specifically I’m doing the following QMP commands.
> 
> {"execute": "qmp_capabilities"}
> 
> {"execute":"blockdev-add",
>   "arguments":{"node-name":"backup-node", "driver":"qcow2", "file":{"driver":"file", "filename":"backups/backup1.img"}}
> }
> 
> {"execute":"blockdev-backup",
>   "arguments":{"device":"drive0", "job-id":"job0", "target":"backup-node", "sync":"full"}
> }
> 
> ... watch many job state change events ...

The last one should be BLOCK_JOB_COMPLETED, you wait for it, and check "error" field - if it exist the job is failed.

You also can poll with query-block-jobs command.

> 
> {"execute":"blockdev-del",
>   "arguments": {"node-name":"backup-node"}
> }
> 

Yes your approach is correct. Note that ideally, you also should do fs-freeze / fs-thaw in guest around blockdev-backup command call, to be sure that the moment in time when we start the backup (the final target image will correspond to this moment in time) is consistent and we'll be able to boot from the backup image later.

-- 
Best regards,
Vladimir