[PATCH v2 09/19] qemu: blockjob: Store list of bitmaps disabled prior to commit

Wed Mar 11 20:15:03 UTC 2020

On 3/11/20 7:55 AM, Peter Krempa wrote:
> Starting a commit job will require disabling bitmaps in the base image
> so that they are not dirtied by the commit job. We need to store a list
> of the bitmaps so that we can later re-enable them.

I think I get why you don't want them further dirtied, but does it 
really matter?

Visually, consider what happens if we create checkpoint1 after writing 
data A, then checkpoint2 after writing data B (rendering the first 
checkpoing read-only) before writing data C, then create an external 
snapshot (possibly simultaneous with checkpoint3) to create a new active 
image (the new image also has a distinct bitmap tracking all changes 
since the snapshot, while the snapshot in the now-backing image remains 
read-write but unchanged because we no longer write to the backing) 
before writing data D:

backing:   AABBCC--
bitmap1:   --11---- (disabled)
bitmap2:   ----11-- (enabled)
active:    ---D-DD-
bitmap3:   ---1-11- (enabled)

In the above, I used different bitmap names between backing and active 
(as that is less confusing), but I don't know if that's what your code 
for bitmaps + external snapshots currently does, or if we end up with 
two bitmaps both named bitmap2 in both images; but that shouldn't really 
matter (as qemu needs both the node name and a bitmap name).  While in 
this configuration, a differential backup from checkpoint1 uses 
backing/bitmap1 + backing/bitmap2 + active/bitmap3 == --11111-; a 
differential from checkpoint2 uses backing/bitmap2 + active/bitmap3 == 
---1111-; a differential from the time of the external snapshot creation 
uses active/bitmap3 == ---1-11-.

So, the question is what happens if we now want to commit active back 
into backing.  If we leave bitmap2 enabled while the commit happens, we 
end up with:

backing:   AABDCDD-
bitmap1:   --11---- (disabled)
bitmap2:   ---1111- (enabled)

You can no longer recreate the point in time where the external snapshot 
was created (the commit lost checkpoint3), but bitmap2 is still viable 
for all changes needed for an incremental backup of changes made after 
bitmap1 was disabled and bitmap2 created.

Conversely, if we disable bitmap2 prior to the commit, then you have the 
option of creating a new bitmap3 prior to the commit to end up with:

backing:   AABDCDD-
bitmap1:   --11---- (disabled)
bitmap2:   ----11-- (disabled)
bitmap3:   ---1-11- (enabled)

such that you now have a point-in-time where you can then do a 
differential backup from all changes after the external snapshot was 
created (even though it has now been committed). But your resulting 
backups from any of the checkpoints see the same cumulative bitmaps as 
before.

And what happens if you accidentally leave bitmap2 enabled during the 
commit?

backing:   AABDCDD-
bitmap1:   --11---- (disabled)
bitmap2:   ---1111- (disabled)
bitmap3:   ---1-11- (enabled)

You have recorded more bits than strictly necessary in bitmap2, but as 
bitmap2 is unusable on its own (a differential bitmap HAS to combine 
bitmap2 with bitmap3, whether you are doing the differential from 
checkpoint1 or checkoint2), and all of those same bits are recorded in 
bitamp3, the resulting cumulative bitmap is unchanged.

Thus, my question is whether this complexity is necessary because it 
buys us some safety (is there some failure path I'm overlooking, where 
keeping bitmap2 unchanged even though backing is partially modified 
before the failure of a half-completed commit is essential?), but I can 
still go ahead and review the patch on the assumption that you have a 
reason for not allowing a stranded-enabled bitmap after a snapshot 
operation to not be modified during commit.

> 
> Add a field and status XML handling code as well as a test.
> 
> Signed-off-by: Peter Krempa <pkrempa at redhat.com>
> ---
>   src/qemu/qemu_blockjob.h                      |  2 ++
>   src/qemu/qemu_domain.c                        | 26 +++++++++++++++++++
>   .../blockjob-blockdev-in.xml                  |  4 +++
>   3 files changed, 32 insertions(+)
> 

At any rate, I'm not seeing any functional problems with the added code, 
even if I still have a question on the technical need.

Reviewed-by: Eric Blake <eblake at redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org