RFC: Qemu backup interface plans

Fri May 21 22:05:40 UTC 2021

17.05.2021 15:07, Vladimir Sementsov-Ogievskiy wrote:
> 3.1 and make it possible to modify the bitmap externally, so that consumer of fleecing can say to backup-top filter: I've already copied these blocks, don't bother with copying them to temp image". This is to solve [2].
> 
> Still, how consumer of fleecing will reset shared bitmap after copying blocks? I have the following idea: we make a "discard-bitmap-filter" filter driver, that own some bitmap and on discard request unset corresponding bits. Also, on read, if read from the region with unset bits the EINVAL returned immediately. This way both consumers (backup job and NBD client) are able to use this interface:
> 
> Backup job can simply call discard on source, we can add an option for this.
> External backup tool will send TRIM request after reading some region. This way disk space will be freed and no extra copy-before-write operations will be done. I also have a side idea that we can implement READ_ONCE flag, so that READ and TRIM can be done in one NBD command. But this works only for clients that don't want to implement any kind of retrying.
> 
> 
> 
> So, finally, how will it look (here I call backup-top with a new name, and "file" child is used instead of "backing", as this change I propose in "[PATCH 00/21] block: publish backup-top filter"):
> 
> Pull backup:
> 
> ┌────────────────────────────────────┐
> │             NBD export             │
> └────────────────────────────────────┘
>    │
>    │
> ┌────────────────────────────────────┐  file   ┌───────────────────────────────────────┐  backing   ┌─────────────┐
> │ discard-bitmap filter (bitmap=bm0) │ ──────▶ │            temp qcow2 img             │ ─────────▶ │ active disk │
> └────────────────────────────────────┘         └───────────────────────────────────────┘            └─────────────┘
>                                                   ▲                                                    ▲
>                                                   │ target                                             │
>                                                   │                                                    │
> ┌────────────────────────────────────┐         ┌───────────────────────────────────────┐  file        │
> │             guest blk              │ ──────▶ │ copy-before-write filter (bitmap=bm0) │ ─────────────┘
> └────────────────────────────────────┘         └───────────────────────────────────────┘

Now I think that discard-bitmap is not really good idea:

1. If it is separate of internal BlockCopyState::copy_bitmap, than we'll have to consider both bitmaps inside block-copy code, it's not a small complication.

2. Using BlockCopyState::copy_bitmap directly seems possible:

User creates copy_bitmap by hand, and pass to copy-before-write filter as an option, block-copy will use this bitmap directly

Same bitmap is passed to discard-bitmap filter, so that it can clear bits on discards.

Pros:
  - we don't make block-copy code more complicated

Note then, that BlockCopyState::copy_bitmap is used in block-copy process as follows:

  - copy tasks are created to copy dirty areas
  - when task is created, corresponding dirty area is cleared (so that creating tasks on same area can't happen)
  - if task failed, corresponding area becomes dirty again (so that it can be retried)

So, we have
Cons:
  - if discard comes when task is in-flight, bits are already clean. Then if task failed bits become dirty again. That's wrong. Not very bad, and not worth doing coplications of [1]. But still, keep it in mind [*]
  - we have to use separate bitmap in discard-filter to fail reads from non-dirty areas (because BlockCopyState::copy_bitmap is cleared by block-copy process, not only by discards). That is worse.. It just means that discard-filter is strange thing and we don't have good understanding of what should it do. Good documentation will help of course, but...

...this all leads me to new idea:

Let's go without discard-bitmap filter with the simple logic:

If discard is done on block-copy target, let's inform block-copy about it. So, block-copy can:

  - clear corresponding area in its internal bitmap
  - [not really improtant, but it addresses [*]]: cancel in-flight tasks in discarded area.

Pros: avoid extra filter

-- 
Best regards,
Vladimir