[RFC PATCH 0/7] To make <transient/> disk sharable
pkrempa at redhat.com
Sat Jan 23 10:54:51 UTC 2021
On Fri, Jan 22, 2021 at 20:10:59 -0500, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma <m.mizuma at jp.fujitsu.com>
> This patch series is RFC to implement to make <transient/> disk sharable.
> Currently, <transient/> disk option for qemu uses blockdev-snapshot QMP
> command with overlay.
> In that case, qemu holds the write lock to the original disk, so we cannot
> share the original disks with the other qemu guests.
> This patch series tries to implement to make the disks, which have
> <transient/> disk option, sharable by using disk hot-plug.
> First, create the overlay disk with the original disk is set as the backingStore.
> Then, blockdev-del the StorageProgs and FormatProgs of the disk. That's
> because to fix the bootindex of the disk.
> Lastly, device_add the disks and CPUs start.
So I had a look at the proposed implementation and its too ugly and
Firstly one of the overcomplicated bits is the part which adds the disk
backend, then removes it and then adds it back. That's too much.
Secondly you are hotplugging ALL virtio disks even those which are not
transient, that won't work either. This must stay limited to <transient>
disk to minimize the impact.
Thirdly the code is also overcomplicated by the fact that you workaround
the fact that qemuDomainAttachDeviceDiskLiveInternal doesn't actually
support hotplug of transient disks. You then prepare everything
somewhere behind it's back and try to hotplug the disk as not-transient.
It would be way better if you support hotplug of transient disk in the
first place reusing the hotplug of the backend and adding image creating
for the frontend.
I also don't understand the issue with bootindex. If the problem is that
bootindex is calculated in qemuBuildDisksCommandLine and used directly
then you'll need to refactor the code in first place, pre-calculate the
indices and store them in the disk private data and then use them from
there, but you must not modify disk->info.bootIndex as that gets
represented back to the user when dumping XML.
The new code also the name of the VM as suffix to the temporary image,
but doesn't update the previous code which is used for disks which don't
support hotplug. That needs to be fixed too.
The fact that we are doing hotplug of the disk should also stay hidden,
so the code doing the hotplug should not actually emit the device-added
event and the whole thing should not need to update the device list at
all, but should be placed before the regular update of the device list.
In the end that means you shouldn't need 'TransientDiskSharable' private
data variable and shouldn do any modification of disk->src->readonly and
Please note that this is a high level review. I've spotted some coding
style inconsistencies and such, but since this will need significant
rework I'll not point them out.
Also don't forget to add test cases, where it will be visible that the
disk (neither frontend nor backend) is added on the commandline.
Also it should be fairly trivial to support it for SCSI disks since they
support hotplug too.
More information about the libvir-list