[RFC PATCH 0/7] To make <transient/> disk sharable

Masayoshi Mizuma msys.mizuma at gmail.com
Mon Jan 25 22:33:16 UTC 2021


On Sat, Jan 23, 2021 at 11:54:51AM +0100, Peter Krempa wrote:
> On Fri, Jan 22, 2021 at 20:10:59 -0500, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <m.mizuma at jp.fujitsu.com>
> > 
> > This patch series is RFC to implement to make <transient/> disk sharable.
> > 
> > Currently, <transient/> disk option for qemu uses blockdev-snapshot QMP
> > command with overlay.
> > In that case, qemu holds the write lock to the original disk, so we cannot
> > share the original disks with the other qemu guests.
> > 
> > This patch series tries to implement to make the disks, which have
> > <transient/> disk option, sharable by using disk hot-plug.
> > 
> > First, create the overlay disk with the original disk is set as the backingStore.
> > Then, blockdev-del the StorageProgs and FormatProgs of the disk. That's
> > because to fix the bootindex of the disk.
> > Lastly, device_add the disks and CPUs start.

Thank you for your review!

> 
> So I had a look at the proposed implementation and its too ugly and
> complicated.
> 
> Firstly one of the overcomplicated bits is the part which adds the disk
> backend, then removes it and then adds it back. That's too much.
> 
> Secondly you are hotplugging ALL virtio disks even those which are not
> transient, that won't work either. This must stay limited to <transient>
> disk to minimize the impact.

I agree that it's overkill to hotplugging all disks...
I'm hotplugging all virtio disks because I have experienced a boot order issue.
Case: the domain xml has two virtio disks, the one has <transient/> and the boot order
is 1, the other doesn't have <transient/> and the boot order is 2. In that case,
qemu tried to boot from the latter disk after the former disk is hot-added even though
the order is 2. The boot issue doesn't happen if the latter disk is also hot-added.

> 
> Thirdly the code is also overcomplicated by the fact that you workaround
> the fact that qemuDomainAttachDeviceDiskLiveInternal doesn't actually
> support hotplug of transient disks. You then prepare everything
> somewhere behind it's back and try to hotplug the disk as not-transient.
> 
> It would be way better if you support hotplug of transient disk in the
> first place reusing the hotplug of the backend and adding image creating
> for the frontend.

Make sense, thanks.

> 
> I also don't understand the issue with bootindex. If the problem is that
> bootindex is calculated in qemuBuildDisksCommandLine and used directly
> then you'll need to refactor the code in first place, pre-calculate the
> indices and store them in the disk private data and then use them from
> there, but you must not modify disk->info.bootIndex as that gets
> represented back to the user when dumping XML.
> 
> The new code also the name of the VM as suffix to the temporary image,
> but doesn't update the previous code which is used for disks which don't
> support hotplug. That needs to be fixed too.

Got it.

> 
> The fact that we are doing hotplug of the disk should also stay hidden,
> so the code doing the hotplug should not actually emit the device-added
> event and the whole thing should not need to update the device list at
> all, but should be placed before the regular update of the device list.
> 
> In the end that means you shouldn't need 'TransientDiskSharable' private
> data variable and shouldn do any modification of disk->src->readonly and
> such.

OK.

> 
> Please note that this is a high level review. I've spotted some coding
> style inconsistencies and such, but since this will need significant
> rework I'll not point them out.
> 
> Also don't forget to add test cases, where it will be visible that the
> disk (neither frontend nor backend) is added on the commandline.
> 
> Also it should be fairly trivial to support it for SCSI disks since they
> support hotplug too.

I suppose the patches will be following parts...

  1. Hot-add support for <transient/> disks
  2. Some changes to be sharable <transient/> disks 
  3. virtio and scsi specific changes to support sharable <transient/>
  4. Testing

Thanks!
Masa




More information about the libvir-list mailing list