[RFC PATCH 0/7] To make <transient/> disk sharable
pkrempa at redhat.com
Mon Jan 25 10:11:43 UTC 2021
On Mon, Jan 25, 2021 at 10:04:41 +0000, Daniel Berrange wrote:
> On Mon, Jan 25, 2021 at 10:51:09AM +0100, Peter Krempa wrote:
> > On Mon, Jan 25, 2021 at 09:33:31 +0000, Daniel Berrange wrote:
> > > On Sat, Jan 23, 2021 at 11:54:51AM +0100, Peter Krempa wrote:
> > > > On Fri, Jan 22, 2021 at 20:10:59 -0500, Masayoshi Mizuma wrote:
> > > > > From: Masayoshi Mizuma <m.mizuma at jp.fujitsu.com>
> > [...]
> > > > Please note that this is a high level review. I've spotted some coding
> > > > style inconsistencies and such, but since this will need significant
> > > > rework I'll not point them out.
> > > >
> > > > Also don't forget to add test cases, where it will be visible that the
> > > > disk (neither frontend nor backend) is added on the commandline.
> > > >
> > > > Also it should be fairly trivial to support it for SCSI disks since they
> > > > support hotplug too.
> > >
> > > Is there a reason we can't use the existing <readonly/> element as a way
> > > to make the underlying base image be treated as read only and thus become
> > > sharable ?
> > The problem is that the qemu device frontend code infers the readonly
> > state from the backend image specification when it's instantiated.
> > Thus if you start qemu with the topmost layer marked as read-only so
> > that the write lock doesn't get asserted, this fact gets stored in the
> > frontend. If you then later insert a writable layer into the chain on
> > top of it, the frontend stays read-only.
> > IDE/SATA disks are outhright refused to be read-only by qemu, virtio
> > disks remember that they are read-only and keep it into guest execution
> > and SCSI disks caused an abort() (which was recently fixed in qemu
> > upstream.)
> > Thus the only two options are:
> > 1) hotplug the frontend after we do the shenanigans with the image
> > 2) implement explicit control of the read-only state of the frontend in
> > qemu in the first place, adapt to it in libvirt and then keep using
> > the current approach in libvirt
> I must be missing something here, because the topmost layer with
> <transient> isn't read-only - it would be writable as that's the
> whole point of the throwaway transient overlay file. It just feels
> to me that the situation desired is already one that is supported
> and working when explicitly top+base are specified in the XML.
> should be treated as identical to the first. During domain
> startup preparation, we would effectvely transform the latter
> into the former, with a random tempfile for top.qcow. This
Your assumptions are right, but the problem is how it's done:
The temporary file is created with 'blockdev-create' during the
initialization of the VM. That means that the chain starts with all the
existing layers, and we then add the throwaway layer before starting
execution of the guest once qemu is already started.
Unfortunately that's after the frontend is initialized (and thus
remembers the readonly state) and after the storage is already open
(thus it already asserted-or wanted to assert the write lock).
'blockdev-create' is used to prevent another invocation of qemu-img
More information about the libvir-list