[RFC PATCH 0/7] To make <transient/> disk sharable

Mon Jan 25 10:13:51 UTC 2021

On Mon, Jan 25, 2021 at 11:11:43AM +0100, Peter Krempa wrote:
> On Mon, Jan 25, 2021 at 10:04:41 +0000, Daniel Berrange wrote:
> > On Mon, Jan 25, 2021 at 10:51:09AM +0100, Peter Krempa wrote:
> > > On Mon, Jan 25, 2021 at 09:33:31 +0000, Daniel Berrange wrote:
> > > > On Sat, Jan 23, 2021 at 11:54:51AM +0100, Peter Krempa wrote:
> > > > > On Fri, Jan 22, 2021 at 20:10:59 -0500, Masayoshi Mizuma wrote:
> > > > > > From: Masayoshi Mizuma <m.mizuma at jp.fujitsu.com>
> > > 
> > > [...]
> > > 
> > > > > Please note that this is a high level review. I've spotted some coding
> > > > > style inconsistencies and such, but since this will need significant
> > > > > rework I'll not point them out.
> > > > > 
> > > > > Also don't forget to add test cases, where it will be visible that the
> > > > > disk (neither frontend nor backend) is added on the commandline.
> > > > > 
> > > > > Also it should be fairly trivial to support it for SCSI disks since they
> > > > > support hotplug too.
> > > > 
> > > > Is there a reason we can't use the existing  <readonly/> element as a way
> > > > to make the underlying base image be treated as read only and thus become
> > > > sharable ?
> > > 
> > > The problem is that the qemu device frontend code infers the readonly
> > > state from the backend image specification when it's instantiated.
> > > 
> > > Thus if you start qemu with the topmost layer marked as read-only so
> > > that the write lock doesn't get asserted, this fact gets stored in the
> > > frontend. If you then later insert a writable layer into the chain on
> > > top of it, the frontend stays read-only.
> > > 
> > > IDE/SATA disks are outhright refused to be read-only by qemu, virtio
> > > disks remember that they are read-only and keep it into guest execution
> > > and SCSI disks caused an abort() (which was recently fixed in qemu
> > > upstream.)
> > > 
> > > Thus the only two options are:
> > > 
> > > 1) hotplug the frontend after we do the shenanigans with the image
> > > 2) implement explicit control of the read-only state of the frontend in 
> > >    qemu in the first place, adapt to it in libvirt and then keep using
> > >    the current approach in libvirt
> > 
> > I must be missing something here, because the topmost layer with
> > <transient> isn't read-only - it would be writable as that's the
> > whole point of the throwaway transient overlay file. It just feels
> > to me that the situation desired is already one that is supported
> > and working when explicitly top+base are specified in the XML.
> 
> [...]
> 
> > should be treated as identical to the first. During domain
> > startup preparation, we would effectvely transform the latter 
> > into the former, with a random tempfile for top.qcow. This
> 
> Your assumptions are right, but the problem is how it's done:
> 
> The temporary file is created with 'blockdev-create' during the
> initialization of the VM. That means that the chain starts with all the
> existing layers, and we then add the throwaway layer before starting
> execution of the guest once qemu is already started.
> 
> Unfortunately that's after the frontend is initialized (and thus
> remembers the readonly state) and after the storage is already open
> (thus it already asserted-or wanted to assert the write lock).
> 
> 'blockdev-create' is used to prevent another invocation of qemu-img

Wouldn't it make life simpler to just use qemu-img and avoid these
problems in the first place ? 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|