[et-mgmt-tools] VM images

David Lutterkort dlutter at redhat.com
Tue Jun 12 18:19:14 UTC 2007

On Tue, 2007-06-12 at 16:21 +0100, Daniel P. Berrange wrote:
> On Mon, Jun 11, 2007 at 12:06:15PM -0700, David Lutterkort wrote:
> > Look at it from the user's point of view: Are you interested in a 'app
> > foo appliance' or in a 'app foo for paravirt Xen w/ PAE support'
> > appliance ? If you want to keep users from appreciating the finer points
> > of this, you'll need to capture it in metadata. And you'll either make
> > the distinction what you run when you download (one image per host
> > 'type') or when you are about to run the image (one image for all host
> > types) Should users really d/l a few 100 MB of stuff to discover that
> > they should have clicked on the 'Xen pv w/ PAE' link rather than the
> > 'Xen pv w/o PAE' link ?
> So this really raises the question of exactly what problem we're trying
> to address here with the image tools. Are we 
>   a) Trying to come up with a way to build & describe a single image
>      which can be deployed to Xen PV, Xen HVM, KVM, VMWare + VirtualBox
>   b) Merely trying to come up with a consistent packaging & metadata
>      scheme, allowing images to be built to target specific virt
>      platforms
> a) is really the superset of b), but I'm not sure how far we'll be able
> to get with a) given the differences in the way all these different 
> virt platforms work & the types of hardware they expose. 

I would rephrase (a) as "How can we hide as many of the virt
technology-specific details for running a VM image from the user as
possible" ... ideally users would only need to know that they are
running on a libvirt host, and get that image for it.

> I can't help thinking its gonna take more than just a different boot kernel fiel
> and initrd to be able to provide an image that works seemlessly on
> all different virt platforms.

Switching out one of the disks for pv vs. hvm lets you do more than
that, and calling that disk /boot is probably misleading. 

It's a little bit like stateless, where almost all files come from one
image, but a few select files are bind mounted into that image (for
stateless, to make them writable, for virt images, to gloss over the
differences between pv and fv, or other subtle differences in the future
like whether pv drivers are available or not)

> > And it seems a tad less yucky than having the appliance boot hvm, ask an
> > XML-RPC server if pv is ok, and then booting pv.
> I wouldn't suggest that as an approach either. 

Nobody should ;)

> First of all we need to
> realize that this whole problem space is basically just caused by the
> Xen paravirt boot process

What about things like PAE vs. non-PAE ? Outside of pv Xen, can you
always assume that your host can support both kinds of guests ?

>  - I don't like the idea creating an overly
> complex metadata format to deal with the dumbassnmess of a particular
> HV, particularly if there is active work to address this weakness of
> Xen.

To be clear: the complexity that would go away is (a) having multiple
<boot> elements instead of just one and (possibly) (b) separating the
description of disks into the <storage> section and how they are to be
mapped into the guest, though I like this split, since there are certain
things that are attributes of the disks, and others that are attributes
of how those disks are mapped into the guest.

The code that checks whether the image can be run on the host is
valuable in itself, as that allows much better UI's than 'try to run it
and see what happens' - unless we can always assume that any image can
be run on any virtual host.

> Two alternatives I can think of. Basically assume a single filesystem
> image, which contains /boot as part of the main FS. 
>   - When building the guest install both kernel & kernel-xen inside
>     the image. The /etc/grub.conf should contain the entries only
>     for the baremetal kernel. The paravirt kernel lines should be 
>     in a separate  /etc/pygrub.conf.  Now adapt pygrub so that it
>     first looks for an /etc/pyrub.conf in the guest image. That way
>     if you boot under HVM, it automatically gets a baremetal kernel
>     and if you boot PV, it automatically gets  Xen kernel. No separate
>     /boot required.

How do you deal with xvc0 ? You want an entry for it in /etc/inittab for
pv Xen, but don't want it for hvm. Similarly, you mentioned that there
may be driver/config differences ... if they can all be autodetected,
that would of course be ideal.

>   - When building the guest install only provide a HVM kernel. In the
>     bundle containing the image & metadata, provide a separate Xen
>     kernel & initrd outside the context of the main filesystem. Boot
>     these directly instead of using pygrub.

Although the image metadata allows it right now, I am not so fond of
having the kernel/intird outside of the image, since you won't pick up
updates that are done inside the image.

> > > Is this even possible under Linux in the general case ?
> > 
> > The first thing you run into when you have this setup is that inittab
> > has an entry for xvc0 for pv, but that that entry gets in the way for
> > fv. You can get around that by symlinking/bindmounting /etc/inittab into
> > the /boot disk (which should probably not be called /boot disk anymore)
> Or running kudzu on every boot, so it detects & re-writes inittab
> according to what console/serial lines are available. Actually I
> think it should do this already.

kudzu runs after init read inittab.

> > >  eg different Xorg configs - Cirrus vs fbdev, 
> > 
> > No idea; though I think not having to maintain n root images for n host
> > types would be pretty attractive.
> > 
> > > different filesystem names in /etc/fstab ? 
> > 
> > Mount-by-label or symlink/bindmount
> > 
> > > Different kernel modules listed in /etc/sysconfig/
> > >  3. Yuk !
> > >  4. How do you generate the filesystems? If I'm using the Fedora live CD
> > >     creation tools to do a chroot based install and image generation, I
> > >     can choose to install kernel, or kernel-xen, or both. So I'll end up
> > >     with a /boot containing a HVM suitable kernel, or a paravirt kernel.
> > 
> > I used a full virt-install.
> That wasn't quite what i was getting at -  virt-install will build you an
> image which does HVM or PV. It won't spit out a sepatrate 'boot' image 
> for each option. The best you could do was have your kickstart script do
> an install of both kernel & kernel-xen inside the guest. That would give
> you an image with both types of kernel available in /boot, but that's still
> a single filesystem image you now have to take a chainsaw to, to create 
> your boot-hvm.img and boot-pv.img. 

Pretty much what I did.

> Now depending on which you boot, your guest will fail an RPM verify on
> either kernel, or kernel-xen because you had to rip it apart.

No; I copied the one boot disk, mounted it and edited /etc/grub.conf.
rpm won't trip over it. To reduce the overall image size, you could
remove the unused kernel/intird from the boot disk, but, as you said, at
the price of surprising rpm.

> > >     Most image generation tools won't even generate a separate image file
> > >     for /boot, just having one main rootfs.
> > 
> > We'd need that ability anyway, e.g. to separate system from data disks
> > for image-based updates.
> Data disks are a little easier to deal with - you simply splice in a data
> disk at the approach part of the FS tree. The /boot thing is nasty becasue
> you're taking a single directory & trying to munge it into 2 different 
> disks.

I was talking about image generation tools creating more than one image

> We definitely need to figure out how to do this, because single unified kernel
> is how Xen will be merged in LKML. Likewise for LGuest, or VMWare paravirtops.
> Separate kernels are going the way of the dodo

When will they get there ?

> > > If people really want the ability to provide images in short term which do 
> > > both HVM and paravirt, then effort is better focused on making it easy for
> > > the image creation tools to spit out two separate sets of image.
> > 
> > Will unified kernels really do away with all the differences between hv
> > and pv from the guest's POV ?
> Not immediately, but over time the core differences in the base kernel will
> be irrelevant. So you'll just be left with the issue of dynamically determining
> what hardware you're running on - which is a problem which already exists in
> HVM when you consider  Xen vs KVM vs VMware vs VirtualBox.

BTW, nobody forces an image builder to do this splitting of boot; think
of it as a compression: we simply stick one instead of two identical
root fs's into the image that the user downloads. 

What I really care about is reducing the number of permutations for an
appliance that the user needs to understand; and I'd rather have
something that works, even if it's kludgy, today. If those kludges go
away, even better.


More information about the et-mgmt-tools mailing list