[libvirt] [PATCH] vz: set mount point for container image-based disks

Mon Sep 21 09:23:37 UTC 2015

On Mon, Sep 21, 2015 at 12:14:57PM +0300, Maxim Nestratov wrote:
> 21.09.2015 11:44, Daniel P. Berrange пишет:
> >On Sun, Sep 20, 2015 at 10:17:51PM +0300, Maxim Nestratov wrote:
> >>From: Maxim Nestratov <mnestratov at virtuozzo.com>
> >>
> >>In order to support not only root disks with type=file for containers,
> >>we need to specify mount points for them.
> >>For instance, if a secondary disk is added by the following record in
> >>xml:
> >>
> >>     <disk type='file' device='disk'>
> >>       <driver type='ploop' cache='writeback'/>
> >>       <source file='/vz/some_path_to_image_dir'/>
> >>       <target bus='sata' dev='sdb'/>
> >>     </disk>
> >>
> >>we are going to add it to container mounted to '/mnt/sdb' path.
> >That's not what the <disk> element is for.  <disk> is about exposing
> >block device nodes to the container. It shouldn't try todo anything
> >with this device nodes. They might be used as raw data storage by
> >an application, so we can't assume they should be mounted.
> >
> >If you want to mount things then you should be using <filesystem>
> >instead.
> >
> Hm. It actually means that any disks with type file shouldn't work in
> containers. Right?

No, the disk source on the host is not correlated to how it is
exposed to the guest.

With <disk type="file"> it means you have to setup some kind of
loop device in the host OS, and then expose the block device to
the guest with that. The same if you have <disk type="network">
then you have to use host kernel or qemu-nbd, for example, to
setup a block device that you can then expose to the guest.

> And working root disks like this is a mistake? But why?

Yes, that would be a mistake too. The root filesystem should
be exposed using <filesystem> with a <target> of /

> In vz, any images plugged into containers are also treated as disks. The
> only difference between 'filesystem' and 'disk' is whether we should mount
> it or not. That's all. While from point of view of a container user it is
> just another storage. Why not  just mount it automatically?

The <disk> element is intended to expose raw block devices to the guest.

The <filesystem> is intended to expose mounted volumed to the guest.

A <disk> can support both type=block and type=file as sources on the
host side, as well as others like type=network.

A <filesystem> can support both type=block and type=file as sources
on the host, as well as a few others.

If you make <disk> automatically mount volumes, then you've just
discarded the only semantic difference between <filesystem> and
<disk>, which is not only wrong but also pointless. If you want
to mount it, you should just use <filesystem>, and not try to
make <disk> do the same as <filesystem>.

It is key that <disk> does *not* try to interpret what todo with
the storage - it is entirely upto the guest to decide what todo
with it. For example, an oracle database might decide to use the
block device directly as data storage. Or you might be running an
application that wants to manipulate the filesystem (eg a fsck
tool) inside the block dev, so again it would be inappropriate to
mount it.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|