[libvirt] overlayfs support to lxc driver

Wed Mar 4 11:20:31 UTC 2015

On Mon, Mar 02, 2015 at 07:45:18PM +0400, Vasiliy Tolstov wrote:
> 2015-03-02 18:22 GMT+03:00 Vasiliy Tolstov <v.tolstov at selfip.ru>:
> > So as i understand i need to add overlayfs like
> > virStorageBackendFileSystem for example virStorageBackendOvlFileSystem
> > but i don't understand how can this pool be used in case of many containers.
> > May be i misunderstand something?
> 
> 
> Or you mean to add .conf/storage_conf.c overlay type fs and add logic
> to processing this. And for such pools i can create xml like:
> <pool type="fs">
>   <name>overlay</name>
>   <source>
>     <dir path="/var/lib/libvirt/filesystem"/>
>     <format type='overlay'/>
>   </source>
>   <target>
>     <path>/var/lib/libvirt/overlay</path>
>   </target>
> </pool>

Ok, I've read a little bit more about overlayfs and see that my
suggestion was incorrect, so ignore everything I said originally.

In particular, I was mistakenly coupling the action of mounting/unmounting
the overlay with the action of creating/deleting the directory containing
the overlay. These are two completely independant actions and so don't
need to be coupled in any way. Only the creating/deleting directory bit
makes sense in the context of storage pools and that code would not in
fact be at all specific to overlayfs. So lets ignore that and just
consider the mount/unmount part of the problem. This we can do just in
the context of the LXC guest XML config.

Currently LXC supports the following interesting filesystem types

 1. External bind mounts - mount dir in host to dir in containre

    <filesystem type='mount' accessmode='passthrough'>
      <source dir='/path/on/host'/>
      <target dir='/path/in/container'/>
    </filesystem>

 2. Internal bind mounts - mount dir in container to other dir in container

    <filesystem type='bind' accessmode='passthrough'>
      <source dir='/path/in/container'/>
      <target dir='/other/path/in/container'/>
    </filesystem>

 3. Block mounts - mount block device in host to dir in container

    <filesystem type='block' accessmode='passthrough'>
      <source file='/dev/sdb1'/>
      <target dir='/path/in/container'/>
    </filesystem>

 4. File mounts - mount image file in host to dir in container

    <filesystem type='file' accessmode='passthrough'>
      <driver name='nbd' type='qcow2'/>
      <source file='/file/on/host/guest.img'/>
      <target dir='/path/in/container'/>
    </filesystem>

Reading the kernel docs at:

  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/overlayfs.txt

For overlayfs to be used, we have to represent 4 pieces of
info in total

 1. one or more lower directory paths
 2. one upper directory path
 3. one work directory path (which must be empty)
 4. the target mount directory

  mount -t overlay overlay -olowerdir=/lower1:/lower1:/lower3,\
        upperdir=/upper,workdir=/work /merged

The fourth item obvious maps to the <target> element in the XML,
which leave us 3 extra directory parameters to be added to the
XML in some manner.

For added fun^H^H^Hpain, any of those lower dirs can themselves
be overlayfs merged mount locations. IOW we have to consider the
possibility of arbitrarily deep nesting too.

We could try to fit in extra parameters in each of the existing
filesystem types to support setting up of an overlay. Or we could
just define a new overlayfs type, or we could leverage the <driver>
type field somehow.

For disks, we have introduced the idea of nested backing stores.
So a qcow2 file, pointing to a qcow2 file pointing to an LVM
device would end up looking like:

    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/domain.qcow'/>
      <backingStore type='file'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/snapshot.qcow'/>
        <backingStore type='block'>
          <format type='raw'/>
          <source dev='/dev/mapper/base'/>
          <backingStore/>
        </backingStore>
      </backingStore>
      <target dev='vdd' bus='virtio'/>
    </disk>

I think it could be appropriate to follow that kind of approach
here too, which would let us in fact deal with the various different
types of mounts we have at each level.

On balance I think we probably want to introduce a new top level
filesystem type for overlayfs. This would have 2 source dirs,
the first would be the upper dir, the second would be the work
dir. There would be backing stores for the lower dirs

So for the simplest operation

 # mount -t overlay overlay -olowerdir=/lower1:/lower1:/lower3,\
        upperdir=/upper,workdir=/work /merged

    <filesystem type='overlay' accessmode='passthrough'>
      <source dir='/upper'/>
      <source dir='/work'/>
      <backingStore type='dir'>
        <target dir='/lower1'/>
      </backingStore>
      <backingStore type='dir'>
        <target dir='/lower2'/>
      </backingStore>
      <backingStore type='dir'>
        <target dir='/lower3'/>
      </backingStore>
      <target dir='/merged'/>
    </filesystem>

That would be a flexible starting point for a first implementation.

If we later want to get more adventurous and later consider adding
further kinds of backing store. For example, if we want nested
overlays

 # mount -t overlay overlay -olowerdir=/backing/lower1:/backing/lower2,\
        upperdir=/backing/upper,workdir=/backing/work /backing/merged
 # mount -t overlay overlay -olowerdir=/backing/merged:/lower1:/lower3,\
        upperdir=/upper,workdir=/work /merged

    <filesystem type='overlay' accessmode='passthrough'>
      <source dir='/upper'/>
      <source dir='/work'/>
      <backingStore type='overlay'>
        <source dir='/backing/upper'/>
        <source dir='/backing/work'/>
        <backingStore type='dir'>
          <target dir='/backing/lower1'/>
        </backingStore>
        <backingStore type='dir'>
          <target dir='/backing/lower2'/>
        </backingStore>
        <target dir='/backing/merged'/>
      </backingStore>
      <backingStore type='dir'>
        <target dir='/lower2'/>
      </backingStore>
      <backingStore type='dir'>
        <target dir='/lower3'/>
      </backingStore>
      <target dir='/merged'/>
    </filesystem>

And if we want to get really adventurous and use qcow2 image as a
backing

 # qemu-nbd /guest.img
 # mount /dev/nbd0 /lower1
 # mount -t overlay overlay -olowerdir=/lower1,\
        upperdir=/upper,workdir=/work /merged

    <filesystem type='overlay' accessmode='passthrough'>
      <source dir='/upper'/>
      <source dir='/work'/>
      <backingStore type='file'>
        <driver name='nbd' type='qcow2'/>
        <source file='/guest.img'/>
        <target dir='/lower1'/>
      </backingStore>
      <target dir='/merged'/>
    </filesystem>

It is perfectly fine if you just implement the first simple case, with
just a single overlayfs and no nesting. The key is just that we get
this XML design right to allow the future extensions I describe

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|