[libvirt] rbd storage pool support for libvirt

Fri Nov 19 12:55:07 UTC 2010

On Fri, Nov 19, 2010 at 9:50 AM, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Fri, Nov 19, 2010 at 09:27:40AM +0000, Stefan Hajnoczi wrote:
>> On Thu, Nov 18, 2010 at 5:13 PM, Sage Weil <sage at newdream.net> wrote:
>> > On Thu, 18 Nov 2010, Daniel P. Berrange wrote:
>> >> On Wed, Nov 17, 2010 at 04:33:07PM -0800, Josh Durgin wrote:
>> >> > Hi Daniel,
>> >> >
>> >> > On 11/08/2010 05:16 AM, Daniel P. Berrange wrote:
>> >> > >>>>In any case, before someone goes off and implements something, does this
>> >> > >>>>look like the right general approach to adding rbd support to libvirt?
>> >> > >>>
>> >> > >>>I think this looks reasonable. I'd be inclined to get the storage pool
>> >> > >>>stuff working with the kernel RBD driver&  UDEV rules for stable path
>> >> > >>>names, since that avoids needing to make any changes to guest XML
>> >> > >>>format. Support for QEMU with the native librados CEPH driver could
>> >> > >>>be added as a second patch.
>> >> > >>
>> >> > >>Okay, that sounds reasonable.  Supporting the QEMU librados driver is
>> >> > >>definitely something we want to target, though, and seems to be route that
>> >> > >>more users are interested in.  Is defining the XML syntax for a guest VM
>> >> > >>something we can discuss now as well?
>> >> > >>
>> >> > >>(BTW this is biting NBD users too.  Presumably the guest VM XML should
>> >> > >>look similar?
>> >> > >
>> >> > >And also Sheepdog storage volumes. To define a syntax for all these we need
>> >> > >to determine what configuration metadata is required at a per-VM level for
>> >> > >each of them. Then try and decide how to represent that in the guest XML.
>> >> > >It looks like at a VM level we'd need a hostname, port number and a volume
>> >> > >name (or path).
>> >> >
>> >> > It looks like that's what Sheepdog needs from the patch that was
>> >> > submitted earlier today. For RBD, we would want to allow multiple hosts,
>> >> > and specify the pool and image name when the QEMU librados driver is
>> >> > used, e.g.:
>> >> >
>> >> >     <disk type="rbd" device="disk">
>> >> >       <driver name="qemu" type="raw" />
>> >> >       <source vdi="image_name" pool="pool_name">
>> >> >         <host name="mon1.example.org" port="6000">
>> >> >         <host name="mon2.example.org" port="6000">
>> >> >         <host name="mon3.example.org" port="6000">
>> >> >       </source>
>> >> >       <target dev="vda" bus="virtio" />
>> >> >     </disk>
>> >> >
>> >> > Does this seem like a reasonable format for the VM XML? Any suggestions?
>> >>
>> >> I'm basically wondering whether we should be going for separate types for
>> >> each of NBD, RBD & Sheepdog, as per your proposal & the sheepdog one earlier
>> >> today. Or type to merge them into one type 'nework' which covers any kind of
>> >> network block device, and list a protocol on the  source element, eg
>> >>
>> >>      <disk type="network" device="disk">
>> >>        <driver name="qemu" type="raw" />
>> >>        <source protocol='rbd|sheepdog|nbd' name="...some image identifier...">
>> >>          <host name="mon1.example.org" port="6000">
>> >>          <host name="mon2.example.org" port="6000">
>> >>          <host name="mon3.example.org" port="6000">
>> >>        </source>
>> >>        <target dev="vda" bus="virtio" />
>> >>      </disk>
>> >
>> > That would work...
>> >
>> > One thing that I think should be considered, though, is that both RBD and
>> > NBD can be used for non-qemu instances by mapping a regular block device
>> > via the host's kernel.  And in that case, there's some sysfs-fu (at least
>> > in the rbd case; I'm not familiar with how the nbd client works) required
>> > to set up/tear down the block device.
>>
>> An nbd block device is attached using the nbd-client(1) userspace tool:
>> $ nbd-client my-server 1234 /dev/nbd0 # <host> <port> <nbd-device>
>>
>> That program will open the socket, grab /dev/nbd0, and poke it with a
>> few ioctls so the kernel has the socket and can take it from there.
>
> We don't need to worry about this for libvirt/QEMU. Since QEMU has native
> NBD client support there's no need to do anything with nbd client tools
> to setup the device for use with a VM.

I agree it's easier to use the built-in NBD support.  Just wanted to
provide the background on how NBD client works when using the kernel
implementation.

Stefan