[libvirt] changes to domain XML for SCSI support

Dave Allan dallan at redhat.com
Wed Oct 26 16:15:12 UTC 2011


On Wed, Oct 26, 2011 at 05:51:55PM +0200, Paolo Bonzini wrote:
> Hi all,
> 
> let's kick off the discussion on what changes are needed in domain
> XML for more complete SCSI support.
> 
> There are three relevant topics:
> 
> 1) providing channel/target/lun addresses for SCSI disks;
> 
> 2) supporting LUN passthrough;
> 
> 3) supporting SCSI host passthrough.
> 
> 
> A fourth topic is supporting NPIV.  It is a special case of SCSI
> host passthrough, and it is important that any extension to the
> domain XML can cover it.
> 
> 
> Enhanced addressing for SCSI devices
> ====================================
> 
> This is the simplest part.  The proposal is to add a new address type
> 
>    <address type='scsi' host='...'
>                         bus='...' target='...' lun='...'/>
> 
> where host selects the qdev parent device, while channel/target/lun
> are passed as qdev properties (the QEMU names are respectively
> channel, scsi-id, lun).
> 
> Libvirt should check for QEMU 1.0 and, for older versions, only
> allow channel=lun=0 and 0<=target<=7.
> 
> 
> LUN passthrough
> ===============
> 
> A SCSI block device from the host can be attached to a domain in two
> ways: as an emulated LUN with SCSI commands implemented within QEMU,
> or by passing SCSI commands down to the block device.  The former is
> handled by the existing <disk type='disk'> and <disk type='cdrom'>
> XML syntax.  The latter is not yet supported.
> 
> On the QEMU side, LUN passthrough is implemented by one of the
> scsi-generic and scsi-block devices.  Scsi-generic requires a
> /dev/sg device name, and can be applied to any device.  scsi-block
> is only available in QEMU 1.0 or newer, requires a block device, can
> be applied only to block devices (sd/sr) and has better performance.
> The choice between one and the other should be as transparent as
> possible.
> 
> Currently, using a block device as the backend for a virtio disk
> implements a kind of LUN passthrough, since the guest can execute
> 
> There are two possible choices here:
> 
> 1) add a new <hostdev> tag.
> 
>   <hostdev mode='subsystem' type='scsi'>
>     <source>
>       <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>     </source>
>     <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>   </hostdev>
> 
> Advantages:
> 
> - allows using the same XML for all SCSI devices (i.e. scsi-generic
> vs. scsi-block is an internal detail of libvirt);
> 
> Disadvantages:
> 
> - does not make it clear which device is being passed through;
> 
> - completely different from the syntax that virtio is using for the
> same purpose; perhaps virtio could be covered by
> 
>   <hostdev mode='subsystem' type='scsi'>
>     <source>
>       <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>     </source>
>     <target dev='vda' bus='virtio'/>
>     <address type='pci' host='...' bus='...' target='...' lun='...'/>
>   </hostdev>
> 
> - <address> specifies the address to a <capability type='scsi'>
> device, but the device to be passed to scsi-block is its
> block_sdXX_* child (aside: it would be nice if the /dev/sgNN name
> was placed somewhere in the nodedev XML for <capability type='scsi'>
> devices);
> 
> - emulated and passthrough LUNs have a completely different XML;
> 
> - host numbers are not stable when hotplugging.
> 
> 2) add a new <drive device='lun'> attribute.
> 
>   <drive type='block' device='lun'>
>     <driver name='qemu' type='raw'/>
>     <source dev='/dev/sda'/>
>     <target dev='sda' bus='scsi'>
>     <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>   </drive>
> 
> Advantages:
> 
> - allows using the same syntax for virtio and SCSI.  virtio could be
> changed to accept device='lun' too.
> 
> - the passed-through device is immediately visible
> 
> - a stable addressing is available via /dev/disk/by-id and /dev/disk/by-path
> 
> - can easily switch a disk between emulated and passthrough modes;
> 
> Disadvantages:
> 
> - does not extend to scsi-generic and to host passthrough;
> 
> 
> 3) something between (1) and (2).  If I understand correctly
> http://www.redhat.com/archives/libvir-list/2008-July/msg00429.html
> this would use <hostdev mode='capability'>.  More on this below.
> 
> 
> SCSI target/host passthrough: rethinking <hostdev mode='capability'>
> ====================================================================
> 
> SCSI target/host passthrough passes the entire set of LUNs attached
> to a SCSI target or host.  On the QEMU side, this is done manually
> by adding a scsi_block or scsi_generic device for each LUN.
> 
> This can be realized using something like:
> 
>   <hostdev mode='subsystem' type='scsi_host'>
>     <source>
>       <address type='scsi' host='...'/>
>     </source>
>     <address type='scsi' host='...'/>
>   </hostdev>
> 
>   <hostdev mode='subsystem'  type='scsi_target'>
>     <source>
>       <address type='scsi' host='...' bus='...' target='...'/>
>     </source>
>     <address type='scsi' host='...' bus='...' target='...'/>
>   </hostdev>
> 
> However, as for LUN passthrough, the main problem is that Linux host
> indices are not stable.  Thus, in this case using <hostdev
> mode='capability'> seems like the only reasonable possibility.
> 
> That said, <hostdev mode='capability'> has never been documented and
> never even implemented.  For this reason, I'm proposing to redo its
> functionality in a different way.  The two examples given in
> http://www.redhat.com/archives/libvir-list/2008-July/msg00429.html
> were the following:
> 
> >A network card by name (ie for OpenVZ)
> >
> >  <hostdev mode='capability'>
> >    <source name='eth0'/>
> >  </hostdev>
> >
> >A SCSI device by name (eg, SCSI PV passthrough), also specifying
> >the target adress
> >
> >  <hostdev mode='capability' type='scsi'>
> >    <source name='sg3'/>
> >    <target address='0:0:0:0'/>
> >  </hostdev>
> 
> In my proposal:
> 
> 1) the "mode" attribute is dropped (more precisely, only "subsystem"
> is allowed and never printed; everything else is rejected);
> 
> 2) the "type" attribute can in principle get any value that is valid
> for a nodedev capability---more or less: for example the usb type
> maps to the usb_device capability; :(
> 
> 3) the "source" element can get a name "attribute" pointing to a
> nodedev name, and a "rel" attribute that is "child" or "parent".
> "child" instructs libvirt to search for a device possessing the
> given capability, and that is a child of the named device; "parent"
> instructs libvirt to pick the parent of the indicated device.  When
> the "name" attribute is included, the element must be empty.
> 
> Given this, here is how the two examples above would look like:
> 
> A network card for OpenVZ:
> 
> - by name (has adding aliases for nodedevs ever been considered,
> such as simply "eth0" in this case?):
> 
>   <hostdev type='net'>
>     <source name='net_eth0_00_22_68_0b_dc_ac'/>
>   </hostdev>
> 
> - by position:
> 
>   <hostdev type='net'>
>     <source rel='child' name='pci_0000_00_19_0'/>
>   </hostdev>
> 
> 
> A SCSI device:
> 
> - by name:
> 
>   <hostdev type='scsi'>
>     <source name='scsi_0_0_0_0'/>
>     <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>   </hostdev>
> 
> - by position (aliases also would allow to specify /dev/sda easily):
> 
>   <hostdev type='scsi'>
>     <source rel='parent' name='block_sda_ST9160411AS_5TG11QWL'/>
>     <address type='scsi' host='...' bus='...' target='...' lun='...'/>
>   </hostdev>
> 
> 
> A SCSI host:
> 
> - by name:
> 
>   <hostdev type='scsi_host'>
>     <source name='scsi_host0'/>
>     <address type='scsi' host='...'/>
>   </hostdev>
> 
> - by position:
> 
>   <hostdev type='scsi_host'>
>     <source rel='child' name='pci_0000_00_1f_2'/>
>     <address type='scsi' host='...'/>
>   </hostdev>
> 
> 
> NPIV support: generalizing hostdev source addresses
> ===================================================
> 
> In NPIV, a virtual HBA is created using "virsh nodedev-create" and
> passed to the guest.  Such virtual adapter does have a stable
> address, namely its WWN.  As such, it can be addressed simply by
> generalizing the kind of source address that can be passed to
> <hostdev type='scsi_host'/>:
> 
>   <hostdev type='scsi_host'>
>     <source>
>       <address type='wwn' wwpn='...' wwnn='...'/>
>     </source>
>   </hostdev>
> 
> (Note that this doesn't use <source name='...'/> and, as such, it
> does not rely on the ideas above).

How do you envision migration working with NPIV?

> Ideas and opinions are welcome!
> 
> Paolo
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list




More information about the libvir-list mailing list