[libvirt] virtio-scsi support proposal, v2

Daniel P. Berrange berrange at redhat.com
Tue Jan 3 17:00:22 UTC 2012


On Fri, Dec 23, 2011 at 09:36:23AM +0100, Paolo Bonzini wrote:
> Here is a revised version of the virtio-scsi proposal.  There's actually
> not too much left intact from v1. :)

The devil is in the details, but I think I broadly agree with everything
suggested in the proposal below.

> The main simplification is in how SCSI hosts can be addressed in a stable
> manner.
> 
> 
> SCSI controller models
> ======================
> 
> Existing controller models are "auto", "buslogic", "lsilogic", "lsias1068",
> or "vmpvscsi".  The new controller model "virtio-scsi" is added.  The model
> "lsilogic" is mapped to the existing "lsi" device in QEMU.
> 
> When PPC64 support will be added, another controller model "spapr-vscsi"
> will be added.
> 
> 
> Stable addressing for SCSI devices
> ==================================
> 
> The existing <address type='drive' ...> element will be extended as follows:
> 
>    <address type='drive' controller='...'
>                         bus='...' target='...' unit='...'/>
> 
> where controller selects the qdev parent device, while bus/target/unit
> are passed as qdev properties (the QEMU names are respectively channel,
> scsi-id, lun).
> 
> Libvirt should check for the QEMU "scsi-disk.channel" property.  If it
> is unavailable, QEMU will only support channel=lun=0 and 0<=target<=7.
> 
> 
> LUN passthrough: block devices
> ==============================
> 
> A SCSI block device from the host can be attached to a domain in two
> ways: as an emulated LUN with SCSI commands implemented within QEMU,
> or by passing SCSI commands down to the block device.  The former is
> handled by the existing <disk type='file'>, <disk type='block'> and
> <disk type='network'> XML syntax.  The latter is not yet supported.
> 
> On the QEMU side, LUN passthrough is implemented by one of the
> scsi-generic and scsi-block devices.  Scsi-generic requires a /dev/sg
> device name, and can be applied to any device.  scsi-block is only
> available in QEMU 1.0 or newer, requires a block device, can be applied
> only to block devices (sd/sr) and has better performance.
> 
> To implement LUN passthrough for block device, libvirt will add a new
> <disk device='lun'> attribute.  When, device='lun' is passed, the device
> attribute is ignored.
> 
> Example:
> 
>   <disk type='block' device='lun'>
>     <disk name='qemu' type='raw'/>
>     <source dev='/dev/sda'/>
>     <target dev='sda' bus='scsi'>
>     <address type='drive' controller='...'
>                         bus='...' target='...' unit='...'/>
>   </disk>
> 
> Also, virtio-blk handling will be enhanced to disable SG_IO passthrough
> when <disk device='disk'>, and only enable it when <disk device='lun'>.
> 
> (I am not sure whether the 'lun' value should be for the type or device
> attribute.  Laine has a patch to implement it for virtio disks which
> uses "type").


If you consider today we have

 type=block
 type=file
 type=network

any of these 3 can be fronted by a virtio-blk device in the
guest which allows SG_IO.  With type='file' SG_IO is trivially
blocked. We have initially focused on type=block, presuming
that type=network doesn't support SG_IO either. Thus we were
free to suggest a new type=lun to replace type=block without
getting into an ambiguity.

In retrospect I don't think this presumption is valid. I think
it is conceivable that type=network could support SG_IO when
pointed to QEMU's userspace iSCSI block driver.

Thus I think we need to rather use device=lun, as suggested in
this proposal

> 
> This syntax makes it clear what is the passed-through device, and at
> the same time it makes it very easy to switch a disk between emulated
> and passthrough modes.  Also, a stable addressing for the source device
> is provided by /dev/disk/by-id and /dev/disk/by-path.
> 
> 
> Stable SCSI host addressing
> ===========================
> 
> SCSI host number in Linux is not stable.  An alternative stable
> addressing is required to pass a whole host or target to a guest.
> 
> One place in which this could be supported is the SCSI volume pool
> syntax:
> 
>       <pool type='scsi'>
>         <name>virtimages</name>
>         <source>
>           <adapter name='host0'/>
>         </source>
>         <target>
>           <path>/dev/disk/by-id</path>
>         </target>
>       </pool>
> 
> libvirt will deprecate the above form for the adapter element and
> provide the following forms:
> 
>           <adapter name='scsi_host0'/>
> 
>           <adapter parent='pci_0000_00_1f_2' unique_id='1'/>
> 
> The existing form changes from host0 to scsi_host0, for
> consistency with the naming that is used in nodedev.  The new
> parent/unique_id addressing uses a parent PCI device and a unique
> id that Linux provides in sysfs.  In order to determine the SCSI
> host number, libvirt would scan all files matched by the glob pattern
> /sys/bus/pci/devices/0000:00:1f.2/*/scsi_host/*/unique_id, looking for
> the one that contains "1".
> 
> The unique_id can be omitted.  In this case, the pool will refer
> to the host with the smallest unique_id under the given device.
> 
> Furthermore, a SCSI pool can be restricted to one target using an
> additional element:
> 
>         <source>
>           <adapter name='scsi_host0'/>
>           <address type='scsi' bus='0' target='0'/>
>         </source>
> 
> (bus defaults to 0, target is mandatory).
> 
> 
> Generic passthrough
> ===================
> 
> Generic device passthrough at the LUN, target or host level builds
> on the extensions to SCSI addressing from the previous section.
> 
> Passing a single LUN extends the <hostdev> tag as follows:
> 
>   <hostdev type='scsi'>
>     <source>
>       <adapter name='scsi_host0'/>
>       <address type='scsi' bus='0' target='0' unit='0'/>
>     </source>
>     <target>
>       <address type='scsi' controller='...'
>                         bus='...' target='...' unit='...'/>
>     </target>
>   </hostdev>
> 
> This will map to a -drive QEMU option referring to a scsi-generic
> device, and a "-device scsi-generic" option referring to the drive.
> libvirt can determine the /dev/sg file to use by reading the directory
> /sys/bus/scsi/devices/target*/*/scsi_generic.  These devices might also
> be shown in the nodedev tree, similar to block devices.
> 
> Whenever a domain should receive all devices belonging to a SCSI host,
> a similar <source> item should be included within the <controller
> type='scsi'> element:
> 
>         <controller type='scsi' model='virtio-scsi'>
>           <source>
>             <adapter name='scsi_host0'/>
>           </source>
>         </controller>
> 
> In this case, libvirt should use scsi-block rather than scsi-generic
> for block devices.
> 
> 
> NPIV-based SCSI host passthrough
> ================================
> 
> In NPIV, a virtual HBA is created using "virsh nodedev-create" and passed
> to the guest.  Passing through a whole SCSI host is quite common when
> using NPIV.  As a result, it is desirable to easily address virtual HBAs
> both in SCSI storage pools and in <controller type='scsi'> elements.
> 
> Here are two proposals for how to refer to NPIV adapters:
> 
> 1) add persistent nodedevs via commands nodedev-define, nodedev-undefine,
> nodedev-start.  The persistent nodedevs have a name, and this can be
> used simply with <adapter name='NAME'>.
> 
> 2) Virtual adapters do have a stable address, namely its WWN.  This
> can be used in a third <adapter> syntax:
> 
>     <source>
>       <adapter type='fc_host' wwpn='...' wwnn='...'/>
>     </source>


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list