[libvirt] changes to domain XML for SCSI support

Osier Yang jyang at redhat.com
Thu Oct 27 08:09:28 UTC 2011


于 2011年10月26日 23:51, Paolo Bonzini 写道:
> Hi all,
>
> let's kick off the discussion on what changes are needed in domain XML 
> for more complete SCSI support.
>
> There are three relevant topics:
>
> 1) providing channel/target/lun addresses for SCSI disks;
>
> 2) supporting LUN passthrough;
>
> 3) supporting SCSI host passthrough.
>
>
> A fourth topic is supporting NPIV. It is a special case of SCSI host 
> passthrough, and it is important that any extension to the domain XML 
> can cover it.
>
>
> Enhanced addressing for SCSI devices
> ====================================
>
> This is the simplest part. The proposal is to add a new address type
>
> <address type='scsi' host='...'
> bus='...' target='...' lun='...'/>
>
> where host selects the qdev parent device, while channel/target/lun 
> are passed as qdev properties (the QEMU names are respectively 
> channel, scsi-id, lun).
>
> Libvirt should check for QEMU 1.0 and, for older versions, only allow 
> channel=lun=0 and 0<=target<=7.
>
>
> LUN passthrough
> ===============
>
> A SCSI block device from the host can be attached to a domain in two 
> ways: as an emulated LUN with SCSI commands implemented within QEMU, 
> or by passing SCSI commands down to the block device. The former is 
> handled by the existing <disk type='disk'> and <disk type='cdrom'> XML 
> syntax. The latter is not yet supported.
>
> On the QEMU side, LUN passthrough is implemented by one of the 
> scsi-generic and scsi-block devices. Scsi-generic requires a /dev/sg 
> device name, and can be applied to any device. scsi-block is only 
> available in QEMU 1.0 or newer, requires a block device, can be 
> applied only to block devices (sd/sr) and has better performance. The 
> choice between one and the other should be as transparent as possible.
>
> Currently, using a block device as the backend for a virtio disk 
> implements a kind of LUN passthrough, since the guest can execute
>
> There are two possible choices here:
>
> 1) add a new <hostdev> tag.
>
> <hostdev mode='subsystem' type='scsi'>
> <source>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </source>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </hostdev>
>
> Advantages:
>
> - allows using the same XML for all SCSI devices (i.e. scsi-generic 
> vs. scsi-block is an internal detail of libvirt);
>
> Disadvantages:
>
> - does not make it clear which device is being passed through;
>
> - completely different from the syntax that virtio is using for the 
> same purpose; perhaps virtio could be covered by
>
> <hostdev mode='subsystem' type='scsi'>
> <source>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </source>
> <target dev='vda' bus='virtio'/>
> <address type='pci' host='...' bus='...' target='...' lun='...'/>
> </hostdev>
>
> - <address> specifies the address to a <capability type='scsi'> 
> device, but the device to be passed to scsi-block is its block_sdXX_* 
> child (aside: it would be nice if the /dev/sgNN name was placed 
> somewhere in the nodedev XML for <capability type='scsi'> devices);
>
> - emulated and passthrough LUNs have a completely different XML;
>
> - host numbers are not stable when hotplugging.
>
> 2) add a new <drive device='lun'> attribute.
>
> <drive type='block' device='lun'>
> <driver name='qemu' type='raw'/>
> <source dev='/dev/sda'/>
> <target dev='sda' bus='scsi'>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </drive>
>

s/drive/disk/

> Advantages:
>
> - allows using the same syntax for virtio and SCSI. virtio could be 
> changed to accept device='lun' too.
>
> - the passed-through device is immediately visible
>
> - a stable addressing is available via /dev/disk/by-id and 
> /dev/disk/by-path
>
> - can easily switch a disk between emulated and passthrough modes;
>
> Disadvantages:
>
> - does not extend to scsi-generic and to host passthrough;
>
>
> 3) something between (1) and (2). If I understand correctly 
> http://www.redhat.com/archives/libvir-list/2008-July/msg00429.html 
> this would use <hostdev mode='capability'>. More on this below.
>
>
> SCSI target/host passthrough: rethinking <hostdev mode='capability'>
> ====================================================================
>
> SCSI target/host passthrough passes the entire set of LUNs attached to 
> a SCSI target or host. On the QEMU side, this is done manually by 
> adding a scsi_block or scsi_generic device for each LUN.
>
> This can be realized using something like:
>
> <hostdev mode='subsystem' type='scsi_host'>
> <source>
> <address type='scsi' host='...'/>
> </source>
> <address type='scsi' host='...'/>
> </hostdev>
>
> <hostdev mode='subsystem' type='scsi_target'>
> <source>
> <address type='scsi' host='...' bus='...' target='...'/>
> </source>
> <address type='scsi' host='...' bus='...' target='...'/>
> </hostdev>
>
> However, as for LUN passthrough, the main problem is that Linux host 
> indices are not stable. Thus, in this case using <hostdev 
> mode='capability'> seems like the only reasonable possibility.
>
> That said, <hostdev mode='capability'> has never been documented and 
> never even implemented. For this reason, I'm proposing to redo its 
> functionality in a different way. The two examples given in 
> http://www.redhat.com/archives/libvir-list/2008-July/msg00429.html 
> were the following:
>
>> A network card by name (ie for OpenVZ)
>>
>> <hostdev mode='capability'>
>> <source name='eth0'/>
>> </hostdev>
>>
>> A SCSI device by name (eg, SCSI PV passthrough), also specifying
>> the target adress
>>
>> <hostdev mode='capability' type='scsi'>
>> <source name='sg3'/>
>> <target address='0:0:0:0'/>
>> </hostdev>
>
> In my proposal:
>
> 1) the "mode" attribute is dropped (more precisely, only "subsystem" 
> is allowed and never printed; everything else is rejected);
>
> 2) the "type" attribute can in principle get any value that is valid 
> for a nodedev capability---more or less: for example the usb type maps 
> to the usb_device capability; :(
>
> 3) the "source" element can get a name "attribute" pointing to a 
> nodedev name, and a "rel" attribute that is "child" or "parent". 
> "child" instructs libvirt to search for a device possessing the given 
> capability, and that is a child of the named device; "parent" 
> instructs libvirt to pick the parent of the indicated device. When the 
> "name" attribute is included, the element must be empty.
>
> Given this, here is how the two examples above would look like:
>
> A network card for OpenVZ:
>
> - by name (has adding aliases for nodedevs ever been considered, such 
> as simply "eth0" in this case?):
>
> <hostdev type='net'>
> <source name='net_eth0_00_22_68_0b_dc_ac'/>
> </hostdev>
>
> - by position:
>
> <hostdev type='net'>
> <source rel='child' name='pci_0000_00_19_0'/>
> </hostdev>
>
>
> A SCSI device:
>
> - by name:
>
> <hostdev type='scsi'>
> <source name='scsi_0_0_0_0'/>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </hostdev>
>
> - by position (aliases also would allow to specify /dev/sda easily):
>
> <hostdev type='scsi'>
> <source rel='parent' name='block_sda_ST9160411AS_5TG11QWL'/>
> <address type='scsi' host='...' bus='...' target='...' lun='...'/>
> </hostdev>
>
>
> A SCSI host:
>
> - by name:
>
> <hostdev type='scsi_host'>
> <source name='scsi_host0'/>
> <address type='scsi' host='...'/>
> </hostdev>
>
> - by position:
>
> <hostdev type='scsi_host'>
> <source rel='child' name='pci_0000_00_1f_2'/>
> <address type='scsi' host='...'/>
> </hostdev>
>
>
> NPIV support: generalizing hostdev source addresses
> ===================================================
>
> In NPIV, a virtual HBA is created using "virsh nodedev-create" and 
> passed to the guest. Such virtual adapter does have a stable address, 
> namely its WWN. As such, it can be addressed simply by generalizing 
> the kind of source address that can be passed to <hostdev 
> type='scsi_host'/>:
>
> <hostdev type='scsi_host'>
> <source>
> <address type='wwn' wwpn='...' wwnn='...'/>
> </source>
> </hostdev>
>
> (Note that this doesn't use <source name='...'/> and, as such, it does 
> not rely on the ideas above).
>
>
> Ideas and opinions are welcome!
>
> Paolo

It looks to me we need both 2) and 3), 2) is to emulate the scsi lun, and 3)
for scsi host passthrough. Just like for PCI device.

Osier




More information about the libvir-list mailing list