[libvirt] RFC: Migration with NPIV

Daniel P. Berrange berrange at redhat.com
Tue Nov 20 16:29:44 UTC 2012


On Tue, Nov 20, 2012 at 11:26:53AM -0500, Dave Allan wrote:
> On Tue, Nov 20, 2012 at 10:17:11AM +0000, Daniel P. Berrange wrote:
> > On Mon, Nov 19, 2012 at 05:30:11PM +0800, Osier Yang wrote:
> > > Hi,
> > > 
> > > This proposal is trying to figure out a solution for migration
> > > of domain which uses LUN behind vHBA as disk device (QEMU
> > > emulated disk only at this stage). And other related NPIV
> > > improvements which are not related with migration. I'm not
> > > luck to get a environment to test if the thoughts are workable,
> > > but I'd like see if guys have good idea/suggestions earlier.
> > > 
> > > 1) Persistent vHBA support
> > > 
> > >   This is the useful stuff missed for long time. Assuming
> > > that one created a vHBA, did masking/zoning, everything works
> > > as expected. However, after a system rebooting, everything is
> > > just lost. If the user wants to get things back, he has to
> > > find out the preivous WWNN & WWPN, and create the vHBA again.
> > > 
> > >   On the other hand, Persistent vHBA support is actually required
> > > for domain which uses LUN behind a vHBA. Othewise the domain
> > > could fail to start after a system rebooting.
> > > 
> > >   To support the persistent vHBA, new APIs like virNodeDeviceDefineXML,
> > > virNodeDeviceUndefine is required. Also it's useful to introduce
> > > "autostart" for vHBA, so that the vHBA could be started automatically
> > > after system rebooting.
> > > 
> > >   Proposed APIs:
> > > 
> > >   virNodeDevicePtr
> > >   virNodeDeviceDefineXML(virConnectPtr conn,
> > >                          const char *xml,
> > >                          unsigned int flags);
> > > 
> > >   int
> > >   virNodeDeviceUndefine(virConnectPtr conn,
> > >                         virNodeDevicePtr dev,
> > >                         unsigned int flags);
> > > 
> > >   int
> > >   virNodeDeviceSetAutostart(virNodeDevicePtr dev,
> > >                             int autostart,
> > >                             unsigned int flags);
> > > 
> > >   int
> > >   virNodeDeviceGetAutostart(virNodeDevicePtr dev,
> > >                             int *autostart,
> > >                             unsigned int flags);
> > 
> > I don't really much like this approach. IMHO, this should
> > all be done via the virStoragePool APIs instead. Adding
> > define/undefine/autostart to virNodeDevice is really just
> > duplicating the storage pool functionality.
> 
> I like the idea of making vHBAs persist as part of pools; how do you
> envision it should work?  Extend the scsi pools to take a vHBA
> descriptor and then instantiating the vHBA as part of starting the
> pool, or something else?

Yes, pretty much that.  Create when you start the pool, delete
when you destroy the pool.

> > If we do the mapping of HBAs to guest domains using storage
> > pools, then at a guest level, migration requires zero work.
> > 
> > It is simply upto the management app to create the storage
> > pool on the destination host with the same Name + UUID, but
> > with the secondary WWNN/WWPN. The nice thing about this, is
> > that you don't need to hardcode details of a secondary
> > WWNN/WWPN up-front. The management app can just decide on
> > those at the time it performs the migration, so 99% of the
> > time there will only need to be a single vHBA setup on the
> > SAN. During migration the mgmt app can setup a second
> > vHBA for the target host, and once complete, delete the
> > original vHBA entirely. 
> 
> Agreed, although there will of course need to be some degree of
> up-front coordination between the management app and the SAN
> administrators to avoid having to involve them to migrate a VM.

Yep, this is in fact why I like to push off more of this
detail to the mgmt app. Libvirt is unable to talk to the
SAN, so its better if the mgmt app had more direct control
of the VHBA setup/teardown via the storage APIs, than to
do it automagically in virDomainMigrate where the mgmt app
cannot synchronize so easily.

> 
> > > 4) Enrich HBA's XML
> > > 
> > >   It's hard to known the vHBAs created from a HBA with current
> > > implementation. One have to dump XML of each (v)HBAs and find
> > > out the clue with element "parent" of vHBAs. It's good to introduce
> > > new element for HBA like "vports", so that one can easily known
> > > what (how many) vHBAs are created from the HBA?
> > > 
> > >   And also it's good to have the maximum vports the HBA supports.
> > > 
> > >   Except these, other useful information should be exposed too,
> > > such as the vendor name, the HBA state, PCI address, etc.
> > > 
> > >   The new XMLs should be like:
> > > 
> > >   <vports num='2' max='64'>
> > >     <vport name="scsi_host40" wwpn="2101001b32a90004"/>
> > >     <vport name="scsi_host40" wwpn="2101001b32a90005"/>
> > >   </vports>
> > >   <online/>
> > >   <vendor>QLogic</vendor>
> > >   <address type="pci" domain="0" bus="0" slot="5" function="0"/>
> > > 
> > >   "online", "vendor", "address" make sense to vHBA too.
> > 
> > I'm trying to remember how we modelled the parent/child relationship
> > for SR-IOV PCI cards. NPIV is a very similar concept, so we should
> > ideally seek to model the parent/child relationship in the same
> > manner.
> 
> Physical function:
> 
> <device>
>   <name>pci_0000_01_00_0</name>
>   <parent>pci_0000_00_01_0</parent>
>   <driver>
>     <name>igb</name>
>   </driver>
>   <capability type='pci'>
>     <domain>0</domain>
>     <bus>1</bus>
>     <slot>0</slot>
>     <function>0</function>
>     <product id='0x10c9'>82576 Gigabit Network Connection</product>
>     <vendor id='0x8086'>Intel Corporation</vendor>
>     <capability type='virt_functions'>
>       <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/>
>       <address domain='0x0000' bus='0x01' slot='0x10' function='0x2'/>
>       <address domain='0x0000' bus='0x01' slot='0x10' function='0x4'/>
>       <address domain='0x0000' bus='0x01' slot='0x10' function='0x6'/>
>       <address domain='0x0000' bus='0x01' slot='0x11' function='0x0'/>
>       <address domain='0x0000' bus='0x01' slot='0x11' function='0x2'/>
>       <address domain='0x0000' bus='0x01' slot='0x11' function='0x4'/>
>     </capability>
>   </capability>
> </device> 
> 
> Virtual function:
> 
> <device>
>   <name>pci_0000_01_10_0</name>
>   <parent>pci_0000_00_01_0</parent>
>   <driver>
>     <name>igbvf</name>
>   </driver>
>   <capability type='pci'>
>     <domain>0</domain>
>     <bus>1</bus>
>     <slot>16</slot>
>     <function>0</function>
>     <product id='0x10ca'>82576 Virtual Function</product>
>     <vendor id='0x8086'>Intel Corporation</vendor>
>     <capability type='phys_function'>
>       <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
>     </capability>
>     <capability type='virt_functions'>
>     </capability>
>   </capability>
> </device>
> 
> Interesingly, I think there's a bug there; the VF should not be
> showing <capability type='virt_functions'> but that's unrelated to the
> present discussion.


Ok, so we should model vHBA relationships via some kind of
<capability> then.


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list