[libvirt-users] NPIV setup?

Fri Jun 1 02:52:40 UTC 2012

> I'm missing something.
> 
> The purpose of NPIV (as I understand it) is to give a guest OS an HBA that
> it can scan, play with new luns, etc all without making changes to the
> physical server(s) the guest is living in currently.

Technically, the purpose of NPIV is to provide discrete paths to storage.  Who uses it and how it's used are totally up for debate.  I could make a case for using NPIVs on a physical system to isolate I/O for an application mounted at /mount/point/A that has no I/O impact on other NPIV storage mounted at /mount/point/B.  The concept of using NPIVs with VMs is a logical extension of that idea.

> However, I can't find a way to either have the guest's XML config create
> the HBA or for the physical server to successfully GIVE the HBA to the
> guest.  I can give disks all day long, but I can't give the HBA to the
> guest.

I've already been down this path, no pun intended.  You do not present a virtual HBA to your guest.  You create virtual HBAs on the host, zone/mask your storage to your virtual HBA and present block-level devices to your VM, using Storage Pools & virtio devices to provide passthru.  Your host doesn't use them beyond knowing there's storage zoned/masked to the vHBAs you created for it. 

I learned this by both opening a case with Red Hat as well as pressing the topic laterally through my organization's Sales Engineer, who contacted Red Hat's virtualization team directly.

> It seems like libvirt and virsh can only create the vHBA on the physical
> box, which defeats the purpose of working with NPIV then... I can just
> present the same luns to the REAL WWPNs of the physical boxes that need
> access, setup multipathd to give /dev/mapper names in a consistent manner,
> and give the raw disks to the guests.

Use virsh nodedev-create /path/to/your/virtualhbacfg.xml.  It will create a virtual device listed in /sys/class/scsi_host.  You zone/mask storage to the NPIVs specified in your XML file.  I then created a storage pool using virt-manager.  I can see the storage using either mpath devices or iSCSI devices in virt-manager, but I believe that the solution is to use iSCSI.  This method allows you to specify the device (choose the virtual devices), and therefore specify the paths.  Using mpath devices is not what you want because it will look at all scsi_hosts, find all storage down all paths, and provides no path segregation which is what I believe you're seeking.

> I really want to get the physical KVM servers out of the 'storage
> management' game aside from the basic OSes of the guests.  How can I do
> that?
> 
> (or am I missing something?)

I believe it's better to manage your storage in one place than in many.  You're virtualizing kernels here.  Might as well virtualize your storage devices in the same place.  Let the host do the hard work and let your VMs relax :)

Believe me, I understand your frustration.  I'm working on a massive project to move all my Linux VMs away from VMWare (for a number of reasons).  In parallel, I'm working on RHEL on IBM Power (pSeries) and developing a KVM/RHEV infrastructure.  Of all the virtualization solutions under my organization's roof, the one that is the best engineered to date is the IBM solution because it's the only one that decidedly leverages NPIV, virtual HBAs (that, yes, are passed down to the VM directly) and nuke-proof redundancy.  RHEL on POWER is as resilient as AIX on POWER with one drawback ... it's a PPC chip which drastically limits what software I can run on it.  So I have RHEL/PPC w/NPIV and then KVM w/NPIV.  I am using the AIX/POWER + RHEL/POWER design as my model as I'm developing my KVM offering.

There are similarities and differences, but the end result is essentially the same (overlooking certain fault tolerance aspects which put the extremely-more-expensive IBM solution ahead of the rest).

IBM: you create VMs which run on the hardware whose sole lot in life is to handle I/O.  Those specialized systems are called VIO servers.  They create, distribute and manage NICs and vHBAs.  They handle the overhead for all disk I/O and network I/O.  They enable you to dynamically add/destroy virtual devices and grant/revoke them to the rest of the VMs on the hardware.  The VIO does not know about the storage presented to it, though it knows about the NPIVs associated with its vHBAs as well as the vLANs associated with its network devices.

KVM/Libvirt does all this as well.  But instead of managing it from within a VM, you manage it on the physical hardware.  Creating your vHBAs on the hardware doesn't really tax the physical host too much because you're leaving all the disk traffic to the HBAs themselves.  That's what they're designed to do.  

Both flavors of VM are entirely SAN-based.  IBM's RHEL solution knows its installed on SAN storage devices and boots off of them directly.  KVM presents virtual SCSI devices which the VM thinks are local disks.  From the guest level, I'd personally rather troubleshoot and work with what appear to be local disks.  From top to bottom, it's just easier to build, manage, kickstart, etc.

I'm in the final phases of my designs with KVM.  So far, I'm very impressed with it and I've even shown my snobby UNIX / IBM colleagues a thing or two to give them a run for their money :)

Lastly, don't beat yourself up.  All of this is relatively new stuff.  Red Hat is still perfecting the methodology behind it.  RHEL 6.3 is supposed to provide auto-instantiation of vHBAs on host boot, so any VMs set to auto-start on boot won't fail because it's waiting around for you to manually recreate the vHBAs.

Hope this is helpful.

MD

> 
> --Jason