[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v2 1/3] storage: Fix existing parent check for vHBA creation

On 07/19/2017 10:21 AM, Erik Skultety wrote:
> On Wed, Jul 19, 2017 at 09:59:06AM -0400, John Ferlan wrote:
>> On 07/19/2017 07:38 AM, Erik Skultety wrote:
>>> On Tue, Jul 18, 2017 at 11:10:38AM -0400, John Ferlan wrote:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1472277
>>>> Commit id '106930aaa' altered the order of checking for an existing
>>>> vHBA (e.g something created via nodedev-create functionality outside
>>>> of the storage pool logic) which inadvertantly broke the code to
>>>> decide whether to alter/force the fchost->managed field to be 'yes'
>>>> because the storage pool will be managing the created vHBA in order
>>>> to ensure when the storage pool is destroyed that the vHBA is also
>>>> destroyed.
>>> Right, I agree with this - unless the user explicitly states they want the
>>> pre-created vHBA to be managed, we don't force any setting. I was wondering
>>> though, what about a use case when the user wants the vHBA to be auto-created,
>>> but non-managed at the same time...(yeah, I know I'm stretching it a bit with
>>> these unlikely scenarios, but then, you'd surely agree, you've already seen
>>> some of similar sort and one can never expect what creative ways of defining
>>> the XML are there to be found :))
>> I think you're becoming the new vHBA expert ;-)
>> In this case, I tell them to go see figure one or go read the docs. From
>> the storage pool page, managed description: "For configurations that do
>> not provide an already created vHBA from a 'virsh nodedev-create',
>> libvirt will set this property to "yes"."
>>> I was also about to point out in the previous version, that I didn't like how
>>> complex virNodeDeviceCreateVport was getting - do wwnn and wwpn relate to a
>>> vHBA at all or is it a regular HBA, does the vHBA exist already or should we
>>> actually create and manage it.
>> The wwnn/wwpn are the wwnn/wwpn used to create the vHBA. Typically, a
>> SAN admin will "assign" the pair (there's some specific rules about
>> them). If not provided for a storage pool, then it's an XML error. For a
>> nodedev it's possible to have libvirt generate a wwnn/wwpn, which yes,
>> is quite confusing. In doing so libvirt uses a specific base and adds to
>> it (see virRandomGenerateWWN and cover at least one eye).
>> I agree in general about the "complexity" thing, but as time has gone on
>> requests for new ways to determine which parent to use has caused code
>> bloat. Complexity is a time factored calculation. When originally
>> implemented using "host#" for parent was perfectly fine, but then
>> someone realized that it should be "scsi_host#". Then someone said, that
>> "scsi_host#" wasn't good enough because it can change between reboots.
>> So parent by wwnn/wwpn was added. During the discussions for that
>> someone else said what about using the fabric_wwn in order to find a
>> parent. Oh and the "future" holds being able to define multiple vHBA's
>> because it's felt migration of domains using vHBA pools is going to need
>> more than one vHBA because on the target host using the same wwnn/wwpn
>> won't work (although I have doubts here, but haven't had the cycles to
>> investigate).
>> If you're interested, go read the tail end of the wiki
>> (http://wiki.libvirt.org/page/NPIV_in_libvirt) - it'll show a bit of the
>> history of how things looked at one time.
>> TBH: Sometimes I think QE reads the code and figures out a way to create
>> bugs based on assumptions the code makes rather than making sure the
>> intended use cases "work properly". Hence this whole need to know
>> whether the parent is the HBA or the vHBA. Why would *anyone* provide
>> the HBA parent wwnn/wwpn when it's perfectly fine to create a storage
>> pool of the HBA without it via:
>>     <adapter type='scsi_host' name='scsi_host3'/>
>> but no, someone wants to be inventive and think/believe:
>>     <adapter type='fc_host' [parent='scsi_host3'] wwnn='HBA_wwnn'
>> wwpn='HBA_wwpn'/>
>> should work as well (where HBA_wwnn/wwpn are the wwnn/wwpn of scsi_host3
>> in this example).
>> The second XML isn't illegal, but because scsi_host3 has vHBA
> Well, I'd call it a misconfiguration, how can a device be a parent to itself?
> That's not what the attribute's supposed to serve and using it this way is a
> misuse - we should either ignore it in that case or return error. The storage
> pool won't start but it should never have in the first place and the XML def
> won't disappear in this case, so IMHO we could and probably should forbid it.

Because it hasn't really been characterized as a misconfiguration
previously. I doubt anyone outside of QE has ever done something like
this as there's no reason to do so. If they want to use the HBA they'd
use the 'scsi_host' format.

IMO: something with 'fc_host' what uses the HBA wwnn/wwpn is
misconfigured because it's not a vHBA then, but there's been no attempt
to prohibit that configuration, hence the current mess.

I'd be perfectly fine with turning this particular bz/check into - don't
configure things this way because it's not how it's supposed to work. If
you want a storage pool backed to an HBA, then use the scsi_host syntax.
If you want a vHBA/NPIV then use the fc_host syntax.

>> characteristics (in this case the ability to create vports), thus it
>> passes certain tests and "could" be a different, but legal way to
>> essentially define the HBA as the storage pool.  Wouldn't anyone use it
>> this way - probably not as there's no reason since the vHBA LUN's won't
>> be available.  It's a digression + tirade ;-)...
>>>> This patch moves the check (and checkParent helper) for an existing
>>>> vHBA back into the createVport in storage_backend_scsi. It also
>>>> restores the logic back to what commit id '79ab0935' had with
>>> As I'm writing below, if you want to go for a partial revert, this patch needs
>>> to be split in 2, it's never a good idea to move code segments and doing
>>> adjustments to it at the same time.
>> While I understand your point, I disagree. Purely moving the code
>> doesn't work as the code in node_device_conf returns @name and code in
>> storage_backend_scsi returns -1/0, so some amount of massaging the code
>> is going to be required anyway.  May as well get it right in one shot
>> rather than confuse a future git bisector even more.
> Huh, you're right, let's go with your proposal then, with squashing in the tiny
> optimization snippet I've suggested in my previous response.
> Erik

Something I've done in my branch


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]