[libvirt] [PATCH 2/2] storage: Polling the sysfs for pool with "fc_host" type adapter

Wed Jan 22 14:16:51 UTC 2014

On 22/01/14 22:02, John Ferlan wrote:
>
> On 01/07/2014 05:37 AM, Osier Yang wrote:
>> On 07/01/14 02:30, John Ferlan wrote:
>>> On 01/06/2014 05:19 AM, Osier Yang wrote:
>>>> The SCSI device corresponding to the vHBA might not show up in
>>>> sysfs yet when we trying to scan the LUNs. As a result, we will
>>>> end up with an empty volume set for the pool after pool-start,
>>>> even if there are LUNs.
>>> So what causes the "delay" to get the LUN's into sysfs?
>> It's basicly from the delay of udev.
>>
>>>    Is there
>>> something that can be done at creation time (or wherever) to sync that
>>> sooner?
>> I thought like that, let's say at the point of "createVport". But the
>> "createVport" is just to create the vHBA, and nothing else to do,
>> the left work for device showing up in the sysfs tree is on the
>> udev then.
>>
>> Polling right after "createVport" for the SCSI device in sysfs tree
>> will take more time.
>>
>>> Is there a way to determine that the SCSI device hasn't shown
>>> up yet other than the readdir()?  You're adding a delay/loop for some
>>> other subsystem's inability? to sync or provide the resources.
>> It's the only way AFAIK.
>>
>>>> Though the time of the device showing up is rather depended,
>>>> better than doing nothing, this patch introduces the polling
>>>> with 5 * 1 seconds in maximum (the time works fine on my
>>>> testing machine at least). Note that for the pool which doesn't
>>>> have any LUN, it will still take 5 seconds to poll, but it's
>>>> not a bad trade, 5 seconds is not much, and in most cases,
>>>> one won't use an empty pool in practice.
>>> Since the paths that call into this only seem to be via refresh and
>>> perhaps a subsequent refresh would resolve things.
>> Exactly, in most cases it will work, since the time window between
>> pool-start and pool-refresh should be enough for the SCSI device
>> showing up in the sysfs tree, *generally*.  but it's not necessary
>> to be that.
>>
>>> Could this be better
>>> served by documenting that it's possible that depending on circumstance
>>> "X" (answer to my first question) that in order to see elements in the
>>> pool, one may have to reload again.  Assuming of course that the next
>>> reload would find them...
>> I thought like this too, but the problem is it's not guaranteed that
>> the volume could be loaded after execute "pool-refresh" one time,
>> may be 2 times, 3 times, ... N times.  Also the time of each
>> "pool-refresh" is not fixed, it depends on how long  the
>> "udevadm settle" (see virFileWaitForDevices) will take.
>>
>>> I guess I'm a bit cautious about adding randomly picked timeout values
>>> based on some test because while it may work for you, perhaps it's 10
>>> seconds for someone else. While you may consider a 5 second pause "OK"
>>> and "reasonable" a customer may not consider that to be reasonable.
>>> People (and testers) do strange and random things.
>> Exactly, this is not the only problem we faced regarding to the
>> storage stuffs, and the users keeps asking why, why, and why.
>>
>>> Furthermore, could it be possible that you "catch" things perfectly and
>>> only say 10 of 100 devices are found... But if you waited another 5
>>> seconds the other 90 devices would show up..  I think by adding this
>>> code you end up down a slippery slope of handing fc_host devices
>>> specially...
>> We are exactly on the same page, but the question is what the
>> best solution we should provide? It looks ugly if we add documentation
>> saying one should use pool-refresh after the pool is started, if the
>> volumes are not loaded, but how many times to use the pool-refresh
>> is depended?  This patch was basicly a proposal for discussion. I
>> didn't expect it could be committed smoothly.
>>
> I'm still not convinced that the retry loop is the right way to go.  We
> could conceivably still have a failure even after 5 retries at once per
> second.
>
> Also is sleep() the right call or should it be usleep()?  I have this
> alarm (sic) going off in the back of my head about using sleep() in a
> threaded context.  Although I do see node_device_driver.c uses it...
>
> Regardless of whether we (u)sleep() or not, if we still have a failure,
> then we're still left documenting the fact that for fc_host type pools
> there are "conditions" that need to be met which are out of the control
> of libvirt's scope of influence that would cause the pool to not be
> populated. The "user" is still left trying to refresh multiple times.
> And we still don't know exactly whey we're not seeing the devices. The
> reality is this is also true for other pools as well since it's not
> guaranteed that processLU() is ever called and 'retval' is always 0 (and
> probably could be removed since it's pointless).

I gave up hacking in the code, instead, I'm making a patch to add
document in virsh manual as a "Note" to prompt the fact.  Something
like this:

<...>
+B<Note>: For pool which relies on remote resources, e.g. a "iscsi" type
+pool, or a "scsi" type pool based on a (v)HBA, since the corresponding
+devices for the volumes may not show up on the host's sysfs yet during
+the pool starting process, it may need to refresh the pool (see
+B<pool-refresh>) to get the volumes correctly loaded. How many times needed
+for the refreshing is depended on the network connection and the time
+the host takes to export the corresponding devices to sysfs.
</...>

Osier