[Linux-cluster] Behavior of "statfs_fast" settune

Wed Jan 16 18:58:35 UTC 2008

Mathieu Avila wrote:
>
>>>>
>>>> I am in the process of evaluating the performance gain of
>>>> the "statfs_fast" patch.
>>>> Once the FS is mounted, I perform "gfs_tool settune ...." and then
>>>> i measure the time to perform "df" on a partially filled FS. The
>>>> time is almost the same, "df" returns almost instantly, with a
>>>> value really near the truth, and progressively reaching the true
>>>> one.
>>>>
>>>> But I have noticed that when the FS size increases, the time to
>>>> perform "gfs_tool settune ...." increases dramatically. In fact,
>>>> after a few measures, it appears that the time to perform "df"
>>>> without fuzzy statfs is the same as the time to activate fuzzy
>>>> statfs. 
>>>>         
>>> In theory, this shouldn't happen. Are you on RHEL 4 or RHEL 5 ? And 
>>> what is the FS size that causes this problem ?
>>>
>>>       
>> I just did a quick try. It doesn't happen to me. By reading your
>> note, were you *repeatedly* issuing "gfs_tool settune .." then
>> followed by "df" ? Remember the "settune" is expected to be run
>> *once* right after the particular GFS filesystem is mounted. You
>> certainly *can* run it multiple times. It won't hurt anything.
>> However, each time the "settune" is invoked, the code has to perform
>> a regular "df" (i.e. that's the way it initializes itself). I suspect
>> this is the cause of your issue. Let me know either way.
>>
>>     
>
> I am using "cluster-1.03" with the statfs_fast patch from:
> http://www.redhat.com/archives/cluster-devel/2007-March/msg00124.html
> (has this been changed after ?)
> All this on a Centos 5.
>
> My use case is :
>  * mkfs of a volume
>  * mount on all 6 nodes 
>  * timing of "settune statfs_fast 1", on all 6 nodes. 
>  * timing of "df" on one node.
> All commands are executed immediately one after the other.
>
> So i issued only one "settune", on all nodes, and was expecting it to
> return immediately. From what you've just said (settune performing a
> real "df"), i guess this behaviour is normal.
>   

yes ...

> I don't understand why it's necessary to perform a real "df" in
> "settune". Isn't the licence inode used to store the previous
> values of "df" so that it can give an immediate answer to "df", and then
> perform a real regular "df" in background to upgrade the "cached df" to
> the real value ?
>   

For GFS1, we can't change disk layout so we borrow the "license" file 
that happens to be an unused on-disk GFS1 file. There is only one per 
file system, comparing to GFS2 that uses N+1 files (N is the number of 
nodes in this cluster) to handle the "df" statistics. Every node keeps 
its changes in memory buffer and syncs its local changes to the master 
(license) file every 30 seconds. Upon unclean shutdown (or crash), the 
local changes in the memory buffer will be lost. To re-sync the correct 
statistics, we need to use real "df" command (that scans the on-disk 
RGRP disk structures) to adjust the correct statistics. For details, 
check out one of my old write-ups in:

http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_fast_statfs.R4

-- Wendy