[Linux-cluster] strange slowness of ls with 1 newly created file on gfs 1 or 2

Wed Jul 11 17:01:55 UTC 2007

Christopher Barry wrote:
> On Tue, 2007-07-10 at 22:23 -0400, Wendy Cheng wrote:
>   
>> Pavel Stano wrote:
>>
>>     
>>> and then run touch on node 1:
>>> serpico# touch /d/0/test
>>>
>>> and ls on node 2:
>>> dinorscio:~# time ls /d/0/
>>> test
>>>
>>>  
>>>
>>>       
>> What have you expected from a cluster filesystem ? When you touch a file 
>> on node 1, it is a "create" that requires at least 2 exclusive locks 
>> (directory lock and the file lock itself, among many other things). On a 
>> local filesystem such as ext3, disk activities are delayed due to 
>> filesystem cache where "touch" writes the data into cache and "ls" reads 
>> it from cache on the very same node - all memory operations.  On cluster 
>> filesystem, when you do an "ls" on node 2, node 2 needs to ask node 1 to 
>> release the locks (few ping-pong messages between two nodes and lock 
>> managers via network), the contents inside node 1's cache need to get 
>> synced to the shared storage. After node 2 gets the locks, it  has to 
>> read contents from the disk.
>>
>> I hope the above explanation is clear.
>>
>>     
>>> and last thing, i try gfs2, but same result
>>>
>>>
>>>  
>>>
>>>       
>> -- Wendy
>>     
>
> This seems a little bit odd to me. I'm running a RH 7.3 cluster,
> pre-redhat Sistina GFS, lock_gulm, 1GB FC shared disk, and have been
> since ~2002.
>
> Here's the timing I get for the same basic test between two nodes:
>
> [root at sbc1 root]# cd /mnt/gfs/workspace/cbarry/
> [root at sbc1 cbarry]# mkdir tst
> [root at sbc1 cbarry]# cd tst
> [root at sbc1 tst]# time touch testfile
>
> real    0m0.094s
> user    0m0.000s
> sys     0m0.000s
> [root at sbc1 tst]# time ls -la testfile
> -rw-r--r--    1 root     root            0 Jul 11 12:20 testfile
>
> real    0m0.122s
> user    0m0.010s
> sys     0m0.000s
> [root at sbc1 tst]#
>
> Then immediately from the other node:
>
> [root at sbc2 root]# cd /mnt/gfs/workspace/cbarry/
> [root at sbc2 cbarry]# time ls -la tst
> total 12
> drwxr-xr-x    2 root     root         3864 Jul 11 12:20 .
> drwxr-xr-x    4 cbarry   cbarry       3864 Jul 11 12:20 ..
> -rw-r--r--    1 root     root            0 Jul 11 12:20 testfile
>
> real    0m0.088s
> user    0m0.010s
> sys     0m0.000s
> [root at sbc2 cbarry]#
>
>
> Now, you cannot tell me 10 seconds is 'normal' for a clustered fs. That
> just does not fly. My guess is DLM is causing problems.
>
>   
 From previous post, we really can't tell since the network and disk 
speeds are variables and unknown. However, look at your data:

local "ls" is 0.122s
remote "ls" is 0.088s

I bet the disk flushing happened during first "ls" (and different base 
kernels treat their dirty data flush and IO scheduling differently). I 
can't be convinced that DLM is an issue - unless the experiment has 
collected enough sample that has its statistical significance.

-- Wendy