[Linux-cluster] du takes very long to complete on GFS partition

Wed Jun 22 10:38:29 UTC 2005

Hi,

I have build a two node cluster with GFS running on a shared SCSI disk 
array with integrated raid controller. The array contains 2 partitions: 
one raid0 partition of about 2 TB and one raid5 partition of about 1,5 TB. 
The stripe size is 512k.

I'm using kernel 2.4.21-27 for x86_64 and GFS version 6.0.2-17.

The problem is that a du -s takes about 6 minutes on either partition 
every time the command is run. I've mounted the partitions with noatime. 
Is this a normal time for GFS to do a du run on a 2TB partition?

The SCSI bus is running fine. I get about 190 MB/s bandwith to the array 
from both nodes. This is according to the specs of the internal raid 
controller.

When the command ran I noticed a lot of traffic on interface lo. This 
seems logical as this node is also running the lockmanager. But what 
bothers me is that the traffic does not acceed about 1,5 MB/s avarage. The 
loopback interface should be able to handle much more so therefore it 
looks that there some sort of bottleneck but I don't see it. Does anybody 
have a clue?

This is a partial capture of vmstat when the du command is running:
procs                      memory      swap          io     system cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy 
id wa
0  0 179524 192808  41552 316656    0    0  1542    16  326  1473  0  4 95 
 2
 0  0 179524 189832  41552 319640    0    0  1492     0  305  1692  0  2 
96  1
 1  1 179524 187980  41556 321488    0    0   932     6  309  1536  0  4 
96  0
 0  0 179524 185372  41556 324088    0    0  1292     0  318  1516  0  2 
97  1
 0  0 179524 182744  41556 326680    0    0  1296     0  324  1896  0  1 
97  2
 0  0 179524 174532  41556 334872    0    0  4096     0  415  3358  1  4 
94  1
 0  0 179524 173524  41556 335880    0    0   504     0  295  1622  0  0 
99  0
 0  1 179524 171420  41560 337980    0    0  1050     6  374  1750  1  4 
92  4
 3  0 179524 164780  41560 344620    0    0  3320     0  794  3211  1  2 
91  6
 5  0 179524 161080  41560 348320    0    0  1850     0  417  2305  0  1 
94  5
 0  0 179524 157660  41564 351736    0    0  1708     6  336  2366  0  4 
94  2
 0  1 179524 155660  41564 353736    0    0  1008     0  306  1795  0  1 
99  0

[root at hera raid0]# pwd
/mnt/raid0
[root at hera raid0]# time du -s .
144530864       .

real    5m27.894s
user    0m0.170s
sys     0m4.110s

kind regards,
Martijn Brizee

Linvision, The Netherlands
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050622/2eaff9a3/attachment.htm>