[Linux-cluster] How does caching work in GFS1?

Wed Aug 11 18:03:31 UTC 2010

Hi,

I am having an issue with a GFS1 cluster in some use cases. Mainly,
running du on a directory takes an unusually long time. I have the
filesystem mounted with noatime and nodiratime statfs_fast is turned
on. Running du on a 2G directory takes about 2 minutes and each
subsequent run took about the same amount of time. Following a tip
that I got, I turned off kernel i/o scheduling (echo noop >
/sys/block/sdc/queue/scheduler) and after I did so, I discovered that
the initial run of du took the same amount of time but subsequent runs
were very fast presumably due to some glock caching benefit (see
results below).

[testuser at buildmgmt-000 testdir]$ for ((i=0;i<=3;i++)); do time du
>/dev/null; done

real    2m10.133s
user    0m0.193s
sys     0m14.579s

real    0m1.948s
user    0m0.043s
sys     0m1.048s

real    0m0.277s
user    0m0.034s
sys     0m0.240s

real    0m0.274s
user    0m0.033s
sys     0m0.239s

This looked very promising but then I discovered that the same speedup
benefit was not realized when traversing our full directory tree.
Following are the results for a 30G directory tree on the same
filesystem.

[testuser at buildmgmt-000 main0]$ for ((i=0;i<=3;i++)); do time du
>/dev/null; done

real    5m41.908s
user    0m0.596s
sys     0m36.141s

real    3m45.757s
user    0m0.574s
sys     0m43.868s

real    3m17.756s
user    0m0.484s
sys     0m44.666s

real    3m15.267s
user    0m0.535s
sys     0m45.981s

I have been trying to tweak tunables such as glock_purge and
reclaim_limit but to no avail. I assume that I am running up against
some kind of cache size limit but I'm not sure how to circumvent it.
There are no other cluster nodes accessing the same test data so there
should not be any lock contention issues. If I could get the same
speedup on the 30G directory as I'm getting on the 2G directory I
would be very happy and so would the users on the cluster. Any help
would be appreciated.

Regards,

-- 
Peter Schobel
~