[Linux-cluster] How does caching work in GFS1?
Peter Schobel
pschobel at 1iopen.net
Wed Aug 11 18:03:31 UTC 2010
Hi,
I am having an issue with a GFS1 cluster in some use cases. Mainly,
running du on a directory takes an unusually long time. I have the
filesystem mounted with noatime and nodiratime statfs_fast is turned
on. Running du on a 2G directory takes about 2 minutes and each
subsequent run took about the same amount of time. Following a tip
that I got, I turned off kernel i/o scheduling (echo noop >
/sys/block/sdc/queue/scheduler) and after I did so, I discovered that
the initial run of du took the same amount of time but subsequent runs
were very fast presumably due to some glock caching benefit (see
results below).
[testuser at buildmgmt-000 testdir]$ for ((i=0;i<=3;i++)); do time du
>/dev/null; done
real 2m10.133s
user 0m0.193s
sys 0m14.579s
real 0m1.948s
user 0m0.043s
sys 0m1.048s
real 0m0.277s
user 0m0.034s
sys 0m0.240s
real 0m0.274s
user 0m0.033s
sys 0m0.239s
This looked very promising but then I discovered that the same speedup
benefit was not realized when traversing our full directory tree.
Following are the results for a 30G directory tree on the same
filesystem.
[testuser at buildmgmt-000 main0]$ for ((i=0;i<=3;i++)); do time du
>/dev/null; done
real 5m41.908s
user 0m0.596s
sys 0m36.141s
real 3m45.757s
user 0m0.574s
sys 0m43.868s
real 3m17.756s
user 0m0.484s
sys 0m44.666s
real 3m15.267s
user 0m0.535s
sys 0m45.981s
I have been trying to tweak tunables such as glock_purge and
reclaim_limit but to no avail. I assume that I am running up against
some kind of cache size limit but I'm not sure how to circumvent it.
There are no other cluster nodes accessing the same test data so there
should not be any lock contention issues. If I could get the same
speedup on the 30G directory as I'm getting on the 2G directory I
would be very happy and so would the users on the cluster. Any help
would be appreciated.
Regards,
--
Peter Schobel
~
More information about the Linux-cluster
mailing list