[Linux-cluster] GFS Tunables

Brandon Young bkyoung at gmail.com
Thu Oct 16 14:52:50 UTC 2008


Hi all,

I currently have a GFS deployment consisting of eight servers and several
GFS volumes.  One of my GFS servers is a dedicated backup server with a
second replica SAN attached to it through a second HBA.  My approach to
backups has been with tools such as rsync and rdiff-backup, run on a nightly
basis.  I am having a particular problem with one or two of my filesystems
taking a *very* long time to backup.  For example, I have /home living on
GFS.  Day-to-day performance is acceptable, but backups are hideously slow.
Every night, I kick off an rdiff-backup of /home from my backup server,
which dumps the backup onto an XFS filesystem on the replica SAN.  This
backup can take days in some cases.

We have done some investigating, and found that it appears that getdents(2)
calls (which give the list of filenames present in a directory) are
spectacularly slow on GFS, irrespective of the size of the directory in
question.  In particular, with 'strace -r', I'm seeing a rate below 100
filenames per second.  The filesystem /home has at least 10 million files in
it, which doing the math means 29.5 hours just to do the getdents calls to
scan them, which is more than a third of wall-clock time.  And that's before
we even start stat'ing.

I google'd around a bit and I can't see any discussion of slow getdents
calls under GFS.  Is there any chance we have some sort of tunable turned
on/off that might be causing this?  I'm not sure which tunables to consider
tweaking, even.  This seems awfully slow, even with sub-optimal locking.  Is
there perhaps some tunable I can try tweaking to improve this situation?
Any insights would be much appreciated.

--
Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081016/aa7f4f0e/attachment.htm>


More information about the Linux-cluster mailing list