[Linux-cluster] optimising DLM speed?
ajb2 at mssl.ucl.ac.uk
Wed Feb 16 21:12:58 UTC 2011
> Yes, ls -l will always take longer because it is not just accessing
the directory, but also every inode in the directory. As a result the
I/O pattern will generally be poor.
I know and accept that. It's common to most filesystems but the access
time is particularly pronounced with GFS2 (presumably because of the
The problem is that users don't see things from the same point of view,
so there's a constant flow of complaints about "slow servers".
They think that holding down the number of files/directory is an
unreasonable restriction - and in some cases (NASA/ESA archives) I can't
even explain the reasons why as the people involved are unreachable.
This is despite quite documentable performance gains from breaking up
large directories even on non-cluster filesystems - We saw a ls -lR
speedup of around 700x when moving one directory structure from flat
(130k files) to nested.
The same poor I/O pattern has a direct bearing on incremental backup
speeds - backup software has to stat() a file (at minimum - SHA hash
comparisons are even more overhead) to see if anything's changed, which
means in large directories a backup may drop down to scan rates of 10
files/second or lower and seldom exceeds 100 files/second at best.
(Bacula is pretty good about caching and issues a fadvise(notneeded)
after each file is checked. I just wish other filesystem-crawling
processes did the same)
> I assume that once the directory has been read in once, that it
acesses will be much faster on subsequent occasions,
Correct - but after 5-10 idle minutes the cached information is lost and
the pattern repeats.
> It is a historical issue that we have inherited from GFS and I've
spent some time trying to come up with a solution in kernel space, but
in the end, a userland solution may be a better way to solve it.
In the case of NFS clients, I'm seriously looking at trying to move to
RHEL6 and use fscache - this should help reduce load a little but won't
help for uncached directories.
If you have any suggestions on the [nfs export|client mount] side to try
and help things I'm open to suggestions.
More information about the Linux-cluster