[Linux-cluster] optimising DLM speed?

Steven Whitehouse swhiteho at redhat.com
Thu Feb 24 12:29:48 UTC 2011


On Wed, 2011-02-23 at 21:48 +0000, Alan Brown wrote:
> After running several days with the larger table sizes I don't think 
> it's made any difference to individual thread performance or overall 
> throughput.
> Likewise, the following changes have had no effect on access time for 
> large directories (but they have improved caching and improved high load 
> overall performance):
> Increasing dentry and inode caches to the maximum size allowed by the 
> kernel (about 128million entries. This is limited as a percentage of 
> memory to about 10%)
> This helped caching under load, but until I added the following change:
> (sysctl)
> vm.max_reclaims_in_progress=1
> vm.zone_reclaim_mode=0
> The cached dentry data would evaporate after a while.
> (Switching to reeclaim_mode=0 is recommended for fileservers to enhance 
> dentry/inode caching)
> At the end of all that, the effect is only minor and the biggest bugbear 
> - access to directories with more than ~150 files onboard is unusably 
> slow - hasn't been addressed.
That doesn't sound like it is related to a DLM issue. 150 entries is not
a lot. What do you mean be "access" in this case? Just looking up a
single file in the directory, or create/delete files or an ls -l
(implying stats to each file) or what exactly?

> The change which had the largest effect on this problem - switching to 
> lock_nolock - isn't practical in a production cluster environment (and 
> defeats the purpose of using GFS2 anyway)
> Iostat's showing that under heavy i/o load (1000-3000 requests/second 
> but only 2-3Mb/s actual data), the kernel on one machine can sit on 
> read/write equests for up to 3000ms before passing them to the storage 
> devices - which usually respond within 2-5ms. It's sitting at 300ms most 
> of the time and the machine concerned only has 5 FSes mounted.
> The other 2 machines in the cluster not facing this kind of 
> treatement(100-300 requests/second) have 30 mounts each, can easily read 
> at 10-20Mb/s and have read delays of 2-10ms (mostly 3-4).
> Users report that these 2 machines are _fast_ when not accessing 
> directories with large numbers of files onboard...
Again, figuring out the exact workload should help us get to the bottom
of what is going on here. How are you measuring the delays reported
above? Is the syscall service time, for example?


> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

More information about the Linux-cluster mailing list