[Linux-cluster] optimising DLM speed?

Wed Feb 23 21:48:03 UTC 2011

After running several days with the larger table sizes I don't think 
it's made any difference to individual thread performance or overall 
throughput.

Likewise, the following changes have had no effect on access time for 
large directories (but they have improved caching and improved high load 
overall performance):

Increasing dentry and inode caches to the maximum size allowed by the 
kernel (about 128million entries. This is limited as a percentage of 
memory to about 10%)

This helped caching under load, but until I added the following change:

(sysctl)
vm.max_reclaims_in_progress=1
vm.zone_reclaim_mode=0

The cached dentry data would evaporate after a while.

(Switching to reeclaim_mode=0 is recommended for fileservers to enhance 
dentry/inode caching)

At the end of all that, the effect is only minor and the biggest bugbear 
- access to directories with more than ~150 files onboard is unusably 
slow - hasn't been addressed.

The change which had the largest effect on this problem - switching to 
lock_nolock - isn't practical in a production cluster environment (and 
defeats the purpose of using GFS2 anyway)

Iostat's showing that under heavy i/o load (1000-3000 requests/second 
but only 2-3Mb/s actual data), the kernel on one machine can sit on 
read/write equests for up to 3000ms before passing them to the storage 
devices - which usually respond within 2-5ms. It's sitting at 300ms most 
of the time and the machine concerned only has 5 FSes mounted.

The other 2 machines in the cluster not facing this kind of 
treatement(100-300 requests/second) have 30 mounts each, can easily read 
at 10-20Mb/s and have read delays of 2-10ms (mostly 3-4).

Users report that these 2 machines are _fast_ when not accessing 
directories with large numbers of files onboard...