[Linux-cluster] optimising DLM speed?
scooter at cgl.ucsf.edu
Thu Feb 24 22:40:25 UTC 2011
On 02/17/2011 01:29 PM, David Teigland wrote:
> On Thu, Feb 17, 2011 at 09:24:41PM +0000, Alan Brown wrote:
>> David Teigland wrote:
>>> Don't change the buffer size, but I'd increase all the hash table sizes to
>>> 4096 and see if anything changes.
>>> echo "4096"> /sys/kernel/config/dlm/cluster/rsbtbl_size
>>> echo "4096"> /sys/kernel/config/dlm/cluster/lkbtbl_size
>>> echo "4096"> /sys/kernel/config/dlm/cluster/dirtbl_size
>> Increasing rsbtbl_size to 4096 or higher results in FSes refusing to
>> mount and clvm refusing to start - both with "cannot allocate
>> At 2048, it works, but gfs_controld and dlm_controld exited when I
>> tried to mount all FSes on one node as a test.
>> At 1024 it seems stable.
>> The other settings seemed to have applied OK. So far, reports are
>> positive (but it's quiet at the moment)
>> I've got a strace of clvmd trying to start with rsbtbl_size set to
>> 4096. Should I post it here or would you prefer it mailed direct?
> Thanks for testing, you can post here.
Hi all. After two tries, we've modified our cluster so that all nodes
have increased their dlm hash table sizes to 1024. Initially, I put the
echos in /etc/init.d/gfs2, but it turns out that /etc/init.d/gfs2 is
sort of a no-op: /etc/init.d/netfs mounts the gfs2 filesystems before
/etc/init.d/gfs2 is ever called, so the echos need to be before netfs.
At any rate, we have noticed a significant perceived improvement in
overall performance of the systems. Where before, it was common to see
imap process in D wait -- sometimes hanging for long periods of time --
we have not seen that at all since updating the hash table size. So
far, so good!
More information about the Linux-cluster