[Cluster-devel] [Upstream patch] DLM: Convert rsb data from linked list to rb_tree

Sat Oct 8 10:13:52 UTC 2011

----- Original Message -----
| On Wed, Oct 05, 2011 at 03:25:39PM -0400, Bob Peterson wrote:
| > Hi,
| > 
| > This upstream patch changes the way DLM keeps track of RSBs.
| > Before, they were in a linked list off a hash table.  Now,
| > they're an rb_tree off the same hash table.  This speeds up
| > DLM lookups greatly.
| > 
| > Today's DLM is faster than older DLMs for many file systems,
| > (e.g. in RHEL5) due to the larger hash table size.  However,
| > this rb_tree implementation scales much better.  For my
| > 1000-directories-with-1000-files test, the patch doesn't
| > show much of an improvement.  But when I scale the file system
| > to 4000 directories with 4000 files (16 million files), it
| > helps greatly. The time to do rm -fR /mnt/gfs2/* drops from
| > 42.01 hours to 23.68 hours.
| 
| How many hash table buckets were you using in that test?
| If it was the default (1024), I'd be interested to know how
| 16k compares.

Hi,

Interestingly, on the stock 2.6.32-206.el6.x86_64 kernel
and 16K hash buckets, the time was virtually the same as
with my patch: 1405m46.519s (23.43 hours). So perhaps we
should re-evaluate whether we should use the rb_tree
implementation or just increase the hash buckets as needed.
I guess the question is now mainly related to scaling and
memory usage for all those hash tables at this point.

Regards,

Bob Peterson
Red Hat File Systems