[Cluster-devel] [Upstream patch] DLM: Convert rsb data from linked list to rb_tree

David Teigland teigland at redhat.com
Mon Oct 10 14:43:20 UTC 2011


On Sat, Oct 08, 2011 at 06:13:52AM -0400, Bob Peterson wrote:
> ----- Original Message -----
> | On Wed, Oct 05, 2011 at 03:25:39PM -0400, Bob Peterson wrote:
> | > Hi,
> | > 
> | > This upstream patch changes the way DLM keeps track of RSBs.
> | > Before, they were in a linked list off a hash table.  Now,
> | > they're an rb_tree off the same hash table.  This speeds up
> | > DLM lookups greatly.
> | > 
> | > Today's DLM is faster than older DLMs for many file systems,
> | > (e.g. in RHEL5) due to the larger hash table size.  However,
> | > this rb_tree implementation scales much better.  For my
> | > 1000-directories-with-1000-files test, the patch doesn't
> | > show much of an improvement.  But when I scale the file system
> | > to 4000 directories with 4000 files (16 million files), it
> | > helps greatly. The time to do rm -fR /mnt/gfs2/* drops from
> | > 42.01 hours to 23.68 hours.
> | 
> | How many hash table buckets were you using in that test?
> | If it was the default (1024), I'd be interested to know how
> | 16k compares.
> 
> Hi,
> 
> Interestingly, on the stock 2.6.32-206.el6.x86_64 kernel
> and 16K hash buckets, the time was virtually the same as
> with my patch: 1405m46.519s (23.43 hours). So perhaps we
> should re-evaluate whether we should use the rb_tree
> implementation or just increase the hash buckets as needed.
> I guess the question is now mainly related to scaling and
> memory usage for all those hash tables at this point.

I'm still interested in possibly using an rbtree with fewer hash buckets.

At the same time, I think the bigger problem may be why gfs2 is caching so
many locks in the first place, especially for millions of unlinked files
whose locks will never benefit you again.




More information about the Cluster-devel mailing list