[Cluster-devel] GFS2: glock statistics gathering (RFC)

David Teigland teigland at redhat.com
Fri Nov 4 16:31:52 UTC 2011


On Fri, Nov 04, 2011 at 03:19:49PM +0000, Steven Whitehouse wrote:
> The three pairs of mean/variance measure the following
> things:
> 
>  1. DLM lock time (non-blocking requests)

You don't need to track and save this value, because all results will be
one of three values which can gather once:

short: the dir node and master node are local: 0 network round trip
medium: one is local, one is remote: 1 network round trip
long: both are remote: 2 network round trips

Once you've measured values for short/med/long, then you're done.
The distribution will depend on the usage pattern.

>  2. DLM lock time (blocking requests)

I think what you want to quantify is how much contention a given lock is
under.  A time measurement is probably not a great way to get that since
it's a combination of: the value above, how long gfs2 takes to release the
lock (itself a combination of things, including the the tunable itself),
and how many nodes are competing for the lock (depends on workload).

>  3. Inter-request time (again to the DLM)

Time between gfs2 requesting the same lock?  That sounds like it might
work ok for measuring contention.

> 1. To be able to better set the glock "min hold time"

Less for a lock with high contention?

> 2. To spot performance issues more easily

Apart from contention, I'm not sure there are many perf issues that dlm
measurements would help with.

> 3. To improve the algorithm for selecting resource groups for
> allocation (to base it on lock wait time, rather than blindly
> using a "try lock")

Don't you grab an rg lock and keep it cached?  How would lock times help?

Also, ocfs2 keeps quite a lot of locking stats you might look at.

Dave




More information about the Cluster-devel mailing list