[Linux-cluster] Lock Resources

Fri May 2 12:23:16 UTC 2008

Hi, Christine:

Really appreciate your prompt and kind reply. 

I have some further questions.

> > 
> > 1. Whether the kernel on each server/node is going
> to
> > initialize a number of empty lock resources after
> > completely rebooting the cluster? 
> > 
> > 2. If so, what is the default value of the number
> of
> > empty lock resources? Is it configurable?
> 
> There is no such thing as an "empty" lock resource.
> Lock resources are
> allocated from kernel memory as required. That does
> mean that the number
> of resources that can be held on a node is limited
> by the amount of
> physical memory in the system. 

Does it mean the cache allocated for disk IO will be
reduced to meet the need of more lock resources? 

If so, for an extremely busy node, when reducing the
cache, the physical disk IO will increase, which in
turn increases the processing time (as disk IO is much
slower than accessing cache), which then in turn
increases the period of holding the lock resources,
which in turn makes the kernel grab more memory space
that should be used for cache in order to create new
lock resources for new requests, and on and on, and
eventually ends up to a no-cache situtation at all.
Would this case ever happen?

> I think this addresses 3 & 4.

Yes, your answer does address them. Thank you.
However, what will happen if an extremely busy
application needs to write more new files thus the
kernel needs to allocate more lock resources but the
physical memory limit has been reached and all
existing lock resources cannot be released? I guess
the kernel will simply force the application go into
an uninterruptable sleep until some lock resources are
released or some memories are freed. Am I right?

> > 3. Whether the number of lock resources is fixed
> > regardless the load of the server?
> > 
> > 4. If not, how the number of lock resources will
> be
> > expended under a heavy load?
> > 
> > 5. The lock manager maintains a cluster-wide
> directory
> > of the locations of the master copy of all the
> lock
> > resources within the cluster and evenly divides
> the
> > content of the directory across all nodes. How can
> I
> > check the content held by a node (what command or
> > API)?
> 
> On RHEL4 (cluster 1) systems the lock directory is
> viewable in
> /proc/cluster/dlm_dir. I don't think there is
> currently any equivalent
> in RHEL5 (cluster 2)

Thanks. Very helpful. From the busiest node A the
first several lines of dlm_dir are below. How to
interpret them, please? 

DLM lockspace 'data'
       5         2f06768 1
       5          114d15 1
       5          120b13 1
       5         5bd1f04 1
       3          6a02f8 2
       5          cb7604 1
       5          ca187b 1

Also there are many files under /proc/cluster, Could
you please direct me to a place where I can find the
usages of these files and descriptions of their
content? 

> > 6. If only one node A is busy while other nodes
> are
> > idle all the time,  does it mean that the node A
> holds
> > a very big master copy of lock resources and other
> > nodes have nothing?
> 
> That's correct. There is no point in mastering locks
> on a remote node as
> it will just slow access down for the only node
> using those locks.
> 
> > 7. For the above case, what would be the content
> of
> > the cluster-wide directory? Only one entry as only
> the
> > node A is really doing IO, or many entries and the
> > number of entries is the same as the number of
> used
> > lock resources on the node A? If the latter case
> is
> > true, will the lock manager still divide the
> content
> > evenly to other nodes? If so, would it costs the
> node
> > A extra time on finding the location of the lock
> > resources, which is just on itself,  by messaging
> > other nodes?
> 
> You're correct that the lock directory will still be
> distributed around
> the cluster in this case and that it causes network
> traffic. It isn't a
> lot of network traffic (and there needs to be some
> way of determining
> where a resource is mastered; a node does not know,
> initially, if it is
> the only node that is using a resource). 

> That lookup only happens the first time
> a resource is used by a node, once the
> node knows where the master is, 
> it does not need to look it up again,
> unless it releases all
> locks on the resource.
> 

Oh, I see. Just to further clarify, does it means if
the same lock resource is required again by an
application on the node A, the node A will go straight
to the known node (ie the node B) which holds the
master previously, but needs to lookup again if the
node B has already released the lock resource?

> 
> 
> I hope this helps,
> 
Yes, yes, very helpful. Thank you very much indeed.

Wish to receive your kind reply again.

Jas

      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ