[Linux-cluster] GFS 6.0 lt_high_locks value in cluster.ccs

Tue Jan 10 03:37:35 UTC 2006

Chris Feist wrote:

> Yes, issue #2 could definitely be the cause of your first issue. 
> Unfortunately you'll need to bring down your cluster to change the value 
> of lt_high_locks.  What is its value currently?  And how much memory do 
> you have on your gulm lock servers?  You'll need about 256M of RAM for 
> gulm for every 1 Million locks (plus enough for any other process and 
> kernel).
> 
> On each of the gulm clients you can also cat /proc/gulm/lockspace to see 
> which client is using most of the locks.

Thanks for the response!  I figured I would probably have to bring down 
the cluster to change the highwater setting, but I was hoping a bit that 
it could be changed dynamically.  Oh well.

The value is currently at the default, which I want to say is something 
like 1.04M.  These machines are both lock servers and samba/NFS servers, 
and have 4GB of RAM available (I have three lock servers in the cluster, 
and all three have 4GB of RAM).  A previous RedHat service call has me 
running the hugemem kernel on all three (the issue there was that, under 
just light activity loading, lowmem would be exhausted and the machines 
would enter an OOM spiral of death).  Now that I have turned off 
hyperthreading, though, memory usage seems to be dramatically lower than 
it was prior to that change.  For instance, the machine running samba 
services has been running since I turned off hyperthreading on Friday 
night.  Today, the machine was under some pretty heavy load.  On a 
normal day, prior to the hyperthreading change, I'd be down to maybe 
500MB of lowmem free right now (out of 3GB).  The only way to completely 
reclaim that memory would be to reboot.  So, now I'm sitting here 
looking at this machine, and it has 3.02GB of 3.31GB free.  I'm going to 
have to let this run for a while to determine if this is a red herring, 
but it looks much better than it ever has in the past.

Here's the interesting output from the /proc/gulm gadgets (note that, at 
the time I grabbed these, I was seeing the "more than the max" message 
logged to syslog between once and twice per minute, but not at the 
10-second rate that I read about previously):

[root at xxxxx root]# cat /proc/gulm/filesystems/data0
Filesystem: data0
JID: 0
handler_queue_cur: 0
handler_queue_max: 26584
[root at xxxxx root]# cat /proc/gulm/filesystems/data1
Filesystem: data1
JID: 0
handler_queue_cur: 0
handler_queue_max: 4583
[root at xxxxx root]# cat /proc/gulm/filesystems/data2
Filesystem: data2
JID: 0
handler_queue_cur: 0
handler_queue_max: 11738
[root at xxxxx root]# cat /proc/gulm/lockspace

lock counts:
   total: 41351
     unl: 29215
     exl: 3
     shd: 12055
     dfr: 0
pending: 0
    lvbs: 16758
    lops: 12597867

[root at xxxxx root]#