[Linux-cluster] GFS 6.0u5 not freeing locks?
Kovacs, Corey J.
cjk at techma.com
Tue Dec 13 20:07:16 UTC 2005
It's been a while since I've worked on the following problem but here I am
at it again.
I have a three node system running RHEL3 update 5 (kernel 2.4.21-32) with
GFS-18.104.22.168-1. All three nodes are running as both lock managers and
filesystem clients. When sending thousands of files to the cluster
(on the order of 1/2 terrabyte of 50k files) target nodes will run
out of memory and refuse to fork. Interestingly enough this condition
does not cause the cluster to fence the node, rather it things everything
is "OK". The effect of course is that the fs is not accessable cuz the
cluster is waiting to hear back from the node in question.
I set the high water mark to 10000 (I know that's low, but I wanted to see
and the system seemed to be trying to free locks every ten seconds as it
simply could not keep up with the file xfer going in.
By the time a node finally locks up there are over 300K of locks in use.
only a small % diff between the locks reported and the inodes in the
I interperet this correctly, it simply meand that for almost all the files I
was able to xfer, there is an existing lock being used. Also, mem usage for
lock_gulmd was at 85M+.
When we started logging things it was at 30M+ rising about 3-400k per min.
I remember seeing some traffic about another cluster user having a similar
I am not sure if it was resolved.
This looks like a leak to me, anyone have any ideas?
More information about the Linux-cluster