[Linux-cluster] OOM failures with GFS, NFS and Samba on a cluster with RHEL3-AS

Jonathan Woytek woytek+ at cmu.edu
Mon Jan 24 18:36:47 UTC 2005


Michael Conrad Tadpol Tilstra wrote:
> On Sun, Jan 23, 2005 at 01:45:28PM -0500, Jonathan Woytek wrote:
> 
>>Additional information:
>>
>>I enabled full output on lock_gulmd, since my dead top sessions would 
>>often show that process near the top of the list around the time of 
>>crashes.  The machine was rebooted around 10:50AM, and was down again at 
> 
> 
> Not suprising that lock_gulmd is working hard when gfs is under heavy
> use.  Its it busy processing all those lock requests.  What would be
> more useful from gulm for this than the logging messages, is to query
> the locktable every so often for its stats.
> `gulm_tool getstats <master>:lt000`
> The 'locks = ###' line is how many lock structures are current held.
> gulm is very greedy about memory, and you are running the lock servers
> on the same nodes you're mounting from.

Here are the stats from the master lock_gulmd lt000:

I_am = Master
run time = 9436
pid = 2205
verbosity = Default
id = 0
partitions = 1
out_queue = 0
drpb_queue = 0
locks = 20356
unlocked = 17651
exclusive = 15
shared = 2690
deferred = 0
lvbs = 17661
expired = 0
lock ops = 107354
conflicts = 0
incomming_queue = 0
conflict_queue = 0
reply_queue = 0
free_locks = 69644
free_lkrqs = 60
used_lkrqs = 0
free_holders = 109634
used_holders = 20366
highwater = 1048576


Something keeps eating away at lowmem, though, and I still can't figure 
out what exactly it is.


> also, just to see if I read the first post right, you have
> samba->nfs->gfs?

If I understand your arrows correctly, I have a filesystem mounted with 
GFS that I'm sharing via NFS to another machine that is sharing it via 
Samba.  I've closed that link, though, to try to eliminate that as a 
problem.  So now I'm serving the GFS filesystem directly through Samba.

jonathan

-- 
Jonathan Woytek                 w: 412-681-3463         woytek+ at cmu.edu
NREC Computing Manager          c: 412-401-1627         KB3HOZ
PGP Key available upon request




More information about the Linux-cluster mailing list