[Linux-cluster] Question :)

Kovacs, Corey J. cjk at techma.com
Tue May 31 18:38:18 UTC 2005

Jerry, is this problem with the "current" supported version of GFS? If so,
version are you running? I am having a similar problem with a 5 node cluster 
with 3 nodes serving as lock managers. If I rsync large ammounts of data
to a node serving as a lock manager and mounting the FS, things croak pretty
If I rsync to a node that is NOT a lock manager, it takes longer but
eventually locks
up their as well. Although at times, it will come back.
when we do out rsync, the gfs_scand and lock_gulmd go crazy. In the instance
the fs comes back, they continue to have high cpu utilization. 
I don't think this is "a fact of life" that anyone needs to live with by the
way, there has
to be a reason for this. I can't believe for a minute that you and I are the
only ones
experienceing this.

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Gerald G. Gilyeat
Sent: Tuesday, May 31, 2005 2:06 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Question :)

First - thanks for the help the last time I poked my pointy little head in
Things have been -much- more stable since we bumped the lock limit to 2097152

However, we're still running into the occasional "glitch" where it seems like
a single process is locking up -all- disk access on us, until it completes
its operation.
Specifically, we see this when folks are doing rsyncs of large amounts of
data (one of my faculty has been trying to copy over a couple thousand 16MB
files). Even piping tar through ssh (from target machine, ssh user at host "cd
/data/dir/path; tar -cpsf -" | tar -xpsf -) results in similar behaviour.
Is this tunable, or simply a fact of life that we're simply going to have to
live with? it only occurs with big, or long, writes. Reads aren't a problem
(it just takes 14 hours to dump 1.5TB to tape...)


Jerry Gilyeat, RHCE
Systems Administrator
Molecular Microbiology and Immunology
Johns Hopkins Bloomberg School of Public Health

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050531/9fedeca5/attachment.htm>

More information about the Linux-cluster mailing list