[Linux-cluster] GFS tuning for combined batch / interactive use

Bob Peterson rpeterso at redhat.com
Thu Dec 16 15:09:39 UTC 2010


----- "Kevin Maguire" <kmaguire at eso.org> wrote:
| Hi
| 
| We are running a 20 node cluster, using Scientific Linux 5.3, with a
| GFS 
| shared filesystem hosted on our SAN. Cluster nodes are dual core units
| 
| with 4 GB of RAM, and a standard Qlogic FC HBA.
| 
| Most of the 20 nodes form a batch-processing cluster, and our users
| are 
| happy enough with the performance they get, but some nodes are used 
| interactively. When the filesystem is under stress due to large batch
| 
| processing jobs running on other nodes, interactive use becomes very
| slow 
| and painful.
| 
| Is there any tuning I (the sysadmin) can do that might help in this 
| situation?  Would a migration to gfs2 make a difference? Are all nodes
| 
| treated identically, or can hosts mounting the filesystem have any
| kind of 
| priority/QoS? Which tools could I use to track down any bottlenecks?
| 
| In theory we could update kernel+gfs bits to a later release, though
| we 
| saw the same issues when using the same cluster with a SL4.x stack,
| but 
| for now it's
| 
| kernel-2.6.18-128.1.1.el5.i686
| kmod-gfs-0.1.31-3.el5.i686
| gfs-utils-0.1.20-7.el5.i386
| gfs2-utils-0.1.53-1.el5_3.1.i386
| 
| Thanks for any help/suggestions,
| Kevin

Hi Kevin,

We recently identified a slowdown in RHEL5.x that involves DLM traffic.
There is a patch to speed dlm up, and it's being tested now.  The
patch is built into RHEL5 kernels starting with 2.6.18-232 and newer.
That means it is currently scheduled to be released in RHEL5.6.

It's also being z-streamed back to 5.5.z, but I don't know when that
is scheduled to go out.  Unfortunately, since the problem was
opened by a customer, the bugzilla record is private to protect the
customer's confidential information.  The patch is public though.
If you are a Red Hat customer, you can probably call Red Hat Support
and ask to be put on the list for bugzilla bug 604139 and
maybe find out when the fix will be available.

There is no guarantee this is what your problem is, and there is
no guarantee that the patch will speed you up.  But it might be.

Regards,

Bob Peterson
Red Hat File Systems




More information about the Linux-cluster mailing list