[rhn-users] Batch control across multiple machines

Jesse Becker jbecker at northwestern.edu
Mon May 2 15:58:46 UTC 2005


On Mon, May 02, 2005 at 09:32:01AM -0600, Craig Aumann wrote:
> I need a "kick in the right direction" for this one.  

*punt*

> I have a bunch of redhat boxes that I use to run batch jobs.  To run
> these jobs, I simply created separate batch files on each machine. 
> However, this is quickly becoming a pain, given that all the simulations
> are part of a larger simulation project.  

Sounds like you need a queuing system.

> What I want is a single point of control through which all batch jobs
> are sent off to separate machines.  As these machines finish their
> jobs,  new jobs are automatically sent to them.  I don't really need
> anything more than this.  

Yep, you need a queuing system. ;-)

> Are there any open-source tools for doing this?  

Several.

Two common ones are the Sun Grid Engine (http://gridengine.sunsource.net/)
and OpenPBS (http://www.openpbs.org/).  Both are fully-blown, mature, and very
flexible queuing systems that should do exactly what you want.  In fact, they
probably will do a lot *more* than what you need, since they support various
access controls and accounting features as well (you don't have to use those,
if you don't want to though).

I've used SGE several times now, and like it quite a bit.

There is also a command called "batch" with is part of the atd package; this
is probably already installed on your systems.  It's fairly limited, but
doesn't require any additional software installs.  The website
http://freshmeat.net/ also has quite a few programs that might work; search
for "queue".

Finally, if you find yourself spending a lot of time managing clusters, then
consider taking a look at the ROCKS (http://rocksclusters.org) or OSCAR
projects (http://oscar.sf.net/).  ROCKS is an RHEL3 recompile, but with lots
of extra queuing software.  OSCAR is set of programs that sit on top of an
existing distribution.  Both include queuing software.


> Thanks!

You're welcome.

-- 
Jesse Becker
GPG-fingerprint: BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/rhn-users/attachments/20050502/01266146/attachment.sig>


More information about the rhn-users mailing list