[Linux-cluster] RHEL3 Cluster network hangup

Gunther Schlegel schlegel at riege.com
Wed Jul 6 06:30:51 UTC 2005


Hi,

I am running RHEL3 ES with the RedHat Cluster Suite (not GFS, simply 
failover cluster).

The clustered application does a lot of printing (lprng), 
faxing(hylafax) and mailing(sendmail). It uses shell scripts to pass the 
jobs to the operating systems daemons.

The client programs of these daemons, which pass jobs to the daemons 
using network connections to localhost start to behave irregular when 
the cluster is up for about 2 weeks.

Examples:
- hylafax faxstat stops listing the transmitted faxes in the middle of 
the list ( but always at the same job )
- sendmail opens a connection to the local daemon but does not transfer 
the message. Both processes sit there and wait, after some time the 
server closes the connection because of missing input from the clients side.
- same with lpr.

I assume that something locks up in the ip stack. Not all services are 
affected at the same time.

I guess this is related to the cluster software as we run that 
application on a lot of servers which all do not show this behaviour and 
that are all not clustered.

Any hints?

regards, Gunther
-------------- next part --------------
A non-text attachment was scrubbed...
Name: schlegel.vcf
Type: text/x-vcard
Size: 345 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/5cf32d5d/attachment.vcf>


More information about the Linux-cluster mailing list