[Linux-cluster] RHEL3 Cluster Broken Pipe error and Heartbeat configuration

Lon Hohberger lhh at redhat.com
Fri Nov 14 21:54:45 UTC 2008


On Wed, 2008-11-12 at 19:14 +0530, lingu wrote:
> cluquorumd[1921]: <warning> Disk-TB: Detected I/O Hang!

Eep.

This means that I/O to shared storage has gotten slow.  Strange.  I
heard reports of this on another cluster (after going from U3->U8), but
I don't know what the cause is.  With this cluster, we straced the
cluquorumd process and found that it was slowing down *a lot* in the
write() call when writing to shared storage.

You can try the current U9+erratum clumanager or the test release if you
want to (it makes unlock more robust when I/O performance is slow for
some reason).

However, someone really needs to profile the kernel if you're seeing
slow write times while stracing cluquorumd...

-- Lon




More information about the Linux-cluster mailing list