[Linux-cluster] fenced spinning?

Jeff Sturm jeff.sturm at eprize.com
Thu Nov 26 00:44:28 UTC 2009


CentOS 5.2, 26-node cluster.

 

Today I restarted one node.  It left the cluster, rebooted and joined
the cluster without incident.  Everything is fine but... fenced has the
CPU pegged.

 

No useful log messages.  strace says it is spinning on poll/recvfrom:

 

poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN},
{fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1)
= 2

recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource
temporarily unavailable)

poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN},
{fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1)
= 2

recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource
temporarily unavailable)

 

Anything else useful I can do to diagnose?  What are the chances I can
recover this node nicely without making things worse?

 

Any help/ideas appreciated,

 

Jeff

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20091125/1363acee/attachment.htm>


More information about the Linux-cluster mailing list