[Linux-cluster] qdisk questions
denis
denisb+gmane at gmail.com
Thu Oct 2 08:16:08 UTC 2008
Hi,
I have recently had a couple of situations with my cluster where both
nodes were restarted simultaneously. The reasons for this are a bit
beyond me so I was wondering if anyone could clarify / point me to
relevant documentation.
Following excerpts from both nodes logs :
Oct 2 08:32:22 node1 qdiskd[3758]: <info> Heuristic: 'ping 10.X.X.X -c1
-t2' DOWN (3/3)
Oct 2 08:32:39 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct 2 08:32:55 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct 2 08:32:58 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t1' DOWN (6/6)
Oct 2 08:33:01 node1 qdiskd[3758]: <notice> Score insufficient for
master operation (0/4; required=1); downgrading
Oct 2 08:33:01 node1 kernel: md: stopping all md devices.
Oct 2 08:32:23 node2 qdiskd[3599]: <info> Heuristic: 'ping 10.X.X.X -c1
-t2' DOWN (3/3)
Oct 2 08:32:49 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct 2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t1' DOWN (6/6)
Oct 2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct 2 08:33:03 node2 qdiskd[3599]: <notice> Score insufficient for
master operation (0/4; required=1); downgrading
Oct 2 08:33:03 node2 kernel: md: stopping all md devices.
Does qdisk reboot the node due to these tests failing?
The upstream routers these nodes are connected to were unavailable for
at most 2 minutes, and all four pingtests require connectivity through
the router (probably need to change that!?).
What kind of tests can I use for qdiskd that will prevent router-outages
from killing my cluster completely?
Regards
--
Denis
More information about the Linux-cluster
mailing list