[Linux-cluster] Things that i don't understand about cluster suite
gforte at leopard.us.udel.edu
Thu Sep 21 12:11:53 UTC 2006
> Lon, I use original's postfix script and returns this if postfix is up:
> "master (pid 957) is running..." when postfix isn't up, script returns:
> "master is stopped". Do I need to change this message to "0" for status
> check works ok??
That's just a message that's printed. return status is the value given
in a statement of the form 'return X', or 0 if no such statement is
explicitly reached. All executables return a status value to the shell,
where 0 is taken to mean "OK", and non-zero means "something bad happened".
The postfix script appears to return the correct values in each case.
My guess would be that it's cluster configuration problem, but I didn't
see anything about postfix in the conf that you pasted ...
> Then, I can't startup only one node, when both are stopped, right??
No, you definitely can do this, if the cluster is configured correctly.
The problem may be in your fencing method - the first thing the booted
node will do when cman starts is to try to contact the other node. When
it times out, it'll try to fence the other node and won't continue until
it does. If the fence process fails, it'll hang there, which I'm
guessing is what you're seeing. So the problem is most likely that
fencing is failing, either due to misconfiguration or because the other
node is powered off and so its iLo agent isn't responding. Since iLo is
supposed to be able to power-up a switched-off server, my guess is
there's a problem with your fencing configuration - did you fix it so
that you have a separate fencedevice entry for each node?
More information about the Linux-cluster