[Linux-cluster] Problem starting cluster

David Brieck Jr. dbrieck at gmail.com
Thu Oct 19 18:05:25 UTC 2006


On 10/19/06, isplist at logicore.net <isplist at logicore.net> wrote:
> Here's a new log after a fresh reboot of the node with matching cluster.conf,
> same as the others;
>
> What's the evil warning about?
>
> Oct 19 11:58:42 cweb92 rgmanager: clurgmgrd startup succeeded
> Oct 19 11:58:42 cweb92 ccsd[2318]: Cluster is not quorate.  Refusing
> connection.
> Oct 19 11:58:42 cweb92 ccsd[2318]: Error while processing connect: Connection
> refused
> Oct 19 11:58:42 cweb92 ccsd[2318]: Invalid descriptor specified (-111).
> Oct 19 11:58:42 cweb92 ccsd[2318]: Someone may be attempting something evil.
> Oct 19 11:58:42 cweb92 ccsd[2318]: Error while processing get: Invalid request
> descriptor
> Oct 19 11:58:42 cweb92 ccsd[2318]: Invalid descriptor specified (-111).
> Oct 19 11:58:42 cweb92 ccsd[2318]: Someone may be attempting something evil.
> Oct 19 11:58:42 cweb92 ccsd[2318]: Error while processing get: Invalid request
> descriptor
> Oct 19 11:58:42 cweb92 ccsd[2318]: Invalid descriptor specified (-21).
> Oct 19 11:58:42 cweb92 ccsd[2318]: Someone may be attempting something evil.
> Oct 19 11:58:42 cweb92 ccsd[2318]: Error while processing disconnect: Invalid
> request descriptor
> Oct 19 11:58:42 cweb92 clurgmgrd[2527]: <notice> Resource Group Manager
> Starting
> Oct 19 11:58:43 cweb92 clurgmgrd[2527]: <info> Loading Service Data
> Oct 19 11:58:45 cweb92 ccsd[2318]: Cluster is not quorate.  Refusing
> connection.
> Oct 19 11:58:45 cweb92 ccsd[2318]: Error while processing connect: Connection
> refused
> Oct 19 11:58:45 cweb92 clurgmgrd[2527]: <crit> #5: Couldn't connect to ccsd!
> Oct 19 11:58:45 cweb92 clurgmgrd[2527]: <crit> #8: Couldn't initialize
> services
> Oct 19 11:58:47 cweb92 rc: Starting webmin:  succeeded
> Oct 19 11:59:11 cweb92 ccsd[2318]: Unable to connect to cluster infrastructure
> after 60 seconds.
> Oct 19 11:59:41 cweb92 ccsd[2318]: Unable to connect to cluster infrastructure
> after 90 seconds.
>
>

Something that happened to me might apply: If you are using GFS and
DLM, you can't start your fence domain if one of the nodes is trying
to fence another node and it's failing. This has happened to me when I
didn't specify a fence device for one of my nodes. If you the fencing
is a goof, you can try to increase the post join delay to keep new
nodes that are entering the cluster from being fenced before they can
join.

I would suggest that instead of the init scripts just run the command
by hand and increase the verbosity. That will probably shed much more
light on things than the logs.




More information about the Linux-cluster mailing list