[Linux-cluster] "dlm_controld[nnnn]: cluster is down, exiting" on node1 when starting node2

David Teigland teigland at redhat.com
Mon Jun 8 15:06:52 UTC 2009


On Fri, Jun 05, 2009 at 10:10:38AM -0700, Steven Dake wrote:
> 99.9% of the time there would be a core file in /var/lib/openais/core*
> if aisexec faults.  We have not seen faults during normal operations for
> years in a released version under typical gfs2 usage scenarios.  If
> there is no core, it means some other component failed, exited, and
> caused that node to be fenced, or the core file could not be written by
> the OS because of some other OS specific failure.  

That's why it would be so valuable to leave a simple "I'm failing" message.
That and the fact that people don't naturally know to go looking for a
/var/lib/openais/core file when everything falls apart.

Dave




More information about the Linux-cluster mailing list