[Linux-cluster] cluster failed after 53 hours
Daniel McNeil
daniel at osdl.org
Wed Jan 19 18:47:57 UTC 2005
On Wed, 2005-01-19 at 00:50, Patrick Caulfield wrote:
> On Tue, Jan 18, 2005 at 03:10:20PM -0800, Daniel McNeil wrote:
> >
> > There is an DLM ASSERT farther down in log that show error = -105
> > which is ENOBUFS. Is this happening after the node has decided
> > to leave the cluster? I just want to make sure a out of memory
> > problem isn't causing the problem.
> >
>
> Unfortunately it could be, or it may not be. :(
> lowcomms_get_buffer() can return NULL if either a) there is no memory to
> allocate a page, or b) the DLM has been shut down. If that happens, -ENOBUFS is
> the result. On balance I would suspect that b) is more likely in this situation.
>
> One oddity in that log is that the DLM took 10 minutes to shutdown after CMAN
> decided it had to leave the cluster - or did those 34980 lines have to go down a
> serial console?
Yup. Serial console.
Daniel
More information about the Linux-cluster
mailing list