[Linux-cluster] umount hung single node

Daniel McNeil daniel at osdl.org
Fri Mar 18 23:00:13 UTC 2005


On Thu, 2005-03-17 at 23:39, David Teigland wrote:

> 
> Were you using clvm (or specifically, was this node running clvmd)?  If
> not, then the unmount would mean stopping all the dlm threads.  That's
> something we seldom do in our testing because clvmd is always still using
> the dlm.  Starting clvmd on your nodes, even if you don't use it, would
> avoid unmount stopping dlm_astd which may avert the problem.
> 
> I just ran across a possibly related problem where kthread_stop() couldn't
> stop dlm_astd.  dlm_astd was in wait_event_interruptible() instead of
> spinning, though.  The fix was to simply get rid of the unnecessary
> wait_queue and the wait_event.  I'm hoping that might fix the problem
> you're seeing, too.  I've attached the patch.


I was not using clvmd.  I grabbed a bunch of info off the node
that was hung in umount with dlm_astd spinning.  The data is
here: http://developer.osdl.org/daniel/GFS/test.14mar2005/

I'll apply your patch and try again.

Daniel




More information about the Linux-cluster mailing list