[linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
Andreas Pflug
pgadmin at pse-consulting.de
Wed Jun 5 13:23:32 UTC 2013
Hi David,
I got quite some trouble with clvmd on corosync 2.3.0/dlm; apparently a
nonfunctional clvmd in the cluster can block all others (kern.log states
clvmd stuck for >120s in some dlm call). I tried to clean things up
killing -9 clvmd, but it will remain on state D or Z. Unfortunately, it
seems that those zombies still keep some dlm stuff locked. When I
restart corosync on a node and dlm_controld -D on it, I see "found
uncontrolled lockspace, tell corosync to remove nodeid from cluster".
Well, that's fine for the first step, but how about cleaning up the dlm
lockspace? dlm_tool leave <lockspace> hangs as well (sometimes it just
fails with error 49). The comment in dlm_controld/action.c isn't too
satisfactory: need reboot, not funny if a whole cluster is affected. I'd
really appreciate a way to manually clean old lockspaces. I'd presume
that an uncontrolled lockspace on an isolated node should be easily
removable...
Regards
Andreas
More information about the linux-lvm
mailing list