[Linux-cluster] weird rgmanager
frederic at ovsg.univ-ag.fr
Mon Jan 25 11:26:42 UTC 2010
In a four node cluster running RH5.4, connected to a FC SAN
node 1 and node 3 are online with rgmanager
node 2 and node 4 are offline
The cluster remains quorate because of a qdiskd running on each node
BUT, node 4, which is offline as per clustat and cman_tool nodes, is
still reported by clustat as running services ( those services are
actually dead ).
I have on the two alive nodes ( node 1 and node 3 ):
type level name id state
fence 0 default 00010004 FAIL_ALL_STOPPED
[1 2 3 4]
dlm 1 clvmd 00020004 LEAVE_STOP_WAIT
[1 2 3 4]
dlm 1 rgmanager 00030004 FAIL_ALL_STOPPED
[1 3 4]
they are running services ( xen vm, nfs and dns ) OK.
The other two dead nodes ( they don't run ccs neither cman neither
nothing ) can access the SAN as is displayed by multipath -ll
I know I can restart the whole cluster but i would like to know why this
I someone please can help.
More information about the Linux-cluster