[Linux-cluster] GFS problem

Jorge Palma jpalmae at gmail.com
Mon Jan 25 21:43:58 UTC 2010


Please send your fence configuration and cluster.conf

Regards


2010/1/25, Alex Urbanowicz <alex.urbanowicz at gmail.com>:
> Hello
>
> I have a problem with shared GFS resource on a 12-node Cluster Manager
> cluster.
>
> The cluster starts up properly if all nodes are booted at once. Any major
> interaction with one of the nodes (reboot, cman restart) causes the GFS to
> lock out the GFS, and for the cluster to fal into some unstable split state.
>
> In this state, logs, clustat and "cman_tool status" report the cluster as
> fully connected and working, while "cman_tool resources" reports only the
> fence resource in JOIN_START_WAIT (or JOIN_STOP WAIT, depending on what was
> done to the cluster in the meantime) state with overlapping but different
> node sets, depending on the node I run the "cman_tool resources" command.
>
> So far, the only functioning method to get the cluster out of the state is
> to manually reboot all the nodes at once, but this is unfeasible due to
> uptime expectations and high load carried by the cluster.
>
> We're completely in the dark about the possible cause of the problem, any
> help is appreciated.
>
> TIA
>
> Alex
>


-- 
Jorge Palma Escobar
Ingeniero de Sistemas
Red Hat Linux Certified Engineer
Certificate Nº 804005089418233




More information about the Linux-cluster mailing list