[Linux-cluster] GFS problem
jpalmae at gmail.com
Mon Jan 25 21:43:58 UTC 2010
Please send your fence configuration and cluster.conf
2010/1/25, Alex Urbanowicz <alex.urbanowicz at gmail.com>:
> I have a problem with shared GFS resource on a 12-node Cluster Manager
> The cluster starts up properly if all nodes are booted at once. Any major
> interaction with one of the nodes (reboot, cman restart) causes the GFS to
> lock out the GFS, and for the cluster to fal into some unstable split state.
> In this state, logs, clustat and "cman_tool status" report the cluster as
> fully connected and working, while "cman_tool resources" reports only the
> fence resource in JOIN_START_WAIT (or JOIN_STOP WAIT, depending on what was
> done to the cluster in the meantime) state with overlapping but different
> node sets, depending on the node I run the "cman_tool resources" command.
> So far, the only functioning method to get the cluster out of the state is
> to manually reboot all the nodes at once, but this is unfeasible due to
> uptime expectations and high load carried by the cluster.
> We're completely in the dark about the possible cause of the problem, any
> help is appreciated.
Jorge Palma Escobar
Ingeniero de Sistemas
Red Hat Linux Certified Engineer
Certificate Nº 804005089418233
More information about the Linux-cluster