[Linux-cluster] $OCF_ERR_CONFIGURED - recovers service on another cluster node
Lon Hohberger
lhh at redhat.com
Wed Feb 8 14:36:26 UTC 2012
On 01/27/2012 04:03 AM, Parvez Shaikh wrote:
> Hi guys,
>
> I am using Red Hat Cluster Suite which comes with RHEL 5.5 -
>
> cman_tool version
> >>6.2.0 config xxx
>
> Now I have a script resource in which I return $OCF_ERR_CONFIGURED; in
> case of a Fatal irrecoverable error, hoping that my service would not
> start on another cluster node.
>
> But I see that cluster, relocates it to another cluster node and
> attempts to start it.
>
> I referred error code documentation from
> http://www.linux-ha.org/doc/dev-guides/_return_codes.html
>
> Is there any return code which makes RHCS to give up on recovering service?
>
The resource must fail during the 'stop' phase if you want rgmanager to
not try to recover it. There is no 'start' phase error condition that
tells rgmanager to give up.
The history: If you don't have a program installed or configured on
host1 but try to enable a service there, it will obviously fail to start
(rightfully so). However, host2 may have the configuration. So,
rgmanager will then stop the service and try to start it on host2. In
fact, it will systematically try every host in the cluster until:
- the service starts successfully
- no more hosts are available (e.g. restricted failover domain,
exclusive services, or simply all hosts were tried). At this
point, the service is placed in the 'stopped' state in
the hopes that the next host to come online will be able to
start the service
- a failure during 'stop' occurs. Most errors during the stop
phase will trigger an abortion of the enable request (except
'OCF_NOT_INSTALLED' when a <script> is missing)
-- Lon
More information about the Linux-cluster
mailing list