[Linux-cluster] manually stopping system-service (part of a cluster service) and rgmanager doesn't start it again

Paul Morgan jumanjiman at gmail.com
Tue Oct 18 22:04:26 UTC 2011


On Tue, Oct 18, 2011 at 06:47:29PM +0200, Masopust, Christian wrote:
-snip-
> Today one of our (not cluster-aware :) ) colleagues manually
> stopped flexlm by "service flexlm stop".
> 
> What I've expected to happen is, that rgmanager detects the
> stopped flexlm-service and either restarts it
> or relocates the complete cluster-service to my other node,
> but nothing happened.  In rgmanager.log I can
> see lots of calls of "/etc/init.d/flexlm status", but no
> action of restart or relocate.
>
> What's going wrong here?   Or is this behaviour ok?

My guess is that `service flexlm status' is returning
exit code 0 even if the service is not running.
Manually check the exit code while it's stopped.
$?==0 means the service is deemed good by rgmanager.

I hacked together an init script for our flexlm server,
and I've just posted it at:

  https://gist.github.com/1296847

In the comments of the script, I point out the importance
of error codes and provide URL references to LSB compliance
notes. The comments also mention how to override the script variables
(in /etc/sysconfig/flexlm).

Feel free to critique the init script.

hth,
-paul

-- 
Paul Morgan <jumanjiman at gmail.com>
RHCE, RHCDS, RHCVA, RHCSS, RHCA
http://github.com/jumanjiman
GPG Public Key ID: 0xf59e77c2
Fingerprint = 3248 D0C8 4B42 2F7C D92A  AEA0 7D20 6D66 F59E 77C2




More information about the Linux-cluster mailing list