[Linux-cluster] What does it means "rgmanager status 139"?

Tue Feb 28 10:48:53 UTC 2012

On Tue, Feb 28, 2012 at 11:46 AM, C. L. Martinez <carlopmart at gmail.com> wrote:
> On Tue, Feb 28, 2012 at 11:36 AM, emmanuel segura <emi2fast at gmail.com> wrote:
>> I think your problem it's in the service definition
>>
>> Il giorno 28 febbraio 2012 11:19, C. L. Martinez <carlopmart at gmail.com> ha
>> scritto:
>>>
>>> On Tue, Feb 28, 2012 at 11:14 AM, Digimer <linux at alteeve.com> wrote:
>>> > On 02/28/2012 05:06 AM, C. L. Martinez wrote:
>>> >> On Tue, Feb 28, 2012 at 11:01 AM, Digimer <linux at alteeve.com> wrote:
>>> >>> On 02/28/2012 04:20 AM, C. L. Martinez wrote:
>>> >>>> Hi all,
>>> >>>>
>>> >>>>  What does it means?? I guess it is related to status check do it by
>>> >>>> rgmanager, but executing status option from shell, result is 0
>>> >>>> ...Then, why rgmanager returns this error??
>>> >>>>
>>> >>>> Thanks.
>>> >>>
>>> >>> What version of the cluster? What is the cluster's configuration? What
>>> >>> service is returning 139?
>>> >>>
>>> >>> You need to provide much more information than this for anyone to be
>>> >>> able to help.
>>> >>>
>>> >>
>>> >> My rhcs verisons:
>>> >> cman-3.0.12.1-23.el6.x86_64
>>> >> rgmanager-3.0.12.1-5.el6.x86_64
>>> >>
>>> >> cluster.conf relative to failed service:
>>> >> <service autostart="0" domain="FirstCluster" exclusive="0"
>>> >> name="splunksrv-svc" recovery="relocate">
>>> >>                         <fs ref="splunksrvdata">
>>> >>                                 <ip ref="192.168.44.4">
>>> >>                                         <script
>>> >> ref="splunksrv-cluster"/>
>>> >>                                 </ip>
>>> >>                         </fs>
>>> >> </service>
>>> >
>>> > That is the service, but doesn't show the 'ref' entries.
>>>
>>> <fs device="/dev/cludata/splunksrvdata" force_fsck="0"
>>> force_unmount="1" fstype="ext4" mountpoint="/data/splunk/instance"
>>> name="splun
>>> ksrvdata" self_fence="1"/>
>>> <ip address="192.168.44.4" monitor_link="yes" sleeptime="10"/>
>>> <script file="/data/config/etc/init.d/splunksrv-cluster"
>>> name="splunksrv-cluster"/>
>>>
>>>
>>> >
>>> >> My service's script:
>>> >> #!/bin/sh -x
>>> >
>>> > <snip>
>>> >
>>> >> Executing from command line:
>>> >>
>>> >> [root at clunode01 init.d]# ./splunksrv-cluster mystatus
>>> >
>>> > <snip>
>>> >
>>> > rgmanager calls "status", so your script must work exactly like an
>>> > init.d script in order for rgmanager to work (that is, responds as
>>> > expected to 'start', 'stop' and 'status'). The "mystatus" is not what
>>> > rgmanager will use.
>>> >
>>>
>>> Yes, I know ... I have changed "status" name to "mystatus" to do some
>>> checks ... but when I change "mystatus" to "status" (the correct way),
>>> then rgmanager fails with error code 139 and relocate the service to
>>> the other node ...
>>>
>
> Yes, it seems. But then, why did I run other services
> were defined as above and works ok?
>
> For example:
> <service autostart="0" domain="SecondNode" name="ossecs-svc"
> recovery="restart-disabled">
>           <fs ref="ossecsdata">
>                   <script ref="ossecs-cluster"/>
>           </fs>
>  </service>
>
>  and works perfect ...

But not, it doesn't works:

Feb 28 10:47:28 rgmanager [script] script:splunksrv-cluster: status of
/data/config/etc/init.d/splunksrv-cluster failed (returned 139)
Feb 28 10:47:28 rgmanager status on script "splunksrv-cluster"
returned 1 (generic error)
Feb 28 10:47:28 rgmanager Stopping service service:splunksrv-svc
Feb 28 10:47:45 rgmanager Service service:splunksrv-svc is recovering

I have changed to:

<service autostart="0" domain="FirstCluster" exclusive="0"
name="splunksrv-svc" recovery="relocate">
        <fs ref="splunksrvdata"/>
        <ip ref="192.168.44.4"/>
        <script ref="splunksrv-cluster"/>
</service>