[Linux-cluster] Problems with a script when is launched via rgmanager

carlopmart carlopmart at gmail.com
Mon Jan 10 21:16:51 UTC 2011


Hi all,

  I am trying to set up a splunk cluster service on two RHEL5.5 hosts (fully 
updated). My problems becomes when I trying to setup this service under rgmanager: 
script ever fails. If I launch the script manually, all works as expected. If I test 
the service using rg_test comand, all works ok as expected.

  This is the error when rgmanager tries to launch the service:

Jan 10 17:50:55 lorien clurgmgrd[25394]: <notice> Starting disabled service 
service:siemmgmt-svc
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Link for eth0: Detected
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <info> Adding IPv4 address 
172.25.70.22/28 to eth0
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Pinging addr 172.25.70.22 from 
dev eth0
Jan 10 17:50:57 lorien clurgmgrd: [25394]: <debug> Sending gratuitous ARP: 
172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <warning> Unknown file system type 'ext4' 
for device /dev/inasvol/splunkvol.  Assuming fsck is required.
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> Running fsck on 
/dev/inasvol/splunkvol
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> mounting /dev/inasvol/splunkvol on 
/data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> mount -t ext4 -o rw 
/dev/inasvol/splunkvol /data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing 
/data/config/etc/init.d/splunk-cluster start
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: start of 
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> start on script "splunk-cluster" 
returned 1 (generic error)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <warning> #68: Failed to start 
service:siemmgmt-svc; return value: 1
Jan 10 17:50:58 lorien clurgmgrd[25394]: <debug> Stopping failed service 
service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> Stopping service service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing 
/data/config/etc/init.d/splunk-cluster stop
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: stop of 
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> stop on script "splunk-cluster" 
returned 1 (generic error)
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> unmounting /data/services/siem/splunk
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> Removing IPv4 address 
172.25.70.22/28 from eth0
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #12: RG service:siemmgmt-svc failed 
to stop; intervention required
Jan 10 17:51:09 lorien clurgmgrd[25394]: <notice> Service service:siemmgmt-svc is failed
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #13: Service service:siemmgmt-svc 
failed to stop cleanly
Jan 10 17:51:09 lorien clurgmgrd[25394]: <debug> Handling failure request for RG 
service:siemmgmt-svc
Jan 10 17:51:19 lorien clurgmgrd[25394]: <debug> 2 events processed


And this is the output using rg_test command:

[root at lorien ~]# rg_test test /etc/cluster/cluster.conf start service siemmgmt-svc
Running in test mode.
Starting siemmgmt-svc...
<warn>   Unknown file system type 'ext4' for device /dev/inasvol/splunkvol. 
Assuming fsck is required.
<debug>  Running fsck on /dev/inasvol/splunkvol
<info>   mounting /dev/inasvol/splunkvol on /data/services/siem/splunk
<debug>  mount -t ext4 -o rw /dev/inasvol/splunkvol /data/services/siem/splunk
<debug>  Link for eth0: Detected
<info>   Adding IPv4 address 172.25.70.22/28 to eth0
<debug>  Pinging addr 172.25.70.22 from dev eth0
<debug>  Sending gratuitous ARP: 172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
<info>   Executing /data/config/etc/init.d/splunk-cluster start
+ . /etc/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -f /etc/sysconfig/i18n -a -z '' ']'
++ . /etc/profile.d/lang.sh
+++ sourced=0
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /etc/sysconfig/i18n ']'
+++ . /etc/sysconfig/i18n
++++ LANG=en_US.UTF-8
++++ SYSFONT=latarcyrheb-sun16
+++ sourced=1
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /.i18n ']'
+++ '[' -n '' ']'
+++ '[' 1 = 1 ']'
+++ '[' -n en_US.UTF-8 ']'
+++ export LANG
+++ '[' -n '' ']'
+++ unset LC_ADDRESS
+++ '[' -n '' ']'
+++ unset LC_CTYPE
+++ '[' -n '' ']'
+++ unset LC_COLLATE
+++ '[' -n '' ']'
+++ unset LC_IDENTIFICATION
+++ '[' -n '' ']'
+++ unset LC_MEASUREMENT
+++ '[' -n '' ']'
+++ unset LC_MESSAGES
+++ '[' -n '' ']'
+++ unset LC_MONETARY
+++ '[' -n '' ']'
+++ unset LC_NAME
+++ '[' -n '' ']'
+++ unset LC_NUMERIC
+++ '[' -n '' ']'
+++ unset LC_PAPER
+++ '[' -n '' ']'
+++ unset LC_TELEPHONE
+++ '[' -n '' ']'
+++ unset LC_TIME
+++ '[' -n C ']'
+++ '[' C '!=' en_US.UTF-8 ']'
+++ export LC_ALL
+++ '[' -n '' ']'
+++ unset LANGUAGE
+++ '[' -n '' ']'
+++ unset LINGUAS
+++ '[' -n '' ']'
+++ unset _XKB_CHARSET
+++ consoletype=pty
+++ '[' -z pty ']'
+++ '[' -n '' ']'
+++ '[' -n '' ']'
+++ '[' -n en_US.UTF-8 ']'
+++ case $LANG in
+++ '[' dumb = linux ']'
+++ unset SYSFONTACM SYSFONT
+++ unset sourced
+++ unset langfile
++ '[' -z '' ']'
++ '[' -f /etc/sysconfig/init ']'
++ . /etc/sysconfig/init
+++ BOOTUP=color
+++ GRAPHICAL=yes
+++ RES_COL=60
+++ MOVE_TO_COL='echo -en \033[60G'
+++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
+++ SETCOLOR_FAILURE='echo -en \033[0;31m'
+++ SETCOLOR_WARNING='echo -en \033[0;33m'
+++ SETCOLOR_NORMAL='echo -en \033[0;39m'
+++ LOGLEVEL=3
+++ PROMPT=yes
+++ AUTOSWAP=no
++ '[' pty = serial ']'
++ '[' color '!=' verbose ']'
++ INITLOG_ARGS=-q
++ 
__sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
+ '[' '!' -d /data/services/siem/splunk/etc ']'
+ HOME=/data/services/siem/splunk
+ DIRECTORY=/data/services/siem/splunk
+ export HOME
+ case "$1" in
+ start
+ echo -n 'Starting Splunk: '
Starting Splunk:
+ sudo -H -u splunk /data/services/siem/splunk/bin/splunk start

Splunk> The IT Search Engine.

Checking prerequisites...
	Checking http port [172.25.70.22:9000]: open
	Checking mgmt port [172.25.70.22:9089]: open
	Checking configuration...  Done.
	Checking index directory...  Done.
	Checking databases...
	Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, 
main, sample, summary
	Checking for SELinux.
All preliminary checks passed.

                                                            [  OK  ]
                                                            [  OK  ]
Starting splunk server daemon (splunkd)... Done.Starting splunkweb... Done.
If you get stuck, we're here to help.
Look for answers here: http://www.splunk.com/base/Documentation

The Splunk web interface is at https://172.25.70.22:9000

+ RETVAL=0
+ '[' 0 -eq 0 ']'
+ success
+ '[' color '!=' verbose -a -z '' ']'
+ echo_success
+ '[' color = color ']'
+ echo -en '\033[60G'
                                                            + echo -n '['
[+ '[' color = color ']'
+ echo -en '\033[0;32m'
+ echo -n '  OK  '
   OK  + '[' color = color ']'
+ echo -en '\033[0;39m'
+ echo -n ']'
]+ echo -ne '\r'
+ return 0
+ return 0
+ echo

+ return 0
+ exit 0
Start of siemmgmt-svc complete

  As you can see, all works ok.

Service configuration under cluster.conf:

<service autostart="0" domain="PriCluster2" name="siemmgmt-svc" recovery="relocate">
              <fs ref="siemdata">
                 <ip ref="172.25.70.22">
                        <script ref="splunk-cluster"/>
                 </ip>
              </fs>
</service>


  Script:

#!/bin/sh -x
# Splunk:       Controls Splunk on Redhat-based systems
#
# chkconfig: 2345 99 15
# description: Starts and stops Splunk
#
# This will work on Redhat systems (maybe others too)

# Source function library.
. /etc/init.d/functions

if [ ! -d /data/services/siem/splunk/etc ]; then
         exit 1
fi

HOME="/data/services/siem/splunk"
DIRECTORY="/data/services/siem/splunk"

export HOME



start() {
         echo -n "Starting Splunk: "
         sudo -H -u splunk ${DIRECTORY}/bin/splunk start > /dev/null
         RETVAL=$?
         if [ $RETVAL -eq 0 ]; then
                 success
         else
                 failure
         fi
         echo
         return $RETVAL
}

stop() {
         echo -n "Stopping Splunk: "
         sudo -H -u splunk ${DIRECTORY}/bin/splunk stop > /dev/null
         RETVAL=$?
         if [ $RETVAL -eq 0 ]; then
                 success
         else
                 failure
         fi
         echo
         return $RETVAL
}

status() {
         exit 0
}

case "$1" in
   start)
         start
         ;;
   stop)
         stop
         ;;
   restart)
         stop
         start
         ;;
   status)
         status
         ;;
   *)
         echo $"Usage: $0 {start|stop|restart|status}"
         exit 1
esac

exit $?


  How can I debug this error?? I don't why fails when is launched via rgmanager ...

-- 
CL Martinez
carlopmart {at} gmail {d0t} com




More information about the Linux-cluster mailing list