[Linux-cluster] Unexpected service restart

Thu Jan 11 20:38:02 UTC 2007

Hi
I have a problem with a service  (oracle ) , this service is restarted 
by clurgmgrd without a error message.
The "message" log show:

=============================================================
Jan  8 04:50:50 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh status
...
Jan  8 04:51:20 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh status
...
Jan  8 04:51:40 eir-db1 clurgmgrd[2922]: <notice> Stopping service Oracle
Jan  8 04:51:40 eir-db1 clurgmgrd: [2922]: <info> Removing IPv4 address 
xxx.xxx.xxx.xxx from bond1
...
Jan  8 04:51:50 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh stop
...
Jan  8 04:52:27 eir-db1 clurgmgrd: [2922]: <info> Removing IPv4 address 
yyy.yyy.yyy.yyy from bond0
Jan  8 04:52:37 eir-db1 clurgmgrd: [2922]: <info> unmounting /data
Jan  8 04:52:38 eir-db1 clurgmgrd[2922]: <notice> Service Oracle is 
recovering
Jan  8 04:52:38 eir-db1 clurgmgrd[2922]: <notice> Recovering failed 
service Oracle
Jan  8 04:52:38 eir-db1 clurgmgrd: [2922]: <info> mounting 
/dev/cciss/c1d0p1 on /data
Jan  8 04:52:38 eir-db1 kernel: kjournald starting.  Commit interval 5 
seconds
Jan  8 04:52:38 eir-db1 kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Jan  8 04:52:38 eir-db1 kernel: EXT3 FS on cciss/c1d0p1, internal journal
Jan  8 04:52:38 eir-db1 kernel: EXT3-fs: mounted filesystem with ordered 
data mode.
Jan  8 04:52:38 eir-db1 clurgmgrd: [2922]: <info> Adding IPv4 address 
xxx.xxx.xxx.xxx to bond0
Jan  8 04:52:39 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh start
...
Jan  8 04:52:47 eir-db1 su(pam_unix)[27069]: session closed for user oracle
Jan  8 04:52:47 eir-db1 clurgmgrd: [2922]: <info> Adding IPv4 address 
yyy.yyy.yyy.yyy to bond1
Jan  8 04:52:48 eir-db1 clurgmgrd[2922]: <notice> Service Oracle started
Jan  8 04:52:50 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh status
...
Jan  8 04:53:20 eir-db1 clurgmgrd: [2922]: <info> Executing 
/opt/oracle/OraHome10g/bin/oracle_mgr.sh status
============================================================================

the "/opt/oracle/OraHome10g/bin/oracle_mgr.sh status" NOT fail ( this is 
a basic test), so I don't understand the reason for the restart of services.
The service configuration is:

==============================================================================================
                <service autostart="1" name="Oracle">
                        <fs device="/dev/cciss/c1d0p1" force_unmount="0" 
fstype="ext3" mountpoint="/data" name="oracle_fs" options="">
                                <ip address="xxx.xxx.xxx.xxx" 
monitor_link="0">
                                        <script 
file="/opt/oracle/OraHome10g/bin/oracle_mgr.sh" name="Oracle_script"/>
                                </ip>
                        </fs>
                        <ip address="yyy.yyy.yyy.yyy" monitor_link="0"/>
                </service>
===============================================================================================
I need information about this restart reason.
Any idea ??
Do you need another Logs files??
How can i modify the logs level in the cluster?

In this moment I'm checking the hardware ( scsi controler and disk and 
drivers).
We have configurated a cluster using with Red Hat E4 U2.
I this moment we have 3 services running in the cluster, we have 
problem  only with oracle.
The hardware is: HP ProLiant DL385 Packaged Cluster with MSA500 G2

Thanks in advance
Luis G.