[Linux-cluster] error clusvcadm

Mon May 13 07:32:27 UTC 2013

Hi,

This is the cluster.conf :

[root at titan0 11:29:14 ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0" ?>
<cluster config_version="7" name="HA_MGMT">
         <fence_daemon clean_start="1" post_fail_delay="0" 
post_join_delay="60"/>
         <clusternodes>
                 <clusternode name="titan0"  nodeid="1" votes="1">
                         <fence>
                                 <method name="1">
                                         <device name="titan0fence" 
option="reboot"/>
                                 </method>
                         </fence>
                 </clusternode>
                 <clusternode name="titan1" nodeid="2" votes="1">
                         <fence>
                                 <method name="1">
                                         <device name="titan1fence" 
option="reboot"/>
                                 </method>
                         </fence>
                 </clusternode>
         </clusternodes>
         <cman  cluster_id="0" expected_votes="1" two_node="1"/>
         <fencedevices>
                 <fencedevice agent="fence_ipmilan" 
ipaddr="172.17.0.101" login="administrator" name="titan0fence" 
passwd="administrator"/>
                 <fencedevice agent="fence_ipmilan" 
ipaddr="172.17.0.102" login="administrator" name="titan1fence" 
passwd="administrator"/>
         </fencedevices>
         <rm>
                 <failoverdomains>
                         <failoverdomain name="titan0_heuristic" 
ordered="0" restricted="1">
                                 <failoverdomainnode name="titan0" 
priority="1"/>
                         </failoverdomain>
                         <failoverdomain name="titan1_heuristic" 
ordered="0" restricted="1">
                                 <failoverdomainnode name="titan1" 
priority="1"/>
                         </failoverdomain>
                         <failoverdomain name="MgmtNodes" ordered="0" 
restricted="0">
                                 <failoverdomainnode name="titan0" 
priority="1"/>
                                 <failoverdomainnode name="titan1" 
priority="2"/>
                         </failoverdomain>
             <failoverdomain name="NFSHA" ordered="0" restricted="0">
                 <failoverdomainnode name="titan0" priority="2"/>
                 <failoverdomainnode name="titan1" priority="1"/>
             </failoverdomain>
                 </failoverdomains>
             <service domain="titan0_heuristic" name="ha_titan0_check" 
autostart="1" checkinterval="10">
                     <script file="/usr/sbin/ha_titan0_check" 
name="ha_titan0_check"/>
             </service>
             <service domain="titan1_heuristic" name="ha_titan1_check" 
autostart="1" checkinterval="10">
                     <script file="/usr/sbin/ha_titan1_check" 
name="ha_titan1_check"/>
             </service>
                 <service domain="MgmtNodes" name="HA_MGMT" 
autostart="0" recovery="relocate">
             <!-- ip addresses lines mgmt -->
                                 <ip address="172.17.0.99/16" 
monitor_link="1"/>
                                 <ip address="10.90.0.99/24" 
monitor_link="1"/>
             <!-- devices lines mgmt -->
                        <fs device="LABEL=postfix" 
mountpoint="/var/spool/postfix" force_unmount="1" fstype="ext3" 
name="mgmtha5" options=""/>
                        <fs device="LABEL=bigimage" 
mountpoint="/var/lib/systemimager" force_unmount="1" fstype="ext3" 
name="mgmtha4" options=""/>
                        <clusterfs device="LABEL=HA_MGMT:conman" 
mountpoint="/var/log/conman" force_unmount="0" fstype="gfs2" 
name="mgmtha3" options=""/>
                        <clusterfs device="LABEL=HA_MGMT:ganglia" 
mountpoint="/var/lib/ganglia/rrds" force_unmount="0" fstype="gfs2" 
name="mgmtha2" options=""/>
                        <clusterfs device="LABEL=HA_MGMT:syslog" 
mountpoint="/var/log/HOSTS" force_unmount="0" fstype="gfs2" 
name="mgmtha1" options=""/>
                        <clusterfs device="LABEL=HA_MGMT:cdb" 
mountpoint="/var/lib/pgsql/data" force_unmount="0" fstype="gfs2" 
name="mgmtha0" options=""/>
                         <script file="/usr/sbin/haservices" 
name="haservices"/>
                 </service>
         <service domain="NFSHA" name="HA_NFS" autostart="0" 
checkinterval="60">
             <!-- ip addresses lines nfs -->
                                 <ip address="10.31.0.99/16" 
monitor_link="1"/>
                                 <ip address="10.90.0.88/24" 
monitor_link="1"/>
                                 <ip address="172.17.0.88/16" 
monitor_link="1"/>
             <!-- devices lines nfs -->
                        <fs device="LABEL=PROGS" mountpoint="/programs" 
force_unmount="1" fstype="ext3" name="nfsha4" options=""/>
                        <fs device="LABEL=WRKTMP" mountpoint="/worktmp" 
force_unmount="1" fstype="ext3" name="nfsha3" options=""/>
                        <fs device="LABEL=LABOS" mountpoint="/labos" 
force_unmount="1" fstype="xfs" name="nfsha2" options="ikeep"/>
                        <fs device="LABEL=OPTINTEL" 
mountpoint="/opt/intel" force_unmount="1" fstype="ext3" name="nfsha1" 
options=""/>
                        <fs device="LABEL=HOMENFS" 
mountpoint="/home_nfs" force_unmount="1" fstype="ext3" name="nfsha0" 
options=""/>
             <script file="/etc/init.d/nfs" name="nfs_service"/>
         </service>
         </rm>
     <totem token="21000" />
</cluster>
<!-- !!!!! DON'T REMOVE OR CHANGE ANYTHING IN PARAMETERS SECTION BELOW
node_name=titan0
node_ipmi_ipaddr=172.17.0.101
node_hwmanager_login=administrator
node_hwmanager_passwd=administrator
ipaddr1_for_heuristics=172.17.0.200
node_ha_name=titan1
node_ha_ipmi_ipaddr=172.17.0.102
node_ha_hwmanager_login=administrator
node_ha_hwmanager_passwd=administrator
ipaddr2_for_heuristics=172.17.0.200
mngt_virt_ipaddr_for_heuristics=not used on this type of node
END OF SECTION !!!!! -->

The var/log/messages is too long and have some messages repeated :
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:33 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:39198
May 13 11:30:34 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:53030
May 13 11:30:34 s_sys at titan0 snmpd[4584]: Received SNMP packet(s) from 
UDP: [10.40.20.30]:53030
May 13 11:30:34 s_sys at titan0 snmpd[4584]: Connection from UDP: 
[10.40.20.30]:41083
May 13 11:30:34 s_sys at titan0 snmpd[4584]: Received SNMP packet(s) from 
UDP: [10.40.20.30]:41083

Regards
Delphine

Le 13/05/13 10:37, Rajveer Singh a écrit :
> Hi Delphine,
> It seems there is some filesystem crash. Please share your 
> /var/log/messages and /etc/cluster/cluster.conf file to help you futher.
>
> Regards,
> Rajveer Singh
>
>
> On Mon, May 13, 2013 at 11:58 AM, Delphine Ramalingom 
> <delphine.ramalingom at univ-reunion.fr 
> <mailto:delphine.ramalingom at univ-reunion.fr>> wrote:
>
>     Hello,
>
>     I have a problem and I need some help.
>
>     Our cluster linux have been stopped for maintenance in the room
>     server butr, an error was occured during the stopping procedure :
>     Local machine disabling service:HA_MGMT...Failure
>
>     The cluster was electrically stopped. But since the restart, I
>     don't succed to restart services with command clussvcadm.
>     I have this message :
>
>     clusvcadm -e HA_MGMT
>     Local machine trying to enable service:HA_MGMT...Aborted; service
>     failed
>     and
>     <err>    startFilesystem: Could not match LABEL=postfix with a
>     real device
>
>     Do you have a solution for me ?
>
>     Thanks a lot in advance.
>
>     Regards
>     Delphine
>
>     -- 
>     Linux-cluster mailing list
>     Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130513/a591136e/attachment.htm>