<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    <small>Here is my cluster.conf:<br>
      <br>
      <?xml version="1.0"?><br>
      <cluster config_version="33" name="GFSpfsCluster"><br>
              <logging debug="on"/><br>
              <clusternodes><br>
                      <clusternode name="pfs03.ns.gfs2.us" nodeid="1"
      votes="1"><br>
                              <fence><br>
                                      <method name="single"><br>
                                              <device
      name="pfs03.ns.us.ctidata.net_vmware"/><br>
                                      </method><br>
                              </fence><br>
                      </clusternode><br>
                      <clusternode name="pfs04.ns.gfs2.us" nodeid="2"
      votes="1"><br>
                              <fence><br>
                                      <method name="single"><br>
                                              <device
      name="pfs04.ns.us.ctidata.net_vmware"/><br>
                                      </method><br>
                              </fence><br>
                      </clusternode><br>
                      <clusternode name="pfs05.ns.gfs2.us" nodeid="3"
      votes="1"><br>
                              <fence><br>
                                      <method name="single"><br>
                                              <device
      name="pfs05.ns.us.ctidata.net_vmware"/><br>
                                      </method><br>
                              </fence><br>
                      </clusternode><br>
              </clusternodes><br>
              <fencedevices><br>
                      <fencedevice agent="fence_vmware"
      ipaddr="10.50.6.20" login="administrator"
      name="pfs03.ns.us.ctidata.net_vmware" passwd="secret"
      port="pfs03.ns.us.ctidata.net"/><br>
                      <fencedevice agent="fence_vmware"
      ipaddr="10.50.6.20" login="administrator"
      name="pfs04.ns.us.ctidata.net_vmware" passwd="secret"
      port="pfs04.ns.us.ctidata.net"/><br>
                      <fencedevice agent="fence_vmware"
      ipaddr="10.50.6.20" login="administrator"
      name="pfs05.ns.us.ctidata.net_vmware" passwd="secret"
      port="pfs05.ns.us.ctidata.net"/><br>
              </fencedevices><br>
              <rm><br>
                      <resources><br>
                              <script file="/etc/init.d/httpd"
      name="httpd"/><br>
                      </resources><br>
                      <failoverdomains><br>
                              <failoverdomain name="pfs03_only"
      nofailback="0" ordered="0" restricted="1"><br>
                                      <failoverdomainnode
      name="pfs03.ns.gfs2.us" priority="1"/><br>
                              </failoverdomain><br>
                              <failoverdomain name="pfs04_only"
      nofailback="0" ordered="0" restricted="1"><br>
                                      <failoverdomainnode
      name="pfs04.ns.gfs2.us" priority="1"/><br>
                              </failoverdomain><br>
                              <failoverdomain name="pfs05_only"
      nofailback="0" ordered="0" restricted="1"><br>
                                      <failoverdomainnode
      name="pfs05.ns.gfs2.us" priority="1"/><br>
                              </failoverdomain><br>
                      </failoverdomains><br>
                      <service autostart="1" domain="pfs03_only"
      exclusive="0" name="pfs03_apache" recovery="restart"><br>
                              <script ref="httpd"/><br>
                      </service><br>
                      <service autostart="1" domain="pfs04_only"
      exclusive="0" name="pfs04_apache" recovery="restart"><br>
                              <script ref="httpd"/><br>
                      </service><br>
                      <service autostart="1" domain="pfs05_only"
      exclusive="0" name="pfs05_apache" recovery="restart"><br>
       </service><br>
              </rm><br>
              <fence_daemon clean_start="0" post_fail_delay="0"
      post_join_delay="3"/><br>
              <cman/><br>
      </cluster><br>
      <br>
      uname -n = pfs05.ns.us.ctidata.net<br>
      <br>
      As I am sure you will notice the cluster.conf has the node set to
      pfs05.ns.gfs2.us while the hostname is set to </small><small>pfs05.ns.us.ctidata.net</small><small>. 

      This was working prior, is working on the other 2 nodes and is
      configured this way so that the cluster uses a private vlan
      specifically setup for cluster communications.<br>
      <br>
      The network is setup as follows:<br>
      <br>
      eth0 = 10.50.10.32/24 this is the production traffic interface<br>
      eth1 = 10.50.20.32/24 this is the interface used for </small><small>iSCSI
      connections to our SAN<br>
      eth2 = 10.50.6.32/24 this is the interface setup for FreeIPA
      authenticated ssh access in from our mgmt vlan.<br>
      eth3 = 10.50.1.32/24 this is a legacy interface used during the
      transition from the old env to this new env<br>
      eth4 = 10.50.3.70/27 this is the interface pfs05.ns.gfs2.us
      resolves to used for cluster communications.<br>
      <br>
      David<br>
    </small><small><br>
    </small><br>
    <br>
    On 08/01/2011 08:56 PM, Digimer wrote:
    <blockquote cite="mid:4E37595F.5010003@alteeve.com" type="cite">
      <pre wrap="">On 08/01/2011 09:50 PM, David wrote:
</pre>
      <blockquote type="cite">
        <pre wrap="">I have the RHCS installed on CentOS6 x86_64.

One of the nodes in a 3 node cluster won't start after I moved the nodes
to a new vlan.

When I start cman this is what I get:

Starting cluster:
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Aug 02 01:45:17 corosync [MAIN  ] Corosync Cluster
Engine ('1.2.3'): started and ready to provide service.
Aug 02 01:45:17 corosync [MAIN  ] Corosync built-in features: nss rdma
Aug 02 01:45:17 corosync [MAIN  ] Successfully read config from
/etc/cluster/cluster.conf
Aug 02 01:45:17 corosync [MAIN  ] Successfully parsed cman config
Aug 02 01:45:17 corosync [TOTEM ] Token Timeout (10000 ms) retransmit
timeout (2380 ms)
Aug 02 01:45:17 corosync [TOTEM ] token hold (1894 ms) retransmits
before loss (4 retrans)
Aug 02 01:45:17 corosync [TOTEM ] join (60 ms) send_join (0 ms)
consensus (12000 ms) merge (200 ms)
Aug 02 01:45:17 corosync [TOTEM ] downcheck (1000 ms) fail to recv const
(2500 msgs)
Aug 02 01:45:17 corosync [TOTEM ] seqno unchanged const (30 rotations)
Maximum network MTU 1402
Aug 02 01:45:17 corosync [TOTEM ] window size per rotation (50 messages)
maximum messages per rotation (17 messages)
Aug 02 01:45:17 corosync [TOTEM ] missed count const (5 messages)
Aug 02 01:45:17 corosync [TOTEM ] send threads (0 threads)
Aug 02 01:45:17 corosync [TOTEM ] RRP token expired timeout (2380 ms)
Aug 02 01:45:17 corosync [TOTEM ] RRP token problem counter (2000 ms)
Aug 02 01:45:17 corosync [TOTEM ] RRP threshold (10 problem count)
Aug 02 01:45:17 corosync [TOTEM ] RRP mode set to none.
Aug 02 01:45:17 corosync [TOTEM ] heartbeat_failures_allowed (0)
Aug 02 01:45:17 corosync [TOTEM ] max_network_delay (50 ms)
Aug 02 01:45:17 corosync [TOTEM ] HeartBeat is Disabled. To enable set
heartbeat_failures_allowed > 0
Aug 02 01:45:17 corosync [TOTEM ] Initializing transport (UDP/IP).
Aug 02 01:45:17 corosync [TOTEM ] Initializing transmit/receive
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Aug 02 01:45:17 corosync [IPC   ] you are using ipc api v2
Aug 02 01:45:18 corosync [TOTEM ] Receive multicast socket recv buffer
size (262142 bytes).
Aug 02 01:45:18 corosync [TOTEM ] Transmit multicast socket send buffer
size (262142 bytes).
corosync: totemsrp.c:3091: memb_ring_id_create_or_load: Assertion `res
== sizeof (unsigned long long)' failed.
Aug 02 01:45:18 corosync [TOTEM ] The network interface [10.50.3.70] is
now up.
corosync died with signal: 6 Check cluster logs for details


Any idea what the issue could be?

Thanks
David
</pre>
      </blockquote>
      <pre wrap="">
What is your cluster.conf file (please obscure passwords only), what
does `uname -n` return and what is your network configuration (interface
names and IPs)?

</pre>
    </blockquote>
  </body>
</html>