[Linux-cluster] DRBD+GFS - Link is down, Link is up

Thu Jun 18 19:22:34 UTC 2009

Hi all,

I configured GFS over DRBD (active-active) with RHCS and IPMI as fence device.

When I try to mount my GFS resource, my interconnect interface goes
down and one node is fenced.  This happen every time.

DRBD joins and become primary...

Jun 18 19:04:30 alice kernel: drbd0: Handshake successful: Agreed
network protocol version 89
Jun 18 19:04:30 alice kernel: drbd0: Peer authenticated using 20 bytes
of 'sha1' HMAC
Jun 18 19:04:30 alice kernel: drbd0: conn( WFConnection -> WFReportParams )
Jun 18 19:04:30 alice kernel: drbd0: Starting asender thread (from
drbd0_receiver [3315])
Jun 18 19:04:30 alice kernel: drbd0: data-integrity-alg: <not-used>
Jun 18 19:04:30 alice kernel: drbd0: drbd_sync_handshake:
Jun 18 19:04:30 alice kernel: drbd0: self
2BA45318C0A122D1:CBAA0E591815072F:3F39591B4EF90EDD:2E40DDEB552666B9
Jun 18 19:04:30 alice kernel: drbd0: peer
CBAA0E591815072E:0000000000000000:3F39591B4EF90EDD:2E40DDEB552666B9
Jun 18 19:04:30 alice kernel: drbd0: uuid_compare()=1 by rule 7
Jun 18 19:04:30 alice kernel: drbd0: peer( Unknown -> Secondary )
conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
Jun 18 19:04:30 alice kernel: drbd0: peer( Secondary -> Primary )
Jun 18 19:04:31 alice kernel: drbd0: conn( WFBitMapS -> SyncSource )
pdsk( UpToDate -> Inconsistent )
Jun 18 19:04:31 alice kernel: drbd0: Began resync as SyncSource (will
sync 16384 KB [4096 bits set]).
Jun 18 19:04:33 alice kernel: drbd0: Resync done (total 1 sec; paused
0 sec; 16384 K/sec)
Jun 18 19:04:33 alice kernel: drbd0: conn( SyncSource -> Connected )
pdsk( Inconsistent -> UpToDate )

Then the fence domain is OK:

Jun 18 19:04:35 alice openais[3475]: [TOTEM] entering GATHER state from 11.
Jun 18 19:04:35 alice openais[3475]: [TOTEM] Creating commit token
because I am the rep.
Jun 18 19:04:35 alice openais[3475]: [TOTEM] Saving state aru 1b high
seq received 1b
Jun 18 19:04:35 alice openais[3475]: [TOTEM] Storing new sequence id for ring 34
Jun 18 19:04:35 alice openais[3475]: [TOTEM] entering COMMIT state.
Jun 18 19:04:35 alice openais[3475]: [TOTEM] entering RECOVERY state.
Jun 18 19:04:35 alice openais[3475]: [TOTEM] position [0] member 10.17.44.116:
Jun 18 19:04:35 alice openais[3475]: [TOTEM] previous ring seq 48 rep
10.17.44.116
Jun 18 19:04:35 alice openais[3475]: [TOTEM] aru 1b high delivered 1b
received flag 1
Jun 18 19:04:35 alice openais[3475]: [TOTEM] position [1] member 10.17.44.117:
Jun 18 19:04:35 alice openais[3475]: [TOTEM] previous ring seq 48 rep
10.17.44.117
Jun 18 19:04:35 alice openais[3475]: [TOTEM] aru a high delivered a
received flag 1
Jun 18 19:04:35 alice openais[3475]: [TOTEM] Did not need to originate
any messages in recovery.
Jun 18 19:04:35 alice openais[3475]: [TOTEM] Sending initial ORF token
Jun 18 19:04:35 alice openais[3475]: [CLM  ] CLM CONFIGURATION CHANGE
Jun 18 19:04:36 alice openais[3475]: [CLM  ] New Configuration:
Jun 18 19:04:36 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.116)
Jun 18 19:04:36 alice openais[3475]: [CLM  ] Members Left:
Jun 18 19:04:36 alice openais[3475]: [CLM  ] Members Joined:
Jun 18 19:04:36 alice openais[3475]: [CLM  ] CLM CONFIGURATION CHANGE
Jun 18 19:04:36 alice openais[3475]: [CLM  ] New Configuration:
Jun 18 19:04:36 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.116)
Jun 18 19:04:36 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.117)
Jun 18 19:04:36 alice openais[3475]: [CLM  ] Members Left:
Jun 18 19:04:36 alice openais[3475]: [CLM  ] Members Joined:
Jun 18 19:04:36 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.117)
Jun 18 19:04:36 alice openais[3475]: [SYNC ] This node is within the
primary component and will provide service.
Jun 18 19:04:36 alice openais[3475]: [TOTEM] entering OPERATIONAL state.
Jun 18 19:04:36 alice openais[3475]: [CLM  ] got nodejoin message 10.17.44.116
Jun 18 19:04:36 alice openais[3475]: [CLM  ] got nodejoin message 10.17.44.117
Jun 18 19:04:36 alice openais[3475]: [CPG  ] got joinlist message from node 1
Jun 18 19:04:40 alice kernel: dlm: connecting to 2
Jun 18 19:04:40 alice kernel: dlm: got connection from 2

WHY DOWN?

Jun 18 19:04:53 alice kernel: eth2: Link is Down
Jun 18 19:04:53 alice openais[3475]: [TOTEM] The token was lost in the
OPERATIONAL state.
Jun 18 19:04:53 alice openais[3475]: [TOTEM] Receive multicast socket
recv buffer size (288000 bytes).
Jun 18 19:04:53 alice openais[3475]: [TOTEM] Transmit multicast socket
send buffer size (262142 bytes).
Jun 18 19:04:53 alice openais[3475]: [TOTEM] entering GATHER state from 2.
Jun 18 19:04:57 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:04:57 alice kernel: eth2: 10/100 speed: disabling TSO

Something goes wrong with DRBD

Jun 18 19:04:58 alice kernel: drbd0: PingAck did not arrive in time.
Jun 18 19:04:58 alice kernel: drbd0: peer( Primary -> Unknown ) conn(
Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Jun 18 19:04:58 alice kernel: drbd0: asender terminated
Jun 18 19:04:58 alice kernel: drbd0: Terminating asender thread
Jun 18 19:04:58 alice kernel: drbd0: short read expecting header on sock: r=-512
Jun 18 19:04:58 alice kernel: drbd0: Creating new current UUID
Jun 18 19:04:58 alice kernel: drbd0: Connection closed
Jun 18 19:04:58 alice kernel: drbd0: conn( NetworkFailure -> Unconnected )
Jun 18 19:04:58 alice kernel: drbd0: receiver terminated
Jun 18 19:04:58 alice kernel: drbd0: Restarting receiver thread
Jun 18 19:04:58 alice kernel: drbd0: receiver (re)started
Jun 18 19:04:58 alice kernel: drbd0: conn( Unconnected -> WFConnection )

Something goes wrong in the cluster

Jun 18 19:04:58 alice openais[3475]: [TOTEM] entering GATHER state from 0.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] Creating commit token
because I am the rep.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] Saving state aru 3c high
seq received 3c
Jun 18 19:04:58 alice openais[3475]: [TOTEM] Storing new sequence id for ring 38
Jun 18 19:04:58 alice openais[3475]: [TOTEM] entering COMMIT state.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] entering RECOVERY state.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] position [0] member 10.17.44.116:
Jun 18 19:04:58 alice openais[3475]: [TOTEM] previous ring seq 52 rep
10.17.44.116
Jun 18 19:04:58 alice openais[3475]: [TOTEM] aru 3c high delivered 3c
received flag 1
Jun 18 19:04:58 alice openais[3475]: [TOTEM] Did not need to originate
any messages in recovery.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] Sending initial ORF token
Jun 18 19:04:58 alice openais[3475]: [CLM  ] CLM CONFIGURATION CHANGE
Jun 18 19:04:58 alice openais[3475]: [CLM  ] New Configuration:
Jun 18 19:04:58 alice kernel: dlm: closing connection to node 2
Jun 18 19:04:58 alice fenced[3494]: bob not a cluster member after 0
sec post_fail_delay
Jun 18 19:04:58 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.116)

"bob" node is fenced (it just joined!)

Jun 18 19:04:58 alice fenced[3494]: fencing node "bob"
Jun 18 19:04:58 alice openais[3475]: [CLM  ] Members Left:
Jun 18 19:04:58 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.117)
Jun 18 19:04:58 alice openais[3475]: [CLM  ] Members Joined:
Jun 18 19:04:58 alice openais[3475]: [CLM  ] CLM CONFIGURATION CHANGE
Jun 18 19:04:58 alice openais[3475]: [CLM  ] New Configuration:
Jun 18 19:04:58 alice openais[3475]: [CLM  ]  r(0) ip(10.17.44.116)
Jun 18 19:04:58 alice openais[3475]: [CLM  ] Members Left:
Jun 18 19:04:58 alice openais[3475]: [CLM  ] Members Joined:
Jun 18 19:04:58 alice openais[3475]: [SYNC ] This node is within the
primary component and will provide service.
Jun 18 19:04:58 alice openais[3475]: [TOTEM] entering OPERATIONAL state.
Jun 18 19:04:58 alice openais[3475]: [CLM  ] got nodejoin message 10.17.44.116
Jun 18 19:04:58 alice openais[3475]: [CPG  ] got joinlist message from node 1
Jun 18 19:05:03 alice kernel: eth2: Link is Down
Jun 18 19:05:08 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:05:08 alice kernel: eth2: 10/100 speed: disabling TSO
Jun 18 19:05:12 alice kernel: eth2: Link is Down
Jun 18 19:05:13 alice fenced[3494]: fence "bob" success
Jun 18 19:05:13 alice kernel: GFS: fsid=webclima:web.0: jid=1: Trying
to acquire journal lock...
Jun 18 19:05:13 alice kernel: GFS: fsid=webclima:web.0: jid=1: Looking
at journal...
Jun 18 19:05:13 alice kernel: GFS: fsid=webclima:web.0: jid=1: Done

eth2 is up and down....

Jun 18 19:05:15 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:05:15 alice kernel: eth2: 10/100 speed: disabling TSO
Jun 18 19:05:21 alice kernel: eth2: Link is Down
Jun 18 19:05:24 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:05:24 alice kernel: eth2: 10/100 speed: disabling TSO
Jun 18 19:05:29 alice kernel: eth2: Link is Down
Jun 18 19:05:33 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:05:33 alice kernel: eth2: 10/100 speed: disabling TSO
Jun 18 19:07:26 alice kernel: eth2: Link is Down
Jun 18 19:07:29 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:07:29 alice kernel: eth2: 10/100 speed: disabling TSO
Jun 18 19:07:36 alice kernel: eth2: Link is Down
Jun 18 19:07:38 alice kernel: eth2: Link is Up 100 Mbps Full Duplex,
Flow Control: None
Jun 18 19:07:38 alice kernel: eth2: 10/100 speed: disabling TSO

Consider that if I don't mount GFS, the node is not fenced and the
failover domains becomes active.
So, I guess the problem is in GFS... and not for example with the NIC.

Here is my configuration:

# cat /etc/drbd.conf
global {
  usage-count no;
}

resource r1 {
  protocol C;

  syncer {
    rate 10M;
    verify-alg sha1;
  }

  startup {
    become-primary-on both;
    wfc-timeout 150;
  }

  disk {
    on-io-error detach;
  }

  net {
    allow-two-primaries;
    cram-hmac-alg "sha1";
    shared-secret "123456";
    after-sb-0pri discard-least-changes;
    after-sb-1pri violently-as0p;
    after-sb-2pri violently-as0p;
    rr-conflict violently;
    ping-timeout 50;
  }

  on alice {
    device      /dev/drbd0;
    disk        /dev/sda2;
    address     10.17.44.116:7789;
    meta-disk   internal;
  }

  on bob {
    device      /dev/drbd0;
    disk        /dev/sda2;
    address     10.17.44.117:7789;
    meta-disk   internal;
  }
}

# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="web" config_version="20" name="web">
        <fence_daemon post_fail_delay="0" post_join_delay="6"/>
        <clusternodes>
                <clusternode name="alice" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="" name="alice-ipmi"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="bob" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="" name="bob-ipmi"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_ipmilan" auth="password"
ipaddr="10.17.44.134" login="cnmca" name="alice-ipmi"
passwd="xxxxxx"/>
                <fencedevice agent="fence_ipmilan" auth="password"
ipaddr="10.17.44.135" login="cnmca" name="bob-ipmi" passwd="xxxxxx"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="alice-domain"
ordered="1" restricted="1">
                                <failoverdomainnode name="alice" priority="1"/>
                                <failoverdomainnode name="bob" priority="2"/>
                        </failoverdomain>
                        <failoverdomain name="bob-domain" ordered="1"
restricted="1">
                                <failoverdomainnode name="bob" priority="1"/>
                                <failoverdomainnode name="alice" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <ip address="10.17.44.16" monitor_link="1"/>
                        <ip address="10.17.44.17" monitor_link="1"/>
                </resources>
                <service autostart="1" domain="alice-domain"
name="alice-alias" recovery="relocate">
                        <ip ref="10.17.44.16"/>
                </service>
                <service autostart="1" domain="bob-domain"
name="bob-alias" recovery="relocate">
                        <ip ref="10.17.44.17"/>
                </service>
        </rm>
</cluster>

# cat /etc/hosts:
127.0.0.1       localhost.localdomain           localhost
172.17.44.116    alice
172.17.44.117    bob

# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:15:17:51:70:38
          inet addr:10.17.44.116  Bcast:10.17.44.255  Mask:255.255.255.0
          inet6 addr: fe80::215:17ff:fe51:7038/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:49984 errors:0 dropped:0 overruns:0 frame:0
          TX packets:83669 errors:0 dropped:0 overruns:0 carrier:0
          collisions:11221 txqueuelen:0
          RX bytes:16151284 (15.4 MiB)  TX bytes:102618030 (97.8 MiB)

eth0      Link encap:Ethernet  HWaddr 00:15:17:51:70:38
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:49984 errors:0 dropped:0 overruns:0 frame:0
          TX packets:83669 errors:0 dropped:0 overruns:0 carrier:0
          collisions:11221 txqueuelen:100
          RX bytes:16151284 (15.4 MiB)  TX bytes:102618030 (97.8 MiB)
          Memory:f9140000-f9160000

eth1      Link encap:Ethernet  HWaddr 00:15:17:51:70:38
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:f91a0000-f91c0000

eth2      Link encap:Ethernet  HWaddr 00:19:99:29:08:8B
          inet addr:172.17.44.116  Bcast:172.17.44.255  Mask:255.255.255.0
          inet6 addr: fe80::219:99ff:fe29:88b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:45 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:1200 (1.1 KiB)  TX bytes:7902 (7.7 KiB)
          Memory:f9200000-f9220000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:3541 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3541 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:464552 (453.6 KiB)  TX bytes:464552 (453.6 KiB)

I hope there is someone just experienced this bad issue.

Thanks in advance.

-- 
Giuseppe