[Linux-cluster] GFS2 volumes hanging on 1 of 3 cluster nodes

Heiko Nardmann heiko.nardmann at itechnical.de
Thu Aug 16 06:30:09 UTC 2012


Symptoms have been

- losing the cluster token thus leading to one machine being fenced
- during reboot multiple failures trying to mount the storage (GFS2, 
too) thus taking a long time until the machine has been finally up
- after reboot access to iSCSI SAN PowerVault MD3200i slow or not 
possible at all (as you described)
- slow performance when transferring files from one of the cluster nodes 
to a third party machine - not always but sometimes
- strange things can also be seen sometimes inside a tcpdump capture 
(many red things shown by wireshark)

The first issue has been the one which I thought of should be a problem 
of the RHCF. Since we purchased RHEL 6.1 together with the machines I 
have contacted Dell which suggested to first run a DSET report for both 
machines (two node cluster). Then the Dell support came up with the 
recommendation to upgrade the NIC firmware.

Regards,

     Heiko

Am 15.08.2012 22:16, schrieb lists at verwilst.be:
> Hi Heiko,
>
> I'm using Dell R310's for this setup, which indeed have broadcom 
> NIC's. I've emailed Dell Prosupport for the location to the latest 
> firmwares ( that site is a mess :( ). In the meantime, could you tell 
> me how you figured out that the NIC's were to blame? What symptoms did 
> you see? Something else i can check to see if it's indeed the same issue?
>
> I reformatted the gfs2 filesystems with default 128M journals, just in 
> case that helped. I can now see all gfs2 mounts on all 3 servers. I 
> went on to start clvmd on every node, 2 nodes went fine, 3rd node gave 
> a timeout. All lvm based commands on that server now hang, on the 
> others those commands work fine.. *sigh*.
>
> Kind regards,
>
> Bart
>
> Heiko Nardmann schreef op 15.08.2012 20:04:
>> Which hardware are you using for your setup? I am just asking because
>> I have experienced similar problems which finally have been solved by
>> updating the NIC firmware of the systems involved (Dell R610 with
>> Broadcom NICs).
>>
>> Regards,
>>
>>     Heiko
>>
>> Am 15.08.2012 17:32, schrieb lists at verwilst.be:
>>> Hello,
>>>
>>> I've set up a 3-node cluster, where i seem to be having problems 
>>> with some of my GFS2 mounts. All servers have 2 gfs2 mounts on iscsi 
>>> luns, /var/lib/libvirt/sanlock and /etc/libvirt/qemu.
>>>
>>> /dev/mapper/iscsi_cluster_qemu on /etc/libvirt/qemu type gfs2 
>>> (rw,relatime,hostdata=jid=0)
>>> /dev/mapper/iscsi_cluster_sanlock on /var/lib/libvirt/sanlock type 
>>> gfs2 (rw,relatime,hostdata=jid=0)
>>>
>>> Currently, on vm01-test, i cannot go to /var/lib/libvirt/sanlock:
>>>
>>> root at vm01-test:~# ls /var/lib/libvirt/sanlock
>>> ^C^C^C
>>> ^C
>>> ^C
>>




More information about the Linux-cluster mailing list