[Linux-cluster] GFS2 + multipath iscsi problems

Michael O'Sullivan michael.osullivan at auckland.ac.nz
Thu Aug 13 04:17:55 UTC 2009

Hi everyone,

I hope someone can help me. I am have created a DRBD device with 2 
servers and present this device to 2 other servers in a RedHat cluster 
using 2 iSCSI paths from each DRBD server to each cluster node (i.e., 4 
paths per cluster node, 8 paths in total). I then use multipath so that 
each cluster node identifies the paths as belonging to the same device. 
Finally, I create a GFS2 filesystem on the device. This was all going 
very well and I was experimenting with different settings for the round 
robin behaviour of multipath until I decided to carve the DRBD device 
into smaller chunks. After some playing around I managed this, but now I 
can only get the GFS2 system to mount properly on both cluster nodes if 
the round robin switching parameter (rr_min_io) is set to 1000. I had 
previously been able to use values of 100, 50, 2, 1 and many others, but 
these settings now cause GFS2 to hang or refuse to mount. By looking 
through the various mailing lists I have been able to update to kernel 
2.6.18-162.el5 which has stopped the hanging, but the GFS2 system still 
refuses to mount at times (multiple gfs2_fsck calls seem to help 
sometimes here) and will withdraw after a few IOs (at least thats what 
dmesg tells me). This is pure speculation, but I am wondering if there 
are some timers I need to set to allow GFS2 to coordinate better with 
lower rr_min_io. I'm happy to provide output, error messages, etc but 
I'm not sure at this stage what would be useful.

Thanks in advance for any help. Kind regards, Mike O'S

