[Linux-cluster] gfs2_jadd borked my cluster?
rhurst at bidmc.harvard.edu
rhurst at bidmc.harvard.edu
Wed Oct 20 16:41:07 UTC 2010
Latest RHEL 5u5 with a four node cluster:
cman-2.0.115-34.el5_5.3
gfs2-utils-0.1.62-20.el5
kernel-2.6.18-194.17.1.el5
Three nodes are blades; the fourth is a KVM guest.
I executed `gfs2_jadd -j1 /home` to add a fourth journal; it completely successfully with old=3, new=4 message. I checked on all three nodes with `gfs2_tool journals /home` and they all reported four journals of size 128MB.
I joined KVM guest to cluster. I attempted to mount /home and it complained there were only three journals. EH??? So, I umount /home on a blade and mount /home on the KVM guest -- it allowed it to mount.
Checking journals on all hosts again, they now report only 3.
I umount /home on KVM guest, and re-mounted it on the blade. It, too, only reports 3 journals now.
I repeated process again, but second time around, I got a GFS2 filesystem withdrawal dump on the guest. And now the DLM has got that channel locked on all nodes with a LEAVE_STOP_WAIT status. I tried fence_node against the guest, it re-booted the node fine, but now DLM fence is locked with a FAIL_ALL_STOPPED status.
1) Can I clear this issue (obviously without re-booting)?
2) What could possibly have gone wrong with gfs2_jadd?
More information about the Linux-cluster
mailing list