[Linux-cluster] GFS LogVol00cluster.1: withdrawn / rejecting I/O to dead device
tom-fedora at kofler.eu.org
tom-fedora at kofler.eu.org
Tue Sep 27 21:16:57 UTC 2005
Hi,
we are building a HA cluster with GFS6.1 and Fedora Core 4
Our SAN box had an outage and was then reconnected.
Now, we are unable to mount the clusterfilesystem gfs.
Sep 27 20:05:19 www5 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: fatal:
I/O error
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: block
= 9498835
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1:
function = gfs_logbh_wait
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: file
= /usr/src/build/607778-i686/BUILD/smp/src/gfs/dio.c, line = 923
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: time
= 1127844319
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: about
to withdraw from the cluster
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: waiting
for outstanding I/O
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1: telling
LM to withdraw
Sep 27 20:05:19 www5 kernel: lock_dlm: withdraw abandoned memory
Sep 27 20:05:19 www5 kernel: GFS: fsid=xxxcluster:LogVol00cluster.1:
withdrawn
Sep 27 20:05:43 www5 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:05:43 www5 kernel: Buffer I/O error on device dm-3, logical block
20971504
Sep 27 20:05:43 www5 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:05:43 www5 kernel: Buffer I/O error on device dm-3, logical block
20971504
Sep 27 20:52:17 www3 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:52:17 www3 kernel: Buffer I/O error on device dm-1, logical block
20971504
Sep 27 20:52:17 www3 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:52:17 www3 kernel: Buffer I/O error on device dm-1, logical block
20971504
Sep 27 20:52:17 www3 kernel: scsi2 (0:0): rejecting I/O to dead device
Sep 27 20:52:17 www3 kernel: Buffer I/O error on device dm-1, logical block
0
Rejecting/lm withdraw did not appear on the third node, also lm withdraw did
not appear on www3
[root at www4 ~]# mount /mnt/ /dev/VolGroupDaten01/LogVol00cluster -t gfs
mount: /mnt/ is not a block device
We need to avoid restarting the server nodes - the volume groups so far are
visible and access with eg. fisk is possible.
Another single server which only uses a non-cluster LVM2 volume mount worked
without reboot.
Any help would be really welcome,
Thanks
Thomas
[root at www3 ~]# vgscan
Reading all physical volumes. This may take a while...
Found volume group "VolGroupDaten02" using metadata type lvm2
Found volume group "VolGroupDaten01" using metadata type lvm2
[root at www3 ~]# lvdisplay VolGroupDaten01
--- Logical volume ---
LV Name /dev/VolGroupDaten01/LogVol00cluster
VG Name VolGroupDaten01
LV UUID o38bnG-sLSi-WhUJ-47Bs-3u6g-qSUm-5yBkNr
LV Write Access read/write
LV Status available
# open 0
LV Size 80.00 GB
Current LE 20480
Segments 1
Allocation inherit
Read ahead sectors 0
Block device 253:1
[root at www3 ~]# pvdisplay
...
...
...
--- Physical volume ---
PV Name /dev/sde
VG Name VolGroupDaten01
PV Size 540.00 GB / not usable 0
Allocatable yes
PE Size (KByte) 4096
Total PE 138239
Free PE 117759
Allocated PE 20480
PV UUID oVeByo-8IoA-qFlt-fsN9-ULAR-xUju-niLTEO
[root at www3 ~]# cman_tool status
Protocol version: 5.0.1
Config version: 2
Cluster name: xxxcluster
Cluster ID: 57396
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 3
Expected_votes: 3
Total_votes: 3
Quorum: 2
Active subsystems: 3
Node name: www3.xxx.cc
Node addresses: 192.168.2.23
[root at www3 ~]# cman_tool nodes
Node Votes Exp Sts Name
1 1 3 M www5.xxx.cc
2 1 3 M www4.xxx.cc
3 1 3 M www3.xxx.cc
<?xml version="1.0"?>
<cluster name="xxxcluster" config_version="3">
<clusternodes>
<clusternode name="www5.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.25"/>
</method>
</fence>
</clusternode>
<clusternode name="www3.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.23"/>
</method>
</fence>
</clusternode>
<clusternode name="www4.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.24"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fence_devices>
<fence_device name="human" agent="fence_manual"/>
</fence_devices>
</cluster>
[root at www3 ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster name="xxxcluster" config_version="3">
<clusternodes>
<clusternode name="www5.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.25"/>
</method>
</fence>
</clusternode>
<clusternode name="www3.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.23"/>
</method>
</fence>
</clusternode>
<clusternode name="www4.xxx.cc" votes="1">
<fence>
<method name="single">
<device name="human" ipaddr="192.168.2.24"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fence_devices>
<fence_device name="human" agent="fence_manual"/>
</fence_devices>
</cluster>
More information about the Linux-cluster
mailing list