[Linux-cluster] Re: gfs withdrawed in function xmote_bh with ret = 0x00000002
David Teigland
teigland at redhat.com
Fri Jun 16 16:37:53 UTC 2006
On Fri, Jun 16, 2006 at 10:38:58PM +0800, ?????? wrote:
> Hi,all
>
> I run the latest STABLE cluster code with 3 nodes,
> I get the message on one node after about 38 hours as:
> <--
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: fatal: assertion "FALSE" failed
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: function = xmote_bh
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: file = /home/sunjw/projects/cluster.STABLE/gfs-kernel/src/gfs/glock.
> c, line = 1093
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: time = 1150408904
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: about to withdraw from the cluster
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: waiting for outstanding I/O
> Jun 16 06:01:44 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: telling LM to withdraw
> Jun 16 06:01:48 nd04 kernel: lock_dlm: withdraw abandoned memory
> Jun 16 06:01:48 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: withdrawn
> Jun 16 06:01:48 nd04 kernel: GFS: fsid=IPTV:gfs-dm2.1: ret = 0x00000002
> -->
> My test program has 'df', 'write', 'ls' and 'read'.
> and each node connect to RAID controller's host port directly with FC.
Hi, I've attached a small patch to print more information and call BUG
instead of withdrawing. It may also be helpful to see a dlm lock dump and
a gfs_tool lockdump on the machine after you hit the BUG.
Thanks,
Dave
-------------- next part --------------
--- ./glock.c.orig 2006-06-16 11:17:48.313980418 -0500
+++ ./glock.c 2006-06-16 11:31:20.617855661 -0500
@@ -30,6 +30,9 @@
#include "quota.h"
#include "recovery.h"
+int dump_glock(struct gfs_glock *gl, char *buf, unsigned int size,
+ unsigned int *count)
+
/* Must be kept in sync with the beginning of struct gfs_glock */
struct glock_plug {
struct list_head gl_list;
@@ -1090,9 +1093,15 @@
spin_unlock(&gl->gl_spin);
} else {
- if (gfs_assert_withdraw(sdp, FALSE) == -1)
- printk("GFS: fsid=%s: ret = 0x%.8X\n",
- sdp->sd_fsname, ret);
+ char *buf;
+ int junk;
+ printk("GFS: fsid=%s: ret = 0x%.8X prev_state = %d\n",
+ sdp->sd_fsname, ret, prev_state);
+ buf = kmalloc(4096);
+ memset(buf, 0, sizeof(buf));
+ dump_glock(gl, buf, 4096, &junk);
+ printk("%s\n", buf);
+ BUG();
}
if (glops->go_xmote_bh)
More information about the Linux-cluster
mailing list