Mark Hlawatschek hlawatschek at atix.de
Wed Jul 16 18:46:54 UTC 2008


During some stress tests with NFS over GFS, I observed a strange problem.

The test setup consists of two GFS cluster nodes (node1, node2) (RHEL4.6), 
both serving the same NFS exports (/mnt/gfstest)
The NFS exports are mounted by two NFS clients (client1, client2), whereas 
client1 has mounted the NFS export from node1 and client2 has mounted the NFS 
export from node2.

During the stress test, client1 creates files into dir1 on the GFS and client2 
created files into dir2 on the same GFS. Node1 continuously reads the files 
created by client1 and client2. After some time (about 10 minutes) the 
following error occurs on node1:

GFS: fsid=axqa01:gfstest.0: fatal: assertion "!bd->bd_pinned 
&& !buffer_busy(bh)" failed
GFS: fsid=axqa01:gfstest.0:   function = ail_empty_gl
GFS: fsid=axqa01:gfstest.0:   file 
= /builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/dio.c, line = 383
GFS: fsid=axqa01:gfstest.0:   time = 1216216523
GFS: fsid=axqa01:gfstest.0: about to withdraw from the cluster
GFS: fsid=axqa01:gfstest.0: waiting for outstanding I/O
GFS: fsid=axqa01:gfstest.0: telling LM to withdraw
lock_dlm: withdraw abandoned memory
GFS: fsid=axqa01:gfstest.0: withdrawn

Is there a workaround for this problem ? Is this a bug ?



