[Linux-cluster] NFS over GFS problem

Sat Jan 29 16:14:48 UTC 2005

Hello,all

I got the gfs' code from cvs on 2004-12-12, which was for kernel 2.6.9 ,
and I opened the NFS server on one GFS node, then I mounted the nfs filesystem
on some machines which did not in the cluster.The NFS server's OS is FC3, 
the NFS clients' OS is RH9. There were problems:

1. When I stoped the NFS server and umount the GFS with NFS client mounted, 
no error happened. but after I remounted the GFS , restarted the NFS server, remounted
the NFS,and umounted the NFS, stoped NFS server, then umount of GFS hang, system reboot
failed too.the dmesg is :

 GFS: fsid=xxx:xxx.3: Unmount seems to be stalled. Dumping lock state...
 Glock (2, 26)
   gl_flags =
   gl_count = 2
   gl_state = 0
   lvb_count = 0
   object = yes
   aspace = 0
   reclaim = no
   Inode:
     num = 26/26
     type = 2
     i_count = 1
     i_flags =
     vnode = yes
 Glock (5, 26)
   gl_flags =
   gl_count = 2
   gl_state = 3
   lvb_count = 0
   object = yes
   aspace = no
   reclaim = no
   Holder
     owner = -1
     gh_state = 3
     gh_flags = 5 7
     error = 0
     gh_iflags = 1 6 7

what's the messages' meaning? It seams as if the mount/umount sequence is critical.

2. When two GFS nodes read the same file(s)(as 1~50 files) on the storage at the
same time, total disk IO performance is worse than one node's? When one GFS node and 
one NFS node(client) read the same file(s) ,the NFS node's performance is nearly zero? 
What's worse, the command "ls" looks like blocked on the GFS directory 
on both GFS nodes and NFS nodes on above cases. So how can I speed up the command "ls",
or may be the system call "stat"?  The GFS filesystem use the "lock_dlm" lock protocol. 
Will "lock_gulm" protocol improve the status, or any other? Are there any 
gfs tune options or mount options to resolve the problem?  

Thanks for any reply!
Best regards! 
Luckey