[Linux-cluster] (no subject)

Bas van der Vlies basv at sara.nl
Mon Mar 6 19:59:26 UTC 2006


Out setup is:
  * We are using GFS from cvs stable branch  on our 2.6.14.7  
cluster.  Just updated today to the
    newest CVS version. Only had to change the mutex() calls.
* The 4 nodes are running debian sarge;
* The 4 nodes act as NFS-servers for +/- 640  client-nodes
* brocade switch with SGI TP9300 4 controllers (15 TB)

We did a lot of testing an we could not crash the cluster, bonnie/ 
iozone and other tools/jobs. Now the cluster is in production we
get a lot of nfsd crashed with EIP is at fda_create. We had it with  
our previous kernel 2.16.4.4 and with this one and "latest"
CVS stable version. The server still runs ++ the load is high and it  
does not respond any more. If we are luckly only one NFS
thread is gone and rest is still up. The rest of the nodes still work.

Have users experienced this kind of problems and maybe have a  
solution for this problem?


Regards,


Here is a oops message:
Unable to handle kernel NULL pointer dereference at virtual address  
00000038
printing eip:
f89bf999
*pde = 37bff001
*pte = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg  
ide_floppy ide_cd cdrom qla2300 qla2xxx_conf qla2xxx firmware_class  
siimage piix e1000 gfs lock_harness dm_mod
CPU:    0
EIP:    0060:[<f89bf999>]    Tainted: GF     VLI
EFLAGS: 00010246   (2.6.14.7-sara1)
EIP is at gfs_create+0xa9/0x1e0 [gfs]
eax: ffffffef   ebx: ffffffef   ecx: 00000001   edx: 00000000
esi: f296e24c   edi: ebf01e18   ebp: ebf01e84   esp: ebf01df8
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 16924, threadinfo=ebf00000 task=ebe84540)
Stack: ebf01e48 f296e24c 00000001 00008180 ebf01e18 00000001 f8cb9000  
dd042254
        ebf01e18 ebf01e18 00000000 ebe84540 00000001 00000120  
00000000 000000c2
        00000000 00000001 ebf01e40 ebf01e40 ebf01e48 ebf01e48  
df0bd858 ebe84540
Call Trace:
[<c0103e5f>] show_stack+0x7f/0xa0
[<c0104012>] show_registers+0x162/0x1d0
[<c0104224>] die+0xf4/0x180
[<c035f697>] do_page_fault+0x2e7/0x6b2
[<c0103b03>] error_code+0x4f/0x54
[<c016b663>] vfs_create+0x83/0xf0
[<c01b81ce>] nfsd_create_v3+0x40e/0x550
[<c01bed2d>] nfsd3_proc_create+0x11d/0x180
[<c01b2f87>] nfsd_dispatch+0xd7/0x200
[<c0353a96>] svc_process+0x536/0x670
[<c01b2d1d>] nfsd+0x1bd/0x350
[<c010127d>] kernel_thread_helper+0x5/0x18
Code: 24 08 8d 45 c4 89 54 24 0c 89 74 24 04 89 04 24 e8 1d c3 fe ff  
85 c0 89 c3 0f 84 2e 01 00 00 83 f8 ef 0f 85 13 01 00 00 8b 55 14  
<80> 7a 38 00 0f 88 06 01 00 00 89 7c 24 0c 31 c0 8d 55 c4 89 44






--
Bas van der Vlies
basv at sara.nl






More information about the Linux-cluster mailing list