[Linux-cluster] segfault if dlm is loaded while cman is still joining the cluster

Jeff jeff at intersystems.com
Mon Aug 2 20:20:45 UTC 2004


Is there a bug tracker somewhere or should we just post
them to this list?
--------------------------------------------------------------
This is on a dual-cpu box (FC2) with hyperthreading enabled
(eg. for a total of 4 logical CPUs).

If I issue the following commands where I type each command
as soon as the prior command completes I get a segfault loading
the dlm. The code is from CVS/latest.

[root at lx4 cluster_orig]# ccsd
[root at lx4 cluster_orig]# cman_tool join
[root at lx4 cluster_orig]# modprobe dlm
Segmentation fault
[root at lx4 cluster_orig]# modprobe dlm
[root at lx4 cluster_orig]# dmesg
<snip>
CMAN: Waiting to join or form a Linux-cluster
CMAN <CVS> (built Aug  2 2004 15:04:09) installed
kmem_cache_create: duplicate cache cluster_sock
------------[ cut here ]------------
kernel BUG at mm/slab.c:1392!
invalid operand: 0000 [#1]
SMP 
Modules linked in: cman parport_pc lp parport autofs4 nfs lockd sunrpc e1000 3c59x floppy sg microcode dm_mod uhci_hcd button battery asus_acpi ac ipv6 ext3 jbd aic7xxx sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c01474f6>]    Not tainted
EFLAGS: 00010202   (2.6.7-clu-smp) 
EIP is at kmem_cache_create+0x4c6/0x660
eax: 00000030   ebx: c22f4770   ecx: c0487c98   edx: 00004ce1
esi: c033a366   edi: f8aa662d   ebp: f51d7b80   esp: f3fb0f5c
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 5476, threadinfo=f3fb0000 task=f43ce230)
Stack: c031b3c8 f8aa6620 f51d7c38 0000000a c0000000 ffffff80 00000080 f8aa6620 
       00000080 c0356fe0 f8aae200 c0356fc4 c0356fc4 f88a804e 00002000 00000000 
       00000000 f8aa6605 c013a5c7 f6b7daa0 00000000 40018008 0807a1a0 00ccaffc 
Call Trace:
 [<f88a804e>] cluster_init+0x4e/0x3f9 [cman]
 [<c013a5c7>] sys_init_module+0x107/0x220
 [<c0106e3d>] sysenter_past_esp+0x52/0x71

Code: 0f 0b 70 05 2d ad 31 c0 8b 0b e9 5b ff ff ff 8b 87 b0 00 00 
 DLM <CVS> (built Aug  2 2004 15:04:29) installed
CMAN: sending membership request
CMAN: got node lx3
CMAN: quorum regained, resuming activity
[root at lx4 cluster_orig]# 


The cpuinfo for the 4 cpu's is pretty much the same. Here's
one of them:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) XEON(TM) CPU 1.80GHz
stepping        : 4
cpu MHz         : 1779.842
cache size      : 512 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 3514.36





More information about the Linux-cluster mailing list