[Linux-cluster] cman_tool do a kernel panic

Tue Mar 29 14:44:16 UTC 2005

The bug you are encountering is a safety check.  It's saying, "Hey, the 
programmer calling this has not initialized the rw lock."

Looking at cluster/cman-kernel/src/membership.c, it looks like 
members_idr_lock was not being initialized, but this was just fixed.

  brassow

On Mar 29, 2005, at 6:57 AM, Alban Crequy wrote:

> Hello,
>
> I am testing GFS (CVS version) with a 2.6.11.5-vanilla kernel. I just 
> follow
> usage.txt [1] step by step. I created a small 2-nodes cluster (see my
> /etc/cluster/cluster.conf file [2]).
>
> My problem is a kernel panic [3] when I run "cman_tool join". The 
> kernel panic
> occurs 16 secondes after "cman_tool join" exit successfully. I read 
> the source
> and the panic() is done by ./include/asm-x86_64/spinlock.h:180:
>
> static inline void _raw_write_lock(rwlock_t *rw)
> {
> #ifdef CONFIG_DEBUG_SPINLOCK
>         BUG_ON(rw->magic != RWLOCK_MAGIC);
> #endif
>         __build_write_lock(rw, "__write_lock_failed");
> }
>
> Any hints?
>
> -- 
> Alban
>
> [1] http://sources.redhat.com/cluster/doc/usage.txt
>
> [2] /etc/cluster/cluster.conf:
> -------->8-------->8--------
> <?xml version="1.0"?>
> <cluster name="alpha" config_version="3">
>
> <cman>
> </cman>
>
> <clusternodes>
> <clusternode name="sam21.toulouse">
> </clusternode>
> <clusternode name="sam22.toulouse">
> </clusternode>
> </clusternodes>
>
> <fencedevices>
>         <fencedevice name="human" agent="fence_manual"/>
> </fencedevices>
>
> <fence_daemon post_join_delay="12">
> </fence_daemon>
>
> </cluster>
> -------->8-------->8--------
>
> [3] Kernel panic:
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at spinlock:179
> invalid operand: 0000 [1] SMP
> CPU 0
> Modules linked in: lock_dlm dlm cman gfs lock_harness md5 ipv6 
> parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc dm_mod video 
> button battery ac uhci_hcd ehci_hcd hw_random e1000 ext3 jbd ata_piix 
> libata sd_mod scsi_mod
> Pid: 3583, comm: cman_comms Tainted: GF     2.6.11.5-alban01
> RIP: 0010:[<ffffffff80337b4c>] <ffffffff80337b4c>{_read_lock+12}
> RSP: 0018:ffff8100364cbd90  EFLAGS: 00010213
> RAX: ffffffff881f3960 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8100364cbfd8 RSI: ffff81003ad7c680 RDI: ffffffff881f3960
> RBP: 0000000000090000 R08: afe2180f00000000 R09: 0000000000000000
> R10: ffffffff80434d60 R11: 0000000000000004 R12: 0000000000000019
> R13: ffff81003efb1800 R14: ffff8100364cbe78 R15: 0000000000000010
> FS:  00002aaaaaac6b00(0000) GS:ffffffff804e8c00(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00002aaaad18e000 CR3: 000000003a0f1000 CR4: 00000000000006e0
> Process cman_comms (pid: 3583, threadinfo ffff8100364ca000, task 
> ffff81003af48920)
> Stack: ffffffff881d875f 0000000000000005 ffffffff881d3cce 
> 000000250000000a
>        000000000001babe ffff810039224b80 ffffffff881f28d0 
> 000000000001e121
>        000000000000001f 000000250000000a
> Call Trace:<ffffffff881d875f>{:cman:find_node_by_nodeid+15} 
> <ffffffff881d3cce>{:cman:cluster_kthread+654}
>        <ffffffff80132630>{default_wake_function+0} 
> <ffffffff8010f1c7>{child_rip+8}
>        <ffffffff881d3a40>{:cman:cluster_kthread+0} 
> <ffffffff8010f1bf>{child_rip+0}
>
> Code: 0f 0b aa 6e 35 80 ff ff ff ff b3 00 f0 83 28 01 0f 88 77 03
> RIP <ffffffff80337b4c>{_read_lock+12} RSP <ffff8100364cbd90>
> <0>Kernel panic - not syncing: Oops
> -------->8-------->8--------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
>