[Linux-cluster] cman_serviced crashed while rebooting two nodes

Bastian Blank bastian at waldi.eu.org
Fri Mar 4 09:34:02 UTC 2005


Hi folks

While rebooting two nodes, both nodes crashed. Both runs 2.6.10 with
cman from 2005-02-06.

One (name: gfs1) with:
| CMAN: nmembers in HELLO message from 4 does not match our view (got 7, exp 8)
| CMAN: node gfs1 has been removed from the cluster : No response to messages
| CMAN: killed by NODEDOWN message
| CMAN: we are leaving the cluster.
| SM: 01000003 sm_stop: SG still joined

The other with the more serious:
| Unable to handle kernel NULL pointer dereference at virtual address 0000000c
|  printing eip:
| c025f7fd
| *pde = ma 00000000 pa 55555000
|  [<c025f9b1>] cancel_uevents+0xd1/0x1d0
|  [<c02eae88>] schedule+0x2c8/0x4b0
|  [<c0260689>] process_lstart_done+0x9/0x30
|  [<c0260755>] process_message+0xa5/0xc0
|  [<c025d780>] serviced+0x0/0x180
|  [<c0261fee>] process_nodechange+0x2e/0x60
|  [<c025d8e3>] serviced+0x163/0x180
|  [<c0127a04>] kthread+0x94/0xa0
|  [<c0127970>] kthread+0x0/0xa0
|  [<c0107241>] kernel_thread_helper+0x5/0x14
| Oops: 0000 [#1]
| CPU:    0
| EIP:    0061:[<c025f7fd>]    Not tainted VLI
| EFLAGS: 00010246   (2.6.10-xen-gfs-1) 
| EIP is at cancel_one_uevent+0x48d/0x570
| eax: 00000000   ebx: c1211cac   ecx: 00000001   edx: 00000000
| esi: 00000000   edi: c1211c60   ebp: c0cd1fc4   esp: c0cd1f5c
| ds: 007b   es: 007b   ss: 0069
| Process cman_serviced (pid: 627, threadinfo=c0cd0000 task=c0cac600)
| Stack: c1211c60 c030f600 0000000b ffffffff 80764db0 00000001 c0390ee8 c1211c60 
|        c0390e3c c0390ee8 c025f9b1 c02eae88 c0cd1fa0 c0260689 c0260755 00000000 
|        c0390f00 c0390f00 c0390ee8 00000001 c0cd1fc4 c0cd0000 c13bfebc 00000000 
| Call Trace:
|  [<c025f9b1>] cancel_uevents+0xd1/0x1d0
|  [<c02eae88>] schedule+0x2c8/0x4b0
|  [<c0260689>] process_lstart_done+0x9/0x30
|  [<c0260755>] process_message+0xa5/0xc0
|  [<c025d780>] serviced+0x0/0x180
|  [<c0261fee>] process_nodechange+0x2e/0x60
|  [<c025d8e3>] serviced+0x163/0x180
|  [<c0127a04>] kthread+0x94/0xa0
|  [<c0127970>] kthread+0x0/0xa0
|  [<c0107241>] kernel_thread_helper+0x5/0x14
| Code: 89 f8 8d 54 24 14 c7 44 24 14 00 00 00 00 e8 0b fa ff ff 85 c0 89 c6 0f 85 cb 00 00 00 8b 43 0c e8 89 14 00 00 89 c2 8b 4c 24 14 <8b> 40 0c a8 01 0f 45 f2 85 c9 75 53 85 f6 74 2d 8b 47 14 a8 08 
|  dlm: test: dlm_dir_rebuild_local failed -1

They don't longer respond to xen shutdown requests.

Bastian

-- 
Lots of people drink from the wrong bottle sometimes.
		-- Edith Keeler, "The City on the Edge of Forever",
		   stardate unknown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050304/d91a0728/attachment.sig>


More information about the Linux-cluster mailing list