[Linux-cluster] RHCS & snmpd[30622]: Received SNMP packet(s) from UDP

Hofmeister, James (WTEC Linux) james.hofmeister at hp.com
Wed Dec 15 22:41:52 UTC 2010


Hello Lon, all,

|Linux-cluster doesn't generate traps/notifications at this point, so I'd
|guess the HP agent :)
|-- Lon

Yep, we found the HP Health agent (cmahostd) that quit sending SNMP messages during the cluster hang:

Dec 10 07:22:24 dm73sr02 kernel: INFO: task cmahostd:31542 blocked for more than 120 seconds.
Dec 10 07:22:24 dm73sr02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 10 07:22:24 dm73sr02 kernel: cmahostd      D ffffffff801508e3     0 
31542      1         31576 31540 (NOTLB)
Dec 10 07:22:24 dm73sr02 kernel:  ffff810c3b889cf8 0000000000000086 0000000000000018 ffffffff884414f8
Dec 10 07:22:24 dm73sr02 kernel:  0000000000000292 000000000000000a ffff810c3f54a820 ffff810c4e1b6040
Dec 10 07:22:24 dm73sr02 kernel:  00007122b167f658 0000000000bb9ecb ffff810c3f54aa08 0000000888442e5f
Call Trace:
[<ffffffff884414f8>] :dlm:request_lock+0x93/0xa0
[<ffffffff8846cee3>] :gfs2:just_schedule+0x0/0xe
[<ffffffff8846ceec>] :gfs2:just_schedule+0x9/0xe
[<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
[<ffffffff8846cee3>] :gfs2:just_schedule+0x0/0xe
[<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
[<ffffffff800a0b44>] wake_bit_function+0x0/0x23
[<ffffffff8846cede>] :gfs2:gfs2_glock_wait+0x2b/0x30
[<ffffffff8847b2ba>] :gfs2:gfs2_getattr+0x85/0xc4
[<ffffffff8847b2b2>] :gfs2:gfs2_getattr+0x7d/0xc4
[<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
[<ffffffff800288ec>] vfs_stat_fd+0x32/0x4a
[<ffffffff8000e4db>] free_pages_and_swap_cache+0x67/0x7e
[<ffffffff80083f43>] sys32_stat64+0x11/0x29
[<ffffffff8006153d>] sysenter_tracesys+0x48/0x83
[<ffffffff8006149d>] sysenter_do_call+0x1e/0x76

Regards, 
      James Hofmeister  
      Hewlett Packard Linux Solutions Engineer 






More information about the Linux-cluster mailing list