kernel 2.6.18-194.3.1 x86_64 and task hungs

Georgios Magklaras georgios at biotek.uio.no
Tue May 11 18:10:53 UTC 2010


I have an ordinary box (PowerEdge 2950 plus PERC 6/E controller) and I 
have moved to the latest RHEL 5.5 kernel. After the move, it seems that 
I have certain tasks hunging in the box. In one case the system died and 
required a powercycle.

That does not appear to be the case when I revert the box to the 
previous kernel (2.6.18-194.el5). Before I look at the storage or other 
firmware issues, can I ask if anyone has seen this behaviour/symptoms?

Cheers,
GM

 From my /var/log/messages:

May 11 11:40:17 cnkeeper kernel: nfsd          D 0000000000000040     0  
5694      1          5695  5693 (L-TLB)
May 11 11:40:17 cnkeeper kernel:  ffff810fe0297970 0000000000000046 
0000000000000286 ffffffff8003da8d
May 11 11:40:17 cnkeeper kernel:  ffff810137ddc000 000000000000000a 
ffff810fe02817a0 ffff810fdaecd0c0
May 11 11:40:17 cnkeeper kernel:  00002ec3045c45bc 0000000000000b01 
ffff810fe0281988 00000003d68e8410
May 11 11:40:17 cnkeeper kernel: Call Trace:
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8003da8d>] 
lock_timer_base+0x1b/0x3c
May 11 11:40:17 cnkeeper kernel:  [<ffffffff887e3e71>] 
:ip_conntrack:__ip_ct_refresh_acct+0x10f/0x152
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80033598>] lock_sock+0x79/0xb2
May 11 11:40:17 cnkeeper kernel:  [<ffffffff800a0abe>] 
autoremove_wake_function+0x0/0x2e
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8001d0e8>] 
tcp_recvmsg+0x29/0xb2a
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8006b011>] 
__switch_to+0xfe/0x22f
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80031c38>] 
sock_common_recvmsg+0x2d/0x43
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8003055b>] 
sock_recvmsg+0x107/0x15f
May 11 11:40:17 cnkeeper kernel:  [<ffffffff800a0abe>] 
autoremove_wake_function+0x0/0x2e
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80048166>] 
__pagevec_release+0x19/0x22
May 11 11:40:17 cnkeeper kernel:  [<ffffffff800cb2c3>] 
shrink_inactive_list+0x88b/0x8d8
May 11 11:40:17 cnkeeper kernel:  [<ffffffff800d00b0>] 
page_referenced_one+0x6a/0xfb
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8003bfea>] 
page_referenced+0x6e/0xe4
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80226c1d>] 
kernel_recvmsg+0x3b/0x4d
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8858dea3>] 
:sunrpc:svc_recvfrom+0x6e/0xe0
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8858f939>] 
:sunrpc:svc_tcp_recvfrom+0x4b0/0x79e
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8008c871>] 
dequeue_task+0x18/0x37
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80062ff8>] 
thread_return+0x62/0xfe
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8003da8d>] 
lock_timer_base+0x1b/0x3c
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8004b36f>] 
try_to_del_timer_sync+0x7f/0x88
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8005bbbb>] 
del_timer_sync+0xc/0x16
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8858ffd9>] 
:sunrpc:svc_recv+0x3b2/0x495
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8008d087>] 
default_wake_function+0x0/0xe
May 11 11:40:17 cnkeeper kernel:  [<ffffffff80064644>] __down_read+0x12/0x92
May 11 11:40:17 cnkeeper kernel:  [<ffffffff886cc5a1>] :nfsd:nfsd+0x0/0x2cb
May 11 11:40:17 cnkeeper kernel:  [<ffffffff886cc694>] :nfsd:nfsd+0xf3/0x2cb
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
May 11 11:40:17 cnkeeper kernel:  [<ffffffff886cc5a1>] :nfsd:nfsd+0x0/0x2cb
May 11 11:40:17 cnkeeper kernel:  [<ffffffff886cc5a1>] :nfsd:nfsd+0x0/0x2cb
May 11 11:40:17 cnkeeper kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11


May 11 11:55:58 cnkeeper kernel: INFO: task sshd:23317 blocked for more 
than 120 seconds.
May 11 11:55:58 cnkeeper kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 11 11:55:58 cnkeeper kernel: sshd          D ffff810fdb2b2de8     0 
23317   8256         23382 23214 (NOTLB)
May 11 11:55:58 cnkeeper kernel:  ffff81036e9bbc58 0000000000000086 
0000000300000000 ffff810137df08d8
May 11 11:55:58 cnkeeper kernel:  0000000000000000 0000000000000001 
ffff810a79f320c0 ffff810fd9f04040
May 11 11:55:58 cnkeeper kernel:  00002fa5c52967a1 0000000004e4e1f5 
ffff810a79f322a8 000000008000d3ba
May 11 11:55:58 cnkeeper kernel: Call Trace:
May 11 11:55:58 cnkeeper kernel:  [<ffffffff80019afe>] 
__follow_mount+0x26/0x7f
May 11 11:55:58 cnkeeper kernel:  [<ffffffff8000ce97>] do_lookup+0x65/0x1e6
May 11 11:55:58 cnkeeper kernel:  [<ffffffff80063c6f>] 
__mutex_lock_slowpath+0x60/0x9b
May 11 11:55:58 cnkeeper kernel:  [<ffffffff80063cb9>] 
.text.lock.mutex+0xf/0x14
May 11 11:55:58 cnkeeper kernel:  [<ffffffff8000cec2>] do_lookup+0x90/0x1e6
May 11 11:55:58 cnkeeper kernel:  [<ffffffff8000a20d>] 
__link_path_walk+0xa01/0xf42
May 11 12:04:23 cnkeeper nrpe[23458]: Connection has taken too long to 
establish. Exiting...
May 11 12:05:23 cnkeeper nrpe[23438]: Connection has taken too long to 
establish. Exiting...
May 11 12:06:18 cnkeeper kernel:  [<ffffffff8000e9e2>] 
link_path_walk+0x42/0xb2
May 11 12:06:52 cnkeeper kernel:  [<ffffffff8000ccb2>] 
do_path_lookup+0x275/0x2f1
May 11 12:06:52 cnkeeper kernel:  [<ffffffff8002368a>] 
__path_lookup_intent_open+0x56/0x97
May 11 12:06:52 cnkeeper kernel:  [<ffffffff8001af23>] open_namei+0x73/0x6d5
May 11 12:06:52 cnkeeper kernel:  [<ffffffff80066b88>] 
do_page_fault+0x4fe/0x874
May 11 12:06:52 cnkeeper kernel:  [<ffffffff800274a4>] 
do_filp_open+0x1c/0x38
May 11 12:06:52 cnkeeper kernel:  [<ffffffff80019dd1>] do_sys_open+0x44/0xbe
May 11 12:06:52 cnkeeper kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0


-- 
Best regards,
--

George Magklaras BSc (Hons) MPhil RHCE
IT Systems Manager/Senior Systems Engineer
The Biotechnology Center of Oslo
University of Oslo

http://www.biotek.uio.no
http://www.no.embnet.org
http://folk.uio.no/georgios





More information about the redhat-list mailing list