[Linux-cluster] Processes in D state

InterNetworX | Hostmaster hostmaster at inwx.de
Mon Jan 3 23:54:45 UTC 2011


Hello,

we are using GFS2 but sometimes there are processes hanging in D state:

# ps axl | grep D
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
0     0 14220 14219  20   0  19624  1916 -      Ds   ?          0:00
/usr/lib/postfix/master -t
0     0 14555 14498  20   0  16608  1716 -      D+
/mnt/storage/openvz/root/129/dev/pts/0   0:00 apt-get install less
0     0 15068 15067  19  -1  36844  2156 -      D<s  ?          0:00
/usr/lib/postfix/master -t
0     0 16603 16602  19  -1  36844  2156 -      D<s  ?          0:00
/usr/lib/postfix/master -t
4   101 19534 13238  19  -1  33132  2984 -      D<   ?          0:00
smtpd -n smtp -t inet -u -c
4   101 19542 13238  19  -1  33116  2976 -      D<   ?          0:00
smtpd -n smtp -t inet -u -c
0     0 19735 13068  20   0   7548   880 -      S+   pts/0      0:00 grep D

dmesg shows this message many times:

[11142.334229] INFO: task master:14220 blocked for more than 120 seconds.
[11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[11142.334310] master        D ffff88032b644800     0 14220  14219
0x00000000
[11142.334315]  ffff88062dd40000 0000000000000086 0000000000000000
ffffffffa02628d9
[11142.334318]  ffff88017a517ef8 000000000000fa40 ffff88017a517fd8
0000000000016940
[11142.334322]  0000000000016940 ffff88032b644800 ffff88032b644af8
0000000b7a517cd8
[11142.334325] Call Trace:
[11142.334340]  [<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2]
[11142.334347]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
[11142.334353]  [<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd [gfs2]
[11142.334358]  [<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70
[11142.334363]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
[11142.334367]  [<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77
[11142.334370]  [<ffffffff81066808>] ? wake_bit_function+0x0/0x23
[11142.334376]  [<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2]
[11142.334383]  [<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2]
[11142.334386]  [<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a
[11142.334389]  [<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65
[11142.334393]  [<ffffffff8112221b>] ? sys_flock+0xff/0x12a
[11142.334396]  [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b

Any idea what is going wrong? Do you need any more informations?

Mario




More information about the Linux-cluster mailing list