[rhelv6-list] Host hung, hung_task_timeout_secs mentioned

Eugene Vilensky evilensky at gmail.com
Tue Aug 16 21:06:50 UTC 2011


On Wed, Jun 15, 2011 at 10:53 AM, Tom Sightler <ttsig at tuxyturvy.com> wrote:
> On Wed, 2011-06-15 at 11:13 -0400, Brian Long wrote:
>> I ran into a server hang last night running 2.6.32-131.2.1.el6.x86_64.
>> I just installed the latest updates (RHEL 6.1) yesterday morning and I
>> experienced the hang during my Amanda backups.  I found a RHEL 5 bug
>> which mention similar problems but no fix:
>> https://bugzilla.redhat.com/show_bug.cgi?id=605444
>>
>> I had Opsware monitoring the host and it went offline completely for
>> about 1 minute.  Has anyone else experienced this?  I'm running a LSI
>> 8708EM2 RAID controller with battery-backed cache.
>>
>
> We battled a very similar issue on one of our older systems that was
> recently upgraded.  Specifically the system was a Dell 2950 that servers
> as a central backup server.  This server runs NFS and Samba to take
> Oracle RMAN backups, runs BackupPC to backup a number of Linux systems,
> and is a backup target for our VMware backup solution.
>
> During the heavy I/O (somewhat common for a backup server) we would get
> messages similar to what you're seeing.  Interestingly, we have another
> older server that performs virtually the same function but so far it
> hasn't experienced this issue.

We ran into this on a RHEL 5.7 server, using ext4 on a LVM-based
volume backed by an Equallogic iscsi device.  These threads were hung:

dmesg | grep INFO:
INFO: task pdflush:30278 blocked for more than 120 seconds.
INFO: task jbd2/dm-10-8:2788 blocked for more than 120 seconds.
INFO: task httpd:4058 blocked for more than 120 seconds.
INFO: task httpd:4278 blocked for more than 120 seconds.
INFO: task httpd:12351 blocked for more than 120 seconds.
INFO: task sftp-server:19531 blocked for more than 120 seconds.
INFO: task sftp-server:19561 blocked for more than 120 seconds.
INFO: task sftp-server:19562 blocked for more than 120 seconds.
INFO: task sftp-server:19613 blocked for more than 120 seconds.
INFO: task sftp-server:19637 blocked for more than 120 seconds.


Not sure if it's IO load related, but ext4 sure throws up some warnings:

EXT4-fs (dm-10): delayed block allocation failed for inode 15122461 at
logical offset 0 with max blocks 1 with error -122
This should not happen!!  Data will be lost


[etc etc]




More information about the rhelv6-list mailing list