System reboot under high load/interrupt requests.

Brown, Rodrick R rodrick.r.brown at bofasecurities.com
Fri Mar 2 16:53:42 UTC 2007


The system is running a database and we notice under heavy load I start
getting the following 
messages in /var/log/kern 

Feb 15 03:54:55 scsspllqap01 kernel: Losing some ticks... checking if
CPU frequency changed.
Feb 15 03:55:02 scsspllqap01 kernel: warning: many lost ticks.
Feb 15 03:55:02 scsspllqap01 kernel: Your time source seems to be
instable or some driver is hogging interupts
Feb 15 03:55:02 scsspllqap01 kernel: rip __do_softirq+0x4d/0xd0
.
.  <--- Large time gaps possibly do to the messages indicated above
about lost ticks? 
.
Mar  1 11:15:42 scsspllqap01 kernel: SCSI error : <0 0 0 0> return code
= 0x20000
Mar  1 11:15:42 scsspllqap01 kernel: end_request: I/O error, dev sda,
sector 745898632
Mar  1 11:15:42 scsspllqap01 kernel: SCSI error : <0 0 0 0> return code
= 0x20000
Mar  1 11:15:42 scsspllqap01 kernel: end_request: I/O error, dev sda,
sector 745898648
Mar  1 11:15:42 scsspllqap01 kernel: SCSI error : <0 0 0 0> return code
= 0x20000
Mar  1 11:15:42 scsspllqap01 kernel: end_request: I/O error, dev sda,
sector 745898664
Mar  1 11:15:42 scsspllqap01 kernel: SCSI error : <0 0 0 0> return code
= 0x20000
Mar  1 11:15:42 scsspllqap01 kernel: end_request: I/O error, dev sda,
sector 745898680
....
.... Pages of messages like these.
....
Mar  1 11:15:42 scsspllqap01 kernel: end_request: I/O error, dev sda,
sector 745899136
Mar  1 13:13:02 scsspllqap01 kernel: klogd 1.4.1, log source =
/proc/kmsg started.
Mar  1 13:13:02 scsspllqap01 kernel: PCI4 _HPP fail=0x5

The system seem it may have rebooted to something directly related to
this problem, I have checked 
Out the storage subsystem and there does not appear to be any errors or
issues on the subsystem side. 

reboot   system boot  2.6.9-34.ELsmp   Thu Mar  1 13:12          (22:33)


If anyone can provide some insight or help in findign the root cause of
this problem it would be appreicated
Thanks. 

Linux scsspllqap01 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006
x86_64 x86_64 x86_64 GNU/Linux

 ___
{o,o} Rodrick R. Brown
|)__)	UNIX Platform Operations (SME)
-"-"-	Banc of America Securities LLC.
ORLY? Global Trading Infrastructure (GCIB)
100 West 33rd ST. 3rd Flr. 
New York, NY 10001
Mail Code: NY1-509-03-18
Office: 646 733 4473
Cell: 646-261-5286
 




More information about the redhat-sysadmin-list mailing list