aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2)

Rossoni Fabio Fabio.Rossoni at urmet.it
Wed Jul 16 12:12:31 UTC 2008


 

 

 Hi,

i'm reached a strange situation over my servers SUN X4200M2 running with
Linux Advanced Platform 5.1 Linux fea.localdomain 2.6.18-53.el5 #1 SMP
Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux.. This
happen on both internal and external disks (Hitachi AMS 200 storage ,
emulex HBA , and HDLM sw Hitachi for multipath)

 

After problem happening I'm not able to use the server due to root
corruption files :

-rwxr-xr-x 1 root root   14096 Sep  5  2007 rmmod

-rwxr-xr-x 1 root root  521552 Aug  7  2006 rmt

-rwxr-xr-x 1 root root   14648 Jul 13  2006 rngd

-rwxr-xr-x 1 root root   57920 Aug  7  2006 route

-rwxr-xr-x 1 root root    5904 Sep 25  2007 rpc.lockd

-rwxr-xr-x 1 root root   49352 Sep 25  2007 rpc.statd

?--------- ? ?    ?          ?            ? rrestore

?--------- ? ?    ?          ?            ? rrestore.static

-rwxr-xr-x 1 root root   29976 Jan  9  2007 rtmon

-rwxr-xr-x 1 root root    7736 Oct 13  2006 runlevel

-rwxr-xr-x 1 root root   30840 Nov 27  2006 runuser

-rwxr-xr-x 1 root root   10376 Aug 17  2007 salsa

 [root at fea sbin]#

 

And also file system are mounted in read-only mode

 

The following is a parts of messages file:

Jul 11 16:11:15 fea clurgmgrd[4739]: <notice> Service service:appl-dfdd
is disabled

Jul 11 16:29:56 fea clurgmgrd[4739]: <notice> Stopping service
service:db-dfdd

Jul 11 16:30:00 fea avahi-daemon[4622]: Withdrawing address record for
10.40.3.40 on eth1.

Jul 11 16:30:11 fea dlm_controld[4281]: uevent message has 3 args

Jul 11 16:30:11 fea clurgmgrd[4739]: <notice> Service service:db-dfdd is
disabled

Jul 11 16:31:44 fea clurgmgrd[4739]: <notice> Starting disabled service
service:db-dfdd

Jul 11 16:31:44 fea kernel: kjournald starting.  Commit interval 5
seconds

Jul 11 16:31:44 fea kernel: EXT3-fs warning: maximal mount count
reached, running e2fsck is recommended

Jul 11 16:31:44 fea kernel: EXT3 FS on sddlmab, internal journal

Jul 11 16:31:44 fea kernel: EXT3-fs: mounted filesystem with ordered
data mode.

Jul 11 16:31:44 fea dlm_controld[4281]: uevent message has 3 args

Jul 11 16:31:44 fea avahi-daemon[4622]: Registering new address record
for 10.40.3.40 on eth1.

Jul 11 16:31:48 fea clurgmgrd[4739]: <notice> Service service:db-dfdd
started

Jul 11 16:40:23 fea clurgmgrd[4739]: <notice> Stopping service
service:db-dfdd

Jul 11 16:40:25 fea avahi-daemon[4622]: Withdrawing address record for
10.40.3.40 on eth1.

Jul 11 16:40:35 fea dlm_controld[4281]: uevent message has 3 args

Jul 11 16:40:35 fea clurgmgrd[4739]: <notice> Service service:db-dfdd is
disabled

Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0):
ext3_free_blocks_sb: bit already cleared for block 382976

Jul 11 17:13:01 fea kernel: Aborting journal on device dm-0.

Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0):
ext3_free_blocks_sb: bit already cleared for block 382977

Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0):
ext3_free_blocks_sb: bit already cleared for block 382978

Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0):
ext3_free_blocks_sb: bit already cleared for block 382979

Jul 11 17:13:01 fea kernel: EXT3-fs error (device dm-0):
ext3_free_blocks_sb: bit already cleared for block 382980

Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in
ext3_reserve_inode_write: Journal has aborted

Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in
ext3_reserve_inode_write: Journal has aborted

Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in
ext3_orphan_del: Journal has aborted

Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0) in
ext3_truncate: Journal has aborted

Jul 11 17:13:02 fea kernel: ext3_abort called.

Jul 11 17:13:02 fea kernel: EXT3-fs error (device dm-0):
ext3_journal_start_sb: Detected aborted journal

Jul 11 17:13:02 fea kernel: Remounting filesystem read-only

Jul 11 17:27:30 fea clurgmgrd[4739]: <info> State change: feb.iride DOWN

Jul 11 17:27:30 fea clurgmgrd[4739]: <info> State change: /dev/sddlmac
UP

Jul 11 17:27:30 fea clurgmgrd[4739]: <info> Waiting for node #2 to be
fenced

Jul 11 17:28:50 fea qdiskd[4191]: <info> Node 2 shutdown

 

And also a kernel bug as:

Jul  9 16:57:13 fea syslogd 1.4.1: restart.

/trace

Jul 10 17:41:09 fea kernel: EXT3-fs warning (device sddlmaa):
ext3_unlink: Deleting nonexistent file (13353077), 0

Jul 10 18:20:04 fea dlm_controld[4260]: uevent message has 3 args

Jul 10 18:20:04 fea kernel: sb orphan head is 13353077

Jul 10 18:20:04 fea kernel: sb_info orphan list:

Jul 10 18:20:04 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 59479 times

Jul 10 18:20:13 fea kernel: BUG: soft lockup detected on CPU#1!

Jul 10 18:20:13 fea kernel:

Jul 10 18:20:13 fea kernel: Call Trace:

Jul 10 18:20:13 fea kernel:  <IRQ>  [<ffffffff800b50fa>]
softlockup_tick+0xd5/0xe7

Jul 10 18:20:13 fea kernel:  [<ffffffff800930e2>]
update_process_times+0x42/0x68

Jul 10 18:20:13 fea kernel:  [<ffffffff800746e3>]
smp_local_timer_interrupt+0x23/0x47

Jul 10 18:20:13 fea kernel:  [<ffffffff80074da5>]
smp_apic_timer_interrupt+0x41/0x47

Jul 10 18:20:13 fea kernel:  [<ffffffff8005bc8e>]
apic_timer_interrupt+0x66/0x6c

Jul 10 18:20:13 fea kernel:  <EOI>  [<ffffffff8008d4b6>]
vprintk+0x29e/0x2ea

Jul 10 18:20:13 fea kernel:  [<ffffffff8008d554>] printk+0x52/0xbd

Jul 10 18:20:13 fea kernel:  [<ffffffff80061a3f>]
out_of_line_wait_on_bit+0x6c/0x78

Jul 10 18:20:13 fea kernel:  [<ffffffff880564f4>]
:ext3:ext3_put_super+0x13e/0x1e0

Jul 10 18:20:13 fea kernel:  [<ffffffff800d8e1e>]
generic_shutdown_super+0x79/0xfb

Jul 10 18:20:13 fea kernel:  [<ffffffff800d8ec6>]
kill_block_super+0x26/0x3a

Jul 10 18:20:13 fea kernel:  [<ffffffff800d8f94>]
deactivate_super+0x6a/0x82

Jul 10 18:20:13 fea kernel:  [<ffffffff800e1d13>] sys_umount+0x245/0x27b

Jul 10 18:20:13 fea kernel:  [<ffffffff800b27ae>]
audit_syscall_entry+0x14d/0x180

Jul 10 18:20:13 fea kernel:  [<ffffffff8005b28d>] tracesys+0xd5/0xe0

Jul 10 18:20:13 fea kernel:

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 50 times

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink , nlink 1, next 0

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 54 times

Jul 10 18:20:13 fea kernel:   in, nlink 1, next 0

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 54 times

Jul 10 18:20:13 fea kernel:   in, nlink 1, next 0

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 54 times

Jul 10 18:20:13 fea kernel:   in, nlink 1, next 0

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 54 times

Jul 10 18:20:13 fea kernel:   in, nlink 1, next 0

Jul 10 18:20:13 fea kernel:   inode dm-0:1010899 at ffff8100df1f3448:
mode 100555, nlink 1, next 0

Jul 10 18:20:13 fea last message repeated 54 times

 

I'm planning to reinstall the server ...

 

Some body can help me ?

Thanks a lot

Fabio



--------------------------------------------
INFORMATIVA SULLA PRIVACY
Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute
in questo messaggio e nei suoi eventuali allegati sono riservate e per
uso esclusivo del destinatario. Nessuno, all'infuori dello stesso,
può copiare o distribuire il messaggio, o parte di esso, a terzi.
Chiunque riceva questo messaggio per errore è pregato di distruggerlo
e di informare il mittente.

PRIVACY NOTICE
According to the D.Lgs. 196/2003 this document and its attachments are
confidential and intended for the named addressee(s) only. If you are
not the intended recipient of this message, any use or dissemination
of this message is prohibited. If you have received this document by
mistake, please notify the sender and destroy all physical and/or
electronic copies.


--------------------------------------------
INFORMATIVA SULLA PRIVACY
Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute
in questo messaggio e nei suoi eventuali allegati sono riservate e per
uso esclusivo del destinatario. Nessuno, all'infuori dello stesso,
può copiare o distribuire il messaggio, o parte di esso, a terzi.
Chiunque riceva questo messaggio per errore è pregato di distruggerlo
e di informare il mittente.

PRIVACY NOTICE
According to the D.Lgs. 196/2003 this document and its attachments are
confidential and intended for the named addressee(s) only. If you are
not the intended recipient of this message, any use or dissemination
of this message is prohibited. If you have received this document by
mistake, please notify the sender and destroy all physical and/or
electronic copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20080716/f4e6c081/attachment.htm>


More information about the Ext3-users mailing list