IO lockups and ext3 readonly filecorruption on RHEL4 (pre a U4)
tweeks
tweeks at rackspace.com
Wed Sep 6 14:23:25 UTC 2006
On Tuesday 05 September 2006 07:34 pm, Christian wrote:
> ok, so ext3 will remount the fs to RO. this would happen if a panic()
> occurs?
These boxes are not panicing. IO (or O actually) seems to come to a complete
stop, the system can't sync.. the journal becomes out of sync.. ext3 freaks
and re-mounts RO, and eventually the system becomes mostly unresponsive (as
no new processes can be properly started. Graceful rebooting becomes a
problem, and eventual reboots find the unsync'd disc very hard to fsck
successfully.
> is there anything related in the logs?
No.. they're read only.
> (if /var is RO too, try
> to setup a loghost).
We may try that as we already have a shared NetDump server set up.
Can i do syslog to BOTH the local machine AND a network syslog server. If the
local logs are locked, will my writing to a remote host still work?
> coud you be more specific? what does fsck.ext3 say?
It shows thousands of de-linked files being found. But I have not witnessed
this first hand, as I am not in front of the console on these machines. But
I'll ask.
> is there something
> in lost+found?
I'm assuming yes.
> remember to use latest version of e2fsprogs. have you
> tried a vanilla kernel yet?
Well, yes. But since it is thus far not able to be reliably reproduced, it's
hard to tell what works and what doesn't. If anyone who understands the
nature of this problem has any suggestions for reliably triggering it, then
please speak up.
Tim:
You mentioned some type of forced buffer flush patch last month... any ETA on
this?
Tweeks
--
Thomas Weeks, Lead Sys. Engineer The Managed Hosting Specialist(TM)
Rackspace Managed Hosting http://www.rackspace.com/
Managed Service Innovation Team Email:<tweeks_at!rackspace.c0m>
"We Fanatically Support Fanatical Support!" (w)210.447.4451 (f)210.447.4041
More information about the Ext3-users
mailing list