[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3 badness in 2.6.0-test2

Neil Brown <neilb cse unsw edu au> wrote:
> > Could have been an IO error, or the block/MD/device layer returned
> > incorrect data.  ext3 used to go BUG a lot in the latter case, but nowadays
> > we try to abort the journal and go read-only.
> > 
> > Without the initial message we do not know.
> Can I add a "me too".....

No.  Go away.

> First, I'm using data=journal - is that supposed to work in 2.6 yet?

I think so.  It's much less tested than ordered mode, but some people have
beat upon it.

> I have a raid5 array across a bunch of SCSI drives and a separate scsi
> drive with boot, swap, and a journal partition.
> I have an ext3 filesystem on the raid5 array with an external journal
> on the journal partition.

oh.  Good to hear that external journals still work.

> The raid5 was rebuilding a spare and I was pounding the filesystem
> over NFS using the SPEC SFS benchmark program (ofcourse the raid5
> rebuild killed the performance reported by SFS, but I expected that.
> Shortly after the rebuild finished, I got an ext3 error (see log
> below) and the journal aborted, and then nfsd Oopsed inside ext3.

> ...
> Aug  6 15:22:05 adams kernel: EXT3-fs error (device md1): ext3_add_entry: bad entry in directory #41
> 009295: rec_len is smaller than minimal - offset=0, inode=3265411686, rec_len=0, name_len=0

It looks like we had a block full of zeroes come back from the device
driver.  I find it distinctly fishy how this happens so much with
ext3-on-md, and so little with ext3-on-just-a-disk.

> Aug  6 15:22:05 adams kernel: Remounting filesystem read-only
> Aug  6 15:22:05 adams kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000

Now that's an ext3 bug. Something like this...

 fs/jbd/transaction.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff -puN fs/jbd/transaction.c~ext3-aborted-journal-fix fs/jbd/transaction.c
--- 25/fs/jbd/transaction.c~ext3-aborted-journal-fix	2003-08-05 23:53:16.000000000 -0700
+++ 25-akpm/fs/jbd/transaction.c	2003-08-05 23:56:47.000000000 -0700
@@ -525,12 +525,18 @@ do_get_write_access(handle_t *handle, st
 			int force_copy, int *credits) 
 	struct buffer_head *bh;
-	transaction_t *transaction = handle->h_transaction;
-	journal_t *journal = transaction->t_journal;
+	transaction_t *transaction;
+	journal_t *journal;
 	int error;
 	char *frozen_buffer = NULL;
 	int need_copy = 0;
+	if (is_handle_aborted(handle))
+		return -EROFS;
+	transaction = handle->h_transaction;
+	journal = transaction->t_journal;
 	jbd_debug(5, "buffer_head %p, force_copy %d\n", jh, force_copy);
 	JBUFFER_TRACE(jh, "entry");


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]