From lists at nerdbynature.de  Wed Jan  3 19:47:39 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Wed, 3 Jan 2007 19:47:39 +0000 (GMT)
Subject: Problem with ext3 filesystem
In-Reply-To: <459BA37D.5050809@netropol.de>
References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de>
	<20061230040816.GA27654@thunk.org>
	<Pine.LNX.4.64.0612301244090.22257@sheep.housecafe.de>
	<459BA37D.5050809@netropol.de>
Message-ID: <Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>


[please reply on-list, so that everybody can comment]

On Wed, 3 Jan 2007, Jan wrote:
> thanks, I already moved all data from this device. I did run this test
> and got no errors yet. I also did dd directly on the device. No error in
> syslog.

...so, the device is fine then?

> Now I tried jfs on one device and did my test with 200 mb files
> - no errros.

...and JFS is fine too.

> I installed a 400GB sata disk with a sata card and tried
> this with ext3 and my test script and got also no errors.

...and now ext3 on this device is alsofine? So, that means the problem 
you had is not reproducible, no?

> there are problems with the disks or the cable the controller should
> notice this ? Perhaps I should ask at areca ?

your 2nd post[0] indeed looked a lot like hardware errors. So yes, if 
these errors persist/are reproducible, you're probably better off asking 
the maintainer or the sata folks for known issues...

Christian.

[0]https://www.redhat.com/archives/ext3-users/2006-December/msg00025.html
-- 
BOFH excuse #125:

we just switched to Sprint.


From jan at netropol.de  Wed Jan  3 20:53:45 2007
From: jan at netropol.de (Jan)
Date: Wed, 03 Jan 2007 20:53:45 +0000
Subject: Problem with ext3 filesystem
In-Reply-To: <Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>
References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de>
	<20061230040816.GA27654@thunk.org>
	<Pine.LNX.4.64.0612301244090.22257@sheep.housecafe.de>
	<459BA37D.5050809@netropol.de>
	<Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>
Message-ID: <459C17D9.70305@netropol.de>

Hi,

> 
> [please reply on-list, so that everybody can comment]

sorry, I don't know why I didn't send it to the list ...

>> thanks, I already moved all data from this device. I did run this test
>> and got no errors yet. I also did dd directly on the device. No error in
>> syslog.
> 
> ...so, the device is fine then?

I think yes.

> 
>> Now I tried jfs on one device and did my test with 200 mb files
>> - no errros.
> 
> ...and JFS is fine too.

JFS is now fine.


>> I installed a 400GB sata disk with a sata card and tried
>> this with ext3 and my test script and got also no errors.
> 
> ...and now ext3 on this device is alsofine? So, that means the problem
> you had is not reproducible, no?

the problem is only on the 1.6 tb areca sata raid. and there it is
reproducible.

> 
>> there are problems with the disks or the cable the controller should
>> notice this ? Perhaps I should ask at areca ?
> 
> your 2nd post[0] indeed looked a lot like hardware errors. So yes, if
> these errors persist/are reproducible, you're probably better off asking
> the maintainer or the sata folks for known issues...

o.k. but the problems seems to come only with areca and ext3, not with
jfs. strange ...


From lists at nerdbynature.de  Wed Jan  3 21:07:39 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Wed, 3 Jan 2007 21:07:39 +0000 (GMT)
Subject: Problem with ext3 filesystem
In-Reply-To: <459C17D9.70305@netropol.de>
References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de>
	<20061230040816.GA27654@thunk.org>
	<Pine.LNX.4.64.0612301244090.22257@sheep.housecafe.de>
	<459BA37D.5050809@netropol.de>
	<Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>
	<459C17D9.70305@netropol.de>
Message-ID: <Pine.LNX.4.64.0701032059080.22257@sheep.housecafe.de>

On Wed, 3 Jan 2007, Jan wrote:
> o.k. but the problems seems to come only with areca and ext3, not with
> jfs. strange ...

I've seen this *many* times on the reiserfs mailing list: ppl are 
complaining that reiserfs was faulty while other filesystems went OK. 
And it turned out to be some hardware issue after all. Now, I can't say that 
I'm 100% sure that the device is to blame, but it seems that some 
hardware bugs are triggered (not caused) by certain fs operations[0], so 
if changing the fs "fixes" it - why not. but it's not a very satisfying 
solution, IMHO.

Christian.

[0] will some fs guru please hit me if this is total gibberish...
-- 
BOFH excuse #423:

It's not RFC-822 compliant.


From tytso at mit.edu  Wed Jan  3 21:07:31 2007
From: tytso at mit.edu (Theodore Tso)
Date: Wed, 3 Jan 2007 16:07:31 -0500
Subject: Problem with ext3 filesystem
In-Reply-To: <459C17D9.70305@netropol.de>
References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de>
	<20061230040816.GA27654@thunk.org>
	<Pine.LNX.4.64.0612301244090.22257@sheep.housecafe.de>
	<459BA37D.5050809@netropol.de>
	<Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>
	<459C17D9.70305@netropol.de>
Message-ID: <20070103210731.GD5491@thunk.org>

On Wed, Jan 03, 2007 at 08:53:45PM +0000, Jan wrote:
> the problem is only on the 1.6 tb areca sata raid. and there it is
> reproducible.

Interesting.  I'm using multiple 600+ megabyte (0.6tb) filesystems
SATA Raid server on my home fileserver using ext3, and I'm not seeing
any problems.  (I have 3tb of space, but for management reasons I
elected to divvy up the space into smaller volumes.)

I'm using a 2.6.18-rc2 kernel with the Areca ARC-1160 controller, with
Areca firmware version 1.41.  (I haven't upgraded to 1.43 yet, even
though it just became available a few month or two.)  It's been
working just fine, and I've had any issues with it.  What Areca
firmware version are you running with?

> > 
> >> there are problems with the disks or the cable the controller should
> >> notice this ? Perhaps I should ask at areca ?
> > 
> > your 2nd post[0] indeed looked a lot like hardware errors. So yes, if
> > these errors persist/are reproducible, you're probably better off asking
> > the maintainer or the sata folks for known issues...
> 
> o.k. but the problems seems to come only with areca and ext3, not with
> jfs. strange ...

When you say it's reproducible, has it been reproducible after using
mke2fs to reformat the filesystem, perhaps with a manually specified
filesystem size?  E2fsck should have complained if the filesystem size
was larger than the apparent size of the physical volume, but if the
Areca firmware somehow screwed up and reported a larger size that what
was actually there, then both mke2fs and e2fsck will blindly believe
what the BLKGETSIZE64 ioctl returns (they won't use the binary search
method of determining the disk size unless the GETBLKSIZE/GETBLKSIZE64
ioctls fail for one reason or another), and if there was some
wraparound bug, that would explain what you're seeing.

So the only other thing I can suggest is to double check the
filesystem size as reported by dumpe2fs or df, and compare it with the
raw volume size as reported by the Areca management interface; do the
numbers look sane?

					= Ted


From jan at netropol.de  Thu Jan  4 09:55:47 2007
From: jan at netropol.de (Jan)
Date: Thu, 04 Jan 2007 09:55:47 +0000
Subject: Problem with ext3 filesystem
In-Reply-To: <Pine.LNX.4.64.0701032059080.22257@sheep.housecafe.de>
References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de>
	<20061230040816.GA27654@thunk.org>
	<Pine.LNX.4.64.0612301244090.22257@sheep.housecafe.de>
	<459BA37D.5050809@netropol.de>
	<Pine.LNX.4.64.0701031932170.22257@sheep.housecafe.de>
	<459C17D9.70305@netropol.de>
	<Pine.LNX.4.64.0701032059080.22257@sheep.housecafe.de>
Message-ID: <459CCF23.4010509@netropol.de>

over the night I got also jfs errors ....
I'll try an older kernel

> On Wed, 3 Jan 2007, Jan wrote:
>> o.k. but the problems seems to come only with areca and ext3, not with
>> jfs. strange ...
> 
> I've seen this *many* times on the reiserfs mailing list: ppl are
> complaining that reiserfs was faulty while other filesystems went OK.
> And it turned out to be some hardware issue after all. Now, I can't say
> that I'm 100% sure that the device is to blame, but it seems that some
> hardware bugs are triggered (not caused) by certain fs operations[0], so
> if changing the fs "fixes" it - why not. but it's not a very satisfying
> solution, IMHO.
> 
> Christian.
> 
> [0] will some fs guru please hit me if this is total gibberish...


From Erik.Andersen at intecbilling.com  Fri Jan  5 15:31:09 2007
From: Erik.Andersen at intecbilling.com (Erik Andersen)
Date: Fri, 5 Jan 2007 16:31:09 +0100
Subject: Problem in e2fsck ? read error in journal inode
References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com>
Message-ID: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>

Hi

I'm experiencing some problems on a harddisk (it has crashed for no known reason),
and in pursuit of getting some of the data out of the disk I'm learning to use the
e2fs progs package.

Origianlly I used version 1.38, but after experiencing segfaults in e2fsck -which now is solved,
I upgraded to 1.39, but now I have hit another problem (on another partition):

The partition was formatted as ext3, and debugfs / dumpfs showed that the feature 'has_journal' was present
(as expected).
The e2fsck command gave the following:

# e2fsck -B4096 -b32768  /dev/hda11
e2fsck 1.39 (29-May-2006)
e2fsck: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /var

So it seemed like a problem in the journal, as I could not find any option to e2fsck to tell
it to skip applying (and reading) the journal, I set the filesystem festure 'has_jounal' off, using debugfs:

# debugfs -b4096 -s32768 -w /dev/hda11
debugfs 1.39 (29-May-2006)
debugfs:   feature -has_journal
Filesystem features: filetype sparse_super
debugfs:   show_super_stats -h
Filesystem volume name:   /var
Last mounted on:          <not available>
Filesystem UUID:          2e8920a2-0460-4a87-b729-af812327fce7
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      filetype sparse_super
Default mount options:    (none)
Filesystem state:         not clean
Errors behavior:          Continue
 :

Now trying e2fsck again did not make any difference:

# e2fsck -B4096 -b32768 -y /dev/hda11
e2fsck 1.39 (29-May-2006)
e2fsck: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /var

I also tried to get tune2fs to turn off the journalling, but got the response:

 First turn 'has_journal' back using debugfs:

# debugfs -b4096 -s32768 -w /dev/hda11
debugfs 1.39 (29-May-2006)
debugfs:  feature has_journal
Filesystem features: has_journal filetype sparse_super
debugfs:  quit

 Then use tune2fs:

# tune2fs -O ^has_journal  /dev/hda11
tune2fs 1.39 (29-May-2006)
tune2fs: Attempt to read block from filesystem resulted in short read while reading journal inode

So my questios are:
1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ?
2) What makes e2fsck act on a journal (is it because journal inode is set) ?
3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore
   any journal information - of course it still need to make sure the inode used for the journal isn't
   used by anybody else) ?
 
It was a bit long, if you need any more info - please let me know
One problem is that I have problems reading the raw partition 'dev/hda11' - I tried
to 'dd' it but it failed...

Regards

Erik Haukj?r Andersen
 
 
--
This e-mail and any attachments are confidential and may also be legally
privileged and/or copyright material of Intec Telecom Systems PLC (or its
affiliated companies). If you are not an intended or authorised recipient
of this e-mail or have received it in error, please delete it immediately
and notify the sender by e-mail. In such a case, reading, reproducing,
printing or further dissemination of this e-mail or its contents is strictly
prohibited and may be unlawful.
Intec Telecom Systems PLC does not represent or warrant that an attachment
hereto is free from computer viruses or other defects. The opinions
expressed in this e-mail and any attachments may be those of the author and
are not necessarily those of Intec Telecom Systems PLC.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070105/7ad07a8b/attachment.htm>

From bruno at wolff.to  Sat Jan  6 04:06:18 2007
From: bruno at wolff.to (Bruno Wolff III)
Date: Fri, 5 Jan 2007 22:06:18 -0600
Subject: Problem in e2fsck ? read error in journal inode
In-Reply-To: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>
References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com>
	<1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>
Message-ID: <20070106040618.GA22262@wolff.to>

On Fri, Jan 05, 2007 at 16:31:09 +0100,
  Erik Andersen <Erik.Andersen at intecbilling.com> wrote:
> 
> So my questios are:
> 1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ?
> 2) What makes e2fsck act on a journal (is it because journal inode is set) ?
> 3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore
>    any journal information - of course it still need to make sure the inode used for the journal isn't
>    used by anybody else) ?

This is a safety feature to make sure you don't shoot yourself in the foot.
If you are willing to throw away the changes in the journal that haven't been
committed to the normal locations yet, then you should be able to make some
changes to the journal to make it look like it is empty. You might even be able
to get away with just writing over the bad block. However, you really should
make an image of this partition before doing any writes to it. I don't know
what changes to make to the journal to make it appear empty.


From Erik.Andersen at intecbilling.com  Sat Jan  6 10:47:52 2007
From: Erik.Andersen at intecbilling.com (Erik Andersen)
Date: Sat, 6 Jan 2007 11:47:52 +0100
Subject: Problem in e2fsck ? read error in journal inode
References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com>
	<1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>
	<20070106040618.GA22262@wolff.to>
Message-ID: <1B5A4F55CD61554E9443E4A6395EC35C27479A@ibrosex01.intecbilling.com>

I understand the danger of not applying the journal, but I understand that what I will loose is
'only' the most recent changes in the filesystem.
Also I agree that the default behaviour of e2fsck should be to apply the journal if it exists,
No doubt about that.
But as e2fsck is ment as a tool for restoration of a damaged filesystem I expected it to be
able to bypass (or ignore) problems which prevents the action of the following parts.
My disk/partition has the problem (which seems like a hardware read-error) located in the 
inode where the journal is, so I cannot apply the journal, 
Because of this I would like to skip applying the journal and checking the inode used
for the journal. 
One way is, using debugfs, to set the appropriate attributes of the superblock so it looks 
like there is no journal (I thought it was the Filesystem Feature 'has_journal' which should 
not be set, but it seems that there are more attribute that needs fiddling..,).

Another way, was if there was an option to 'e2fsck' which made it ignore the journal (say '-ij'), 
it would let e2fsck read the superblock, but not attempt to do anything with the journal (including
reading the journal inode), e2fsck could then restore what it can.

/Erik Haukjaer Andersen


-----Original Message-----
From: Bruno Wolff III [mailto:bruno at wolff.to]
Sent: Sat 06-01-2007 05:06
To: Erik Andersen
Cc: ext3-users at redhat.com; tytso at mit.edu
Subject: Re: Problem in e2fsck ? read error in journal inode
 
On Fri, Jan 05, 2007 at 16:31:09 +0100,
  Erik Andersen <Erik.Andersen at intecbilling.com> wrote:
> 
> So my questios are:
> 1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ?
> 2) What makes e2fsck act on a journal (is it because journal inode is set) ?
> 3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore
>    any journal information - of course it still need to make sure the inode used for the journal isn't
>    used by anybody else) ?

This is a safety feature to make sure you don't shoot yourself in the foot.
If you are willing to throw away the changes in the journal that haven't been
committed to the normal locations yet, then you should be able to make some
changes to the journal to make it look like it is empty. You might even be able
to get away with just writing over the bad block. However, you really should
make an image of this partition before doing any writes to it. I don't know
what changes to make to the journal to make it appear empty.
 
 
--
This e-mail and any attachments are confidential and may also be legally
privileged and/or copyright material of Intec Telecom Systems PLC (or its
affiliated companies). If you are not an intended or authorised recipient
of this e-mail or have received it in error, please delete it immediately
and notify the sender by e-mail. In such a case, reading, reproducing,
printing or further dissemination of this e-mail or its contents is strictly
prohibited and may be unlawful.
Intec Telecom Systems PLC does not represent or warrant that an attachment
hereto is free from computer viruses or other defects. The opinions
expressed in this e-mail and any attachments may be those of the author and
are not necessarily those of Intec Telecom Systems PLC.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070106/de1348e1/attachment.htm>

From nicdnicd at gmail.com  Wed Jan 10 09:43:48 2007
From: nicdnicd at gmail.com (Nickel Cadmium)
Date: Wed, 10 Jan 2007 10:43:48 +0100
Subject: Can't mount /home anymore
Message-ID: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>

Hi!

I'm new to the list. I have a problem with mounting my home directory since
my PC crashed. I hope that I can get some help on this list as I don't know
much of ext3 myself.

The mount command for my /home gives me the following output:

# mount /dev/sda6 /mnt/tmp/
mount: wrong fs type, bad option, bad superblock on /dev/sda6,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

The partition /dev/sda6 is an ext3 file system. It also tried to specify the
file system type with -text3 and to use a backup superblock with the sb
option but it does not help.

I tried to run fsck but it does not fix the problem either. Here is the
output of fsck:

# fsck.ext3 /dev/sda6
e2fsck 1.39 (29-May-2006)
Group descriptors look bad... trying backup blocks...
Inode bitmap for group 522 is not in group. (block 3271884801)
Relocate<y>? yes

fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home

I searched the Internet and I found a small Windows application that could
see my /home partition and files. So I can't beleive there is nothing under
Linux to recover my files.
I'll appreciate any help on this topic because I tried anything I could
think of by myself and I still can't mount my home.
Any suggestions?

Best wishes!
Cd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070110/534f0469/attachment.htm>

From mvolaski at aecom.yu.edu  Sat Jan 13 02:31:38 2007
From: mvolaski at aecom.yu.edu (Maurice Volaski)
Date: Fri, 12 Jan 2007 21:31:38 -0500
Subject: [Q] How can the directory location to dd output affect performance?
In-Reply-To: <20070110170011.5FC9973674@hormel.redhat.com>
References: <20070110170011.5FC9973674@hormel.redhat.com>
Message-ID: <a06240403c1cdf22629b7@[129.98.90.227]>

I have two Opteron-based Tyan systems being supported by PCI-e Areca 
cards. There is definitely an issue going on in the two systems that 
is causing significantly degraded performance of these cards. It 
appeared, initially, that the SATA backplane on the Tyan chassis was 
wholly to blame.

But then I made an odd discovery.

I'm running from the Ubuntu LiveCD for 64-bit. It uses kernel 
2.6.19-7 and the RAID drives are formatted as ext3. The benchmark 
command is dd if=/dev/zero of=output oflag=sync bs=100M count=1

My root is organized has a /maurice directory and a /maurice/drbd 
directory and initially I had changed to that directory to run the 
benchmark. In here, the speeds were slow, averaging about 40 
MB/second.
When I happened to run it from /, I suddenly began getting about 70 
MB/second. So in some bizarre fashion, the location to where the 
output of dd is directed to dramatically impacts the performance. I 
have run from other directories and the performance varies depending 
on which directory I'm in.

Can anyone explain this?
-- 

Maurice Volaski, mvolaski at aecom.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University


From lists at nerdbynature.de  Sun Jan 14 02:54:27 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sun, 14 Jan 2007 02:54:27 +0000 (GMT)
Subject: Problem in e2fsck ? read error in journal inode
In-Reply-To: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>
References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com>
	<1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com>
Message-ID: <Pine.LNX.4.64.0701140248340.25453@sheep.housecafe.de>

On Fri, 5 Jan 2007, Erik Andersen wrote:
> One problem is that I have problems reading the raw partition 'dev/hda11' - I tried
> to 'dd' it but it failed...

Probably too late, but just for the record: if dd(1) fails, $FILESYSTEM 
can't do much about it: try dd_rescue, then use fsck on this image. If 
you have even more space: make a backup copy of the dd_rescue'd data 
before using fsck....

-- 
BOFH excuse #99:

SIMM crosstalk.


From lists at nerdbynature.de  Sun Jan 14 03:00:13 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sun, 14 Jan 2007 03:00:13 +0000 (GMT)
Subject: Can't mount /home anymore
In-Reply-To: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>
References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0701140255530.25453@sheep.housecafe.de>

On Wed, 10 Jan 2007, Nickel Cadmium wrote:
> # fsck.ext3 /dev/sda6
> e2fsck 1.39 (29-May-2006)
> Group descriptors look bad... trying backup blocks...
> Inode bitmap for group 522 is not in group. (block 3271884801)
> Relocate<y>? yes
>
> fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home

...and after this message, fsck.ext3 just stops? What's the exit code of 
fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try "fsck.ext3 -v" for 
more details. Is there anything related in your syslog? Can you dd(1) 
the device (read! not write! :)) without errors?

Which kernel/arch are you running?

Christian.
-- 
BOFH excuse #99:

SIMM crosstalk.


From lists at nerdbynature.de  Sun Jan 14 03:12:43 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sun, 14 Jan 2007 03:12:43 +0000 (GMT)
Subject: [Q] How can the directory location to dd output affect
	performance?
In-Reply-To: <a06240403c1cdf22629b7@[129.98.90.227]>
References: <20070110170011.5FC9973674@hormel.redhat.com>
	<a06240403c1cdf22629b7@[129.98.90.227]>
Message-ID: <Pine.LNX.4.64.0701140304300.25453@sheep.housecafe.de>

On Fri, 12 Jan 2007, Maurice Volaski wrote:
> the RAID drives are formatted as ext3. The benchmark command is dd 
> if=/dev/zero of=output oflag=sync bs=100M count=1
------------------^

> My root is organized has a /maurice directory and a /maurice/drbd directory 
> and initially I had changed to that directory to run the benchmark. In here, 
> the speeds were slow, averaging about 40 MB/second.
> When I happened to run it from /, I suddenly began getting about 70 
> MB/second. So in some bizarre fashion, the location to where the output of dd 
> is directed to dramatically impacts the performance. I have run from other
> directories and the performance varies depending on which directory I'm in.

Strange indeed. Only thing that comes to mind is: you're specifying the 
output file not as an absolute path, but relative: the directories (and 
its contents) are distributed all over the disk: some may "live" in the 
inner part of the plattern, some in the outer part - and different areas 
have different speeds. I've never encountered this and I could be dead 
wrong, but I'd suggest to specify the same 'of=/path/to/output' - I 
could imagine that it's more likely that for the next benchmark the 
filesystem uses the same on-disk location...no?

Christian.
-- 
BOFH excuse #12:

dry joints on cable plug


From nicdnicd at gmail.com  Sat Jan 20 11:01:14 2007
From: nicdnicd at gmail.com (Nickel Cadmium)
Date: Sat, 20 Jan 2007 12:01:14 +0100
Subject: Can't mount /home anymore
In-Reply-To: <Pine.LNX.4.64.0701140255530.25453@sheep.housecafe.de>
References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>
	<Pine.LNX.4.64.0701140255530.25453@sheep.housecafe.de>
Message-ID: <9ec348a90701200301t11d1c133m926b38a8a1326b31@mail.gmail.com>

Hi Christian (& all)!

Thanks for the reply. I was away for some time but here is the extra
information you requested.

Yes, after the message "fsck.ext3: e2fsck_read_bitmaps: illegal bitmap
block(s) for /home", fsck just stops.
The command 'fsck.ext3 /dev/sda6; echo $?' returns the value 8. Looking at
the man page for fsck, I found that this is an "Operational error". I have
totally no clue what this means.

With fsck, nothing is reported in the syslog file. If I try mounting the
partition, I get the following errors reported:
Jan 20 11:43:57 localhost kernel: EXT3-fs error (device sda6):
ext3_check_descriptors: Inode bitmap for group 522 not in group (block
3271884801)!
Jan 20 11:43:57 localhost kernel: EXT3-fs: group descriptors corrupted !

I could dd the partition without errors. I did copy the partition two times
already, I order to be able to try some recovery on it. With converting a
copy to ext2 and running "fsck.ext2 -v -y" on it (in something like two
days), I was able to get some files (all?) in the lost+found. However, the
file names are lost and the directory structure as well. It's hard to tell
which file is what.
I'm really wondering if there is a way to mount that partition again.

I run Mandriva on a Pentium PC. My kernel is 2.6.17-5mdv. However, I first
thought than my /home problem was some kind of booting problem. Thus I
upgraded from Mandriva 2006 to Mandriva 2007. This means that I don't know
what my kernel was when the problem occurred. It should be 2.6.12 as this
was a straight out-of-the-box installation.
My fsck version is "e2fsck 1.39".

Best wishes,
Cd

On 1/14/07, Christian Kujau <lists at nerdbynature.de> wrote:
>
> On Wed, 10 Jan 2007, Nickel Cadmium wrote:
> > # fsck.ext3 /dev/sda6
> > e2fsck 1.39 (29-May-2006)
> > Group descriptors look bad... trying backup blocks...
> > Inode bitmap for group 522 is not in group. (block 3271884801)
> > Relocate<y>? yes
> >
> > fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home
>
> ...and after this message, fsck.ext3 just stops? What's the exit code of
> fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try "fsck.ext3 -v" for
> more details. Is there anything related in your syslog? Can you dd(1)
> the device (read! not write! :)) without errors?
>
> Which kernel/arch are you running?
>
> Christian.
> --
> BOFH excuse #99:
>
> SIMM crosstalk.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070120/9d66936d/attachment.htm>

From sumanvg at cdactvm.in  Wed Jan 24 04:57:05 2007
From: sumanvg at cdactvm.in (Suman V G)
Date: Wed, 24 Jan 2007 10:27:05 +0530
Subject: ext3 journal from windows
Message-ID: <001801c73f74$185972e0$0c1d10ac@rccf012>


hai

is there any way to view the contents of ext3 journal from windows?
regards
suman

______________________________________
Scanned and protected by Email scanner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070124/6c0475d9/attachment.htm>

From adilger at clusterfs.com  Wed Jan 24 22:51:37 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Wed, 24 Jan 2007 15:51:37 -0700
Subject: ext3 journal from windows
In-Reply-To: <001801c73f74$185972e0$0c1d10ac@rccf012>
References: <001801c73f74$185972e0$0c1d10ac@rccf012>
Message-ID: <20070124225137.GT5236@schatzie.adilger.int>

On Jan 24, 2007  10:27 +0530, Suman V G wrote:
> is there any way to view the contents of ext3 journal from windows?
> regards

Debugfs allows dumping the journal to a file, and I _think_ e2fsprogs
can be compiled under windows.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From caphrim007 at gmail.com  Sun Jan 28 22:38:12 2007
From: caphrim007 at gmail.com (Tim Rupp)
Date: Sun, 28 Jan 2007 16:38:12 -0600
Subject: filesystem becomming read only
Message-ID: <45BD25D4.4030209@gmail.com>

Hi list,

I'm looking for advice/help in tracking down a problem with a new system
I've purchased.

I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It
has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard
drives attached to it.

It seems that when doing a considerable amount of file writing, the
filesystem will become read-only. See attached dmesg output.

I started looking for help on the nvnews forums, and found a suggestion
to set the pci=nommconf kernel parameter. This did not help. Aside from
that, there have only been suggestions such as "its likely faulty hardware".

kernel
2.6.17-10-generic #2 SMP

running on Ubuntu Edgy Eft, amd64 version; but the same problem showed
up on Fedora Core 6, both x86_64 and i386.

I checked to see if it was perhaps bad memory by running memtest86+, but
after 14 hours no errors were found. I've run badblocks on the disk that
contains the ext3 partition and no errors were found. Aside from
badblocks, I'm not aware of any disk tools I could use to test further.
smartmon tools report that all 3 of the disks are OK.

The bulk of the data being sent to the machine is via the network using
the application Unison, version 2.13.16 if that makes any difference.

I haven't tried another suggestion to set the kernel paramter idle=poll,
but since nothing else has worked so far, I don't see that making much
difference. Also I haven't tried installing Windows to isolate the
"faulty hardware" suggestion. Bad hardware would suggest that Windows
would see problems too right?

Any help would be greatly appreciated. I'm at the end of my rope on this
one.

Thanks in advance,
Tim
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg.output
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070128/28150c28/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lspci.output
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070128/28150c28/attachment-0001.ksh>

From tytso at mit.edu  Sun Jan 28 23:10:13 2007
From: tytso at mit.edu (Theodore Tso)
Date: Sun, 28 Jan 2007 18:10:13 -0500
Subject: filesystem becomming read only
In-Reply-To: <45BD25D4.4030209@gmail.com>
References: <45BD25D4.4030209@gmail.com>
Message-ID: <20070128231013.GB2442@thunk.org>

On Sun, Jan 28, 2007 at 04:38:12PM -0600, Tim Rupp wrote:
> 
> I'm looking for advice/help in tracking down a problem with a new system
> I've purchased.
> 
> I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It
> has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard
> drives attached to it.
> 
> It seems that when doing a considerable amount of file writing, the
> filesystem will become read-only. See attached dmesg output.

According to the dmesg output, the filesystem is getting remounted
read-only because the kernel detected an inconsistency in the block
allocation bitmaps.  Basically, a block that was in use and getting
freed (due to a file getting deleted) was found to be already marked
as not in use in the block bitmap.  This is very dangerous, since a
corrupted block allocation bitmap can result in data loss when a block
gets used by two different files, and the contents of part of the
first file gets overwritten by the second.  Hence, ext3 remounted the
filesystem read-only in order to protect your data from getting (more)
corrupted.

The question then is why is this happening.  If you run e2fsck and it
finds nothing wrong, then that means it was the in-core memory that
was corrupted --- so the data was correct on disk, but when it was
read from disk to memory, it had gotten corrupted somehow (another
good reason for ext3 to mark the filesystem read-only; to prevent the
corrupted data from getting written back to disk).

In any case, given that you've checked the memory, it does rather seem
to narrow it down to either SATA cables, the disk drives, or the SATA
controller, roughly in that order of probability.  The SATA cables are
probably the cheapest to try replacing first.  I suppose there is a
chance that there it's a hardware device driver or kernel issue.  You
might want to ask on LKML or on the Ubuntu support forums if there are
any known issues wit the nVidia SATA controller driver.

Good luck,

						- Ted


From caphrim007 at gmail.com  Mon Jan 29 00:05:33 2007
From: caphrim007 at gmail.com (Tim Rupp)
Date: Sun, 28 Jan 2007 18:05:33 -0600
Subject: filesystem becomming read only
In-Reply-To: <20070128231013.GB2442@thunk.org>
References: <45BD25D4.4030209@gmail.com> <20070128231013.GB2442@thunk.org>
Message-ID: <45BD3A4D.3080302@gmail.com>

Thanks Ted, I'll go through that list and try swapping the original
parts with spares that I have around home.

I've run fsck since the problem started occurring and it _has_ found
problems with the filesystem. I don't have the output on hand, but I can
definitely make the filesystem go read-only again. When I do, I can send
another mail with the attached output from the fsck. Maybe it will help
to find the problem.

I'll also try the LKML and Ubuntu forums.

Thanks a lot!
-Tim

Theodore Tso wrote:
> On Sun, Jan 28, 2007 at 04:38:12PM -0600, Tim Rupp wrote:
>> I'm looking for advice/help in tracking down a problem with a new system
>> I've purchased.
>>
>> I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It
>> has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard
>> drives attached to it.
>>
>> It seems that when doing a considerable amount of file writing, the
>> filesystem will become read-only. See attached dmesg output.
> 
> According to the dmesg output, the filesystem is getting remounted
> read-only because the kernel detected an inconsistency in the block
> allocation bitmaps.  Basically, a block that was in use and getting
> freed (due to a file getting deleted) was found to be already marked
> as not in use in the block bitmap.  This is very dangerous, since a
> corrupted block allocation bitmap can result in data loss when a block
> gets used by two different files, and the contents of part of the
> first file gets overwritten by the second.  Hence, ext3 remounted the
> filesystem read-only in order to protect your data from getting (more)
> corrupted.
> 
> The question then is why is this happening.  If you run e2fsck and it
> finds nothing wrong, then that means it was the in-core memory that
> was corrupted --- so the data was correct on disk, but when it was
> read from disk to memory, it had gotten corrupted somehow (another
> good reason for ext3 to mark the filesystem read-only; to prevent the
> corrupted data from getting written back to disk).
> 
> In any case, given that you've checked the memory, it does rather seem
> to narrow it down to either SATA cables, the disk drives, or the SATA
> controller, roughly in that order of probability.  The SATA cables are
> probably the cheapest to try replacing first.  I suppose there is a
> chance that there it's a hardware device driver or kernel issue.  You
> might want to ask on LKML or on the Ubuntu support forums if there are
> any known issues wit the nVidia SATA controller driver.
> 
> Good luck,
> 
> 						- Ted
> 


From tytso at mit.edu  Mon Jan 29 01:24:28 2007
From: tytso at mit.edu (Theodore Tso)
Date: Sun, 28 Jan 2007 20:24:28 -0500
Subject: filesystem becomming read only
In-Reply-To: <45BD3A4D.3080302@gmail.com>
References: <45BD25D4.4030209@gmail.com> <20070128231013.GB2442@thunk.org>
	<45BD3A4D.3080302@gmail.com>
Message-ID: <20070129012428.GC24828@thunk.org>

On Sun, Jan 28, 2007 at 06:05:33PM -0600, Tim Rupp wrote:
> Thanks Ted, I'll go through that list and try swapping the original
> parts with spares that I have around home.
> 
> I've run fsck since the problem started occurring and it _has_ found
> problems with the filesystem. I don't have the output on hand, but I can
> definitely make the filesystem go read-only again. When I do, I can send
> another mail with the attached output from the fsck. Maybe it will help
> to find the problem.

Well, the most important thing about the fsck error is to see whether
it looks like a single bit error, or an entire block being corrupted,
or a block getting written to the wrong location on disk.  (The last
two can be hard to differentiate, but you see ASCII text in an inode
table block, or an block/inode bitmap, that's usually a good clue that
it was the latter.)  

But at the end of the day, it looks like a hardware problem, and this
won't necessarily tell you exactly what is to blame, so it's not a
high priority thing to do.

You could try using badblocks -w (warning, this is a distructive
read/write test) or badblocks -n to see if you catch the disk doing
something wrong, but it may be that creating a filesystem and then
running your workload will be the best stress test.  Unfortunately we
don't have a good disk drive exerciser that exercises the disk with a
lot of random access read/write and seek patterns in Linux, at least
not as far as I know, anyway.

Good luck,

						- Ted


From adilger at clusterfs.com  Mon Jan 29 04:17:26 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Sun, 28 Jan 2007 21:17:26 -0700
Subject: filesystem becomming read only
In-Reply-To: <45BD25D4.4030209@gmail.com>
References: <45BD25D4.4030209@gmail.com>
Message-ID: <20070129041726.GW5236@schatzie.adilger.int>

On Jan 28, 2007  16:38 -0600, Tim Rupp wrote:
> I checked to see if it was perhaps bad memory by running memtest86+, but
> after 14 hours no errors were found.

I've heard in the past that you need to run memtest86 for at least a day
or two to be sure about that.  Another option (if you have multiple sticks
of RAM) is to take half of it out, see if the problem still happens (when
running with your reproducer), repeat until you've isolated it to one or
more sticks of RAM.


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From laurentsebag at free.fr  Mon Jan 29 22:16:10 2007
From: laurentsebag at free.fr (Laurent Sebag)
Date: Mon, 29 Jan 2007 23:16:10 +0100
Subject: seeking a developper documentation for jbd and ext3
Message-ID: <45BE722A.4090902@free.fr>

Hi,

I am a student in computer science and I develop a program that tries to 
explain other students the mecanisms of the ext3 filesystem : we show 
the content of each structure and explain what it means.
But I was unable to find a developper documentation for the jounalizing 
functionality (jbd).
Could you please tell me where can I find one ? ( in english or in french )
Also a documentation for the ext3 filesystem would be great.

Thanks,

Laurent Sebag


From kernel at crazytrain.com  Tue Jan 30 02:49:26 2007
From: kernel at crazytrain.com (farmerdude)
Date: Mon, 29 Jan 2007 21:49:26 -0500
Subject: seeking a developper documentation for jbd and ext3
In-Reply-To: <45BE722A.4090902@free.fr>
References: <45BE722A.4090902@free.fr>
Message-ID: <1170125366.8990.4.camel@oliver>

Laurent,

GOOGLE is your friend, or any search engine.  You'll find;

http://www.oreilly.com/catalog/linuxkernel2/chapter/ch17.pdf

http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html

regards,

farmerdude


On Mon, 2007-01-29 at 23:16 +0100, Laurent Sebag wrote:
> Hi,
> 
> I am a student in computer science and I develop a program that tries to 
> explain other students the mecanisms of the ext3 filesystem : we show 
> the content of each structure and explain what it means.
> But I was unable to find a developper documentation for the jounalizing 
> functionality (jbd).
> Could you please tell me where can I find one ? ( in english or in french )
> Also a documentation for the ext3 filesystem would be great.
> 
> Thanks,
> 
> Laurent Sebag
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
> 


From tushu1232 at gmail.com  Wed Jan 31 06:30:36 2007
From: tushu1232 at gmail.com (tushar)
Date: Wed, 31 Jan 2007 12:00:36 +0530
Subject: CHANGE IN THE struct ext3_dir_entry_2 IS SUGGESTED
Message-ID: <200976900701302230y49924d22o5bfcac5fdc2aa53d@mail.gmail.com>

well a change in the struct ext3_dir_entry_2 like

 ++ change in the structure

 struct ext33_dir_entry_2 {

++           union {
                      __le32 inode;
++                     struct ext33_inode *emb_i;

++                    }  u_emb_i;

        __le16    rec_len;        /* Directory entry length */
        __u8    name_len;        /* Name length */
        __u8    file_type;
        char    name[EXT3_NAME_LEN];    /* File name */

       }*de;

initially the default access was through the *de which referenced only
de->inode but the change is as follows

well we have reflected  the changes in the ext3 filesystem source code using
the above structure (but only used the u_emb_i.inode)

we just wanted to know is ther any change to be done in EXT3_DIR_REC_LEN
macro

#define EXT3_DIR_PAD            4
#define EXT3_DIR_ROUND            (EXT3_DIR_PAD - 1)
#define EXT3_DIR_REC_LEN(name_len)     (((name_len)  + 8 + EXT3_DIR_ROUND) &
\
                     ~EXT3_DIR_ROUND)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070131/6363c5f8/attachment.htm>

From sumanvg at cdactvm.in  Wed Jan 24 04:34:59 2007
From: sumanvg at cdactvm.in (Suman V G)
Date: Wed, 24 Jan 2007 10:04:59 +0530
Subject: ext3 journal from windows
Message-ID: <000801c73f71$02a26540$0c1d10ac@rccf012>

hai
is there any way to view the contents of ext3 journal form windows?
regards

______________________________________
Scanned and protected by Email scanner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070124/02f1d0a5/attachment.htm>

From evgeni at scientist.com  Mon Jan 29 15:43:54 2007
From: evgeni at scientist.com (Evgeni)
Date: Mon, 29 Jan 2007 07:43:54 -0800 (PST)
Subject: Can't mount /home anymore
In-Reply-To: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>
References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com>
Message-ID: <8691524.post@talk.nabble.com>


fsck can't help you because bitmaps are damaged,
but there is a way to recover your files.

1. Prepair enough space on another partition
and create directory where to put recovered files.

2. Boot linux.
(for example use Rescue CD or Knoppix Live CD)

3. Run debugfs in catastrophic mode (-c option) :
debugfs -c /dev/hdaX
catastrophic mode does not read inode and group bitmaps
if your superblock is damaged consider using -s (superblock) and -b (block
size) options
to specify backup superblock
(the block size and superblock locations can be found by dumpe2fs)

4. Inside debugfs shell run:
rdump directory_to_recover directory_for_recovered_files
   directory_to_recover is in damaged partition
   directory_for_recovered_files is in your active partition (from step 1
above)

for example:
   rdump /home /tmp/recovery
This will copy /home directory and all it's content including subdirectories
and files to /tmp/recovery.
-- 
View this message in context: http://www.nabble.com/Can%27t-mount--home-anymore-tf2951542.html#a8691524
Sent from the Ext3 - User mailing list archive at Nabble.com.