From lists at nerdbynature.de Wed Jan 3 19:47:39 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 3 Jan 2007 19:47:39 +0000 (GMT) Subject: Problem with ext3 filesystem In-Reply-To: <459BA37D.5050809@netropol.de> References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de> <20061230040816.GA27654@thunk.org> <459BA37D.5050809@netropol.de> Message-ID: [please reply on-list, so that everybody can comment] On Wed, 3 Jan 2007, Jan wrote: > thanks, I already moved all data from this device. I did run this test > and got no errors yet. I also did dd directly on the device. No error in > syslog. ...so, the device is fine then? > Now I tried jfs on one device and did my test with 200 mb files > - no errros. ...and JFS is fine too. > I installed a 400GB sata disk with a sata card and tried > this with ext3 and my test script and got also no errors. ...and now ext3 on this device is alsofine? So, that means the problem you had is not reproducible, no? > there are problems with the disks or the cable the controller should > notice this ? Perhaps I should ask at areca ? your 2nd post[0] indeed looked a lot like hardware errors. So yes, if these errors persist/are reproducible, you're probably better off asking the maintainer or the sata folks for known issues... Christian. [0]https://www.redhat.com/archives/ext3-users/2006-December/msg00025.html -- BOFH excuse #125: we just switched to Sprint. From jan at netropol.de Wed Jan 3 20:53:45 2007 From: jan at netropol.de (Jan) Date: Wed, 03 Jan 2007 20:53:45 +0000 Subject: Problem with ext3 filesystem In-Reply-To: References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de> <20061230040816.GA27654@thunk.org> <459BA37D.5050809@netropol.de> Message-ID: <459C17D9.70305@netropol.de> Hi, > > [please reply on-list, so that everybody can comment] sorry, I don't know why I didn't send it to the list ... >> thanks, I already moved all data from this device. I did run this test >> and got no errors yet. I also did dd directly on the device. No error in >> syslog. > > ...so, the device is fine then? I think yes. > >> Now I tried jfs on one device and did my test with 200 mb files >> - no errros. > > ...and JFS is fine too. JFS is now fine. >> I installed a 400GB sata disk with a sata card and tried >> this with ext3 and my test script and got also no errors. > > ...and now ext3 on this device is alsofine? So, that means the problem > you had is not reproducible, no? the problem is only on the 1.6 tb areca sata raid. and there it is reproducible. > >> there are problems with the disks or the cable the controller should >> notice this ? Perhaps I should ask at areca ? > > your 2nd post[0] indeed looked a lot like hardware errors. So yes, if > these errors persist/are reproducible, you're probably better off asking > the maintainer or the sata folks for known issues... o.k. but the problems seems to come only with areca and ext3, not with jfs. strange ... From lists at nerdbynature.de Wed Jan 3 21:07:39 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 3 Jan 2007 21:07:39 +0000 (GMT) Subject: Problem with ext3 filesystem In-Reply-To: <459C17D9.70305@netropol.de> References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de> <20061230040816.GA27654@thunk.org> <459BA37D.5050809@netropol.de> <459C17D9.70305@netropol.de> Message-ID: On Wed, 3 Jan 2007, Jan wrote: > o.k. but the problems seems to come only with areca and ext3, not with > jfs. strange ... I've seen this *many* times on the reiserfs mailing list: ppl are complaining that reiserfs was faulty while other filesystems went OK. And it turned out to be some hardware issue after all. Now, I can't say that I'm 100% sure that the device is to blame, but it seems that some hardware bugs are triggered (not caused) by certain fs operations[0], so if changing the fs "fixes" it - why not. but it's not a very satisfying solution, IMHO. Christian. [0] will some fs guru please hit me if this is total gibberish... -- BOFH excuse #423: It's not RFC-822 compliant. From tytso at mit.edu Wed Jan 3 21:07:31 2007 From: tytso at mit.edu (Theodore Tso) Date: Wed, 3 Jan 2007 16:07:31 -0500 Subject: Problem with ext3 filesystem In-Reply-To: <459C17D9.70305@netropol.de> References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de> <20061230040816.GA27654@thunk.org> <459BA37D.5050809@netropol.de> <459C17D9.70305@netropol.de> Message-ID: <20070103210731.GD5491@thunk.org> On Wed, Jan 03, 2007 at 08:53:45PM +0000, Jan wrote: > the problem is only on the 1.6 tb areca sata raid. and there it is > reproducible. Interesting. I'm using multiple 600+ megabyte (0.6tb) filesystems SATA Raid server on my home fileserver using ext3, and I'm not seeing any problems. (I have 3tb of space, but for management reasons I elected to divvy up the space into smaller volumes.) I'm using a 2.6.18-rc2 kernel with the Areca ARC-1160 controller, with Areca firmware version 1.41. (I haven't upgraded to 1.43 yet, even though it just became available a few month or two.) It's been working just fine, and I've had any issues with it. What Areca firmware version are you running with? > > > >> there are problems with the disks or the cable the controller should > >> notice this ? Perhaps I should ask at areca ? > > > > your 2nd post[0] indeed looked a lot like hardware errors. So yes, if > > these errors persist/are reproducible, you're probably better off asking > > the maintainer or the sata folks for known issues... > > o.k. but the problems seems to come only with areca and ext3, not with > jfs. strange ... When you say it's reproducible, has it been reproducible after using mke2fs to reformat the filesystem, perhaps with a manually specified filesystem size? E2fsck should have complained if the filesystem size was larger than the apparent size of the physical volume, but if the Areca firmware somehow screwed up and reported a larger size that what was actually there, then both mke2fs and e2fsck will blindly believe what the BLKGETSIZE64 ioctl returns (they won't use the binary search method of determining the disk size unless the GETBLKSIZE/GETBLKSIZE64 ioctls fail for one reason or another), and if there was some wraparound bug, that would explain what you're seeing. So the only other thing I can suggest is to double check the filesystem size as reported by dumpe2fs or df, and compare it with the raw volume size as reported by the Areca management interface; do the numbers look sane? = Ted From jan at netropol.de Thu Jan 4 09:55:47 2007 From: jan at netropol.de (Jan) Date: Thu, 04 Jan 2007 09:55:47 +0000 Subject: Problem with ext3 filesystem In-Reply-To: References: <4592FC5D.70101@netropol.de> <4595845B.7010403@netropol.de> <20061230040816.GA27654@thunk.org> <459BA37D.5050809@netropol.de> <459C17D9.70305@netropol.de> Message-ID: <459CCF23.4010509@netropol.de> over the night I got also jfs errors .... I'll try an older kernel > On Wed, 3 Jan 2007, Jan wrote: >> o.k. but the problems seems to come only with areca and ext3, not with >> jfs. strange ... > > I've seen this *many* times on the reiserfs mailing list: ppl are > complaining that reiserfs was faulty while other filesystems went OK. > And it turned out to be some hardware issue after all. Now, I can't say > that I'm 100% sure that the device is to blame, but it seems that some > hardware bugs are triggered (not caused) by certain fs operations[0], so > if changing the fs "fixes" it - why not. but it's not a very satisfying > solution, IMHO. > > Christian. > > [0] will some fs guru please hit me if this is total gibberish... From Erik.Andersen at intecbilling.com Fri Jan 5 15:31:09 2007 From: Erik.Andersen at intecbilling.com (Erik Andersen) Date: Fri, 5 Jan 2007 16:31:09 +0100 Subject: Problem in e2fsck ? read error in journal inode References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com> Message-ID: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> Hi I'm experiencing some problems on a harddisk (it has crashed for no known reason), and in pursuit of getting some of the data out of the disk I'm learning to use the e2fs progs package. Origianlly I used version 1.38, but after experiencing segfaults in e2fsck -which now is solved, I upgraded to 1.39, but now I have hit another problem (on another partition): The partition was formatted as ext3, and debugfs / dumpfs showed that the feature 'has_journal' was present (as expected). The e2fsck command gave the following: # e2fsck -B4096 -b32768 /dev/hda11 e2fsck 1.39 (29-May-2006) e2fsck: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /var So it seemed like a problem in the journal, as I could not find any option to e2fsck to tell it to skip applying (and reading) the journal, I set the filesystem festure 'has_jounal' off, using debugfs: # debugfs -b4096 -s32768 -w /dev/hda11 debugfs 1.39 (29-May-2006) debugfs: feature -has_journal Filesystem features: filetype sparse_super debugfs: show_super_stats -h Filesystem volume name: /var Last mounted on: Filesystem UUID: 2e8920a2-0460-4a87-b729-af812327fce7 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: filetype sparse_super Default mount options: (none) Filesystem state: not clean Errors behavior: Continue : Now trying e2fsck again did not make any difference: # e2fsck -B4096 -b32768 -y /dev/hda11 e2fsck 1.39 (29-May-2006) e2fsck: Attempt to read block from filesystem resulted in short read while checking ext3 journal for /var I also tried to get tune2fs to turn off the journalling, but got the response: First turn 'has_journal' back using debugfs: # debugfs -b4096 -s32768 -w /dev/hda11 debugfs 1.39 (29-May-2006) debugfs: feature has_journal Filesystem features: has_journal filetype sparse_super debugfs: quit Then use tune2fs: # tune2fs -O ^has_journal /dev/hda11 tune2fs 1.39 (29-May-2006) tune2fs: Attempt to read block from filesystem resulted in short read while reading journal inode So my questios are: 1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ? 2) What makes e2fsck act on a journal (is it because journal inode is set) ? 3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore any journal information - of course it still need to make sure the inode used for the journal isn't used by anybody else) ? It was a bit long, if you need any more info - please let me know One problem is that I have problems reading the raw partition 'dev/hda11' - I tried to 'dd' it but it failed... Regards Erik Haukj?r Andersen -- This e-mail and any attachments are confidential and may also be legally privileged and/or copyright material of Intec Telecom Systems PLC (or its affiliated companies). If you are not an intended or authorised recipient of this e-mail or have received it in error, please delete it immediately and notify the sender by e-mail. In such a case, reading, reproducing, printing or further dissemination of this e-mail or its contents is strictly prohibited and may be unlawful. Intec Telecom Systems PLC does not represent or warrant that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this e-mail and any attachments may be those of the author and are not necessarily those of Intec Telecom Systems PLC. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruno at wolff.to Sat Jan 6 04:06:18 2007 From: bruno at wolff.to (Bruno Wolff III) Date: Fri, 5 Jan 2007 22:06:18 -0600 Subject: Problem in e2fsck ? read error in journal inode In-Reply-To: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com> <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> Message-ID: <20070106040618.GA22262@wolff.to> On Fri, Jan 05, 2007 at 16:31:09 +0100, Erik Andersen wrote: > > So my questios are: > 1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ? > 2) What makes e2fsck act on a journal (is it because journal inode is set) ? > 3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore > any journal information - of course it still need to make sure the inode used for the journal isn't > used by anybody else) ? This is a safety feature to make sure you don't shoot yourself in the foot. If you are willing to throw away the changes in the journal that haven't been committed to the normal locations yet, then you should be able to make some changes to the journal to make it look like it is empty. You might even be able to get away with just writing over the bad block. However, you really should make an image of this partition before doing any writes to it. I don't know what changes to make to the journal to make it appear empty. From Erik.Andersen at intecbilling.com Sat Jan 6 10:47:52 2007 From: Erik.Andersen at intecbilling.com (Erik Andersen) Date: Sat, 6 Jan 2007 11:47:52 +0100 Subject: Problem in e2fsck ? read error in journal inode References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com> <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> <20070106040618.GA22262@wolff.to> Message-ID: <1B5A4F55CD61554E9443E4A6395EC35C27479A@ibrosex01.intecbilling.com> I understand the danger of not applying the journal, but I understand that what I will loose is 'only' the most recent changes in the filesystem. Also I agree that the default behaviour of e2fsck should be to apply the journal if it exists, No doubt about that. But as e2fsck is ment as a tool for restoration of a damaged filesystem I expected it to be able to bypass (or ignore) problems which prevents the action of the following parts. My disk/partition has the problem (which seems like a hardware read-error) located in the inode where the journal is, so I cannot apply the journal, Because of this I would like to skip applying the journal and checking the inode used for the journal. One way is, using debugfs, to set the appropriate attributes of the superblock so it looks like there is no journal (I thought it was the Filesystem Feature 'has_journal' which should not be set, but it seems that there are more attribute that needs fiddling..,). Another way, was if there was an option to 'e2fsck' which made it ignore the journal (say '-ij'), it would let e2fsck read the superblock, but not attempt to do anything with the journal (including reading the journal inode), e2fsck could then restore what it can. /Erik Haukjaer Andersen -----Original Message----- From: Bruno Wolff III [mailto:bruno at wolff.to] Sent: Sat 06-01-2007 05:06 To: Erik Andersen Cc: ext3-users at redhat.com; tytso at mit.edu Subject: Re: Problem in e2fsck ? read error in journal inode On Fri, Jan 05, 2007 at 16:31:09 +0100, Erik Andersen wrote: > > So my questios are: > 1) How can I make e2fsck skip reading a faulty journal (in my case there might be a HW error on the block) ? > 2) What makes e2fsck act on a journal (is it because journal inode is set) ? > 3) Shouldn't e2fsck act on wether the filesystem features (and in case of no 'has_journal' just ignore > any journal information - of course it still need to make sure the inode used for the journal isn't > used by anybody else) ? This is a safety feature to make sure you don't shoot yourself in the foot. If you are willing to throw away the changes in the journal that haven't been committed to the normal locations yet, then you should be able to make some changes to the journal to make it look like it is empty. You might even be able to get away with just writing over the bad block. However, you really should make an image of this partition before doing any writes to it. I don't know what changes to make to the journal to make it appear empty. -- This e-mail and any attachments are confidential and may also be legally privileged and/or copyright material of Intec Telecom Systems PLC (or its affiliated companies). If you are not an intended or authorised recipient of this e-mail or have received it in error, please delete it immediately and notify the sender by e-mail. In such a case, reading, reproducing, printing or further dissemination of this e-mail or its contents is strictly prohibited and may be unlawful. Intec Telecom Systems PLC does not represent or warrant that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this e-mail and any attachments may be those of the author and are not necessarily those of Intec Telecom Systems PLC. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicdnicd at gmail.com Wed Jan 10 09:43:48 2007 From: nicdnicd at gmail.com (Nickel Cadmium) Date: Wed, 10 Jan 2007 10:43:48 +0100 Subject: Can't mount /home anymore Message-ID: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> Hi! I'm new to the list. I have a problem with mounting my home directory since my PC crashed. I hope that I can get some help on this list as I don't know much of ext3 myself. The mount command for my /home gives me the following output: # mount /dev/sda6 /mnt/tmp/ mount: wrong fs type, bad option, bad superblock on /dev/sda6, missing codepage or other error In some cases useful info is found in syslog - try dmesg | tail or so The partition /dev/sda6 is an ext3 file system. It also tried to specify the file system type with -text3 and to use a backup superblock with the sb option but it does not help. I tried to run fsck but it does not fix the problem either. Here is the output of fsck: # fsck.ext3 /dev/sda6 e2fsck 1.39 (29-May-2006) Group descriptors look bad... trying backup blocks... Inode bitmap for group 522 is not in group. (block 3271884801) Relocate? yes fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home I searched the Internet and I found a small Windows application that could see my /home partition and files. So I can't beleive there is nothing under Linux to recover my files. I'll appreciate any help on this topic because I tried anything I could think of by myself and I still can't mount my home. Any suggestions? Best wishes! Cd -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvolaski at aecom.yu.edu Sat Jan 13 02:31:38 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Fri, 12 Jan 2007 21:31:38 -0500 Subject: [Q] How can the directory location to dd output affect performance? In-Reply-To: <20070110170011.5FC9973674@hormel.redhat.com> References: <20070110170011.5FC9973674@hormel.redhat.com> Message-ID: I have two Opteron-based Tyan systems being supported by PCI-e Areca cards. There is definitely an issue going on in the two systems that is causing significantly degraded performance of these cards. It appeared, initially, that the SATA backplane on the Tyan chassis was wholly to blame. But then I made an odd discovery. I'm running from the Ubuntu LiveCD for 64-bit. It uses kernel 2.6.19-7 and the RAID drives are formatted as ext3. The benchmark command is dd if=/dev/zero of=output oflag=sync bs=100M count=1 My root is organized has a /maurice directory and a /maurice/drbd directory and initially I had changed to that directory to run the benchmark. In here, the speeds were slow, averaging about 40 MB/second. When I happened to run it from /, I suddenly began getting about 70 MB/second. So in some bizarre fashion, the location to where the output of dd is directed to dramatically impacts the performance. I have run from other directories and the performance varies depending on which directory I'm in. Can anyone explain this? -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From lists at nerdbynature.de Sun Jan 14 02:54:27 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sun, 14 Jan 2007 02:54:27 +0000 (GMT) Subject: Problem in e2fsck ? read error in journal inode In-Reply-To: <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> References: <1B5A4F55CD61554E9443E4A6395EC35C274796@ibrosex01.intecbilling.com> <1B5A4F55CD61554E9443E4A6395EC35C274799@ibrosex01.intecbilling.com> Message-ID: On Fri, 5 Jan 2007, Erik Andersen wrote: > One problem is that I have problems reading the raw partition 'dev/hda11' - I tried > to 'dd' it but it failed... Probably too late, but just for the record: if dd(1) fails, $FILESYSTEM can't do much about it: try dd_rescue, then use fsck on this image. If you have even more space: make a backup copy of the dd_rescue'd data before using fsck.... -- BOFH excuse #99: SIMM crosstalk. From lists at nerdbynature.de Sun Jan 14 03:00:13 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sun, 14 Jan 2007 03:00:13 +0000 (GMT) Subject: Can't mount /home anymore In-Reply-To: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> Message-ID: On Wed, 10 Jan 2007, Nickel Cadmium wrote: > # fsck.ext3 /dev/sda6 > e2fsck 1.39 (29-May-2006) > Group descriptors look bad... trying backup blocks... > Inode bitmap for group 522 is not in group. (block 3271884801) > Relocate? yes > > fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home ...and after this message, fsck.ext3 just stops? What's the exit code of fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try "fsck.ext3 -v" for more details. Is there anything related in your syslog? Can you dd(1) the device (read! not write! :)) without errors? Which kernel/arch are you running? Christian. -- BOFH excuse #99: SIMM crosstalk. From lists at nerdbynature.de Sun Jan 14 03:12:43 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sun, 14 Jan 2007 03:12:43 +0000 (GMT) Subject: [Q] How can the directory location to dd output affect performance? In-Reply-To: References: <20070110170011.5FC9973674@hormel.redhat.com> Message-ID: On Fri, 12 Jan 2007, Maurice Volaski wrote: > the RAID drives are formatted as ext3. The benchmark command is dd > if=/dev/zero of=output oflag=sync bs=100M count=1 ------------------^ > My root is organized has a /maurice directory and a /maurice/drbd directory > and initially I had changed to that directory to run the benchmark. In here, > the speeds were slow, averaging about 40 MB/second. > When I happened to run it from /, I suddenly began getting about 70 > MB/second. So in some bizarre fashion, the location to where the output of dd > is directed to dramatically impacts the performance. I have run from other > directories and the performance varies depending on which directory I'm in. Strange indeed. Only thing that comes to mind is: you're specifying the output file not as an absolute path, but relative: the directories (and its contents) are distributed all over the disk: some may "live" in the inner part of the plattern, some in the outer part - and different areas have different speeds. I've never encountered this and I could be dead wrong, but I'd suggest to specify the same 'of=/path/to/output' - I could imagine that it's more likely that for the next benchmark the filesystem uses the same on-disk location...no? Christian. -- BOFH excuse #12: dry joints on cable plug From nicdnicd at gmail.com Sat Jan 20 11:01:14 2007 From: nicdnicd at gmail.com (Nickel Cadmium) Date: Sat, 20 Jan 2007 12:01:14 +0100 Subject: Can't mount /home anymore In-Reply-To: References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> Message-ID: <9ec348a90701200301t11d1c133m926b38a8a1326b31@mail.gmail.com> Hi Christian (& all)! Thanks for the reply. I was away for some time but here is the extra information you requested. Yes, after the message "fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home", fsck just stops. The command 'fsck.ext3 /dev/sda6; echo $?' returns the value 8. Looking at the man page for fsck, I found that this is an "Operational error". I have totally no clue what this means. With fsck, nothing is reported in the syslog file. If I try mounting the partition, I get the following errors reported: Jan 20 11:43:57 localhost kernel: EXT3-fs error (device sda6): ext3_check_descriptors: Inode bitmap for group 522 not in group (block 3271884801)! Jan 20 11:43:57 localhost kernel: EXT3-fs: group descriptors corrupted ! I could dd the partition without errors. I did copy the partition two times already, I order to be able to try some recovery on it. With converting a copy to ext2 and running "fsck.ext2 -v -y" on it (in something like two days), I was able to get some files (all?) in the lost+found. However, the file names are lost and the directory structure as well. It's hard to tell which file is what. I'm really wondering if there is a way to mount that partition again. I run Mandriva on a Pentium PC. My kernel is 2.6.17-5mdv. However, I first thought than my /home problem was some kind of booting problem. Thus I upgraded from Mandriva 2006 to Mandriva 2007. This means that I don't know what my kernel was when the problem occurred. It should be 2.6.12 as this was a straight out-of-the-box installation. My fsck version is "e2fsck 1.39". Best wishes, Cd On 1/14/07, Christian Kujau wrote: > > On Wed, 10 Jan 2007, Nickel Cadmium wrote: > > # fsck.ext3 /dev/sda6 > > e2fsck 1.39 (29-May-2006) > > Group descriptors look bad... trying backup blocks... > > Inode bitmap for group 522 is not in group. (block 3271884801) > > Relocate? yes > > > > fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home > > ...and after this message, fsck.ext3 just stops? What's the exit code of > fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try "fsck.ext3 -v" for > more details. Is there anything related in your syslog? Can you dd(1) > the device (read! not write! :)) without errors? > > Which kernel/arch are you running? > > Christian. > -- > BOFH excuse #99: > > SIMM crosstalk. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sumanvg at cdactvm.in Wed Jan 24 04:57:05 2007 From: sumanvg at cdactvm.in (Suman V G) Date: Wed, 24 Jan 2007 10:27:05 +0530 Subject: ext3 journal from windows Message-ID: <001801c73f74$185972e0$0c1d10ac@rccf012> hai is there any way to view the contents of ext3 journal from windows? regards suman ______________________________________ Scanned and protected by Email scanner -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at clusterfs.com Wed Jan 24 22:51:37 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 24 Jan 2007 15:51:37 -0700 Subject: ext3 journal from windows In-Reply-To: <001801c73f74$185972e0$0c1d10ac@rccf012> References: <001801c73f74$185972e0$0c1d10ac@rccf012> Message-ID: <20070124225137.GT5236@schatzie.adilger.int> On Jan 24, 2007 10:27 +0530, Suman V G wrote: > is there any way to view the contents of ext3 journal from windows? > regards Debugfs allows dumping the journal to a file, and I _think_ e2fsprogs can be compiled under windows. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From caphrim007 at gmail.com Sun Jan 28 22:38:12 2007 From: caphrim007 at gmail.com (Tim Rupp) Date: Sun, 28 Jan 2007 16:38:12 -0600 Subject: filesystem becomming read only Message-ID: <45BD25D4.4030209@gmail.com> Hi list, I'm looking for advice/help in tracking down a problem with a new system I've purchased. I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard drives attached to it. It seems that when doing a considerable amount of file writing, the filesystem will become read-only. See attached dmesg output. I started looking for help on the nvnews forums, and found a suggestion to set the pci=nommconf kernel parameter. This did not help. Aside from that, there have only been suggestions such as "its likely faulty hardware". kernel 2.6.17-10-generic #2 SMP running on Ubuntu Edgy Eft, amd64 version; but the same problem showed up on Fedora Core 6, both x86_64 and i386. I checked to see if it was perhaps bad memory by running memtest86+, but after 14 hours no errors were found. I've run badblocks on the disk that contains the ext3 partition and no errors were found. Aside from badblocks, I'm not aware of any disk tools I could use to test further. smartmon tools report that all 3 of the disks are OK. The bulk of the data being sent to the machine is via the network using the application Unison, version 2.13.16 if that makes any difference. I haven't tried another suggestion to set the kernel paramter idle=poll, but since nothing else has worked so far, I don't see that making much difference. Also I haven't tried installing Windows to isolate the "faulty hardware" suggestion. Bad hardware would suggest that Windows would see problems too right? Any help would be greatly appreciated. I'm at the end of my rope on this one. Thanks in advance, Tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dmesg.output URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lspci.output URL: From tytso at mit.edu Sun Jan 28 23:10:13 2007 From: tytso at mit.edu (Theodore Tso) Date: Sun, 28 Jan 2007 18:10:13 -0500 Subject: filesystem becomming read only In-Reply-To: <45BD25D4.4030209@gmail.com> References: <45BD25D4.4030209@gmail.com> Message-ID: <20070128231013.GB2442@thunk.org> On Sun, Jan 28, 2007 at 04:38:12PM -0600, Tim Rupp wrote: > > I'm looking for advice/help in tracking down a problem with a new system > I've purchased. > > I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It > has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard > drives attached to it. > > It seems that when doing a considerable amount of file writing, the > filesystem will become read-only. See attached dmesg output. According to the dmesg output, the filesystem is getting remounted read-only because the kernel detected an inconsistency in the block allocation bitmaps. Basically, a block that was in use and getting freed (due to a file getting deleted) was found to be already marked as not in use in the block bitmap. This is very dangerous, since a corrupted block allocation bitmap can result in data loss when a block gets used by two different files, and the contents of part of the first file gets overwritten by the second. Hence, ext3 remounted the filesystem read-only in order to protect your data from getting (more) corrupted. The question then is why is this happening. If you run e2fsck and it finds nothing wrong, then that means it was the in-core memory that was corrupted --- so the data was correct on disk, but when it was read from disk to memory, it had gotten corrupted somehow (another good reason for ext3 to mark the filesystem read-only; to prevent the corrupted data from getting written back to disk). In any case, given that you've checked the memory, it does rather seem to narrow it down to either SATA cables, the disk drives, or the SATA controller, roughly in that order of probability. The SATA cables are probably the cheapest to try replacing first. I suppose there is a chance that there it's a hardware device driver or kernel issue. You might want to ask on LKML or on the Ubuntu support forums if there are any known issues wit the nVidia SATA controller driver. Good luck, - Ted From caphrim007 at gmail.com Mon Jan 29 00:05:33 2007 From: caphrim007 at gmail.com (Tim Rupp) Date: Sun, 28 Jan 2007 18:05:33 -0600 Subject: filesystem becomming read only In-Reply-To: <20070128231013.GB2442@thunk.org> References: <45BD25D4.4030209@gmail.com> <20070128231013.GB2442@thunk.org> Message-ID: <45BD3A4D.3080302@gmail.com> Thanks Ted, I'll go through that list and try swapping the original parts with spares that I have around home. I've run fsck since the problem started occurring and it _has_ found problems with the filesystem. I don't have the output on hand, but I can definitely make the filesystem go read-only again. When I do, I can send another mail with the attached output from the fsck. Maybe it will help to find the problem. I'll also try the LKML and Ubuntu forums. Thanks a lot! -Tim Theodore Tso wrote: > On Sun, Jan 28, 2007 at 04:38:12PM -0600, Tim Rupp wrote: >> I'm looking for advice/help in tracking down a problem with a new system >> I've purchased. >> >> I have a beige box server with a Gigabyte GA-M51GM-S2G motherboard. It >> has the nVidia MCP51 SATA controller with 3 250 gig Western Digital hard >> drives attached to it. >> >> It seems that when doing a considerable amount of file writing, the >> filesystem will become read-only. See attached dmesg output. > > According to the dmesg output, the filesystem is getting remounted > read-only because the kernel detected an inconsistency in the block > allocation bitmaps. Basically, a block that was in use and getting > freed (due to a file getting deleted) was found to be already marked > as not in use in the block bitmap. This is very dangerous, since a > corrupted block allocation bitmap can result in data loss when a block > gets used by two different files, and the contents of part of the > first file gets overwritten by the second. Hence, ext3 remounted the > filesystem read-only in order to protect your data from getting (more) > corrupted. > > The question then is why is this happening. If you run e2fsck and it > finds nothing wrong, then that means it was the in-core memory that > was corrupted --- so the data was correct on disk, but when it was > read from disk to memory, it had gotten corrupted somehow (another > good reason for ext3 to mark the filesystem read-only; to prevent the > corrupted data from getting written back to disk). > > In any case, given that you've checked the memory, it does rather seem > to narrow it down to either SATA cables, the disk drives, or the SATA > controller, roughly in that order of probability. The SATA cables are > probably the cheapest to try replacing first. I suppose there is a > chance that there it's a hardware device driver or kernel issue. You > might want to ask on LKML or on the Ubuntu support forums if there are > any known issues wit the nVidia SATA controller driver. > > Good luck, > > - Ted > From tytso at mit.edu Mon Jan 29 01:24:28 2007 From: tytso at mit.edu (Theodore Tso) Date: Sun, 28 Jan 2007 20:24:28 -0500 Subject: filesystem becomming read only In-Reply-To: <45BD3A4D.3080302@gmail.com> References: <45BD25D4.4030209@gmail.com> <20070128231013.GB2442@thunk.org> <45BD3A4D.3080302@gmail.com> Message-ID: <20070129012428.GC24828@thunk.org> On Sun, Jan 28, 2007 at 06:05:33PM -0600, Tim Rupp wrote: > Thanks Ted, I'll go through that list and try swapping the original > parts with spares that I have around home. > > I've run fsck since the problem started occurring and it _has_ found > problems with the filesystem. I don't have the output on hand, but I can > definitely make the filesystem go read-only again. When I do, I can send > another mail with the attached output from the fsck. Maybe it will help > to find the problem. Well, the most important thing about the fsck error is to see whether it looks like a single bit error, or an entire block being corrupted, or a block getting written to the wrong location on disk. (The last two can be hard to differentiate, but you see ASCII text in an inode table block, or an block/inode bitmap, that's usually a good clue that it was the latter.) But at the end of the day, it looks like a hardware problem, and this won't necessarily tell you exactly what is to blame, so it's not a high priority thing to do. You could try using badblocks -w (warning, this is a distructive read/write test) or badblocks -n to see if you catch the disk doing something wrong, but it may be that creating a filesystem and then running your workload will be the best stress test. Unfortunately we don't have a good disk drive exerciser that exercises the disk with a lot of random access read/write and seek patterns in Linux, at least not as far as I know, anyway. Good luck, - Ted From adilger at clusterfs.com Mon Jan 29 04:17:26 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Sun, 28 Jan 2007 21:17:26 -0700 Subject: filesystem becomming read only In-Reply-To: <45BD25D4.4030209@gmail.com> References: <45BD25D4.4030209@gmail.com> Message-ID: <20070129041726.GW5236@schatzie.adilger.int> On Jan 28, 2007 16:38 -0600, Tim Rupp wrote: > I checked to see if it was perhaps bad memory by running memtest86+, but > after 14 hours no errors were found. I've heard in the past that you need to run memtest86 for at least a day or two to be sure about that. Another option (if you have multiple sticks of RAM) is to take half of it out, see if the problem still happens (when running with your reproducer), repeat until you've isolated it to one or more sticks of RAM. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From laurentsebag at free.fr Mon Jan 29 22:16:10 2007 From: laurentsebag at free.fr (Laurent Sebag) Date: Mon, 29 Jan 2007 23:16:10 +0100 Subject: seeking a developper documentation for jbd and ext3 Message-ID: <45BE722A.4090902@free.fr> Hi, I am a student in computer science and I develop a program that tries to explain other students the mecanisms of the ext3 filesystem : we show the content of each structure and explain what it means. But I was unable to find a developper documentation for the jounalizing functionality (jbd). Could you please tell me where can I find one ? ( in english or in french ) Also a documentation for the ext3 filesystem would be great. Thanks, Laurent Sebag From kernel at crazytrain.com Tue Jan 30 02:49:26 2007 From: kernel at crazytrain.com (farmerdude) Date: Mon, 29 Jan 2007 21:49:26 -0500 Subject: seeking a developper documentation for jbd and ext3 In-Reply-To: <45BE722A.4090902@free.fr> References: <45BE722A.4090902@free.fr> Message-ID: <1170125366.8990.4.camel@oliver> Laurent, GOOGLE is your friend, or any search engine. You'll find; http://www.oreilly.com/catalog/linuxkernel2/chapter/ch17.pdf http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html regards, farmerdude On Mon, 2007-01-29 at 23:16 +0100, Laurent Sebag wrote: > Hi, > > I am a student in computer science and I develop a program that tries to > explain other students the mecanisms of the ext3 filesystem : we show > the content of each structure and explain what it means. > But I was unable to find a developper documentation for the jounalizing > functionality (jbd). > Could you please tell me where can I find one ? ( in english or in french ) > Also a documentation for the ext3 filesystem would be great. > > Thanks, > > Laurent Sebag > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > From tushu1232 at gmail.com Wed Jan 31 06:30:36 2007 From: tushu1232 at gmail.com (tushar) Date: Wed, 31 Jan 2007 12:00:36 +0530 Subject: CHANGE IN THE struct ext3_dir_entry_2 IS SUGGESTED Message-ID: <200976900701302230y49924d22o5bfcac5fdc2aa53d@mail.gmail.com> well a change in the struct ext3_dir_entry_2 like ++ change in the structure struct ext33_dir_entry_2 { ++ union { __le32 inode; ++ struct ext33_inode *emb_i; ++ } u_emb_i; __le16 rec_len; /* Directory entry length */ __u8 name_len; /* Name length */ __u8 file_type; char name[EXT3_NAME_LEN]; /* File name */ }*de; initially the default access was through the *de which referenced only de->inode but the change is as follows well we have reflected the changes in the ext3 filesystem source code using the above structure (but only used the u_emb_i.inode) we just wanted to know is ther any change to be done in EXT3_DIR_REC_LEN macro #define EXT3_DIR_PAD 4 #define EXT3_DIR_ROUND (EXT3_DIR_PAD - 1) #define EXT3_DIR_REC_LEN(name_len) (((name_len) + 8 + EXT3_DIR_ROUND) & \ ~EXT3_DIR_ROUND) -------------- next part -------------- An HTML attachment was scrubbed... URL: From sumanvg at cdactvm.in Wed Jan 24 04:34:59 2007 From: sumanvg at cdactvm.in (Suman V G) Date: Wed, 24 Jan 2007 10:04:59 +0530 Subject: ext3 journal from windows Message-ID: <000801c73f71$02a26540$0c1d10ac@rccf012> hai is there any way to view the contents of ext3 journal form windows? regards ______________________________________ Scanned and protected by Email scanner -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeni at scientist.com Mon Jan 29 15:43:54 2007 From: evgeni at scientist.com (Evgeni) Date: Mon, 29 Jan 2007 07:43:54 -0800 (PST) Subject: Can't mount /home anymore In-Reply-To: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> Message-ID: <8691524.post@talk.nabble.com> fsck can't help you because bitmaps are damaged, but there is a way to recover your files. 1. Prepair enough space on another partition and create directory where to put recovered files. 2. Boot linux. (for example use Rescue CD or Knoppix Live CD) 3. Run debugfs in catastrophic mode (-c option) : debugfs -c /dev/hdaX catastrophic mode does not read inode and group bitmaps if your superblock is damaged consider using -s (superblock) and -b (block size) options to specify backup superblock (the block size and superblock locations can be found by dumpe2fs) 4. Inside debugfs shell run: rdump directory_to_recover directory_for_recovered_files directory_to_recover is in damaged partition directory_for_recovered_files is in your active partition (from step 1 above) for example: rdump /home /tmp/recovery This will copy /home directory and all it's content including subdirectories and files to /tmp/recovery. -- View this message in context: http://www.nabble.com/Can%27t-mount--home-anymore-tf2951542.html#a8691524 Sent from the Ext3 - User mailing list archive at Nabble.com.