From sumanvg at cdactvm.in Thu Feb 1 04:13:25 2007 From: sumanvg at cdactvm.in (Suman V G) Date: Thu, 1 Feb 2007 09:43:25 +0530 Subject: ext3 journal from windows References: <001801c73f74$185972e0$0c1d10ac@rccf012> <20070124225137.GT5236@schatzie.adilger.int> Message-ID: <003501c745b7$5264fbf0$0c1d10ac@rccf012> can you just brief out how compiling of e2fsprogs helps in viewing the ext3 journal. I have downloaded e2fsprogs and i find a journal file in that,but i dont understand what that file is about.pls help regards suman ______________________________________ Scanned and protected by Email scanner From marcop123 at gmail.com Thu Feb 1 15:29:39 2007 From: marcop123 at gmail.com (Marco Polo) Date: Thu, 1 Feb 2007 10:29:39 -0500 Subject: corrupted FS! Message-ID: <859813fd0702010729s21df49f3v402568b04b7086fb@mail.gmail.com> I have an apple Xserve RAID attached to a Linux Box, serving one Raided LUN of 2500Gigs, it was functional until the box crashed. When I hard booted it, even though it mounted the FS, I've been unable to access it. Cd-ing to any of its directories gives "Input/Output error". The ls ?la on the FS shows "?" for every field with no permission information. The /var/log/messages is flooded with: Feb 1 10:12:21 *HOSTNAME* kernel: EXT3-fs error (device sda1): ext3_get_inode_loc: unable to read inod e block - inode=178356225, block=356712450 An e2fsck gives e2fsck 1.35 (28-Feb-2004) /xraid: recovering journal The filesystem size (according to the superblock) is 610476016 blocks The physical size of the device is 73605104 blocks Either the superblock or the partition table is likely to be corrupt! Abort? It seems like the partition table is corrupted and not the superblock, and if so, what would be the right way of going about it?! I am reluctant to answer no to the abort question of e2fsck to try to fix it. Or should I -M -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicdnicd at gmail.com Fri Feb 2 15:13:47 2007 From: nicdnicd at gmail.com (Nickel Cadmium) Date: Fri, 2 Feb 2007 16:13:47 +0100 Subject: Can't mount /home anymore In-Reply-To: <9ec348a90701200301t11d1c133m926b38a8a1326b31@mail.gmail.com> References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> <9ec348a90701200301t11d1c133m926b38a8a1326b31@mail.gmail.com> Message-ID: <9ec348a90702020713naf4d5fv73e8b60352bda53e@mail.gmail.com> Hi! I'm still stuck with my unmountable home partition. Would it be possible to mount it using a backup block somehow? Cd On 1/20/07, Nickel Cadmium wrote: > > Hi Christian (& all)! > > Thanks for the reply. I was away for some time but here is the extra > information you requested. > > Yes, after the message "fsck.ext3: e2fsck_read_bitmaps: illegal bitmap > block(s) for /home", fsck just stops. > The command 'fsck.ext3 /dev/sda6; echo $?' returns the value 8. Looking at > the man page for fsck, I found that this is an "Operational error". I have > totally no clue what this means. > > With fsck, nothing is reported in the syslog file. If I try mounting the > partition, I get the following errors reported: > Jan 20 11:43:57 localhost kernel: EXT3-fs error (device sda6): > ext3_check_descriptors: Inode bitmap for group 522 not in group (block > 3271884801)! > Jan 20 11:43:57 localhost kernel: EXT3-fs: group descriptors corrupted ! > > I could dd the partition without errors. I did copy the partition two > times already, I order to be able to try some recovery on it. With > converting a copy to ext2 and running "fsck.ext2 -v -y" on it (in > something like two days), I was able to get some files (all?) in the > lost+found. However, the file names are lost and the directory structure as > well. It's hard to tell which file is what. > I'm really wondering if there is a way to mount that partition again. > > I run Mandriva on a Pentium PC. My kernel is 2.6.17-5mdv. However, I first > thought than my /home problem was some kind of booting problem. Thus I > upgraded from Mandriva 2006 to Mandriva 2007. This means that I don't know > what my kernel was when the problem occurred. It should be 2.6.12 as this > was a straight out-of-the-box installation. > My fsck version is "e2fsck 1.39". > > Best wishes, > Cd > > On 1/14/07, Christian Kujau wrote: > > > > On Wed, 10 Jan 2007, Nickel Cadmium wrote: > > > # fsck.ext3 /dev/sda6 > > > e2fsck 1.39 (29-May-2006) > > > Group descriptors look bad... trying backup blocks... > > > Inode bitmap for group 522 is not in group. (block 3271884801) > > > Relocate? yes > > > > > > fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home > > > > ...and after this message, fsck.ext3 just stops? What's the exit code of > > fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try " fsck.ext3 -v" > > for > > more details. Is there anything related in your syslog? Can you dd(1) > > the device (read! not write! :)) without errors? > > > > Which kernel/arch are you running? > > > > Christian. > > -- > > BOFH excuse #99: > > > > SIMM crosstalk. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.c.wolber at boeing.com Fri Feb 2 15:31:33 2007 From: richard.c.wolber at boeing.com (Wolber, Richard C) Date: Fri, 2 Feb 2007 07:31:33 -0800 Subject: Can't mount /home anymore In-Reply-To: <9ec348a90702020713naf4d5fv73e8b60352bda53e@mail.gmail.com> Message-ID: <8C7C41A176AC0B468BEFB2EFD9BDAB990242691C@XCH-NW-5V2.nw.nos.boeing.com> Before you do this, make double sure you have a backup of your disk volume. It can, and probably will, damage some or all of your filesystem. First you need to find your backup superblocks. You can calculate them based on the filesystem block size, but I find that it's easier to just do the following: [root at server ~]# dumpe2fs /dev/sda6 | grep Backup dumpe2fs 1.35 (28-Feb-2004) Backup superblock at 8193, Group descriptors at 8194-8194 Backup superblock at 24577, Group descriptors at 24578-24578 Backup superblock at 40961, Group descriptors at 40962-40962 Backup superblock at 57345, Group descriptors at 57346-57346 Backup superblock at 73729, Group descriptors at 73730-73730 Now that you have the backup superblocks, you have to replace the old superblock with a backup superblock: e2fsck -b 8193 /dev/sda6 Then try to mount the filesystem. If it fails to mount, move on down to the next backup superblock (24577) and so on, until you run out of backup superblocks *OR* the filesystem mounts properly. Once you get it mounted, recover whats left of your files to a safe place, wipe the drive, reformat it, restore your files and then think long and hard about getting a decent nightly backup solution in place! ..Chuck.. ________________________________ From: Nickel Cadmium [mailto:nicdnicd at gmail.com] Sent: Friday, February 02, 2007 7:14 AM To: ext3-users at redhat.com Subject: Re: Can't mount /home anymore Hi! I'm still stuck with my unmountable home partition. Would it be possible to mount it using a backup block somehow? Cd On 1/20/07, Nickel Cadmium wrote: Hi Christian (& all)! Thanks for the reply. I was away for some time but here is the extra information you requested. Yes, after the message "fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home", fsck just stops. The command 'fsck.ext3 /dev/sda6; echo $?' returns the value 8. Looking at the man page for fsck, I found that this is an "Operational error". I have totally no clue what this means. With fsck, nothing is reported in the syslog file. If I try mounting the partition, I get the following errors reported: Jan 20 11:43:57 localhost kernel: EXT3-fs error (device sda6): ext3_check_descriptors: Inode bitmap for group 522 not in group (block 3271884801)! Jan 20 11:43:57 localhost kernel: EXT3-fs: group descriptors corrupted ! I could dd the partition without errors. I did copy the partition two times already, I order to be able to try some recovery on it. With converting a copy to ext2 and running "fsck.ext2 -v -y" on it (in something like two days), I was able to get some files (all?) in the lost+found. However, the file names are lost and the directory structure as well. It's hard to tell which file is what. I'm really wondering if there is a way to mount that partition again. I run Mandriva on a Pentium PC. My kernel is 2.6.17-5mdv. However, I first thought than my /home problem was some kind of booting problem. Thus I upgraded from Mandriva 2006 to Mandriva 2007. This means that I don't know what my kernel was when the problem occurred. It should be 2.6.12 as this was a straight out-of-the-box installation. My fsck version is "e2fsck 1.39". Best wishes, Cd On 1/14/07, Christian Kujau wrote: On Wed, 10 Jan 2007, Nickel Cadmium wrote: > # fsck.ext3 /dev/sda6 > e2fsck 1.39 (29-May-2006) > Group descriptors look bad... trying backup blocks... > Inode bitmap for group 522 is not in group. (block 3271884801) > Relocate? yes > > fsck.ext3: e2fsck_read_bitmaps: illegal bitmap block(s) for /home ...and after this message, fsck.ext3 just stops? What's the exit code of fsck.ext3? (e.g. 'fsck.ext3 /dev/sda6; echo $?'). Try " fsck.ext3 -v" for more details. Is there anything related in your syslog? Can you dd(1) the device (read! not write! :)) without errors? Which kernel/arch are you running? Christian. -- BOFH excuse #99: SIMM crosstalk. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at nerdbynature.de Fri Feb 2 20:38:02 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Fri, 2 Feb 2007 20:38:02 +0000 (GMT) Subject: corrupted FS! In-Reply-To: <859813fd0702010729s21df49f3v402568b04b7086fb@mail.gmail.com> References: <859813fd0702010729s21df49f3v402568b04b7086fb@mail.gmail.com> Message-ID: On Thu, 1 Feb 2007, Marco Polo wrote: > of its directories gives "Input/Output error". The ls ?la on the FS shows > "?" for every field with no permission information. The /var/log/messages is > flooded with: > > Feb 1 10:12:21 *HOSTNAME* kernel: EXT3-fs error (device sda1): > ext3_get_inode_loc: unable to read inode block - inode=178356225, block=356712450 hm, are there device related messages in the syslog? "unable to read inode block" sounds like general i/o errors to me. > e2fsck 1.35 (28-Feb-2004) can you try a more current version? e2fsprogs.sf.net is at v1.39 > It seems like the partition table is corrupted and not the superblock, and > if so, what would be the right way of going about it?! sorry to repeat myself, but at first I'd make sure that there are no device errors. if this can be ruled out: do you have spare 250GB to backup you (corrupt) fs? if yes - please do, then use the current fsck to try to repair the fs. C. -- BOFH excuse #45: virus attack, luser responsible From lists at nerdbynature.de Fri Feb 2 20:54:06 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Fri, 2 Feb 2007 20:54:06 +0000 (GMT) Subject: ext3 journal from windows In-Reply-To: <003501c745b7$5264fbf0$0c1d10ac@rccf012> References: <001801c73f74$185972e0$0c1d10ac@rccf012> <20070124225137.GT5236@schatzie.adilger.int> <003501c745b7$5264fbf0$0c1d10ac@rccf012> Message-ID: On Thu, 1 Feb 2007, Suman V G wrote: > can you just brief out how compiling of e2fsprogs helps in viewing the ext3 > journal. I have downloaded e2fsprogs and i find a journal file in that,but i > dont understand what that file is about.pls help viewing the ext3 journal has come up on this list quite a few times: https://www.redhat.com/archives/ext3-users/2006-June/msg00005.html However, it's not dumping into a human-readable format. what are you trying to achieve? C. -- BOFH excuse #45: virus attack, luser responsible From lists at nerdbynature.de Sat Feb 3 00:30:08 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sat, 3 Feb 2007 00:30:08 +0000 (GMT) Subject: corrupted FS! In-Reply-To: <859813fd0702021400q7c34b60fo6ff2e900c66a0c51@mail.gmail.com> References: <859813fd0702010729s21df49f3v402568b04b7086fb@mail.gmail.com> <859813fd0702021400q7c34b60fo6ff2e900c66a0c51@mail.gmail.com> Message-ID: On Fri, 2 Feb 2007, Marco Polo wrote: > Feb 2 11:49:15 backup kernel: EXT3-fs warning (device sda1): > ext3_clear_journal_err: Filesystem error recorded from previous mount: > IO failure well, that's the one spot where it says "io failure". from looking at the other places in fs/ext3/inode.c I cannot tell for sure if they are related to IO errors though... > Feb 2 11:49:15 backup kernel: EXT3-fs warning (device sda1): > ext3_clear_journal_err: Marking fs in need of filesystem check. > Feb 2 11:49:15 backup kernel: EXT3-fs warning: mounting fs with errors, > running e2fsck is recommended so, mounting succeeds and even recovery is fine. looks like filesystem structure is intact. > Feb 2 11:49:15 backup kernel: SELinux: initialized (dev sda1, type ext3), > uses xattr > Feb 2 11:49:23 backup kernel: EXT3-fs error (device sda1): > ext3_get_inode_loc: unable to read inode block - inode=127238145, > block=254476290 ...yet e2fsck fails to repair the fs? have you tried the current version? but anyway: since you can mount the fs: try to mount it ro, probably withouth fancy options and try to backup your data if you did not already. btw, which kernel/distribution is this? does a more current kernel help? (it won't help when I/O errors really are to blame) Christian. -- BOFH excuse #153: Big to little endian conversion error From tytso at mit.edu Tue Feb 6 03:36:45 2007 From: tytso at mit.edu (Theodore Tso) Date: Mon, 5 Feb 2007 22:36:45 -0500 Subject: CHANGE IN THE struct ext3_dir_entry_2 IS SUGGESTED In-Reply-To: <200976900701302230y49924d22o5bfcac5fdc2aa53d@mail.gmail.com> References: <200976900701302230y49924d22o5bfcac5fdc2aa53d@mail.gmail.com> Message-ID: <20070206033645.GC11018@thunk.org> On Wed, Jan 31, 2007 at 12:00:36PM +0530, tushar wrote: > well a change in the struct ext3_dir_entry_2 like > > ++ change in the structure > > struct ext33_dir_entry_2 { > > ++ union { > __le32 inode; > ++ struct ext33_inode *emb_i; > > ++ } u_emb_i; > > __le16 rec_len; /* Directory entry length */ > __u8 name_len; /* Name length */ > __u8 file_type; > char name[EXT3_NAME_LEN]; /* File name */ > > }*de; This change doesn't make any sense to me. The ext3_dir_entry_2 data structure reflects an on-disk layout. As such putting pointers into an on-disk data structure isn't particularly useful. If the goal is to make an incompatible change to the filesystem format to store the inode embedded into the directory structure instead of in the inode table, it's something that I've thought about, and in fact there was a Usenix paper exploring this idea about ten years ago. The hard part with doing something like this managing hard links, particularly when an inode is originally created in one directory, hardlinked in another directory, and then the original directory entry is removed. There are ways of dealing it, but it's non-trivial. In any case, an incompatible change like this is not something that would be made to ext3. It's something that potentially could be considered for ext4, but as I said, there are a lot of issues that would have to be considered and thought through first. Probably better to move that sort of discussing to the linux-ext4 mailing list on vger.kernel.org. Regards, - Ted From nicdnicd at gmail.com Tue Feb 6 18:31:18 2007 From: nicdnicd at gmail.com (Nickel Cadmium) Date: Tue, 6 Feb 2007 19:31:18 +0100 Subject: Can't mount /home anymore In-Reply-To: <8691524.post@talk.nabble.com> References: <9ec348a90701100143i4b085c43v5340d3c192c30384@mail.gmail.com> <8691524.post@talk.nabble.com> Message-ID: <9ec348a90702061031m7c20c516p73f98b1be6f5aed7@mail.gmail.com> Hi! Thanks a lot for your mail: I managed to recover my files! I don't really know how debugfs works but with the rdump command it could copy all my files to a new partition. Cheers, Cd On 1/29/07, Evgeni wrote: > > > fsck can't help you because bitmaps are damaged, > but there is a way to recover your files. > > 1. Prepair enough space on another partition > and create directory where to put recovered files. > > 2. Boot linux. > (for example use Rescue CD or Knoppix Live CD) > > 3. Run debugfs in catastrophic mode (-c option) : > debugfs -c /dev/hdaX > catastrophic mode does not read inode and group bitmaps > if your superblock is damaged consider using -s (superblock) and -b (block > size) options > to specify backup superblock > (the block size and superblock locations can be found by dumpe2fs) > > 4. Inside debugfs shell run: > rdump directory_to_recover directory_for_recovered_files > directory_to_recover is in damaged partition > directory_for_recovered_files is in your active partition (from step 1 > above) > > for example: > rdump /home /tmp/recovery > This will copy /home directory and all it's content including > subdirectories > and files to /tmp/recovery. > -- > View this message in context: > http://www.nabble.com/Can%27t-mount--home-anymore-tf2951542.html#a8691524 > Sent from the Ext3 - User mailing list archive at Nabble.com. > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sev at bnl.gov Fri Feb 16 16:25:19 2007 From: sev at bnl.gov (Sev Binello) Date: Fri, 16 Feb 2007 11:25:19 -0500 Subject: how does ext3 handle no communication to storage In-Reply-To: <20060828205822.GB4944@thunk.org> References: <44F33E3A.8020805@bnl.gov> <20060828205822.GB4944@thunk.org> Message-ID: <45D5DAEF.5020702@bnl.gov> Hi - We had the conversation below a while back. I was wondering if a patch was ever put together to address this issue Thanks -Sev -- Sev Binello Brookhaven National Laboratory Upton, New York 631-344-5647 sev at bnl.gov Theodore Tso wrote: > On Mon, Aug 28, 2006 at 03:04:26PM -0400, Sev Binello wrote: > >> Can anyone tell us what the expected behavior is, >> in the event that ext3 loses total contact with the storage system ? >> >> We have found that the file system is put into read only mode, >> it is then found to contain errors, and requires an fsck. >> Sometimes the fsck finds numerous (some serious looking) errors, >> and that running without fsck doesn't seem like a safe option. >> >> We are trying to understand why exactly this is. >> Why do we get errors ?  Why serious ones ? >> >> > > The filesystem should go read-only when you try to modify it. > HOWEVER, the problem comes when connectivity is restored. When an > attempt to modify the filesystem fails, the journal is aborted and an > I/O is returned. However, there may be modified blocks left hanging > about in the buffer cache before the kernel realized that connectivity > has been lost, and what we need to do is to make sure that all dirty > blocks in the buffer cache and page cache are dropped. > > Basically, if I'm right, this is a bug, which we need to fix. That > patch would require flushing all modified buffers and page cache pages > when the filesystem goes read-only. The modified buffers is the more > important thing, since that's what causes the filesystem corruption, > although for correctness's sake we should be flushing any modified > page cache pages as well. I don't have time to code this right now, > but I'll try to get a patch out to relatively soonish, if you're > willing to try it to see if it addresses your observed problem. > > - Ted > > From mvolaski at aecom.yu.edu Sat Feb 17 06:32:08 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sat, 17 Feb 2007 01:32:08 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: <20070216170007.A75AD73189@hormel.redhat.com> References: <20070216170007.A75AD73189@hormel.redhat.com> Message-ID: I made a filesystem (mke2fs -j) on a logical volume under kernel 2.6.20 on a 64-bit based system, and when I try to mount it, ext3 complains with EXT3-fs: dm-1: couldn't mount because of unsupported optional features (80). I first thought I just forgot to make the filesystem, so I remade it and the error is still present. I ran fsck on this freshly made filesystem, and it completed with no errors. ------------------------------------ Here's the dumpe2fs -h info: dumpe2fs 1.39 (29-May-2006) Filesystem volume name: Last mounted on: Filesystem UUID: a4b7ee96-4aa9-4312-9ec9-91059539ece5 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal resize_inode dir_index filetype 64bit sparse_super large_file Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 393216000 Block count: 786432000 Reserved block count: 39321600 Free blocks: 774032395 Free inodes: 393215989 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16384 Inode blocks per group: 512 Filesystem created: Fri Feb 16 20:00:51 2007 Last mount time: n/a Last write time: Fri Feb 16 20:36:07 2007 Mount count: 0 Maximum mount count: 23 Last checked: Fri Feb 16 20:36:07 2007 Check interval: 15552000 (6 months) Next check after: Wed Aug 15 21:36:07 2007 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Default directory hash: tea Directory Hash Seed: a00b379c-0028-4659-93e8-b0a9198ed6c0 Journal backup: inode blocks Journal size: 128M ------------------------------------ Here's the lvdisplay output on the logical volume: --- Logical volume --- LV Name /dev/vgw/lvhall VG Name vgw LV UUID eq696g-YyLS-xkwV-XGQf-ri1i-4bTK-RaN5s3 LV Write Access read/write LV Status available # open 0 LV Size 2.93 TB Current LE 768000 Segments 2 Allocation inherit Read ahead sectors 0 Block device 254:1 What could be causing this error and what does "80" refer to? -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From adilger at clusterfs.com Sat Feb 17 07:28:27 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Sat, 17 Feb 2007 00:28:27 -0700 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: References: <20070216170007.A75AD73189@hormel.redhat.com> Message-ID: <20070217072827.GB10715@schatzie.adilger.int> On Feb 17, 2007 01:32 -0500, Maurice Volaski wrote: > I made a filesystem (mke2fs -j) on a logical volume under kernel > 2.6.20 on a 64-bit based system, and when I try to mount it, ext3 > complains with > > EXT3-fs: dm-1: couldn't mount because of unsupported optional features (80). > > I first thought I just forgot to make the filesystem, so I remade it > and the error is still present. I ran fsck on this freshly made > filesystem, and it completed with no errors. > > Filesystem features: has_journal resize_inode dir_index filetype > 64bit sparse_super large_file > What could be causing this error and what does "80" refer to? For some reason the 64bit feature is set in the filesystem. This should only be set for ext4dev filesystems. Are you specifying this feature yourself e.g. "mke2fs -O 64bit ..."? Your filesystem is only 2.7TB so you don't need the 64bit feature set and it doesn't really help you unless you plan to expand the logical volume past 16TB. If you do need this functionality then you need to use the ext4dev filesystem, but this is currently a work-in-progress and shouldn't be used for critical data. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From mvolaski at aecom.yu.edu Sat Feb 17 08:38:02 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sat, 17 Feb 2007 03:38:02 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: <20070217072827.GB10715@schatzie.adilger.int> References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> Message-ID: >On Feb 17, 2007 01:32 -0500, Maurice Volaski wrote: >> I made a filesystem (mke2fs -j) on a logical volume under kernel >> 2.6.20 on a 64-bit based system, and when I try to mount it, ext3 >> complains with >> >> EXT3-fs: dm-1: couldn't mount because of unsupported optional features (80). >> >> I first thought I just forgot to make the filesystem, so I remade it >> and the error is still present. I ran fsck on this freshly made >> filesystem, and it completed with no errors. >> >> Filesystem features: has_journal resize_inode dir_index filetype >> 64bit sparse_super large_file > >> What could be causing this error and what does "80" refer to? > >For some reason the 64bit feature is set in the filesystem. This should >only be set for ext4dev filesystems. Are you specifying this feature >yourself e.g. "mke2fs -O 64bit ..."? Your filesystem is only 2.7TB so >you don't need the 64bit feature set and it doesn't really help you >unless you plan to expand the logical volume past 16TB. > >If you do need this functionality then you need to use the ext4dev >filesystem, but this is currently a work-in-progress and shouldn't >be used for critical data. > I have no idea how it got there. I make the filesystem with mke2fs -j and the mke2fs.conf lists "base_features = sparse_super,filetype,resize_inode,dir_index". I do not have ext4dev compiled into the kernel at all. My other filesystems, all under 2 TB, seem to be working, so I wonder whether somehow some ext4dev got erroneously added to ext3, and it's being applied to filesystems that are greater than 2 TB and are being compiled with 64-bit gcc, just a wild guess on my part. -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From mvolaski at aecom.yu.edu Sat Feb 17 22:53:56 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sat, 17 Feb 2007 17:53:56 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: <20070217072827.GB10715@schatzie.adilger.int> References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> Message-ID: > > Filesystem features: has_journal resize_inode dir_index filetype >> 64bit sparse_super large_file > >> What could be causing this error and what does "80" refer to? > >For some reason the 64bit feature is set in the filesystem. This should >only be set for ext4dev filesystems. Are you specifying this feature >yourself e.g. "mke2fs -O 64bit ..."? Your filesystem is only 2.7TB so >you don't need the 64bit feature set and it doesn't really help you >unless you plan to expand the logical volume past 16TB. > Indeed, it appears that my initial assumption is correct. mke2fs contains ext4dev code and it erroneously applies the 64bit flag to filesystems > 2 TB by default. I confirmed this by creating a 1999G filesystem and it didn't get the 64bit flag. I then resized it 3000G, and luckily resize2fs doesn't contain this code or attempt to adjust the flags. The filesystem checked OK in fsck and is now mounting properly. -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From tytso at mit.edu Sun Feb 18 01:53:27 2007 From: tytso at mit.edu (Theodore Tso) Date: Sat, 17 Feb 2007 20:53:27 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> Message-ID: <20070218015327.GA923@thunk.org> On Sat, Feb 17, 2007 at 05:53:56PM -0500, Maurice Volaski wrote: > Indeed, it appears that my initial assumption is correct. mke2fs > contains ext4dev code and it erroneously applies the 64bit flag to > filesystems > 2 TB by default. Are you using the e2fsprogs-interim patchset version e2fsprogs-1.39-tyt1? It should only be used for ext4dev development, and nothing else. I'll look at fixing that bug in e2fsprogs-1.39-tyt1, but that code base has a bunch of interim patches that I'm currently in the process of reworking for e2fsprogs mainline inclusion; it's there only for the convenience of ext4 developers. I've added a README to make this clear for future users. Regards, - Ted From mvolaski at aecom.yu.edu Sun Feb 18 02:57:20 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sat, 17 Feb 2007 21:57:20 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: <20070218015327.GA923@thunk.org> References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> <20070218015327.GA923@thunk.org> Message-ID: >On Sat, Feb 17, 2007 at 05:53:56PM -0500, Maurice Volaski wrote: >> Indeed, it appears that my initial assumption is correct. mke2fs >> contains ext4dev code and it erroneously applies the 64bit flag to >> filesystems > 2 TB by default. > >Are you using the e2fsprogs-interim patchset version >e2fsprogs-1.39-tyt1? It should only be used for ext4dev development, >and nothing else. I'll look at fixing that bug in >e2fsprogs-1.39-tyt1, but that code base has a bunch of interim patches >that I'm currently in the process of reworking for e2fsprogs mainline >inclusion; it's there only for the convenience of ext4 developers. > I'm using Gentoo and it has e2fsprogs-1.39-r1. It says at http://bugs.gentoo.org/show_bug.cgi?id=156697 that this version contains a number of ext4 patches, but I'm not certain if it is equivalent to tyt1 or not. -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From adilger at clusterfs.com Sun Feb 18 08:04:24 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Sun, 18 Feb 2007 01:04:24 -0700 Subject: how does ext3 handle no communication to storage In-Reply-To: <45D5DAEF.5020702@bnl.gov> References: <44F33E3A.8020805@bnl.gov> <20060828205822.GB4944@thunk.org> <45D5DAEF.5020702@bnl.gov> Message-ID: <20070218080423.GF10715@schatzie.adilger.int> On Feb 16, 2007 11:25 -0500, Sev Binello wrote: > Theodore Tso wrote: > >On Mon, Aug 28, 2006 at 03:04:26PM -0400, Sev Binello wrote: > >>Can anyone tell us what the expected behavior is, > >>in the event that ext3 loses total contact with the storage system ? > >> > >>We have found that the file system is put into read only mode, > >>it is then found to contain errors, and requires an fsck. > >>Sometimes the fsck finds numerous (some serious looking) errors, > >>and that running without fsck doesn't seem like a safe option. > >> > >>We are trying to understand why exactly this is. > >>Why do we get errors ?  Why serious ones ? > > > >The filesystem should go read-only when you try to modify it. > >HOWEVER, the problem comes when connectivity is restored. When an > >attempt to modify the filesystem fails, the journal is aborted and an > >I/O is returned. However, there may be modified blocks left hanging > >about in the buffer cache before the kernel realized that connectivity > >has been lost, and what we need to do is to make sure that all dirty > >blocks in the buffer cache and page cache are dropped. In fact, there are a number of other places as well, like the elevator and IDE/SCSI/LVM layers that can be hung up on timeouts and retries for a long time. It would be nice if the filesystem could abort all pending IOs in the underlying layers > >Basically, if I'm right, this is a bug, which we need to fix. That > >patch would require flushing all modified buffers and page cache pages > >when the filesystem goes read-only. The modified buffers is the more > >important thing, since that's what causes the filesystem corruption, > >although for correctness's sake we should be flushing any modified > >page cache pages as well. I don't have time to code this right now, > >but I'll try to get a patch out to relatively soonish, if you're > >willing to try it to see if it addresses your observed problem. We talked at one time of marking the block device via set_device_ro(). That would prevent any of the blocks to be flushed out by the block layer. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From tytso at mit.edu Mon Feb 19 00:38:30 2007 From: tytso at mit.edu (Theodore Tso) Date: Sun, 18 Feb 2007 19:38:30 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> <20070218015327.GA923@thunk.org> Message-ID: <20070219003830.GB25490@thunk.org> On Sat, Feb 17, 2007 at 09:57:20PM -0500, Maurice Volaski wrote: > I'm using Gentoo and it has e2fsprogs-1.39-r1. It says at > http://bugs.gentoo.org/show_bug.cgi?id=156697 that this version > contains a number of ext4 patches, but I'm not certain if it is > equivalent to tyt1 or not. Sigh, could you feed back to Gentoo that the version of e2fsprogs it is fielding is clearly buggy? Thanks, - Ted From mvolaski at aecom.yu.edu Mon Feb 19 01:32:10 2007 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sun, 18 Feb 2007 20:32:10 -0500 Subject: Filesystem won't mount because of "unsupported optional features (80)" In-Reply-To: <20070219003830.GB25490@thunk.org> References: <20070216170007.A75AD73189@hormel.redhat.com> <20070217072827.GB10715@schatzie.adilger.int> <20070218015327.GA923@thunk.org> <20070219003830.GB25490@thunk.org> Message-ID: >On Sat, Feb 17, 2007 at 09:57:20PM -0500, Maurice Volaski wrote: >> I'm using Gentoo and it has e2fsprogs-1.39-r1. It says at >> http://bugs.gentoo.org/show_bug.cgi?id=156697 that this version >> contains a number of ext4 patches, but I'm not certain if it is >> equivalent to tyt1 or not. > >Sigh, could you feed back to Gentoo that the version of e2fsprogs it >is fielding is clearly buggy? Thanks, Done: http://bugs.gentoo.org/show_bug.cgi?id=167562 -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From Hania.Yassin at motorola.com Tue Feb 20 20:45:34 2007 From: Hania.Yassin at motorola.com (Yassin Hania-CHY002) Date: Tue, 20 Feb 2007 15:45:34 -0500 Subject: Backing up ext3 root partition with dd Message-ID: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> Is there a reason why an ext3 root partition cannot be copied to an alternate partition using the "dd" command? The dd is copying the mounted root partition into an alternate partition that is not mounted. The dd returns success, but the fsck on that partition fails with errors as follows: ----------------- fsck 1.37 (21-Mar-2005) /dev/Active_Update/root2: recovering journal /dev/Active_Update/root2: Clearing orphaned inode 57554 (uid=0, gid=0, mode=010600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57552 (uid=1010, gid=3002, mode=010600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57548 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57547 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57546 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57545 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57544 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57543 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57542 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57541 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57540 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57539 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57538 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57537 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Clearing orphaned inode 57536 (uid=0, gid=0, mode=020600, size=0) /dev/Active_Update/root2: Deleted inode 90369 has zero dtime. FIXED. /dev/Active_Update/root2: Inode 90370 is in use, but has dtime set. FIXED. /dev/Active_Update/root2: Inode 90371 is in use, but has dtime set. FIXED. /dev/Active_Update/root2: Inode 90372 is in use, but has dtime set. FIXED. /dev/Active_Update/root2: Inode 90373 is in use, but has dtime set. FIXED. /dev/Active_Update/root2: Inode 90373 has imagic flag set. retfrom fsck is 1024 ------------------------------- When mounting that partition, an error is logged to /var/log/messages as follows: EXT3-fs warning: mounting fs with errors, running e2fsck is recommended We are seeing this on boxes running RHEL4 and even boxes running a PPC version of Linux. This failure is not seen on every run of the scripts, and is mostly seen when mirroring the root partition. Here is the script which reproduces this ext3 corruption after running it 5-6 times: ------------------------------------------ #!/usr/bin/perl -w my $ret; ## ## Determine the backup partitions ## my @temp = `cat /etc/fstab | grep alt_root`; my (@temp2, @alt_parts, @parts, @part); my $cnt = 0; foreach (@temp) { @temp2 = split / /; push @alt_parts, $temp2[0]; } ## ## Determine the active partitions ## @temp = `cat /etc/fstab | grep Active_Update | grep -v alt_root | grep -v log | grep -v swap`; foreach (@temp) { @temp2 = split / /; push @parts, $temp2[0]; } while (`cat /var/log/messages | grep -ci "running e2fsck is recommended"` <= 0) { foreach my $alt_part (@alt_parts) { $alt_part =~ /(\D+)\d/; @part = grep /$1/, @parts; `dd if=$part[0] of=$alt_part > /dev/null 2>&1`; $ret = $?; ## ## Abort if the partition copy fails as there is likely something ## seriously wrong ## if ($ret != 0) { `logger "ERROR: Problems copying data to $alt_part."`; `logger " Contact Motorola for assistance."`; `logger "Partition Copy exit code: $ret"`; exit 1; } ## ## Check the filesystem on the copied partition ## `/sbin/fsck -fp $alt_part > /dev/null 2>&1`; $ret = $?; `logger "fsck exit code for $alt_part was: $ret"`; } ## ## Modify the copied partition ## `mount /mnt/alt_root; rm -f /mnt/alt_root/bob; touch /mnt/alt_root/bob; umount /mnt/alt_root`; } ------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at clusterfs.com Wed Feb 21 08:45:59 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 21 Feb 2007 01:45:59 -0700 Subject: Backing up ext3 root partition with dd In-Reply-To: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> References: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> Message-ID: <20070221084559.GV10715@schatzie.adilger.int> On Feb 20, 2007 15:45 -0500, Yassin Hania-CHY002 wrote: > Is there a reason why an ext3 root partition cannot be copied to an > alternate partition using the "dd" command? > The dd is copying the mounted root partition into an alternate partition > that is not mounted. The dd returns success, but the fsck on that > partition fails with errors as follows: You can't use 'dd' because it backs up the journal in an unsafe manner. If you want to do a backup use 'dump'. It won't touch the journal. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From tweeks at rackspace.com Wed Feb 21 23:25:23 2007 From: tweeks at rackspace.com (tweeks) Date: Wed, 21 Feb 2007 17:25:23 -0600 Subject: Backing up ext3 root partition with dd In-Reply-To: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> References: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> Message-ID: <200702211725.24398.tweeks@rackspace.com> For the same reason you can't xerox a piece of paper that your wiggling. ;) More precisely... the act of mounting it in a state other than read only makes it "uncopy-able" (from an image perspective). It sounds like you want a type of snapshot feature like what XFS offers. Tweeks On Tuesday 20 February 2007 14:45, Yassin Hania-CHY002 wrote: > Is there a reason why an ext3 root partition cannot be copied to an > alternate partition using the "dd" command? > The dd is copying the mounted root partition into an alternate partition > that is not mounted. The dd returns success, but the fsck on that > partition fails with errors as follows: > ----------------- > fsck 1.37 (21-Mar-2005) > /dev/Active_Update/root2: recovering journal > /dev/Active_Update/root2: Clearing orphaned inode 57554 (uid=0, gid=0, > mode=010600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57552 (uid=1010, > gid=3002, mode=010600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57548 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57547 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57546 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57545 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57544 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57543 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57542 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57541 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57540 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57539 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57538 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57537 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Clearing orphaned inode 57536 (uid=0, gid=0, > mode=020600, size=0) > /dev/Active_Update/root2: Deleted inode 90369 has zero dtime. FIXED. > /dev/Active_Update/root2: Inode 90370 is in use, but has dtime set. > FIXED. > /dev/Active_Update/root2: Inode 90371 is in use, but has dtime set. > FIXED. > /dev/Active_Update/root2: Inode 90372 is in use, but has dtime set. > FIXED. > /dev/Active_Update/root2: Inode 90373 is in use, but has dtime set. > FIXED. > /dev/Active_Update/root2: Inode 90373 has imagic flag set. retfrom fsck > is 1024 > ------------------------------- > > When mounting that partition, an error is logged to /var/log/messages as > follows: > EXT3-fs warning: mounting fs with errors, running e2fsck is recommended > > We are seeing this on boxes running RHEL4 and even boxes running a PPC > version of Linux. > This failure is not seen on every run of the scripts, and is mostly seen > when mirroring the root partition. > > Here is the script which reproduces this ext3 corruption after running > it 5-6 times: > ------------------------------------------ > #!/usr/bin/perl -w > my $ret; > > ## > ## Determine the backup partitions > ## > my @temp = `cat /etc/fstab | grep alt_root`; > my (@temp2, @alt_parts, @parts, @part); > my $cnt = 0; > foreach (@temp) > { > @temp2 = split / /; > push @alt_parts, $temp2[0]; > } > > ## > ## Determine the active partitions > ## > @temp = `cat /etc/fstab | grep Active_Update | grep -v alt_root | grep > -v log | grep -v swap`; > foreach (@temp) > { > @temp2 = split / /; > push @parts, $temp2[0]; > } > > > while (`cat /var/log/messages | grep -ci "running e2fsck is > recommended"` <= 0) > { > foreach my $alt_part (@alt_parts) > { > $alt_part =~ /(\D+)\d/; > @part = grep /$1/, @parts; > `dd if=$part[0] of=$alt_part > /dev/null 2>&1`; > $ret = $?; > > ## > ## Abort if the partition copy fails as there is likely something > ## seriously wrong > ## > if ($ret != 0) > { > `logger "ERROR: Problems copying data to $alt_part."`; > `logger " Contact Motorola for assistance."`; > `logger "Partition Copy exit code: $ret"`; > exit 1; > } > > ## > ## Check the filesystem on the copied partition > ## > `/sbin/fsck -fp $alt_part > /dev/null 2>&1`; > $ret = $?; > `logger "fsck exit code for $alt_part was: $ret"`; > } > > ## > ## Modify the copied partition > ## > `mount /mnt/alt_root; rm -f /mnt/alt_root/bob; touch > /mnt/alt_root/bob; umount /mnt/alt_root`; > } > > ------------------------- Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace Managed Hosting. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From jss at ast.cam.ac.uk Thu Feb 22 10:34:26 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Thu, 22 Feb 2007 10:34:26 +0000 Subject: Very slow ext3 fsck Message-ID: Hi - We have an ext3 file system which is 3.5TB in size (on top of lvm). Free are 172049011 out of 854473728 4096K blocks, and 396540654 out of 427245568 inodes. This is using Scientific Linux 4.4 (a RHEL clone). The filesystem consists of multiple backups created with rsync using --link-dest, which hard links files which haven't been modified to the previous copy. There are several hundred days worth of these backups. I decided to fsck the file system, but unfortunately fsck is extremely slow. It has been going now for 67 hours and appears to be completely cpu bound (no obvious disk access) and stuck at the "Pass 2: Checking directory structure" stage. It doesn't respond to a normal kill or ctrl+c. Does anybody know whether it has got stuck in a loop, or does it really take so long to check so many hardlinks? Would it help moving to a newer e2fsck than RHEL provides (it has version number e2fsprogs-1.35-12.4.EL4). Thanks Jeremy From jss at ast.cam.ac.uk Thu Feb 22 14:26:32 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Thu, 22 Feb 2007 14:26:32 +0000 Subject: Very slow ext3 fsck References: Message-ID: Jeremy Sanders wrote: > Does anybody know whether it has got stuck in a loop, or does it really > take so long to check so many hardlinks? Would it help moving to a newer > e2fsck than RHEL provides (it has version number e2fsprogs-1.35-12.4.EL4). I should also add that strace produces no output on the process, so it's apparently not making any system calls. Jeremy -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From richard.c.wolber at boeing.com Thu Feb 22 16:02:37 2007 From: richard.c.wolber at boeing.com (Wolber, Richard C) Date: Thu, 22 Feb 2007 08:02:37 -0800 Subject: Backing up ext3 root partition with dd In-Reply-To: <200702211725.24398.tweeks@rackspace.com> References: <93C192319466824A822B9179032077E201401695@de01exm66.ds.mot.com> <200702211725.24398.tweeks@rackspace.com> Message-ID: <8C7C41A176AC0B468BEFB2EFD9BDAB9902426994@XCH-NW-5V2.nw.nos.boeing.com> tweeks said: > > For the same reason you can't xerox a piece of paper that > your wiggling. ;) > > More precisely... the act of mounting it in a state other > than read only makes it "uncopy-able" (from an image > perspective). It sounds like you want a type of snapshot > feature like what XFS offers. Or LVM snapshots. ..Chuck.. From santi at usansolo.net Thu Feb 22 20:22:03 2007 From: santi at usansolo.net (Santi Saez) Date: Thu, 22 Feb 2007 21:22:03 +0100 Subject: Very slow ext3 fsck In-Reply-To: References: Message-ID: <5654D9CF-CF91-4324-8C4A-E8DA78D0F5DE@usansolo.net> El 22/02/2007, a las 15:26, Jeremy Sanders escribi?: > Jeremy Sanders wrote: > >> Does anybody know whether it has got stuck in a loop, or does it >> really >> take so long to check so many hardlinks? Would it help moving to a >> newer >> e2fsck than RHEL provides (it has version number >> e2fsprogs-1.35-12.4.EL4). > > I should also add that strace produces no output on the process, so > it's > apparently not making any system calls. I think that fsck spawns a children, try with "strace -f" to trace this child process too. Regards, -- Santi Saez From tytso at mit.edu Thu Feb 22 22:26:38 2007 From: tytso at mit.edu (Theodore Tso) Date: Thu, 22 Feb 2007 17:26:38 -0500 Subject: Very slow ext3 fsck In-Reply-To: References: Message-ID: <20070222222638.GA24594@thunk.org> On Thu, Feb 22, 2007 at 10:34:26AM +0000, Jeremy Sanders wrote: > Hi - > > We have an ext3 file system which is 3.5TB in size (on top of lvm). Free are > 172049011 out of 854473728 4096K blocks, and 396540654 out of 427245568 > inodes. This is using Scientific Linux 4.4 (a RHEL clone). The filesystem > consists of multiple backups created with rsync using --link-dest, which > hard links files which haven't been modified to the previous copy. There > are several hundred days worth of these backups. > > I decided to fsck the file system, but unfortunately fsck is extremely slow. > It has been going now for 67 hours and appears to be completely cpu bound > (no obvious disk access) and stuck at the "Pass 2: Checking directory > structure" stage. It doesn't respond to a normal kill or ctrl+c. Did you run fsck out of a command-line? It should respond to a normal kill or ctrl-c. If it isn't I have to wonder whether the device driver is locked up for some reason. Can you login via ssh or a second console? If so, run "ps aux" and "ps lx" and report back what the e2fsck ps lines shows. Also, how much memory do you have? 3.5TB is pretty big, and if you don't have enough memory, it could just simply be a matter of the system paging its brains out. - Ted From jss at ast.cam.ac.uk Thu Feb 22 23:01:21 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Thu, 22 Feb 2007 23:01:21 +0000 Subject: Very slow ext3 fsck References: <20070222222638.GA24594@thunk.org> Message-ID: Theodore Tso wrote: > Did you run fsck out of a command-line? It should respond to a normal > kill or ctrl-c. If it isn't I have to wonder whether the device > driver is locked up for some reason. fsck is running over an xterm (over an ssh connection). It doesn't seem to respond to ctrl+c or kill commands (I haven't tried kill -9 yet). > Can you login via ssh or a second console? If so, run "ps aux" and > "ps lx" and report back what the e2fsck ps lines shows. The system is working fine (except for the fsck). I don't think it's paging as the CPU usage is 100%. xback1:~:$ ps aux|grep fsck root 4521 0.0 0.0 51992 404 pts/0 S+ Feb19 0:00 fsck /dev/xbackup1/xback1_backup1 root 4522 91.0 43.8 2126784 446772 pts/0 RN+ Feb19 4589:42 fsck.ext2 /dev/xbackup1/xback1_backup1 xback1:~:$ ps lx |grep fsck 0 914 5278 4990 16 0 51084 688 pipe_w S+ pts/1 0:00 grep fsck > Also, how much memory do you have? 3.5TB is pretty big, and if you > don't have enough memory, it could just simply be a matter of the > system paging its brains out. There is 1GB of memory. fsck seems resident at around 440MB, and uses 2GB total virtual memory. There's nothing else large running on the system, so I don't see why it wouldn't be using all the memory if it's swapping. There is no apparent disk activity. By the way, the system is a hyperthreading P4, running in 64 bit mode. Jeremy -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From jss at ast.cam.ac.uk Thu Feb 22 23:03:39 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Thu, 22 Feb 2007 23:03:39 +0000 Subject: Very slow ext3 fsck References: <5654D9CF-CF91-4324-8C4A-E8DA78D0F5DE@usansolo.net> Message-ID: Santi Saez wrote: > I think that fsck spawns a children, try with "strace -f" to trace > this child process too. The first fsck is doing a wait4 (presumably its child process), the second shows no system activity. Thanks Jeremy -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From jss at ast.cam.ac.uk Thu Feb 22 23:05:26 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Thu, 22 Feb 2007 23:05:26 +0000 Subject: Very slow ext3 fsck References: <20070222222638.GA24594@thunk.org> Message-ID: Jeremy Sanders wrote: > xback1:~:$ ps lx |grep fsck > 0 914 5278 4990 16 0 51084 688 pipe_w S+ pts/1 0:00 grep > fsck Sorry - just noticed this missed out the real commands: xback1:/home/jss:# ps lx|grep fsck 4 0 4522 4521 39 19 2126784 446772 - RN+ pts/0 4598:06 fsck.ext2 /dev/xbackup1/xback1_backup1 4 0 4521 4491 16 0 51992 404 wait S+ pts/0 0:00 fsck /dev/xbackup1/xback1_backup1 4 0 5500 5437 16 0 51084 696 pipe_w S+ pts/1 0:00 grep fsck -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From mb/ext3 at dcs.qmul.ac.uk Fri Feb 23 07:21:03 2007 From: mb/ext3 at dcs.qmul.ac.uk (Matt Bernstein) Date: Fri, 23 Feb 2007 07:21:03 +0000 (GMT) Subject: Very slow ext3 fsck In-Reply-To: <20070222222638.GA24594@thunk.org> References: <20070222222638.GA24594@thunk.org> Message-ID: On Feb 22 Theodore Tso wrote: > On Thu, Feb 22, 2007 at 10:34:26AM +0000, Jeremy Sanders wrote: >> We have an ext3 file system which is 3.5TB in size (on top of lvm). Free are >> 172049011 out of 854473728 4096K blocks, and 396540654 out of 427245568 >> inodes. This is using Scientific Linux 4.4 (a RHEL clone). The filesystem >> consists of multiple backups created with rsync using --link-dest, which >> hard links files which haven't been modified to the previous copy. There >> are several hundred days worth of these backups. I have had this exact same problem with this exact same set-up (though it was FC5/x86_64 on a 1.5T volume) just under a year ago. >> I decided to fsck the file system, but unfortunately fsck is extremely slow. >> It has been going now for 67 hours and appears to be completely cpu bound >> (no obvious disk access) and stuck at the "Pass 2: Checking directory >> structure" stage. It doesn't respond to a normal kill or ctrl+c. I sent Ted an e2image of the fs (which admittedly was huge), but suspect he didn't have time or resource to see what was going on. > Did you run fsck out of a command-line? It should respond to a normal > kill or ctrl-c. If it isn't I have to wonder whether the device > driver is locked up for some reason. It's definitely fsck being confused; I also observed it wasn't making any syscalls. We both have large numbers of files with high link counts. I found I _could_ fsck the volume in a couple of hours if I had less than (IIRC) about 50 days' backups, but at least at some point after that fsck would stick in pass 2 for more than a week--at which point I gave up, trashed the fs (since my fsck was necessitated by hardware failure) and started again. You could mount the fs and punt last night's backup to a pristine fs and fsck that if you have the terabytes available. > Also, how much memory do you have? 3.5TB is pretty big, and if you > don't have enough memory, it could just simply be a matter of the > system paging its brains out. In my case the process was 1.6G on a 2G machine. No paging. Definitely e2fsck CPU-bound. HTH Matt From tushu1232 at gmail.com Fri Feb 23 12:47:55 2007 From: tushu1232 at gmail.com (tushar) Date: Fri, 23 Feb 2007 18:17:55 +0530 Subject: how to commit a directory entry to the disk Message-ID: <200976900702230447w51b8955euf9f7b93aed4e9c33@mail.gmail.com> hey guys, well i am stuck up at a trivial point of committing the directory entry buffer to the disk i have intialised the values of struct ext3_dir_entry_2 *de and now want to commit it to the disk in the function EXT3 FS----linux 2.6.18 ----/ext3/inode.c static int ext33_do_update_inode(handle_t *handle, struct inode *inode, struct ext33_iloc *iloc) { --------------- raw inode updation------------------ dentry=list_entry(inode->i_dentry.next,struct dentry,d_alias); bh1=ext33_find_entry(dentry,&de1); ---please tell me how to commit the de1 to the disk data structures after this point---------- } -------------- next part -------------- An HTML attachment was scrubbed... URL: From tushu1232 at gmail.com Sat Feb 24 03:06:22 2007 From: tushu1232 at gmail.com (tushar) Date: Sat, 24 Feb 2007 08:36:22 +0530 Subject: how to commit a directory entry to the disk in EXT3 FS Message-ID: <200976900702231906l5793b70at27dc027d52b48ffa@mail.gmail.com> hey guys, well i am stuck up at a trivial point of committing the directory entry buffer to the disk i have intialised the values of struct ext3_dir_entry_2 *de and now want to commit it to the disk in the function EXT3 FS----linux 2.6.18 ----fs/ext3/inode.c static int ext3_do_update_inode(handle_t *handle, struct inode *inode, struct ext3_iloc *iloc) { --------------- raw inode updation------------------ dentry=list_entry(inode->identry.next,struct dentry,d_alias); bh1=ext3_find_entry(dentry,&de1); -----please tell me how to commit the de1 to the disk data structures after this point---------- } -------------- next part -------------- An HTML attachment was scrubbed... URL: From lakshmipathi.g at gmail.com Sat Feb 24 13:22:17 2007 From: lakshmipathi.g at gmail.com (laksmi pathi) Date: Sat, 24 Feb 2007 18:52:17 +0530 Subject: Hi all Message-ID: Hi all, I wrote a program which recovers deleted file from Ext3/Ext2 FS.It's like crash proof program.For past few months i'm trying hard to get feedback or comments or criticizm on the tool.I hope to get from you. The link is , https://sourceforge.net/projects/giis/ Warm Regards, Lakshmipathi.G From mr._x at shaw.ca Sun Feb 25 06:19:02 2007 From: mr._x at shaw.ca (..:::BeOS Mr. X:::..) Date: Sat, 24 Feb 2007 22:19:02 -0800 Subject: Hi all In-Reply-To: References: Message-ID: <45E12A56.2010403@shaw.ca> Yes, but I always here that recover from ext3 is not possible... possibly explain some of the technology ? I have interest in using the program if I can in fact figure out how to use it. I accidently recently deleted a music folder with many mp3 files in it. laksmi pathi wrote: > Hi all, > I wrote a program which recovers deleted file from Ext3/Ext2 FS.It's > like crash proof program.For past few months i'm trying hard to get > feedback or comments or criticizm on the tool.I hope to get from you. > The link is , > https://sourceforge.net/projects/giis/ > Warm Regards, > Lakshmipathi.G > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > From bruno at wolff.to Sun Feb 25 18:07:39 2007 From: bruno at wolff.to (Bruno Wolff III) Date: Sun, 25 Feb 2007 12:07:39 -0600 Subject: Hi all In-Reply-To: <45E12A56.2010403@shaw.ca> References: <45E12A56.2010403@shaw.ca> Message-ID: <20070225180739.GA516@wolff.to> On Sat, Feb 24, 2007 at 22:19:02 -0800, "..:::BeOS Mr. X:::.." wrote: > Yes, but I always here that recover from ext3 is not possible... > possibly explain some of the technology ? I have interest in using the > program if I can in fact figure out how to use it. I accidently recently > deleted a music folder with many mp3 files in it. You are probably better off regularly making backups rather than beta testing This software. From lakshmipathi.g at gmail.com Mon Feb 26 07:45:48 2007 From: lakshmipathi.g at gmail.com (laksmi pathi) Date: Mon, 26 Feb 2007 13:15:48 +0530 Subject: Hi all In-Reply-To: <20070225180739.GA516@wolff.to> References: <45E12A56.2010403@shaw.ca> <20070225180739.GA516@wolff.to> Message-ID: Hi Beos, It's true you can't recover files from ext3 since file address are zeroed out while deleting. This tool is crash proof recovery tool. You can the recover the files which are deleted only after it's installation.The concept is, once you install the tool,It make backup copy of your files addresses.When you delete a file , it's address in inode is deleted ...but we can access file from it's address which we copied earlier-provided the content is not overwirtte-So it's like a crash proof tool. Hi Bruno Wolff , Yes it's always better to take regular backup- and fellow developers in freshmeat tested and rated this tool, i assume they are quite satisfied with the tool. Please check out : http://freshmeat.net/projects/giis/ Warm Regards, Lakshmipathi.G On 2/25/07, Bruno Wolff III wrote: > On Sat, Feb 24, 2007 at 22:19:02 -0800, > "..:::BeOS Mr. X:::.." wrote: > > Yes, but I always here that recover from ext3 is not possible... > > possibly explain some of the technology ? I have interest in using the > > program if I can in fact figure out how to use it. I accidently recently > > deleted a music folder with many mp3 files in it. > > You are probably better off regularly making backups rather than beta testing > This software. > From jss at ast.cam.ac.uk Mon Feb 26 10:23:47 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Mon, 26 Feb 2007 10:23:47 +0000 Subject: Very slow ext3 fsck References: Message-ID: Jeremy Sanders wrote: > We have an ext3 file system which is 3.5TB in size (on top of lvm). Free > are 172049011 out of 854473728 4096K blocks, and 396540654 out of > 427245568 inodes. This is using Scientific Linux 4.4 (a RHEL clone). The > filesystem consists of multiple backups created with rsync using > --link-dest, which hard links files which haven't been modified to the > previous copy. There are several hundred days worth of these backups. Just to say I've also tried with e2fsprogs-1.39 e2fsck and it hangs indefinitely too :-( Jeremy -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From tytso at mit.edu Mon Feb 26 13:07:06 2007 From: tytso at mit.edu (Theodore Tso) Date: Mon, 26 Feb 2007 08:07:06 -0500 Subject: Very slow ext3 fsck In-Reply-To: References: Message-ID: <20070226130706.GA8154@thunk.org> On Mon, Feb 26, 2007 at 10:23:47AM +0000, Jeremy Sanders wrote: > Jeremy Sanders wrote: > > > We have an ext3 file system which is 3.5TB in size (on top of lvm). Free > > are 172049011 out of 854473728 4096K blocks, and 396540654 out of > > 427245568 inodes. This is using Scientific Linux 4.4 (a RHEL clone). The > > filesystem consists of multiple backups created with rsync using > > --link-dest, which hard links files which haven't been modified to the > > previous copy. There are several hundred days worth of these backups. > > Just to say I've also tried with e2fsprogs-1.39 e2fsck and it hangs > indefinitely too :-( What output do you see from e2fsck if you manually run it? - Ted From bimal_pandit at rediffmail.com Mon Feb 26 15:08:18 2007 From: bimal_pandit at rediffmail.com (bimal pandit) Date: 26 Feb 2007 15:08:18 -0000 Subject: Hi all Message-ID: <20070226150818.26821.qmail@webmail89.rediffmail.com> Dear Laxmi, On Mon, 26 Feb 2007 laksmi pathi wrote : >Hi Beos, >It's true you can't recover files from ext3 since file address are >zeroed out while deleting. >This tool is crash proof recovery tool. >You can the recover the files which are deleted only after it's >installation.The concept is, once you install the tool,It make backup >copy of your files addresses.When you delete a file , it's address in >inode is deleted ...but we can access file from it's address which we >copied earlier-provided the content is not overwirtte-So it's like a >crash proof tool. >Hi Bruno Wolff , >Yes it's always better to take regular backup- >and fellow developers in freshmeat tested and rated this tool, >i assume they are quite satisfied with the tool. >Please check out : >http://freshmeat.net/projects/giis/ > >Warm Regards, >Lakshmipathi.G > > > > >On 2/25/07, Bruno Wolff III wrote: >>On Sat, Feb 24, 2007 at 22:19:02 -0800, >> "..:::BeOS Mr. X:::.." wrote: >> > Yes, but I always here that recover from ext3 is not possible... >> > possibly explain some of the technology ? I have interest in using the >> > program if I can in fact figure out how to use it. I accidently recently >> > deleted a music folder with many mp3 files in it. >> >>You are probably better off regularly making backups rather than beta testing >>This software. >> > great job, will test it and would be keen to help and support you to the extent and the way I could be ... regards, Bimal Pandit -------------- next part -------------- An HTML attachment was scrubbed... URL: From jss at ast.cam.ac.uk Mon Feb 26 16:12:21 2007 From: jss at ast.cam.ac.uk (Jeremy Sanders) Date: Mon, 26 Feb 2007 16:12:21 +0000 Subject: Very slow ext3 fsck References: <20070226130706.GA8154@thunk.org> Message-ID: Theodore Tso wrote: > What output do you see from e2fsck if you manually run it? Just the usual output: [root at xback1 ~]# fsck /dev/xbackup1/xback1_backup1 fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) /dev/xbackup1/xback1_backup1 has gone 191 days without being checked, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure And hangs indefinitely. Jeremy -- Jeremy Sanders http://www-xray.ast.cam.ac.uk/~jss/ X-Ray Group, Institute of Astronomy, University of Cambridge, UK. Public Key Server PGP Key ID: E1AAE053 From tytso at mit.edu Mon Feb 26 16:55:11 2007 From: tytso at mit.edu (Theodore Tso) Date: Mon, 26 Feb 2007 11:55:11 -0500 Subject: Very slow ext3 fsck In-Reply-To: References: <20070226130706.GA8154@thunk.org> Message-ID: <20070226165511.GC8154@thunk.org> On Mon, Feb 26, 2007 at 04:12:21PM +0000, Jeremy Sanders wrote: > Theodore Tso wrote: > > > What output do you see from e2fsck if you manually run it? > > Just the usual output: > > [root at xback1 ~]# fsck /dev/xbackup1/xback1_backup1 > fsck 1.35 (28-Feb-2004) > e2fsck 1.35 (28-Feb-2004) > /dev/xbackup1/xback1_backup1 has gone 191 days without being checked, check > forced. > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > > And hangs indefinitely. I would love to get my hands on that filesystem and be able to run e2fsck under gdb to figure out what's going on. This looks like something entirely new. Unfortunately this is so bug that e2image will probably create a truly huge file that will take forever to transfer. Is there any possibility I can get remote access to the machine, or perhaps if that's not possible you can run it under a debugger, maybe under some kind of remote VNC connection so I can see what you're seeing and a telephone call so we can work this problem together? Regards, - Ted From richard.c.wolber at boeing.com Tue Feb 27 22:09:42 2007 From: richard.c.wolber at boeing.com (Wolber, Richard C) Date: Tue, 27 Feb 2007 14:09:42 -0800 Subject: e2fsck -p vs -y Message-ID: <8C7C41A176AC0B468BEFB2EFD9BDAB99024269C4@XCH-NW-5V2.nw.nos.boeing.com> Other than what is printed on STD{OUT|ERR}, is there any functional difference between the -p and -y arguments in the e2fsck command? ..Chuck.. -- Chuck Wolber Electronic Flight Bag/ Network File Server Crew Information Systems/ OSS Wonk Mobile: 253.576.1154 Desk: 206.655.6918 "21. A person who is nice to you, but rude to the waiter, is not a nice person." -Dave Barry "25 things I have learned in 50 years!"