From lists at mjh.name Mon Dec 6 14:23:15 2004 From: lists at mjh.name (Milan =?ISO-8859-1?Q?Holz=E4pfel?=) Date: Mon, 6 Dec 2004 15:23:15 +0100 Subject: 135 GB ext3 on broken drive -- other possibilities than "e2fsck -y"? Message-ID: <20041206152315.5a8a5b03.lists@mjh.name> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I got an IDE-drive which decided to get broken. Part of the extended partition table was lost, but I was able to recover it, so I could reach the ext3 filesystem with a size of about 135 GB. I made a copy of it (luckily the ISP doesn't seem to need the broken drive urgently hehe) and ran fsck on that copy. The first time I ran fsck I had the partition table slightly changed so I could reach the XFS fs which comes after the ext3 partition on the disk, which made the ext3 slightly smaller, so fsck complained and I tried to fiddle around with modding fsck's questions somehow. This resultet in about 2.8 GB of data (there were > 10 GB of data before. Most importantly some tars which are sizes around 4 GB.) Next time I gave the block device the size which the superblock said the FS had before, and I used fsck.ext3 with the -y option. Then I got 3 GB of data, and not all of it was in lost+found (like it was with the first attempt.) Much got into lost+found though, and obviously, much didn't make it back into the fs. Since I still have the broken drive (and just as important, enough free space on another drive) available for the next few days: Is there anything more I can try? (Espacially to find one of the more recent large tars. They have obviously been at the wrong place in the wrong time, but that's another story.) TO these big files of 1+ GB: If I take an ext3 fs of 130 GB with 100+ GB free space, can you estimate the chance of files copied with cp from another drive getting allocated continously? Or is this possible with ext3 at all? How big is the chance of continous allocation if these files are read from another drive, but then sent trough bzip2, with the output being written? Or if I use tar to create this file with the contents read from the same filesystem? (I just though I'd ask these too in case anyone can tell that searing for the beginning of one of these files may well be worth the effort.) Here some more details on the fsck process: Amongst others, I got a lot of these messages: | Deleted inode 8618118 has zero dtime. Fix? | Special (device/socket/fifo/symlink) file (inode 14646819) has immutable | or append-only flag set. Clear? | Special (device/socket/fifo) inode 14679587 has non-zero size. Fix? | Inode 16187144 was part of the orphaned inode list. FIXED. | Inode 16187146 is in use, but has dtime set. | Inode 16187364 has imagic flag set. | Inode 16187340 has compression flag set on filesystem without compression support. | Inode 16187340 has INDEX_FL flag set but is not a directory. | Inode 16187340, i_size is 5912753600013104432, should be 0. | Inode 16187340, i_blocks is 2042526010, should be 0. | Inode 16187020 has illegal block(s). | Illegal block #0 (2315255807) in inode 16187020. CLEARED. | Illegal block #1 (4094814513) in inode 16187020. CLEARED. | Illegal block #2 (3179347967) in inode 16187020. CLEARED. | Illegal block #3 (4294967135) in inode 16187020. CLEARED. | Illegal block #4 (3218371584) in inode 16187020. CLEARED. | Illegal block #6 (4284530057) in inode 16187020. CLEARED. | Illegal block #7 (2106327039) in inode 16187020. CLEARED. | Illegal block #8 (1962902940) in inode 16187020. CLEARED. | Illegal block #9 (1421708237) in inode 16187020. CLEARED. | Illegal block #10 (2248146943) in inode 16187020. CLEARED. | Illegal block #11 (2344842495) in inode 16187020. CLEARED. | Too many illegal blocks in inode 16187020. And after my second fsck attempt, every further fsck does this: | linux root # fsck.ext3 /dev/hdc8 -y | e2fsck 1.35 (28-Feb-2004) | /dev/hdc8 contains a file system with errors, check forced. | Pass 1: Checking inodes, blocks, and sizes | Pass 2: Checking directory structure | Pass 3: Checking directory connectivity | '..' in /lost+found/#12713448 (12713448) is (0), should be /lost+found (7208985). | Fix? yes | | Couldn't fix parent of inode 12713448: Couldn't find parent directory entry | | Pass 4: Checking reference counts | Pass 5: Checking group summary information | | /dev/hdc8: ********** WARNING: Filesystem still has errors ********** | | /dev/hdc8: 86053/16990208 files (2.7% non-contiguous), 1320442/33975459 blocks | linux root # Maybe these details can help you tell me whether there's any hope to find any further data on the drive. TIA for any help Milan Holz?pfel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFBtGtT2wyvT2WDeWYRAo5BAJ4lG+pYMUyfKm9LMkz+vnPmgpTHmgCfdl4u aokTS8KDGLsvmQr4C25elDU= =r9D4 -----END PGP SIGNATURE----- From guolin at alexa.com Mon Dec 6 22:54:35 2004 From: guolin at alexa.com (Guolin Cheng) Date: Mon, 6 Dec 2004 14:54:35 -0800 Subject: Maximum ext3 file system size ?? Message-ID: <41089CB27BD8D24E8385C8003EDAF7AB01C60FB0@karl.alexa.com> Hi, If the ext3 file system maximum size updated or it is still 4TB for 2.6.* kernel? The site at http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html says that it is 4TB yet, but I would like to know if it is possible to create and use stable & easy-to-fix (or at least as stable & easy-to-fix as ext3) file systems as big as 100TB for 32 bit Linux architecture? Any experience and suggestions are greatly appreciated. Thanks. Q: What is the largest possible size of an ext3 filesystem and of files on ext3? inspired by Andreas Dilger, suggested by Christian Kujau: Ext3 can support files up to 1TB. With a 2.4 kernel the filesystem size is limited by the maximal block device size, which is 2TB. In 2.6 the maximum (32-bit CPU) limit is of block devices is 16TB, but ext3 supports only up to 4TB. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stripyd at stripydog.com Tue Dec 7 08:19:25 2004 From: stripyd at stripydog.com (Keith Young) Date: Tue, 07 Dec 2004 08:19:25 +0000 Subject: symlink permissions Message-ID: <1102407564.4479.2695.camel@florence.garden.stripydog.com> When CONFIG_EXT3_FS_POSIX_ACL is not defined, ext3_init_acl() is an inline function in fs/ext3/acl.h which doesn't check if a file is a symlink before applying umask. I've always liked my acls to be available (so never noticed), but came across this recently when trying to explain why RedHat Enterprise 3's BOOT kernel creates symlinks 755 during kickstart. I'm *assuming* this is a bug (acl code treats symlinks specially): It doesn't affect functionality, but those 755 symlinks can be noisy in your security reporting :-) Can anyone tell me if there's a good reason why umask *should* be applied to symlink permissions? Otherwise I guess (for 2.6.9): --- fs/ext3/acl.h 2004-12-07 08:15:07.859199829 +0000 +++ fs/ext3/acl.h.khy 2004-12-07 08:05:11.631931063 +0000 @@ -5,6 +5,7 @@ */ #include +#include #define EXT3_ACL_VERSION 0x0001 #define EXT3_ACL_MAX_ENTRIES 32 @@ -79,7 +80,8 @@ static inline int ext3_init_acl(handle_t *handle, struct inode *inode, struct inode *dir) { - inode->i_mode &= ~current->fs->umask; + if (!S_ISLNK(inode->i_mode)) + inode->i_mode &= ~current->fs->umask; return 0; } #endif /* CONFIG_EXT3_FS_POSIX_ACL */ From adilger at clusterfs.com Tue Dec 7 08:57:31 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 7 Dec 2004 01:57:31 -0700 Subject: Maximum ext3 file system size ?? In-Reply-To: <41089CB27BD8D24E8385C8003EDAF7AB01C60FB0@karl.alexa.com> References: <41089CB27BD8D24E8385C8003EDAF7AB01C60FB0@karl.alexa.com> Message-ID: <20041207085731.GJ2064@schnapps.adilger.int> On Dec 06, 2004 14:54 -0800, Guolin Cheng wrote: > If the ext3 file system maximum size updated or it is still 4TB for > 2.6.* kernel? The site at > http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html says that it > is 4TB yet, but I would like to know if it is possible to create and use > stable & easy-to-fix (or at least as stable & easy-to-fix as ext3) file > systems as big as 100TB for 32 bit Linux architecture? I don't think it is practical to have such gigantic filesystems for ext3, even if it would be possible. Currently for ia64 and ppc64 and Alpha you could use larger blocksize (up to 64kB) to give up to 2^31 * 64kB = 2^47 or 128TB filesystems without (I think) any changes. We had reports of one user trying to use a 4TB ext3 filesystem but there were problems when they wrote more than 2TB (though it was unclear whether the problems were from ext3, MD RAID, or the block/SCSI layer). However, with such extremely large filesystems the e2fsck time would be incredibly large I think (it grows with block count and inode count). Not to be self-serving, but Lustre (which uses ext3 as the back-end filesystem) has several customers running with 100TB+ filesystems and will have a 900TB installation next year. It can do this by aggregating multiple independent ext3 filesystems together, and also scales the number of fileservers so that you have better performance in addition to just a very large single-server filesystem. It isn't for everyone (a GPL version is available, but it isn't trivial to set up/use yet) but it is reliable enough to use on half of the world's largest Linux systems. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From agruen at suse.de Tue Dec 7 10:00:40 2004 From: agruen at suse.de (Andreas Gruenbacher) Date: Tue, 07 Dec 2004 11:00:40 +0100 Subject: [FIX] Re: symlink permissions In-Reply-To: <1102407564.4479.2695.camel@florence.garden.stripydog.com> References: <1102407564.4479.2695.camel@florence.garden.stripydog.com> Message-ID: <1102413639.3008.38.camel@winden.suse.de> Hello Keith and Andrew, On Tue, 2004-12-07 at 09:19, Keith Young wrote: > When CONFIG_EXT3_FS_POSIX_ACL is not defined, ext3_init_acl() is an > inline function in fs/ext3/acl.h which doesn't check if a file is a > symlink before applying umask. That's a bug, indeed. Thanks for reporting; a fix is attached. Andrew, could you please push this to Linus? Thanks! Cheers, -- Andreas Gruenbacher SUSE Labs, SUSE LINUX GMBH -------------- next part -------------- An embedded message was scrubbed... From: Andreas Gruenbacher Subject: Ext[23] apply umask to symlinks with ACLs configured out Date: Tue, 07 Dec 2004 10:58:06 +0100 Size: 1503 URL: From tamon at mhp.de Tue Dec 7 15:58:49 2004 From: tamon at mhp.de (Tobias Amon) Date: Tue, 7 Dec 2004 16:58:49 +0100 Subject: Problem with more than 1T B Message-ID: Hello, I'm using linux kernel 2.6.8-24. I configured it with "CONFIGURE_EXT3_FS=y" and installed "e2fsprogs". Now I try formatting my 2,5 TB HDD (Raid5 12x250GB) how can I do this? the system internal tool does only format 1TB. Thanks Bye From evilninja at gmx.net Tue Dec 7 23:26:25 2004 From: evilninja at gmx.net (Christian) Date: Wed, 08 Dec 2004 00:26:25 +0100 Subject: Problem with more than 1T B In-Reply-To: References: Message-ID: <41B63C21.8050007@gmx.net> Tobias Amon schrieb: > Hello, > > I'm using linux kernel 2.6.8-24. what is -24? which distribution? > I configured it with "CONFIGURE_EXT3_FS=y" > and installed "e2fsprogs". > > Now I try formatting my 2,5 TB HDD (Raid5 12x250GB) > how can I do this? the system internal tool does only format 1TB. have you enabled "LBD" too? > config LBD > Say Y here if you want to attach large (bigger than 2TB) discs to > your machine, or if you want to have a raid or loopback device > bigger than 2TB. Otherwise say N. From jacques.duplessis at videotron.ca Tue Dec 7 23:53:18 2004 From: jacques.duplessis at videotron.ca (Jacques Duplessis) Date: Tue, 07 Dec 2004 18:53:18 -0500 Subject: Increase size of ext3 filesystem WHILE MOUNTED Message-ID: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> Hi, We will have to go and use SuSE (30 servers) just because ext3 filesystem cannot be increase while the filesystem is mounted. I Don't understand why RedHat do not support filesystems that can do thing that ext3 cannot do. When is RedHat going to understand that in a production environnement (30 servers that is seems will be SuSE) we need extend filesystem online. IBM JS, Reiserfs and XFS Filsystems will allow user to extend their filesystems without unmounting them (When they are in a Volume Group - LVM). I would have liked to go with RedHat, but they do not give me the choice. They stick to ext3 (refuse to support any others filesystems) while SuSE (An American Company now) support ext3,reiserfs,JFS and XFS. SuSE offer 4 supported filesystems, while RedHat offer 1 (supported) filesystems and on top off that we cannot increase them online ! RedHat will have to review their marketing technique or they will loose some important market to SuSE, the other big American Linux company. ----- Jacques Duplessis jacques.duplessis at videotron.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: From tytso at mit.edu Wed Dec 8 00:29:26 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Tue, 7 Dec 2004 19:29:26 -0500 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> References: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> Message-ID: <20041208002926.GA5776@thunk.org> On Tue, Dec 07, 2004 at 06:53:18PM -0500, Jacques Duplessis wrote: > Hi, > We will have to go and use SuSE (30 servers) just because ext3 > filesystem cannot be increase while the filesystem is mounted. Actually, the latest 2.6 kernel and the latest Red Hat Fedora Core distribution has an ext3 filesystem that can indeed support on-line resize. - Ted From adilger at clusterfs.com Wed Dec 8 00:36:51 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 7 Dec 2004 17:36:51 -0700 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> References: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> Message-ID: <20041208003651.GA9872@schnapps.adilger.int> On Dec 07, 2004 18:53 -0500, Jacques Duplessis wrote: > We will have to go and use SuSE (30 servers) just because ext3 > filesystem cannot be increase while the filesystem is mounted. Ironically, ext3 online resizing was added to the vanilla kernel on Oct 29 for 2.6.9, and has been in FC2 for a while before that. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From evilninja at gmx.net Wed Dec 8 16:46:03 2004 From: evilninja at gmx.net (Christian) Date: Wed, 08 Dec 2004 17:46:03 +0100 Subject: Problem with more than 1T B In-Reply-To: References: Message-ID: <41B72FCB.5080005@gmx.net> Tobias Amon schrieb: > I use Suse Linux 9.2 > LBD is enabled, too so, do you have any (error) messages in the syslog? perhaps (!) an strace of the mkfs.ext3 could help too. "the system internal tool does only format 1TB." is quite vague. > and installed "e2fsprogs". which version? oh, and please respond on-list, perhaps other ppl are curious about some details too ;) Christian. From evilninja at gmx.net Wed Dec 8 17:10:51 2004 From: evilninja at gmx.net (Christian) Date: Wed, 08 Dec 2004 18:10:51 +0100 Subject: AW: Problem with more than 1T B In-Reply-To: References: Message-ID: <41B7359B.5070208@gmx.net> boy, if you really care about your problem, *USE* ext3-users: > oh, and please respond on-list, perhaps other ppl are curious > about some details too ;) really. Tobias Amon schrieb: > Hi, > > I use e2fsprogs version 1.5.2 (I think) http://e2fsprogs.sourceforge.net/ exists, latest version is 1.35. > I got another solution. Yast, which came with Suse > is not able to create partitions larger than 2 TB > the "parted" tool can do so. ??? this doesn't make any sense. when using "mkfs.*", you are supposed to already *have* created such a large partition. once this is done, you can format it. > So I have to make partitions manually and format them with "mke2fs" man mkfs.ext3. hoping not to feed the troll, Christian. From adilger at clusterfs.com Wed Dec 8 19:20:23 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 8 Dec 2004 12:20:23 -0700 Subject: AW: Problem with more than 1T B In-Reply-To: <41B7359B.5070208@gmx.net> References: <41B7359B.5070208@gmx.net> Message-ID: <20041208192023.GD2899@schnapps.adilger.int> On Dec 08, 2004 18:10 +0100, Christian wrote: > Tobias Amon schrieb: > >I use e2fsprogs version 1.5.2 (I think) > > http://e2fsprogs.sourceforge.net/ exists, latest version is 1.35. > > >I got another solution. Yast, which came with Suse > >is not able to create partitions larger than 2 TB > >the "parted" tool can do so. So this is entirely a problem with Yast, and not mke2fs/e2fsprogs. Please ask SuSE about that. When you do create such a large filesystem, it would be sensible to test it once first by creating some large files in the fs (enough to fill it up, either many small ones or fewer large ones) that have a verifiable data pattern in them, like "64-bit byte_offset:32-bit inum" at the start of each 4kB block. Then read back the files and verify the data is still correct. If this fails, try the same thing on the raw block device to check if it can write data > 2TB at the correct offsets. The reason I ask is that I'm personally not 100% convinced that larger than 2TB filesystems are working properly yet. One of our customers was testing with a ~4TB filesystem and they experienced data corruption when writing > 2TB of data. Since it was only for a demo we just went with mutliple 2TB filesystems and have not had an opportunity to determine if the problem was in MD RAID or ext3. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From jacques.duplessis at videotron.ca Thu Dec 9 02:41:46 2004 From: jacques.duplessis at videotron.ca (Jacques Duplessis) Date: Wed, 08 Dec 2004 21:41:46 -0500 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <20041208002926.GA5776@thunk.org> Message-ID: <0I8F00JO3OTQMD@VL-MO-MR010.ip.videotron.ca> Thank You Good news at last ! Do you know if we can get it to work on Enterprise 3.0 ? -----Original Message----- From: Theodore Ts'o [mailto:tytso at thunk.org] On Behalf Of Theodore Ts'o Sent: Tuesday, December 07, 2004 7:29 PM To: Jacques Duplessis Cc: ext3-users at redhat.com Subject: Re: Increase size of ext3 filesystem WHILE MOUNTED On Tue, Dec 07, 2004 at 06:53:18PM -0500, Jacques Duplessis wrote: > Hi, > We will have to go and use SuSE (30 servers) just because ext3 > filesystem cannot be increase while the filesystem is mounted. Actually, the latest 2.6 kernel and the latest Red Hat Fedora Core distribution has an ext3 filesystem that can indeed support on-line resize. - Ted From jacques.duplessis at videotron.ca Thu Dec 9 02:42:06 2004 From: jacques.duplessis at videotron.ca (Jacques Duplessis) Date: Wed, 08 Dec 2004 21:42:06 -0500 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <20041208003651.GA9872@schnapps.adilger.int> Message-ID: <0I8F00JQSOU8MD@VL-MO-MR010.ip.videotron.ca> Thank You Good news at last ! Do you know if we can get it to work on Enterprise 3.0 ? ----- Jacques Duplessis jacques.duplessis at videotron.ca -----Original Message----- From: Andreas Dilger [mailto:adilger at clusterfs.com] Sent: Tuesday, December 07, 2004 7:37 PM To: Jacques Duplessis Cc: ext3-users at redhat.com Subject: Re: Increase size of ext3 filesystem WHILE MOUNTED On Dec 07, 2004 18:53 -0500, Jacques Duplessis wrote: > We will have to go and use SuSE (30 servers) just because ext3 > filesystem cannot be increase while the filesystem is mounted. Ironically, ext3 online resizing was added to the vanilla kernel on Oct 29 for 2.6.9, and has been in FC2 for a while before that. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ From Holger.Kiehl at dwd.de Sun Dec 5 17:02:55 2004 From: Holger.Kiehl at dwd.de (Holger Kiehl) Date: Sun, 5 Dec 2004 17:02:55 +0000 (GMT) Subject: BUG in fs/ext3/dir.c Message-ID: Hello When using readdir() on a directory with many files or long file names it can happen that it returns the same file name twice. Attached is a program that demonstrates this. I have traced this problem back to linux-2.6.10-rc1-bk18 and all kernels after this one are effected. linux-2.6.10-rc1-bk17 is still okay. If I reverse the following patch in linux-2.6.10-rc1-bk18, readdir() works again correctly: diff -Nru linux-2.6.10-rc1-bk17/fs/ext3/dir.c linux-2.6.10-rc1-bk18/fs/ext3/dir.c --- linux-2.6.10-rc1-bk17/fs/ext3/dir.c 2004-10-18 23:54:30.000000000 +0200 +++ linux-2.6.10-rc1-bk18/fs/ext3/dir.c 2004-12-05 16:44:21.000000000 +0100 @@ -418,7 +418,7 @@ get_dtype(sb, fname->file_type)); if (error) { filp->f_pos = curr_pos; - info->extra_fname = fname->next; + info->extra_fname = fname; return error; } fname = fname->next; @@ -457,9 +457,12 @@ * If there are any leftover names on the hash collision * chain, return them first. */ - if (info->extra_fname && - call_filldir(filp, dirent, filldir, info->extra_fname)) - goto finished; + if (info->extra_fname) { + if(call_filldir(filp, dirent, filldir, info->extra_fname)) + goto finished; + else + goto next_entry; + } if (!info->curr_node) info->curr_node = rb_first(&info->root); @@ -492,7 +495,7 @@ info->curr_minor_hash = fname->minor_hash; if (call_filldir(filp, dirent, filldir, fname)) break; - +next_entry: info->curr_node = rb_next(info->curr_node); if (!info->curr_node) { if (info->next_hash == ~0) { Regards, Holger PS: Please CC me since I am not on this list. -------------- next part -------------- #include #include #include #include #include #include #include #include int main(int argc, char *argv[]) { int fd, filename_length, i, j, no_of_files; char pathname[256], *ptr, prevname[256], to_pathname[256], *to_ptr; DIR *dp; struct dirent *p_dir; struct stat stat_buf; if (argc != 3) { fprintf(stderr, "Usage: %s \n", argv[0]); exit(1); } else { no_of_files = atoi(argv[1]); filename_length = atoi(argv[2]); } /* Create necessary dirs. */ (void)mkdir("testbugdir", S_IRUSR|S_IWUSR|S_IXUSR); (void)mkdir("testbugdir/input", S_IRUSR|S_IWUSR|S_IXUSR); (void)mkdir("testbugdir/output", S_IRUSR|S_IWUSR|S_IXUSR); /* Create input files. */ strcpy(pathname, "testbugdir/input/"); ptr = pathname + strlen(pathname); for (i = 0; i < no_of_files; i++) { sprintf(ptr, "%0*d", filename_length, i); if ((fd = open(pathname, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR)) == -1) { fprintf(stderr, "open() error %s : %s\n", pathname, strerror(errno)); exit(1); } close(fd); } /* Move input files to output. */ strcpy(to_pathname, "testbugdir/output/"); to_ptr = to_pathname + strlen(to_pathname); *ptr = '\0'; if ((dp = opendir(pathname)) == NULL) { fprintf(stderr, "opendir() error (%s) : %s\n", pathname, strerror(errno)); exit(1); } prevname[0] = '\0'; while ((p_dir = readdir(dp)) != NULL) { if (p_dir->d_name[0] == '.') { continue; } if (strcmp(prevname, p_dir->d_name) == 0) { fprintf(stderr, "BUG: %s appears twice!\n", p_dir->d_name); } strcpy(prevname, p_dir->d_name); strcpy(ptr, p_dir->d_name); if (stat(pathname, &stat_buf) < 0) { fprintf(stderr, "stat() error (%s) : %s\n", pathname, strerror(errno)); continue; } strcpy(to_ptr, p_dir->d_name); if (rename(pathname, to_pathname) == -1) { fprintf(stderr, "rename() error (file %d) : %s\n", pathname, strerror(errno)); } } (void)closedir(dp); /* Remove everyting. */ *to_ptr = '\0'; if ((dp = opendir(to_pathname)) == NULL) { fprintf(stderr, "opendir() error (%s) : %s\n", to_pathname, strerror(errno)); exit(1); } prevname[0] = '\0'; while ((p_dir = readdir(dp)) != NULL) { if (p_dir->d_name[0] == '.') { continue; } if (strcmp(prevname, p_dir->d_name) == 0) { fprintf(stderr, "BUG: %s appears twice!\n", p_dir->d_name); } strcpy(prevname, p_dir->d_name); strcpy(to_ptr, p_dir->d_name); if (unlink(to_pathname) == -1) { fprintf(stderr, "unlink() error (%s) : %s\n", to_pathname, strerror(errno)); continue; } } (void)closedir(dp); if (rmdir("testbugdir/input") == -1) { fprintf(stderr, "rmdir() error (testbugdir/input) : %s\n", strerror(errno)); } if (rmdir("testbugdir/output") == -1) { fprintf(stderr, "rmdir() error (testbugdir/output) : %s\n", strerror(errno)); } if (rmdir("testbugdir") == -1) { fprintf(stderr, "rmdir() error (testbugdir) : %s\n", strerror(errno)); } exit(0); } From Holger.Kiehl at dwd.de Sun Dec 5 17:56:18 2004 From: Holger.Kiehl at dwd.de (Holger Kiehl) Date: Sun, 5 Dec 2004 17:56:18 +0000 (GMT) Subject: BUG in fs/ext3/dir.c In-Reply-To: References: Message-ID: On Sun, 5 Dec 2004, Holger Kiehl wrote: > Hello > > When using readdir() on a directory with many files or long file names > it can happen that it returns the same file name twice. Attached is > a program that demonstrates this. > I forgot to mention how to call the program to show the bug. This is done as follows: ./a.out 200 20 BUG: 00000000000000000061 appears twice! stat() error (testbugdir/input/00000000000000000061) : No such file or directory BUG: 00000000000000000061 appears twice! unlink() error (testbugdir/output/00000000000000000061) : No such file or directory or: ./a.out 50 61 BUG: 0000000000000000000000000000000000000000000000000000000000020 appears twice! stat() error (testbugdir/input/0000000000000000000000000000000000000000000000000000000000020) : No such file or directory BUG: 0000000000000000000000000000000000000000000000000000000000020 appears twice! unlink() error (testbugdir/output/0000000000000000000000000000000000000000000000000000000000020) : No such file or directory Holger From sct at redhat.com Thu Dec 9 02:58:35 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 09 Dec 2004 02:58:35 +0000 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <0I8F00JO3OTQMD@VL-MO-MR010.ip.videotron.ca> References: <0I8F00JO3OTQMD@VL-MO-MR010.ip.videotron.ca> Message-ID: <1102561114.1921.86.camel@sisko.sctweedie.blueyonder.co.uk> Hi, On Thu, 2004-12-09 at 02:41, Jacques Duplessis wrote: > Do you know if we can get it to work on Enterprise 3.0 ? Not currently, no --- the online resize code is a new feature in ext3 on 2.6 kernels only. But future 2.6-based RHEL releases will have it. Cheers, Stephen From swarren at wwwdotorg.org Thu Dec 9 04:16:07 2004 From: swarren at wwwdotorg.org (Stephen Warren) Date: Wed, 08 Dec 2004 21:16:07 -0700 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck Message-ID: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> Hi. I'm attempting to setup a box here to be a file-server for all my data. I'm attempting to resize an ext3 partition to demonstrate this capability to myself before fully committing to this system as the primary data storage. I'm having some problems resizing an ext3 filesystem after I've resized the underlying logical volume. Following the ext3 resize, fsck spits out lots of errors like: Pass 1: Checking inodes, blocks, and sizes Group 49's inode table at 1605636 conflicts with some other fs block. Relocate? no I believe that I'm following the correct procedure for resizing the filesystem. Any pointers greatly appreciated. Thanks. A complete transcript demonstrating this problem follows: SEVERN:~# cat /etc/fedora-release Fedora Core release 3 (Heidelberg) SEVERN:~# uname -a Linux severn.wwwdotorg.org 2.6.9-1.667 #1 Tue Nov 2 14:41:25 EST 2004 i686 athlon i386 GNU/Linux SEVERN:~# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hdk2[1] hdg2[0] 242685824 blocks [2/2] [UU] md0 : active raid1 hdk1[1] hdg1[0] 104320 blocks [2/2] [UU] unused devices: root@:~# pvscan PV /dev/md1 VG severn_vg0 lvm2 [231.44 GB / 109.88 GB free] Total: 1 [231.44 GB] / in use: 1 [231.44 GB] / in no VG: 0 [0 ] root at SEVERN:~# vgscan Reading all physical volumes. This may take a while... Found volume group "severn_vg0" using metadata type lvm2 root at SEVERN:~# lvscan ACTIVE '/dev/severn_vg0/severn_root' [24.44 GB] inherit ACTIVE '/dev/severn_vg0/severn_samba' [128.00 MB] inherit ACTIVE '/dev/severn_vg0/severn_archive' [12.00 GB] inherit ACTIVE '/dev/severn_vg0/severn_photos' [20.00 GB] inherit ACTIVE '/dev/severn_vg0/severn_home' [20.00 GB] inherit ACTIVE '/dev/severn_vg0/severn_backup' [40.00 GB] inherit ACTIVE '/dev/severn_vg0/severn_svn' [5.00 GB] inherit root at SEVERN:~# lvcreate -L 5G -n test severn_vg0 Logical volume "test" created root at SEVERN:~# mke2fs /dev/severn_vg0/test mke2fs 1.35 (28-Feb-2004) max_blocks 1342177280, rsv_groups = 40960, rsv_gdb = 319 Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 655360 inodes, 1310720 blocks 65536 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1342177280 40 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: 0/40 [...deleted from logfile...] done inode.i_blocks = 20424, i_size = 4243456 Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 33 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. root at SEVERN:~# e2fsck /dev/severn_vg0/test e2fsck 1.35 (28-Feb-2004) /dev/severn_vg0/test: clean, 11/655360 files, 23134/1310720 blocks root at SEVERN:~# resize2fs -d 16 /dev/severn_vg0/test resize2fs 1.35 (28-Feb-2004) The filesystem is already 1310720 blocks long. Nothing to do! root at SEVERN:~# lvextend -L +5G /dev/severn_vg0/test Extending logical volume test to 10.00 GB Logical volume test successfully resized root at SEVERN:~# resize2fs -d 16 /dev/severn_vg0/test resize2fs 1.35 (28-Feb-2004) Resizing the filesystem on /dev/severn_vg0/test to 2621440 (4k) blocks. The filesystem on /dev/severn_vg0/test is now 2621440 blocks long. root at SEVERN:~# e2fsck /dev/severn_vg0/test e2fsck 1.35 (28-Feb-2004) /dev/severn_vg0/test: clean, 11/1310720 files, 43696/2621440 blocks root at SEVERN:~# e2fsck -f /dev/severn_vg0/test e2fsck 1.35 (28-Feb-2004) Pass 1: Checking inodes, blocks, and sizes Group 49's inode table at 1605636 conflicts with some other fs block. Relocate? no Group 49's inode table at 1605637 conflicts with some other fs block. Relocate? no ... same message repeated for many blocks. ... eventually hit ^C Group 49's inode table at 1605767 conflicts with some other fs block. Relocate? Quit root at SEVERN:~# debugfs /dev/severn_vg0/test debugfs 1.35 (28-Feb-2004) debugfs: stats Filesystem volume name: Last mounted on: Filesystem UUID: c037ef14-2db8-41ce-92bf-3b642a5bad55 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: resize_inode filetype sparse_super large_file Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 1310720 Block count: 2621440 Reserved block count: 131072 Free blocks: 2577744 Free inodes: 1310709 First block: 0 Block size: 4096 Fragment size: 4096 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16384 Inode blocks per group: 512 Filesystem created: Wed Dec 8 20:24:58 2004 Last mount time: n/a Last write time: Wed Dec 8 20:26:31 2004 Mount count: 0 Maximum mount count: 33 Last checked: Wed Dec 8 20:24:58 2004 Check interval: 15552000 (6 months) Next check after: Mon Jun 6 21:24:58 2005 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Default directory hash: tea Directory Hash Seed: ed7f64a9-338d-4938-8e49-b93241bb88b6 Directories: 2 Group 0: block bitmap at 321, inode bitmap at 322, inode table at 323 31927 free blocks, 16373 free inodes, 2 used directories Group 1: block bitmap at 33089, inode bitmap at 33090, inode table at 33091 31933 free blocks, 16384 free inodes, 0 used directories Group 2: block bitmap at 65536, inode bitmap at 65537, inode table at 65538 32254 free blocks, 16384 free inodes, 0 used directories Group 3: block bitmap at 98625, inode bitmap at 98626, inode table at 98627 31933 free blocks, 16384 free inodes, 0 used directories Group 4: block bitmap at 131072, inode bitmap at 131073, inode table at 131074 32254 free blocks, 16384 free inodes, 0 used directories Group 5: block bitmap at 164161, inode bitmap at 164162, inode table at 164163 31933 free blocks, 16384 free inodes, 0 used directories Group 6: block bitmap at 196608, inode bitmap at 196609, inode table at 196610 32254 free blocks, 16384 free inodes, 0 used directories Group 7: block bitmap at 229697, inode bitmap at 229698, inode table at 229699 31933 free blocks, 16384 free inodes, 0 used directories Group 8: block bitmap at 262144, inode bitmap at 262145, inode table at 262146 32254 free blocks, 16384 free inodes, 0 used directories Group 9: block bitmap at 295233, inode bitmap at 295234, inode table at 295235 31933 free blocks, 16384 free inodes, 0 used directories Group 10: block bitmap at 327680, inode bitmap at 327681, inode table at 327682 32254 free blocks, 16384 free inodes, 0 used directories Group 11: block bitmap at 360448, inode bitmap at 360449, inode table at 360450 32254 free blocks, 16384 free inodes, 0 used directories Group 12: block bitmap at 393216, inode bitmap at 393217, inode table at 393218 32254 free blocks, 16384 free inodes, 0 used directories Group 13: block bitmap at 425984, inode bitmap at 425985, inode table at 425986 32254 free blocks, 16384 free inodes, 0 used directories Group 14: block bitmap at 458752, inode bitmap at 458753, inode table at 458754 32254 free blocks, 16384 free inodes, 0 used directories Group 15: block bitmap at 491520, inode bitmap at 491521, inode table at 491522 32254 free blocks, 16384 free inodes, 0 used directories Group 16: block bitmap at 524288, inode bitmap at 524289, inode table at 524290 32254 free blocks, 16384 free inodes, 0 used directories Group 17: block bitmap at 557056, inode bitmap at 557057, inode table at 557058 32254 free blocks, 16384 free inodes, 0 used directories Group 18: block bitmap at 589824, inode bitmap at 589825, inode table at 589826 32254 free blocks, 16384 free inodes, 0 used directories Group 19: block bitmap at 622592, inode bitmap at 622593, inode table at 622594 32254 free blocks, 16384 free inodes, 0 used directories Group 20: block bitmap at 655360, inode bitmap at 655361, inode table at 655362 32254 free blocks, 16384 free inodes, 0 used directories Group 21: block bitmap at 688128, inode bitmap at 688129, inode table at 688130 32254 free blocks, 16384 free inodes, 0 used directories Group 22: block bitmap at 720896, inode bitmap at 720897, inode table at 720898 32254 free blocks, 16384 free inodes, 0 used directories Group 23: block bitmap at 753664, inode bitmap at 753665, inode table at 753666 32254 free blocks, 16384 free inodes, 0 used directories Group 24: block bitmap at 786432, inode bitmap at 786433, inode table at 786434 32254 free blocks, 16384 free inodes, 0 used directories Group 25: block bitmap at 819521, inode bitmap at 819522, inode table at 819523 31933 free blocks, 16384 free inodes, 0 used directories Group 26: block bitmap at 851968, inode bitmap at 851969, inode table at 851970 32254 free blocks, 16384 free inodes, 0 used directories Group 27: block bitmap at 885057, inode bitmap at 885058, inode table at 885059 31933 free blocks, 16384 free inodes, 0 used directories Group 28: block bitmap at 917504, inode bitmap at 917505, inode table at 917506 32254 free blocks, 16384 free inodes, 0 used directories Group 29: block bitmap at 950272, inode bitmap at 950273, inode table at 950274 32254 free blocks, 16384 free inodes, 0 used directories Group 30: block bitmap at 983040, inode bitmap at 983041, inode table at 983042 32254 free blocks, 16384 free inodes, 0 used directories Group 31: block bitmap at 1015808, inode bitmap at 1015809, inode table at 1015 810 32254 free blocks, 16384 free inodes, 0 used directories Group 32: block bitmap at 1048576, inode bitmap at 1048577, inode table at 1048 578 32254 free blocks, 16384 free inodes, 0 used directories Group 33: block bitmap at 1081344, inode bitmap at 1081345, inode table at 1081 346 32254 free blocks, 16384 free inodes, 0 used directories Group 34: block bitmap at 1114112, inode bitmap at 1114113, inode table at 1114 114 32254 free blocks, 16384 free inodes, 0 used directories Group 35: block bitmap at 1146880, inode bitmap at 1146881, inode table at 1146 882 32254 free blocks, 16384 free inodes, 0 used directories Group 36: block bitmap at 1179648, inode bitmap at 1179649, inode table at 1179 650 32254 free blocks, 16384 free inodes, 0 used directories Group 37: block bitmap at 1212416, inode bitmap at 1212417, inode table at 1212 418 32254 free blocks, 16384 free inodes, 0 used directories Group 38: block bitmap at 1245184, inode bitmap at 1245185, inode table at 1245 186 32254 free blocks, 16384 free inodes, 0 used directories Group 39: block bitmap at 1277952, inode bitmap at 1277953, inode table at 1277 954 32254 free blocks, 16384 free inodes, 0 used directories Group 40: block bitmap at 1310720, inode bitmap at 1310721, inode table at 1310 722 32254 free blocks, 16384 free inodes, 0 used directories Group 41: block bitmap at 1343488, inode bitmap at 1343489, inode table at 1343 490 32254 free blocks, 16384 free inodes, 0 used directories Group 42: block bitmap at 1376256, inode bitmap at 1376257, inode table at 1376 258 32254 free blocks, 16384 free inodes, 0 used directories Group 43: block bitmap at 1409024, inode bitmap at 1409025, inode table at 1409 026 32254 free blocks, 16384 free inodes, 0 used directories Group 44: block bitmap at 1441792, inode bitmap at 1441793, inode table at 1441 794 32254 free blocks, 16384 free inodes, 0 used directories Group 45: block bitmap at 1474560, inode bitmap at 1474561, inode table at 1474 562 32254 free blocks, 16384 free inodes, 0 used directories Group 46: block bitmap at 1507328, inode bitmap at 1507329, inode table at 1507 330 32254 free blocks, 16384 free inodes, 0 used directories Group 47: block bitmap at 1540096, inode bitmap at 1540097, inode table at 1540 098 32254 free blocks, 16384 free inodes, 0 used directories Group 48: block bitmap at 1572864, inode bitmap at 1572865, inode table at 1572 866 32254 free blocks, 16384 free inodes, 0 used directories Group 49: block bitmap at 1605634, inode bitmap at 1605635, inode table at 1605 636 32252 free blocks, 16384 free inodes, 0 used directories Group 50: block bitmap at 1638400, inode bitmap at 1638401, inode table at 1638 402 32254 free blocks, 16384 free inodes, 0 used directories Group 51: block bitmap at 1671168, inode bitmap at 1671169, inode table at 1671 170 32254 free blocks, 16384 free inodes, 0 used directories Group 52: block bitmap at 1703936, inode bitmap at 1703937, inode table at 1703 938 32254 free blocks, 16384 free inodes, 0 used directories Group 53: block bitmap at 1736704, inode bitmap at 1736705, inode table at 1736 706 32254 free blocks, 16384 free inodes, 0 used directories Group 54: block bitmap at 1769472, inode bitmap at 1769473, inode table at 1769 474 32254 free blocks, 16384 free inodes, 0 used directories Group 55: block bitmap at 1802240, inode bitmap at 1802241, inode table at 1802 242 32254 free blocks, 16384 free inodes, 0 used directories Group 56: block bitmap at 1835008, inode bitmap at 1835009, inode table at 1835 010 32254 free blocks, 16384 free inodes, 0 used directories Group 57: block bitmap at 1867776, inode bitmap at 1867777, inode table at 1867 778 32254 free blocks, 16384 free inodes, 0 used directories Group 58: block bitmap at 1900544, inode bitmap at 1900545, inode table at 1900 546 32254 free blocks, 16384 free inodes, 0 used directories Group 59: block bitmap at 1933312, inode bitmap at 1933313, inode table at 1933 314 32254 free blocks, 16384 free inodes, 0 used directories Group 60: block bitmap at 1966080, inode bitmap at 1966081, inode table at 1966 082 32254 free blocks, 16384 free inodes, 0 used directories Group 61: block bitmap at 1998848, inode bitmap at 1998849, inode table at 1998 850 32254 free blocks, 16384 free inodes, 0 used directories Group 62: block bitmap at 2031616, inode bitmap at 2031617, inode table at 2031 618 32254 free blocks, 16384 free inodes, 0 used directories Group 63: block bitmap at 2064384, inode bitmap at 2064385, inode table at 2064 386 32254 free blocks, 16384 free inodes, 0 used directories Group 64: block bitmap at 2097152, inode bitmap at 2097153, inode table at 2097 154 32254 free blocks, 16384 free inodes, 0 used directories Group 65: block bitmap at 2129920, inode bitmap at 2129921, inode table at 2129 922 32254 free blocks, 16384 free inodes, 0 used directories Group 66: block bitmap at 2162688, inode bitmap at 2162689, inode table at 2162 690 32254 free blocks, 16384 free inodes, 0 used directories Group 67: block bitmap at 2195456, inode bitmap at 2195457, inode table at 2195 458 32254 free blocks, 16384 free inodes, 0 used directories Group 68: block bitmap at 2228224, inode bitmap at 2228225, inode table at 2228 226 32254 free blocks, 16384 free inodes, 0 used directories Group 69: block bitmap at 2260992, inode bitmap at 2260993, inode table at 2260 994 32254 free blocks, 16384 free inodes, 0 used directories Group 70: block bitmap at 2293760, inode bitmap at 2293761, inode table at 2293 762 32254 free blocks, 16384 free inodes, 0 used directories Group 71: block bitmap at 2326528, inode bitmap at 2326529, inode table at 2326 530 32254 free blocks, 16384 free inodes, 0 used directories Group 72: block bitmap at 2359296, inode bitmap at 2359297, inode table at 2359 298 32254 free blocks, 16384 free inodes, 0 used directories Group 73: block bitmap at 2392064, inode bitmap at 2392065, inode table at 2392 066 32254 free blocks, 16384 free inodes, 0 used directories Group 74: block bitmap at 2424832, inode bitmap at 2424833, inode table at 2424 834 32254 free blocks, 16384 free inodes, 0 used directories Group 75: block bitmap at 2457600, inode bitmap at 2457601, inode table at 2457 602 32254 free blocks, 16384 free inodes, 0 used directories Group 76: block bitmap at 2490368, inode bitmap at 2490369, inode table at 2490 370 32254 free blocks, 16384 free inodes, 0 used directories Group 77: block bitmap at 2523136, inode bitmap at 2523137, inode table at 2523 138 32254 free blocks, 16384 free inodes, 0 used directories Group 78: block bitmap at 2555904, inode bitmap at 2555905, inode table at 2555 906 32254 free blocks, 16384 free inodes, 0 used directories Group 79: block bitmap at 2588672, inode bitmap at 2588673, inode table at 2588 674 32254 free blocks, 16384 free inodes, 0 used directories debugfs: quit -- Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO swarren at wwwdotorg.org http://www.wwwdotorg.org/pgp.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature URL: From adilger at clusterfs.com Thu Dec 9 09:01:13 2004 From: adilger at clusterfs.com ('Andreas Dilger') Date: Thu, 9 Dec 2004 02:01:13 -0700 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <0I8F00JQSOU8MD@VL-MO-MR010.ip.videotron.ca> References: <20041208003651.GA9872@schnapps.adilger.int> <0I8F00JQSOU8MD@VL-MO-MR010.ip.videotron.ca> Message-ID: <20041209090112.GG2899@schnapps.adilger.int> On Dec 08, 2004 21:42 -0500, Jacques Duplessis wrote: > Thank You Good news at last ! > Do you know if we can get it to work on Enterprise 3.0 ? There is an ext3 patch for older 2.4 kernels (maybe 2.4.19 is the most recent one at sourceforge, URL below though I can't seem to access SF right now), but I'd have a hard time believing it would apply without any tweaking to an RH kernel. That said, the "technology" for doing ext3 online resizing hasn't changed very much for years so it wouldn't surprise me if you could take the 2.6 patch and apply it to a 2.4.21-rhel kernel with only minor fixing. You won't get RH support for same, but it will probably work. > -----Original Message----- > From: Andreas Dilger [mailto:adilger at clusterfs.com] > Sent: Tuesday, December 07, 2004 7:37 PM > To: Jacques Duplessis > Cc: ext3-users at redhat.com > Subject: Re: Increase size of ext3 filesystem WHILE MOUNTED > > On Dec 07, 2004 18:53 -0500, Jacques Duplessis wrote: > > We will have to go and use SuSE (30 servers) just because ext3 > > filesystem cannot be increase while the filesystem is mounted. > > Ironically, ext3 online resizing was added to the vanilla kernel on Oct 29 > for 2.6.9, and has been in FC2 for a while before that. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From adilger at clusterfs.com Thu Dec 9 17:55:55 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 9 Dec 2004 10:55:55 -0700 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck In-Reply-To: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> References: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> Message-ID: <20041209175555.GH2899@schnapps.adilger.int> On Dec 08, 2004 21:16 -0700, Stephen Warren wrote: > I'm attempting to setup a box here to be a file-server for all my data. > I'm attempting to resize an ext3 partition to demonstrate this > capability to myself before fully committing to this system as the > primary data storage. I'm having some problems resizing an ext3 > filesystem after I've resized the underlying logical volume. Following > the ext3 resize, fsck spits out lots of errors like: > > Pass 1: Checking inodes, blocks, and sizes > Group 49's inode table at 1605636 conflicts with some other fs block. > Relocate? no > > I believe that I'm following the correct procedure for resizing the > filesystem. Any pointers greatly appreciated. Thanks. > > A complete transcript demonstrating this problem follows: Thanks for the detail, it is clear I think what is happening. > root at SEVERN:~# mke2fs /dev/severn_vg0/test > mke2fs 1.35 (28-Feb-2004) > max_blocks 1342177280, rsv_groups = 40960, rsv_gdb = 319 > Filesystem label= > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > 655360 inodes, 1310720 blocks > 65536 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=1342177280 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This implies that the FC3 e2fsprogs has the ext2online patch for mke2fs applied so that it is reserving block group descriptors for online (mounted) filesystem resizing. It seems that resize2fs isn't taking these reserved blocks into account when it is allocating the inode table. > root at SEVERN:~# e2fsck -f /dev/severn_vg0/test > e2fsck 1.35 (28-Feb-2004) > Pass 1: Checking inodes, blocks, and sizes > Group 49's inode table at 1605636 conflicts with some other fs block. > Relocate? no > > Group 1: block bitmap at 33089, inode bitmap at 33090, inode table at > 33091 > 31933 free blocks, 16384 free inodes, 0 used directories > Group 3: block bitmap at 98625, inode bitmap at 98626, inode table at > 98627 > 31933 free blocks, 16384 free inodes, 0 used directories > Group 5: block bitmap at 164161, inode bitmap at 164162, inode table > at 164163 > 31933 free blocks, 16384 free inodes, 0 used directories > Group 7: block bitmap at 229697, inode bitmap at 229698, inode table > at 229699 > 31933 free blocks, 16384 free inodes, 0 used directories All of these groups have backup group descriptors, as do all groups numbered {3,5,7}^n. Note free blocks count. > Group 49: block bitmap at 1605634, inode bitmap at 1605635, inode > table at 1605636 > 32252 free blocks, 16384 free inodes, 0 used directories This is the first group with backup descriptors created by resize2fs. It doesn't have the reserved group blocks (about 300 or so) and e2fsck is likely complaining about this. This is obviously a bug that needs to be fixed. The good news is that instead of resizing your filesystem while it is unmounted you can resize it while it is mounted, and that shouldn't suffer from any of these problems (and is much more convenient). You need the ext2resize RPM from sourceforge (don't know why it isn't in FC2 if they have also applied the patch to mke2fs): ftp://rpmfind.net/linux/sourceforge/e/ex/ext2resize/ext2resize-1.1.19-1.i386.rpm Then you can mke2fs a new filesystem, mount it, lvextend, and run "ext2online /dev/severn_vg0/test" and it will grow to fill the LV. There is also a tool that ships with LVM called "e2fsadm" which does this for you, like "e2fsadm -L +5G /dev/severn_vg0/test" should do both the lvextend and ext2online step at once. You should also be able to properly resize it while unmounted with ext2resize, but that is far less interesting... Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From brugolsky at telemetry-investments.com Thu Dec 9 18:10:22 2004 From: brugolsky at telemetry-investments.com (Bill Rugolsky Jr.) Date: Thu, 9 Dec 2004 13:10:22 -0500 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck In-Reply-To: <20041209175555.GH2899@schnapps.adilger.int> References: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> <20041209175555.GH2899@schnapps.adilger.int> Message-ID: <20041209181022.GB591@ti64.telemetry-investments.com> On Thu, Dec 09, 2004 at 10:55:55AM -0700, Andreas Dilger wrote: > This is obviously a bug that needs to be fixed. The good news is > that instead of resizing your filesystem while it is unmounted you can > resize it while it is mounted, and that shouldn't suffer from any of > these problems (and is much more convenient). You need the ext2resize > RPM from sourceforge (don't know why it isn't in FC2 if they have also > applied the patch to mke2fs): > > ftp://rpmfind.net/linux/sourceforge/e/ex/ext2resize/ext2resize-1.1.19-1.i386.rpm > > Then you can mke2fs a new filesystem, mount it, lvextend, and run > "ext2online /dev/severn_vg0/test" and it will grow to fill the LV. > There is also a tool that ships with LVM called "e2fsadm" which does > this for you, like "e2fsadm -L +5G /dev/severn_vg0/test" should do > both the lvextend and ext2online step at once. > > You should also be able to properly resize it while unmounted with > ext2resize, but that is far less interesting... Looks like resize2fs has not been fixed to handle the new resizing code, but ext2online is in the Red Hat e2fsprogs. I used it the other day on my root filesystem, and it seemed to work fine. [rugolsky at mercury ~]$ rpm -qf /usr/sbin/ext2online e2fsprogs-1.35-11.2 Regards, Bill Rugolsky From swarren at wwwdotorg.org Thu Dec 9 18:55:26 2004 From: swarren at wwwdotorg.org (Stephen Warren) Date: Thu, 09 Dec 2004 11:55:26 -0700 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck In-Reply-To: <20041209175555.GH2899@schnapps.adilger.int> References: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> <20041209175555.GH2899@schnapps.adilger.int> Message-ID: <1102618527.31951.TMDA@tmda.severn.wwwdotorg.org> Andreas Dilger wrote: > On Dec 08, 2004 21:16 -0700, Stephen Warren wrote: > >>I'm attempting to setup a box here to be a file-server for all my data. >>I'm attempting to resize an ext3 partition to demonstrate this >>capability to myself before fully committing to this system as the >>primary data storage. I'm having some problems resizing an ext3 >>filesystem after I've resized the underlying logical volume. Following >>the ext3 resize, fsck spits out lots of errors like: >>... > This is obviously a bug that needs to be fixed. The good news is > that instead of resizing your filesystem while it is unmounted you can > resize it while it is mounted, and that shouldn't suffer from any of > these problems (and is much more convenient). You need the ext2resize > RPM from sourceforge (don't know why it isn't in FC2 if they have also > applied the patch to mke2fs): > > ftp://rpmfind.net/linux/sourceforge/e/ex/ext2resize/ext2resize-1.1.19-1.i386.rpm > > Then you can mke2fs a new filesystem, mount it, lvextend, and run > "ext2online /dev/severn_vg0/test" and it will grow to fill the LV. > There is also a tool that ships with LVM called "e2fsadm" which does > this for you, like "e2fsadm -L +5G /dev/severn_vg0/test" should do > both the lvextend and ext2online step at once. > > You should also be able to properly resize it while unmounted with > ext2resize, but that is far less interesting... Thanks everyone for your help. I have one question and another problem: Question: On the filesystem that I ran resize2fs on, is it now broken beyond repair? I originally found the problem on a 'real' data volume. After the resize, I can certainly mount it (read-only - didn't try rw) and pull the data off it, but I assume that the fsck errors are legitimate and the fs is corrupt, such that I should run mke2fs again from scratch? This isn't a huge problem, since I was attempting to resize my "backup" LV, so it was just copies of other data... Problem: Well, I do have ext2online installed, although there's no e2fsadm. I'll have to check for other RPMs... So, I went ahead and tried ext2online, but I get a bunch of errors during the execution - it's indicating my kernel doesn't have online resize support compiled in. I would have assumed the FC(3) kernel did, but I guess I should go check. Still, the debug trace from ext2online indicates some other errors about allocating things before the kernel problem. Short version of log of ext2online: =============================== ext2online: error reserving block 0x18813e finding 0x18813f in inode adding 0x18813f to inode ext2online: error reserving block 0x18813f finding 0x188140 in inode adding 0x188140 to inode ext2online: error reserving block 0x188140 mark 16384 unavailable end inodes used ... ... Calling mount() for /dev/mapper/severn_vg0-test(/mnt/test) with resize=2621440:319 ext2online: resize failed while in kernel ext2online: Invalid argument ext2online: does the kernel support online resizing? SEVERN:/mnt# echo $? 4 =============================== Full details: SEVERN:/mnt# lvcreate -L 5G -n test severn_vg0 Logical volume "test" created SEVERN:/mnt# mke2fs /dev/severn_vg0/test mke2fs 1.35 (28-Feb-2004) max_blocks 1342177280, rsv_groups = 40960, rsv_gdb = 319 Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 655360 inodes, 1310720 blocks 65536 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1342177280 40 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done inode.i_blocks = 20424, i_size = 4243456 Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 23 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. SEVERN:/mnt# fsck -f /dev/severn_vg0/test fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/severn_vg0/test: 11/655360 files (9.1% non-contiguous), 23134/1310720 blocks SEVERN:/mnt# lvextend -L +5G /dev/severn_vg0/test Extending logical volume test to 10.00 GB Logical volume test successfully resized SEVERN:/mnt# mount /dev/severn_vg0/test /mnt/test SEVERN:/mnt# ext2online -d -v /dev/severn_vg0/test ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b ext2_open ext2_bcache_init new filesystem size 2621440 ext2_determine_itoffset setting itoffset to +323 group 2 inode table has offset 2, not 323 group 4 inode table has offset 2, not 323 group 6 inode table has offset 2, not 323 group 8 inode table has offset 2, not 323 group 10 inode table has offset 2, not 323 ... same message through to group 39 ext2_get_reserved Found 319 blocks in s_reserved_gdt_blocks using 319 reserved group descriptor blocks 40 old groups, 1 blocks 80 new groups, 1 blocks ext2_ioctl: EXTEND group to 1310720 blocks ext2online: ext2_ioctl: Inappropriate ioctl for device creating group 40 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x140000 new inode bitmap is at 0x140001 new inode table is at 0x140143-0x140342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used creating group 41 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x148000 new inode bitmap is at 0x148001 new inode table is at 0x148143-0x148342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used ... same message for many other groups creating group 49 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x188141 new inode bitmap is at 0x188142 new inode table is at 0x188143-0x188342 new group has 31933 free blocks new group has 16384 free inodes (512 blocks) mark superblock 0x188000 used mark group desc. 0x188001-0x188001 used mark reserved descriptors 0x188002-0x188140 used finding 0x188002 in inode adding 0x188002 to inode add 0 as direct block finding 0x188003 in inode adding 0x188003 to inode add 1 as direct block finding 0x188004 in inode adding 0x188004 to inode add 2 as direct block ... same message for many inodes ext2online: error reserving block 0x18813e finding 0x18813f in inode adding 0x18813f to inode ext2online: error reserving block 0x18813f finding 0x188140 in inode adding 0x188140 to inode ext2online: error reserving block 0x188140 mark 16384 unavailable end inodes used creating group 50 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x190000 new inode bitmap is at 0x190001 new inode table is at 0x190143-0x190342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used ... same message for many groups creating group 68 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x220000 new inode bitmap is at 0x220001 new inode table is at 0x220143-0x220342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used creating group 69 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x228000 new inode bitmap is at 0x228001 new inode table is at 0x228143-0x228342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used creating group 79 with 32768 blocks (rsvd = 319, newgd = 1) using itoffset of 323 new block bitmap is at 0x278000 new inode bitmap is at 0x278001 new inode table is at 0x278143-0x278342 new group has 32254 free blocks new group has 16384 free inodes (512 blocks) mark 16384 unavailable end inodes used ext2online: resizing to 2621440 blocks ...flushing buffer 81/block 2588673 Calling mount() for /dev/mapper/severn_vg0-test(/mnt/test) with resize=2621440:319 ext2online: resize failed while in kernel ext2online: Invalid argument ext2online: does the kernel support online resizing? SEVERN:/mnt# echo $? 4 SEVERN:/mnt# df -k Filesystem 1K-blocks Used Available Use% Mounted on ... /dev/mapper/severn_vg0-test 5160576 10232 4888200 1% /mnt/test -- Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO swarren at wwwdotorg.org http://www.wwwdotorg.org/pgp.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature URL: From Frank at lists.sytes.net Fri Dec 10 02:12:27 2004 From: Frank at lists.sytes.net (Frank) Date: Fri, 10 Dec 2004 03:12:27 +0100 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> References: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> Message-ID: <20041210021229.18B0175203@mail.figaro.fr> Hallo, How can i resize a ext3 FS without a LVM. A simple mounted ext3 FS. In this worst case Scenario i would try to spend the root Partition maybe 200MB. Is this possible online ? /dev/hda1 2972236 2322832 495988 83% / /dev/hda3 111791952 90501492 15611888 86% /mnt/hda3 /dev/hdd1 59106972 655884 55448544 2% /mnt/hdd1 Greeetings Frank From adilger at clusterfs.com Fri Dec 10 07:17:37 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Fri, 10 Dec 2004 00:17:37 -0700 Subject: Increase size of ext3 filesystem WHILE MOUNTED In-Reply-To: <20041210021229.18B0175203@mail.figaro.fr> References: <0I8D006W4MD3OU@VL-MO-MR010.ip.videotron.ca> <20041210021229.18B0175203@mail.figaro.fr> Message-ID: <20041210071737.GA9923@schnapps.adilger.int> On Dec 10, 2004 03:12 +0100, Frank wrote: > How can i resize a ext3 FS without a LVM. A simple mounted ext3 FS. In > this worst case Scenario i would try to spend the root Partition maybe > 200MB. Is this possible online ? > > /dev/hda1 2972236 2322832 495988 83% / > /dev/hda3 111791952 90501492 15611888 86% /mnt/hda3 > /dev/hdd1 59106972 655884 55448544 2% /mnt/hdd1 Use "GNU parted" to resize while unmounted. Sorry, but it isn't possible to do filesystem resizing while mounted if the underlying container (DOS partitions in this case) is not also resizable safely. Sure you can rewrite the partition table, but that doesn't move the data. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From guolin at alexa.com Fri Dec 10 21:49:10 2004 From: guolin at alexa.com (Guolin Cheng) Date: Fri, 10 Dec 2004 13:49:10 -0800 Subject: Problem: FC3 anaconda load kernel sata modules in wrong order Message-ID: <41089CB27BD8D24E8385C8003EDAF7AB01FE6636@karl.alexa.com> Hi, I encounter a problem to install FC3 on one of our machine, the machine has two SATA hard drives connected directly to motherboard which will use the ata_piix kernel module, and two other SATA hard drives connected through a Promise SATA card which is driven by sata_promise kernel module. The problem is: FC3 anaconda loads sata_promise module before ata_piix module by default, and this lead to the fact that sda and sdb are assigned to two hard drives connected on Promise card, then after machine successfully installed, a reboot it fails and reports "no operation system found" error. Because machines will boot from hard drives connected directly to motherboard (set in BIOS), it can not boot from drives connected to a Promise SATA PCI card. Anyone know if it is possible to specify the kernel module loading sequence for anaconda during initial installation stage? Thanks. --Guolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at clusterfs.com Sat Dec 11 07:57:08 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Sat, 11 Dec 2004 00:57:08 -0700 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck In-Reply-To: <1102618527.31951.TMDA@tmda.severn.wwwdotorg.org> References: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> <20041209175555.GH2899@schnapps.adilger.int> <1102618527.31951.TMDA@tmda.severn.wwwdotorg.org> Message-ID: <20041211075708.GB9923@schnapps.adilger.int> On Dec 09, 2004 11:55 -0700, Stephen Warren wrote: > On the filesystem that I ran resize2fs on, is it now broken beyond > repair? I originally found the problem on a 'real' data volume. After > the resize, I can certainly mount it (read-only - didn't try rw) and > pull the data off it, but I assume that the fsck errors are legitimate > and the fs is corrupt, such that I should run mke2fs again from scratch? You could either do a full e2fsck on it (it should be able to recover this since the metadata that the inode table is overlapping is unused and e2fsck will make a copy of the duplicate blocks) or you could use resize2fs to shrink it back to the original size (that should also make the problem go away). Best to do this before doing any writes to the filesystem, and I would suggest making a backup, but since this is already a backup... > Well, I do have ext2online installed, although there's no e2fsadm. I'll > have to check for other RPMs... It would be part of the lvm 1.0 tools if it was anywhere. It is just a convenience wrapper around lvextend and ext2online (or resize2fs for unmounted resizes). > So, I went ahead and tried ext2online, but I get a bunch of errors > during the execution - it's indicating my kernel doesn't have online > resize support compiled in. I would have assumed the FC(3) kernel did, > but I guess I should go check. Still, the debug trace from ext2online > indicates some other errors about allocating things before the kernel > problem. > > Short version of log of ext2online: > > =============================== > ext2online: error reserving block 0x18813e > finding 0x18813f in inode > adding 0x18813f to inode > ext2online: error reserving block 0x18813f > finding 0x188140 in inode > adding 0x188140 to inode > ext2online: error reserving block 0x188140 > mark 16384 unavailable end inodes used > ... > ... > Calling mount() for /dev/mapper/severn_vg0-test(/mnt/test) with > resize=2621440:319 > ext2online: resize failed while in kernel > ext2online: Invalid argument > ext2online: does the kernel support online resizing? Having the kernel messages would be very useful here. When I wrote that patch it was an optional feature (under the ext3 filesystem option in the config, maybe marked experimental), but looking at the current Linus kernel it appears to always be enabled (i.e. no config option). > SEVERN:/mnt# lvcreate -L 5G -n test severn_vg0 > Logical volume "test" created > > SEVERN:/mnt# mke2fs /dev/severn_vg0/test > mke2fs 1.35 (28-Feb-2004) > max_blocks 1342177280, rsv_groups = 40960, rsv_gdb = 319 > Filesystem label= > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > 655360 inodes, 1310720 blocks > 65536 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=1342177280 > 40 block groups > 32768 blocks per group, 32768 fragments per group > 16384 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736 > > Writing inode tables: done > inode.i_blocks = 20424, i_size = 4243456 Hmm, those are a bit strange, not sure where they are coming from. > SEVERN:/mnt# ext2online -d -v /dev/severn_vg0/test > ext2online v1.1.18 - 2001/03/18 for EXT2FS 0.5b > ext2_open > ext2_bcache_init > new filesystem size 2621440 > ext2_determine_itoffset > setting itoffset to +323 > group 2 inode table has offset 2, not 323 > group 4 inode table has offset 2, not 323 > group 6 inode table has offset 2, not 323 > group 8 inode table has offset 2, not 323 > group 10 inode table has offset 2, not 323 > ... same message through to group 39 These are normal - it just tells us that mke2fs didn't shift the inode tables for non-backup groups. > ext2_get_reserved > Found 319 blocks in s_reserved_gdt_blocks > using 319 reserved group descriptor blocks This tells us mke2fs reserved 319 group descriptor blocks, so we can resize up to 320 * 4096 / 32 = 40960 groups * 32MB/group = 1.2TB. By default it reserves enough blocks to resize 1024x the original fs size, at the cost of 11MB of used space in the original 5GB filesystem. > 40 old groups, 1 blocks > 80 new groups, 1 blocks We are staying within the original group descriptor block (makes the resize a bit easier) > ext2_ioctl: EXTEND group to 1310720 blocks > ext2online: ext2_ioctl: Inappropriate ioctl for device This is where it starts to look like online resize isn't available. For compatibility with older kernels (2.4.x doing ext2 online resizing, needs a patch) it doesn't just bail at this point since the resizing mechanism from user->kernel space was different. > creating group 49 with 32768 blocks (rsvd = 319, newgd = 1) > using itoffset of 323 > new block bitmap is at 0x188141 > new inode bitmap is at 0x188142 > new inode table is at 0x188143-0x188342 > new group has 31933 free blocks > new group has 16384 free inodes (512 blocks) > mark superblock 0x188000 used > mark group desc. 0x188001-0x188001 used > mark reserved descriptors 0x188002-0x188140 used > finding 0x188002 in inode > adding 0x188002 to inode > add 0 as direct block > finding 0x188003 in inode > adding 0x188003 to inode > add 1 as direct block > finding 0x188004 in inode > adding 0x188004 to inode > add 2 as direct block > > ... same message for many inodes > > ext2online: error reserving block 0x18813e > finding 0x18813f in inode > adding 0x18813f to inode > ext2online: error reserving block 0x18813f > finding 0x188140 in inode > adding 0x188140 to inode > ext2online: error reserving block 0x188140 This is probably some bad interaction between the ext2 resize mechanism and the ext3 one. I think at this point this ext2 online resize code should be removed from ext2online, since it isn't really being supported and is just causing grief to the world. > ext2online: resizing to 2621440 blocks > ...flushing buffer 81/block 2588673 > Calling mount() for /dev/mapper/severn_vg0-test(/mnt/test) with > resize=2621440:319 > ext2online: resize failed while in kernel > ext2online: Invalid argument > ext2online: does the kernel support online resizing? Here we probably get another kernel message like 'Unrecognized mount option "resize" or missing value'. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From puhuri at iki.fi Wed Dec 15 17:01:45 2004 From: puhuri at iki.fi (Markus Peuhkuri) Date: Wed, 15 Dec 2004 19:01:45 +0200 Subject: Trying to recover corrupted journal Message-ID: <41C06DF9.5080003@iki.fi> My IDE disk went bad (surprising S.M.A.R.T indicated that it was ok, still getting some errors). I managed to copy most of 120 GB partition with dd_recover, only something like 600 blocks was unreadable. After 'e2fsck -y' (or would -p been better?) most of important data not found from backups was restored. However, one important file is still missing (an older copy exists so everything is not lost). Apparently it was open or was written to disk when disk errors started. If I look disk image file with debugfs, the file is listed in the right directory but if I try to dump it, I get zero-sized file (btw what are the fields in debugfs ls -command?). One of problems is that journal was corrupted (has a wrong magic number; possibly a read error just at start). Is there some way to try to recover part of the journal? One idea is that I try to search blocks for the start of file (OOo sxc: "PK\003\004\024\0\0"..."mimetypeapplication/vnd.sun.xml.calc") but after that the problem is to locate the rest 15 blocks. How the journal is organized? Is there any detailed documentation (in addition to kernel sources :-) on ext3 journal, most of documentation I found was for ext2 and didn't cover journalling at all? -- http://iki.fi/puhuri From strombrg at dcs.nac.uci.edu Wed Dec 15 23:41:04 2004 From: strombrg at dcs.nac.uci.edu (Dan Stromberg) Date: Wed, 15 Dec 2004 15:41:04 -0800 Subject: toasted ext3 filesystem under lvm2 Message-ID: I have a Fedora Core 3 system at home, that was running fine, but now won't boot. Someone shut the power off on it without doing an orderly shutdown, and also I sometimes apply patches with "yum -y update" without doing a reboot immediately afterward - I suppose either of these could be related to my system not booting. I have a lot of information about the early stages of the problem at: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737 In short, FC3 won't boot, and the FC3 rescue CD's automatic filesystem mounting crashes. Mounting the filesystem manually errors with "invalid argument". I suspect this is either ext3 corruption, or lvm2 corruption, or both. I've made a copy of the partition on another system, and have been testing various things on that copy, like various fsck commands, e2salvage, e2extract, but fsck complains about bad superblocks, e2salvage apparently ran out of memory, and e2extract just listed millions of 0 length files. I wrote a small python script that hunts for ext3 magic numbers, and it found some in both a known-good ext3, as well as my corrupt partition image. The first offsets were the same, but others were different. All ended in hexadecimal 38. Does anyone know how to convert such numbers, relative to the beginning of the partition in bytes, to an appropriate fsck -b argument? What units does fsck -b take? The disk itself appears to be fine, as I can mount its /boot, and I got no errors when I dd'd off the partition image. When I left for work this morning, I had for loop with 1 million fsck's with different -b's (and -vn) running against a copy of the partition, to see if it would eventually hit upon a usable superblock (assuming mkfs -n isn't doing what it should, and also, I just don't want to type every last number...). But it doesn't seem likely to bear fruit, really. I also ran memtest86 on the system that had the trouble for a little over an hour, but found no errors. The machine was a $299 deal from bilsystem.com, which arrived unassembled. However, it's been stable until now, other than a time I had to replace its RAM. Does anyone have any suggestions for me? I'd really like to get this data back! PS: I wrote something very much like e2extract for the atari 800 when I was in high school... If anyone has any thoughts about the general structure of such a program for ext3... I might dive into writing one. A small tree diagram of the on-disk data structures involved with 1-n and n-1 and n-n relationships might be enough to get a good start on it. But I'd rather not reinvent the wheel if it's already out there. Thanks! From strombrg at dcs.nac.uci.edu Fri Dec 17 18:27:42 2004 From: strombrg at dcs.nac.uci.edu (Dan Stromberg) Date: Fri, 17 Dec 2004 10:27:42 -0800 Subject: toasted ext3 filesystem under lvm2 References: Message-ID: This fixed the problem: On FC3's rescue disk... 1) Do startup network interfaces 2) Don't try to automatically mount the filesystems - not even readonly 3) lvm vgchange --ignorelockingfailure -P -a y 4) fdisk -l, and guess which partition is which based on size: the small one was /boot, and the large one was / 5) mkdir /mnt/boot 6) mount /dev/hda1 /mnt/boot 7) Look up the device node for the root filesystem in /mnt/boot/grub/grub.conf 8) A first tentative step, to see if things are working: fsck -n /dev/VolGroup00/LogVol00 9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00 10) Wait a while... Be patient. Don't interrupt it 11) Reboot On Wed, 15 Dec 2004 15:41:04 -0800, Dan Stromberg wrote: > > I have a Fedora Core 3 system at home, that was running fine, but now > won't boot. > > Someone shut the power off on it without doing an orderly shutdown, and > also I sometimes apply patches with "yum -y update" without doing a reboot > immediately afterward - I suppose either of these could be related to my > system not booting. > > I have a lot of information about the early stages of the problem at: > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737 > > In short, FC3 won't boot, and the FC3 rescue CD's automatic filesystem > mounting crashes. Mounting the filesystem manually errors with "invalid > argument". I suspect this is either ext3 corruption, or lvm2 corruption, > or both. > > I've made a copy of the partition on another system, and have been testing > various things on that copy, like various fsck commands, e2salvage, > e2extract, but fsck complains about bad superblocks, e2salvage apparently > ran out of memory, and e2extract just listed millions of 0 length files. > > I wrote a small python script that hunts for ext3 magic numbers, and it > found some in both a known-good ext3, as well as my corrupt partition > image. The first offsets were the same, but others were different. All > ended in hexadecimal 38. Does anyone know how to convert such numbers, > relative to the beginning of the partition in bytes, to an appropriate > fsck -b argument? What units does fsck -b take? > > The disk itself appears to be fine, as I can mount its /boot, and I got no > errors when I dd'd off the partition image. > > When I left for work this morning, I had for loop with 1 million fsck's > with different -b's (and -vn) running against a copy of the partition, to > see if it would eventually hit upon a usable superblock (assuming mkfs -n > isn't doing what it should, and also, I just don't want to type every > last number...). But it doesn't seem likely to bear fruit, really. > > I also ran memtest86 on the system that had the trouble for a little over > an hour, but found no errors. > > The machine was a $299 deal from bilsystem.com, which arrived unassembled. > However, it's been stable until now, other than a time I had to > replace its RAM. > > Does anyone have any suggestions for me? I'd really like to get this data > back! > > PS: I wrote something very much like e2extract for the atari 800 when I > was in high school... If anyone has any thoughts about the general > structure of such a program for ext3... I might dive into writing one. A > small tree diagram of the on-disk data structures involved with 1-n and > n-1 and n-n relationships might be enough to get a good start on it. But > I'd rather not reinvent the wheel if it's already out there. > > Thanks! From tytso at mit.edu Thu Dec 23 14:09:04 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Thu, 23 Dec 2004 09:09:04 -0500 Subject: resize2fs on LVM on MD raid on Fedora Core 3 - inode table conflicts in fsck In-Reply-To: <20041211075708.GB9923@schnapps.adilger.int> References: <1102565691.18351.TMDA@tmda.severn.wwwdotorg.org> <20041209175555.GH2899@schnapps.adilger.int> <1102618527.31951.TMDA@tmda.severn.wwwdotorg.org> <20041211075708.GB9923@schnapps.adilger.int> Message-ID: <20041223140904.GB11120@thunk.org> On Sat, Dec 11, 2004 at 12:57:08AM -0700, Andreas Dilger wrote: > You could either do a full e2fsck on it (it should be able to recover this > since the metadata that the inode table is overlapping is unused and e2fsck > will make a copy of the duplicate blocks) or you could use resize2fs to > shrink it back to the original size (that should also make the problem go > away). Best to do this before doing any writes to the filesystem, and I > would suggest making a backup, but since this is already a backup... Actually, this can (fairly often) fail since the new group descriptor blocks + the GDT reserved blocks can overlap with the inode table, and e2fsck doesn't deal well with relocating the inode table if there aren't enough contiguous blocks. At that point, e2fsck will fail to to correct things. Shrinking the filesystem *should* work, at least in most cases, but the method which I have tested involves applying the following patch to e2fsprogs: http://e2fsprogs.bkbits.net:8080/e2fsprogs/gnupatch at 41c232c9j54wXA9ZNSaEuTwkzJI5QQ and then running the commands: debugfs -w /dev/hdXXX -R "features ^resize_inode" e2fsck -f /dev/hdXXXX To remove the resize_inode feature. If you want on-line resizing, you can then run ext2prepare and that should restore the resize_inode safely. I have a patch to resize2fs (attached) to make it be aware of the resize_inode feature, but I'm still testing it for 100% correctness, so it's for review and examination purposes only at this point; it hasn't been committed into my sources yet. - Ted -------------- next part -------------- ===== resize/resize2fs.c 1.33 vs edited ===== --- 1.33/resize/resize2fs.c 2004-09-17 17:10:17 -04:00 +++ edited/resize/resize2fs.c 2004-12-23 08:24:47 -05:00 @@ -46,6 +46,7 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs); static errcode_t inode_ref_fix(ext2_resize_t rfs); static errcode_t move_itables(ext2_resize_t rfs); +static errcode_t fix_resize_inode(ext2_filsys fs); static errcode_t ext2fs_calculate_summary_stats(ext2_filsys fs); /* @@ -133,6 +134,10 @@ if (retval) goto errout; + retval = fix_resize_inode(rfs->new_fs); + if (retval) + goto errout; + retval = ext2fs_close(rfs->new_fs); if (retval) goto errout; @@ -205,12 +210,13 @@ * includes the superblock backup, the group descriptor * backups, the inode bitmap, the block bitmap, and the inode * table. - * - * XXX Not all block groups need the descriptor blocks, but - * being clever is tricky... */ - overhead = 3 + fs->desc_blocks + fs->inode_blocks_per_group; - + overhead = (int) (2 + fs->inode_blocks_per_group); + + if (ext2fs_bg_has_super(fs, fs->group_desc_count - 1)) + overhead += 1 + fs->desc_blocks + + fs->super->s_reserved_gdt_blocks; + /* * See if the last group is big enough to support the * necessary data structures. If not, we need to get rid of @@ -288,6 +294,29 @@ } /* + * If the resize_inode feature is set, and we are changing the + * number of descriptor blocks, then adjust + * s_reserved_gdt_blocks if possible to avoid needing to move + * the inode table either now or in the future. + */ + if ((fs->super->s_feature_compat & + EXT2_FEATURE_COMPAT_RESIZE_INODE) && + (rfs->old_fs->desc_blocks != fs->desc_blocks)) { + int new; + + new = ((int) fs->super->s_reserved_gdt_blocks) + + (rfs->old_fs->desc_blocks - fs->desc_blocks); + if (new < 0) + new = 0; + if (new > fs->blocksize/4) + new = fs->blocksize/4; + fs->super->s_reserved_gdt_blocks = new; + if (new == 0) + fs->super->s_feature_compat &= + ~EXT2_FEATURE_COMPAT_RESIZE_INODE; + } + + /* * If we are shrinking the number block groups, we're done and * can exit now. */ @@ -346,7 +375,8 @@ if (fs->super->s_feature_incompat & EXT2_FEATURE_INCOMPAT_META_BG) old_desc_blocks = fs->super->s_first_meta_bg; else - old_desc_blocks = fs->desc_blocks; + old_desc_blocks = fs->desc_blocks + + fs->super->s_reserved_gdt_blocks; for (i = rfs->old_fs->group_desc_count; i < fs->group_desc_count; i++) { memset(&fs->group_desc[i], 0, @@ -466,7 +496,8 @@ if (fs->super->s_feature_incompat & EXT2_FEATURE_INCOMPAT_META_BG) old_desc_blocks = fs->super->s_first_meta_bg; else - old_desc_blocks = fs->desc_blocks; + old_desc_blocks = fs->desc_blocks + + fs->super->s_reserved_gdt_blocks; for (i = 0; i < fs->group_desc_count; i++) { has_super = ext2fs_bg_has_super(fs, i); if (has_super) @@ -613,8 +644,8 @@ old_blocks = old_fs->super->s_first_meta_bg; new_blocks = fs->super->s_first_meta_bg; } else { - old_blocks = old_fs->desc_blocks; - new_blocks = fs->desc_blocks; + old_blocks = old_fs->desc_blocks + old_fs->super->s_reserved_gdt_blocks; + new_blocks = fs->desc_blocks + fs->super->s_reserved_gdt_blocks; } if (old_blocks == new_blocks) { @@ -1100,7 +1131,7 @@ if (!ino) break; - if (inode.i_links_count == 0) + if (inode.i_links_count == 0 && ino != EXT2_RESIZE_INO) continue; /* inode not in use */ pb.is_dir = LINUX_S_ISDIR(inode.i_mode); @@ -1424,6 +1455,57 @@ return 0; errout: + return retval; +} + +/* + * Fix the resize inode + */ +static errcode_t fix_resize_inode(ext2_filsys fs) +{ + struct ext2_inode inode; + errcode_t retval; + char * block_buf; + + if (!(fs->super->s_feature_compat & + EXT2_FEATURE_COMPAT_RESIZE_INODE)) + return 0; + + retval = ext2fs_get_mem(fs->blocksize, &block_buf); + if (retval) goto errout; + + retval = ext2fs_read_inode(fs, EXT2_RESIZE_INO, &inode); + if (retval) goto errout; + + inode.i_blocks = fs->blocksize/512; + + retval = ext2fs_write_inode(fs, EXT2_RESIZE_INO, &inode); + if (retval) goto errout; + + if (!inode.i_block[EXT2_DIND_BLOCK]) { + /* + * Avoid zeroing out block #0; that's rude. This + * should never happen anyway since the filesystem + * should be fsck'ed and we assume it is consistent. + */ + fprintf(stderr, + _("Should never happen resize inode corrupt!\n")); + exit(1); + } + + memset(block_buf, 0, fs->blocksize); + + retval = io_channel_write_blk(fs->io, inode.i_block[EXT2_DIND_BLOCK], + 1, block_buf); + if (retval) goto errout; + + retval = ext2fs_create_resize_inode(fs); + if (retval) + goto errout; + +errout: + if (block_buf) + ext2fs_free_mem(&block_buf); return retval; } From gmallika at corp.untd.com Mon Dec 27 06:34:51 2004 From: gmallika at corp.untd.com (Gogulamudi, Basa Mallika) Date: Mon, 27 Dec 2004 12:04:51 +0530 Subject: Journaling. Message-ID: <4D8B620F4FDA414982B23A3E7F6A8EF105056979@hydmail01.hyd.corp.int.untd.com> Hi, I am using ext3 file system. Can I know the place where journal log is stored and more details about journaling. Quick response is appreciated. Thanks in advance, Mallika. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tytso at mit.edu Tue Dec 28 19:44:57 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Tue, 28 Dec 2004 14:44:57 -0500 Subject: Journaling. In-Reply-To: <4D8B620F4FDA414982B23A3E7F6A8EF105056979@hydmail01.hyd.corp.int.untd.com> References: <4D8B620F4FDA414982B23A3E7F6A8EF105056979@hydmail01.hyd.corp.int.untd.com> Message-ID: <20041228194457.GA5944@thunk.org> On Mon, Dec 27, 2004 at 12:04:51PM +0530, Gogulamudi, Basa Mallika wrote: > Hi, > > I am using ext3 file system. Can I know the place where > journal log is stored and more details about journaling. The journal is normally (there are some exceptions, such as when a journal is created on a mounted filesystem via tune2fs) stored in a hidden inode, inode #8. How the journal works is that metadata blocks that need to get modified first get written to the journal, followed by a commit block, and only then do the modified metadata blocks get written to their final location on disk. If the system crashes, the journal gets replayed either when the filesystem is mounted (in the case of the root filesystem), or when fsck is run on the filesystem. The reason why we run the journal as part of fsck is that it allows the journal to be run in parallel across multiple disk spindles, to speed bootup times. If you want more details than this about how journalling works, I suggest you ask specific questions. "More details about journalling" doesn't make clear what you already know, and what you're hoping to learn.... - Ted From brugolsky at telemetry-investments.com Tue Dec 28 23:34:49 2004 From: brugolsky at telemetry-investments.com (Bill Rugolsky Jr.) Date: Tue, 28 Dec 2004 18:34:49 -0500 Subject: Journaling. In-Reply-To: <20041228194457.GA5944@thunk.org> References: <4D8B620F4FDA414982B23A3E7F6A8EF105056979@hydmail01.hyd.corp.int.untd.com> <20041228194457.GA5944@thunk.org> Message-ID: <20041228233449.GA27122@ti64.telemetry-investments.com> A good overview of the design of Ext3 and the JBD journaling layer, by the original author, Dr. Stephen C. Tweedie, can be found here: http://www.kernel.org/pub/linux/kernel/people/sct/ext3/journal-design.ps.gz Some usage notes from Andrew Morton are here: http://www.zip.com.au/~akpm/linux/ext3/ext3-usage.html For more information, see Documentation/filesystems/ext3.txt Documentation/DocBook/journal-api.tmpl in the Linux kernel source tree. The ext3-user mail list is archived in several places, e.g., Downloadable: http://www.redhat.com/archives/ext3-users/ Searchable: http://marc.theaimsgroup.com/?l=ext3-users The ext2-devel list also hosts developer discussions regarding ext3: http://lists.sourceforge.net/lists/listinfo/ext2-devel http://marc.theaimsgroup.com/?l=ext2-devel Regards, Bill Rugolsky From jacques.duplessis at videotron.ca Wed Dec 29 00:26:23 2004 From: jacques.duplessis at videotron.ca (Jacques Duplessis) Date: Tue, 28 Dec 2004 19:26:23 -0500 Subject: Journaling. In-Reply-To: <4D8B620F4FDA414982B23A3E7F6A8EF105056979@hydmail01.hyd.corp.int.untd.com> Message-ID: <0I9G00H0RJW31T@VL-MO-MR010.ip.videotron.ca> The journal is within the filesystem _____ From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] On Behalf Of Gogulamudi, Basa Mallika Sent: Monday, December 27, 2004 1:35 AM To: ext3-users at redhat.com Subject: Journaling. Hi, I am using ext3 file system. Can I know the place where journal log is stored and more details about journaling. Quick response is appreciated. Thanks in advance, Mallika. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy at lug.org.uk Fri Dec 31 02:49:05 2004 From: andy at lug.org.uk (Andy Smith) Date: Fri, 31 Dec 2004 02:49:05 +0000 Subject: ext3 journal on software raid Message-ID: <20041231024905.GT99565@caffreys.strugglers.net> Hi, Are the comments made in this posting from the linux-raid list correct? http://marc.theaimsgroup.com/?l=linux-raid&m=110444288429682&w=2 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: