From kapil.sampath at wipro.com Thu Jun 2 04:52:51 2005 From: kapil.sampath at wipro.com (kapil.sampath at wipro.com) Date: Thu, 2 Jun 2005 10:22:51 +0530 Subject: passwd : Module is unknown (Redhat 9 Enterprise Edition) Message-ID: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com> Hi All, Can anyone help me in resolving this problem. I use Redhat 9 Enterprise edition. I have a session in which I logged in as a root. When I issue the command "su" from any other user it is throwing error "su : Incorrect password", If I try to change the password from the root session, it is throwing error "passwd : module unknown". [root at TESTING root]# su su: incorrect password [root at TESTING root]# passwd Changing password for user root. passwd: Module is unknown [root at TESTING root]# which passwd /usr/bin/passwd [root at TESTING root]# ls /etc/passwd /etc/passwd [root at TESTING root]# [root at TESTING root]# uname -a Linux TESTING 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux [root at TESTING root]# As a normal user other than root [kapil at TESTING kapil]$ su su: incorrect password [kapil at TESTING kapil]$ Please help me in resolving this problem. --------------------------------------- Thanks & Regards Kapil Sampath Wipro Technologies, 475A, Old Mahabalipuram Road, Shollinganallur, Chennai - 600 119 Dir: +91-44-30691701 Ext : 41701 Mobile : +91-9444101619 __________________________ "Men should be taught to be practical, physically strong. A dozen such lions will conquer the world, Not millions of sheep." - Swami Vivekananda -------------- next part -------------- An HTML attachment was scrubbed... URL: From menscher at uiuc.edu Thu Jun 2 05:02:09 2005 From: menscher at uiuc.edu (Damian Menscher) Date: Thu, 2 Jun 2005 00:02:09 -0500 (CDT) Subject: passwd : Module is unknown (Redhat 9 Enterprise Edition) In-Reply-To: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com> References: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com> Message-ID: On Thu, 2 Jun 2005 kapil.sampath at wipro.com wrote: > I use Redhat 9 Enterprise edition. I have a session in which I logged in > as a root. When I issue the command "su" from any other user it is > throwing error "su : Incorrect password", If I try to change the > password from the root session, it is throwing error "passwd : module > unknown". First of all, there is no such thing as Redhat 9 Enterprise -- it looks like you're running RHEL3, unpatched. Secondly, your question has nothing to do with ext3, so you will likely get a more helpful response elsewhere. But if I had to guess, I'd say your /etc/pam.d/system-auth is corrupted. You might try reinstalling your pam rpm. Damian Menscher -- -=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=- -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=- -=#| 4602 Beckman, VMIL/MS, Imaging Technology Group:(217)244-3074 |#=- -=#| www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=- -=#| The above opinions are not necessarily those of my employers. |#=- From brugolsky at telemetry-investments.com Tue Jun 7 16:04:34 2005 From: brugolsky at telemetry-investments.com (Bill Rugolsky Jr.) Date: Tue, 7 Jun 2005 12:04:34 -0400 Subject: transaction->t_forget == NULL assertion failure with data=journal Message-ID: <20050607160434.GC16192@ti64.telemetry-investments.com> It appears that this bug in data=journal mode, https://listman.redhat.com/archives/ext3-users/2005-February/msg00045.html isn't fixed in 2.6.11.11. Andrew, I've CC'd you since you have previously looked at this specific issue. I'm seeing this problem on dual-Opteron x86-64 boxes serving NFS + Samba3 to a few dozen clients; it takes several hours at high load to reproduce. We have not tested 2.6.12-rc6 yet, as I need to schedule time for the clients on the cluster. I will try and do that ASAP. I see several important fixes on the bk-commits-head list, but none of them jump out at me as being obviously more relevant to data=journal than data=ordered. Meanwhile, I'll endeavor and reproduce this locally. It would be really useful to hunt this down and kill it, because NFS over Ext3 otherwise performs very well in data=journal mode. Suggestions welcome. -Bill Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:625: "transaction->t_forget == NULL" ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at checkpoint:625 invalid operand: 0000 [1] SMP CPU 1 Modules linked in: e1000 qla2300 qla2xxx netconsole thermal processor fan button battery ac eeprom adm1026 i2c_sensor i2c_amd756 i2c_core Pid: 17828, comm: kjournald Not tainted 2.6.11.11 RIP: 0010:[] ffffffff801f347f>{__journal_drop_transaction+319} RSP: 0018:ffff810028841b58 EFLAGS: 00010296 RAX: 0000000000000071 RBX: ffff8100f840fe00 RCX: ffffffff80612d88 RDX: ffffffff80612d88 RSI: 0000000000000292 RDI: ffffffff80612d80 RBP: ffff8100faf21800 R08: ffff8100f7b45b40 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100ccba49c0 R13: ffff8100faf21800 R14: 0000000000000000 R15: ffff8100faf2195c FS: 00002aaaaade8b00(0000) GS:ffffffff80847e00(0000) knlGS:00000000557b26c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaac2000 CR3: 000000007f6fb000 CR4: 00000000000006e0 Process kjournald (pid: 17828, threadinfo ffff810028840000, task ffff8100fb3e0230) Stack: ffff81007ee618c8 ffff8100faf21800 ffff81003adc3ce8 ffffffff801f36c3 ffff81002fc7d1e8 ffffffff801f275e 0000000100000000 ffff8100faf21824 00000e8c00000000 ffff8100b32d0174 Call Trace: ffffffff801f36c3>{__journal_remove_checkpoint+99} ffffffff801f275e>{journal_commit_transaction+3534} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff8012f047>{recalc_task_prio+327} ffffffff801f6e2c>{kjournald+268} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff80175b4e>{filp_close+126} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff801f6fd0>{commit_timeout+0} ffffffff8010f0f7>{child_rip+8} ffffffff801f6d20>{kjournald+0} ffffffff8010f0ef>{child_rip+0} Code: 0f 0b b9 12 59 80 ff ff ff ff 71 02 66 66 90 66 90 48 83 7b RIP ffffffff801f347f>{__journal_drop_transaction+319} RSP From nebid2005 at yahoo.com.au Wed Jun 8 05:16:59 2005 From: nebid2005 at yahoo.com.au (Matt Smith) Date: Wed, 8 Jun 2005 15:16:59 +1000 (EST) Subject: clone RHEL 4 ext3 partition Message-ID: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com> Hi, I'm about to roll out a whole bunch of Redhat Enterprise 4 workstations and have run into problems cloning from the original. Normally I would use ghost (v7.5) because it does a nice job when cloning to a different sized disk.Unfortunately it comes up with read error 29004. Looking around it seems that Symantec don't support Fedora Core 3 (with Ghost v.8 - don't know if v.9 works ???). Next option was to use dd. This worked fine but when I went to resize the partition I noticed that Redhat have removed resize2fs from e2fsprogs. After installing the Redhat e2fsprogs source rpm (and the newest from sourceforge - i tried both) and after compiling the resize2fs binary i got an error - "Filesystem has unsupported feature(s)" All other techniques that i know eg. dump cpio all need to resize the partition after imaging. THE QUESTION: How do i clone RHEL 4 ext3 partitions to a different sized disk ? Thanks Matt Send instant messages to your online friends http://au.messenger.yahoo.com From adilger at clusterfs.com Wed Jun 8 05:55:31 2005 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 7 Jun 2005 23:55:31 -0600 Subject: clone RHEL 4 ext3 partition In-Reply-To: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com> References: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com> Message-ID: <20050608055531.GZ14004@schnapps.adilger.int> On Jun 08, 2005 15:16 +1000, Matt Smith wrote: > I'm about to roll out a whole bunch of Redhat > Enterprise 4 workstations and have run into problems > cloning from the original. > > Normally I would use ghost (v7.5) because it does a > nice job when cloning to a different sized > disk.Unfortunately it comes up with read error 29004. > Looking around it seems that Symantec don't support > Fedora Core 3 (with Ghost v.8 - don't know if v.9 > works ???). > > Next option was to use dd. > This worked fine but when I went to resize the > partition I noticed that Redhat have removed resize2fs > from e2fsprogs. > After installing the Redhat e2fsprogs > source rpm (and the newest from sourceforge - i tried > both) and after compiling the resize2fs binary i got > an error - "Filesystem has unsupported feature(s)" Both problems are likely from the same root cause - namely some strange feature enabled in your filesystem. What does "dumpe2fs -h {dev} | grep feature" say about your filesystem? AFAIK the latest e2fsprogs should support all the ext3 features in a released kernel. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From maneesh at in.ibm.com Wed Jun 8 07:36:58 2005 From: maneesh at in.ibm.com (Maneesh Soni) Date: Wed, 8 Jun 2005 13:06:58 +0530 Subject: [BUGME 4683] oops at log_do_checkpoint+0xa1/0x230 Message-ID: <20050608073658.GD5900@in.ibm.com> Hello, I was wondering if we can get some help in resolving this bug. It was reported earlier and logged in bugme.osdl.org http://bugme.osdl.org/show_bug.cgi?id=4683 The problem is kernel oops at log_do_checkpoint() due to NULL buffer_head. This could be because of some race in journalling code for which I don't have much clue. There is kdump available for analysis as mentioned in the bugme. Thanks Maneesh -- Maneesh Soni Linux Technology Center, IBM India Software Labs, Bangalore, India email: maneesh at in.ibm.com Phone: 91-80-25044990 From cchan at outblaze.com Thu Jun 9 07:46:30 2005 From: cchan at outblaze.com (Christopher Chan) Date: Thu, 09 Jun 2005 15:46:30 +0800 Subject: kjournald pegging cpu Message-ID: <42A7F3D6.4030907@outblaze.com> kernel version 2.6.10-1.771_FC2smp We have had quite a few instances of kjournald pegging cpu and thereby effectively knocking out the system's i/o. What can we do to provide more information so that the cause can be identified and fixed? Thanks Christopher From adilger at clusterfs.com Thu Jun 9 07:59:25 2005 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 9 Jun 2005 01:59:25 -0600 Subject: kjournald pegging cpu In-Reply-To: <42A7F3D6.4030907@outblaze.com> References: <42A7F3D6.4030907@outblaze.com> Message-ID: <20050609075925.GJ14004@schnapps.adilger.int> On Jun 09, 2005 15:46 +0800, Christopher Chan wrote: > kernel version 2.6.10-1.771_FC2smp > > We have had quite a few instances of kjournald pegging cpu and thereby > effectively knocking out the system's i/o. > > What can we do to provide more information so that the cause can be > identified and fixed? When the problem hits use sysrq-t and/or sysrq-p on the console to dump the stack of kjournald, as a starting point to see what it is doing. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From sct at redhat.com Thu Jun 9 13:23:26 2005 From: sct at redhat.com (Stephen C. Tweedie) Date: Thu, 09 Jun 2005 14:23:26 +0100 Subject: kjournald pegging cpu In-Reply-To: <20050609075925.GJ14004@schnapps.adilger.int> References: <42A7F3D6.4030907@outblaze.com> <20050609075925.GJ14004@schnapps.adilger.int> Message-ID: <1118323405.4851.92.camel@sisko.sctweedie.blueyonder.co.uk> Hi, On Thu, 2005-06-09 at 08:59, Andreas Dilger wrote: > When the problem hits use sysrq-t and/or sysrq-p on the console to dump > the stack of kjournald, as a starting point to see what it is doing. A kernel "readprofile" can also be very useful for this. --Stephen From nebid2005 at yahoo.com.au Fri Jun 10 06:19:22 2005 From: nebid2005 at yahoo.com.au (Nebid) Date: Fri, 10 Jun 2005 16:19:22 +1000 (EST) Subject: clone RHEL 4 ext3 partition In-Reply-To: <20050608055531.GZ14004@schnapps.adilger.int> Message-ID: <20050610061922.99105.qmail@web33604.mail.mud.yahoo.com> Hi Andreas, Thanks for replying. dumpe2fs -h /dev/hda3 | grep feature yields... Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Regards, Matt --- Andreas Dilger wrote: > On Jun 08, 2005 15:16 +1000, Matt Smith wrote: > > I'm about to roll out a whole bunch of Redhat > > Enterprise 4 workstations and have run into > problems > > cloning from the original. > > > > Normally I would use ghost (v7.5) because it does > a > > nice job when cloning to a different sized > > disk.Unfortunately it comes up with read error > 29004. > > Looking around it seems that Symantec don't > support > > Fedora Core 3 (with Ghost v.8 - don't know if v.9 > > works ???). > > > > Next option was to use dd. > > This worked fine but when I went to resize the > > partition I noticed that Redhat have removed > resize2fs > > from e2fsprogs. > > After installing the Redhat e2fsprogs > > source rpm (and the newest from sourceforge - i > tried > > both) and after compiling the resize2fs binary i > got > > an error - "Filesystem has unsupported feature(s)" > > Both problems are likely from the same root cause - > namely > some strange feature enabled in your filesystem. > > What does "dumpe2fs -h {dev} | grep feature" say > about > your filesystem? AFAIK the latest e2fsprogs should > support all the ext3 features in a released kernel. > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > > Send instant messages to your online friends http://au.messenger.yahoo.com From adilger at clusterfs.com Fri Jun 10 06:34:22 2005 From: adilger at clusterfs.com (Andreas Dilger) Date: Fri, 10 Jun 2005 00:34:22 -0600 Subject: clone RHEL 4 ext3 partition In-Reply-To: <20050610061922.99105.qmail@web33604.mail.mud.yahoo.com> References: <20050608055531.GZ14004@schnapps.adilger.int> <20050610061922.99105.qmail@web33604.mail.mud.yahoo.com> Message-ID: <20050610063422.GR14004@schnapps.adilger.int> On Jun 10, 2005 16:19 +1000, Nebid wrote: > Thanks for replying. > > dumpe2fs -h /dev/hda3 | grep feature > yields... > Filesystem features: has_journal ext_attr resize_inode > dir_index filetype needs_recovery sparse_super > large_file Well, if your filesystem has "needs_recovery" set that means it is either mounted (which is OK as long as you do a clean unmount before trying the clone/resize), or it needs an e2fsck run on it to clean up the journal before doing the resize ("e2fsck /dev/hda3" will just replay the journal, as will mounting the filesystem). I believe the 1.37 e2fsprogs should handle all of the above features without trouble. It may be that Ghost doesn't understand the "resize_inode" feature... Having said that, if you are running the FC3 kernel on these nodes you can use the online resize feature to do the resizing. Mount the filesystem, then run "ext2resize -v /dev/hda3" and it will resize to fill the partition. > --- Andreas Dilger wrote: > > On Jun 08, 2005 15:16 +1000, Matt Smith wrote: > > > I'm about to roll out a whole bunch of Redhat > > > Enterprise 4 workstations and have run into > > > problems cloning from the original. > > > > > > Normally I would use ghost (v7.5) because it does a > > > nice job when cloning to a different sized > > > disk.Unfortunately it comes up with read error 29004. > > > Looking around it seems that Symantec don't support > > > Fedora Core 3 (with Ghost v.8 - don't know if v.9 > > > works ???). > > > > > > Next option was to use dd. > > > This worked fine but when I went to resize the > > > partition I noticed that Redhat have removed > > > resize2fs from e2fsprogs. > > > After installing the Redhat e2fsprogs > > > source rpm (and the newest from sourceforge - I tried > > > both) and after compiling the resize2fs binary I got > > > an error - "Filesystem has unsupported feature(s)" > > > > Both problems are likely from the same root cause - > > namely some strange feature enabled in your filesystem. > > > > What does "dumpe2fs -h {dev} | grep feature" say > > about your filesystem? AFAIK the latest e2fsprogs should > > support all the ext3 features in a released kernel. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From jli at click3x.com Tue Jun 14 15:17:04 2005 From: jli at click3x.com (Jonathan) Date: Tue, 14 Jun 2005 11:17:04 -0400 Subject: 2.4.20 kernel patch for ext3 Message-ID: <000201c570f4$222fb450$d6895fc7@astroboy> Hi everyone, I need a 2.4.20 kernel patch for ext3 file system. It's corrupting our data. I did some google, and all pages led to this page http://www.zip.com.au/~akpm/linux/ext3/ Where all the link to the patches I need are broken. Anyone can help me get these patches? OR I'm running 2.4.20 kernel right now, if I just do the kernel upgrade, do I still need to patch for ext3? Which is the best kernel to upgrade to? 2.4.30 or 2.4.31. Thanks everyone ! Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From evilninja at gmx.net Tue Jun 14 18:43:48 2005 From: evilninja at gmx.net (evilninja) Date: Tue, 14 Jun 2005 20:43:48 +0200 Subject: 2.4.20 kernel patch for ext3 In-Reply-To: <000201c570f4$222fb450$d6895fc7@astroboy> References: <000201c570f4$222fb450$d6895fc7@astroboy> Message-ID: <42AF2564.6090709@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jonathan schrieb: > Hi everyone, I need a 2.4.20 kernel patch for ext3 file system. It's > corrupting our data. um, please be a *bit* more specific (how your data gets corrupted). > I'm running 2.4.20 kernel right now, if I just do the kernel upgrade, > do I still need to patch for ext3? > no, when you upgrade to a current 2.4 kernel (2.4.30) no additional patches are required to use ext3 and upgrading to a more current kernel really *is* a good idea. - -- BOFH excuse #125: we just switched to Sprint. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCryVkC/PVm5+NVoYRArSTAJ4/5xAIu0NKnsiaGfb9nS2yINvZoQCgjvlz v2Z4CMytRGC9oTM1lzUJFyk= =30ZE -----END PGP SIGNATURE----- From dshaw at jabberwocky.com Tue Jun 14 21:14:05 2005 From: dshaw at jabberwocky.com (David Shaw) Date: Tue, 14 Jun 2005 17:14:05 -0400 Subject: bad inode number followed by ext3_abort and remount readonly Message-ID: <20050614211405.GA26456@jabberwocky.com> I have seen this happen a number of times: Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9 Jun 13 13:58:16 n202 kernel: Aborting journal on device sda5. Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9 Jun 13 13:58:16 n202 last message repeated 6 times Jun 13 13:58:18 n202 kernel: ext3_abort called. Jun 13 13:58:18 n202 kernel: EXT3-fs error (device sda5): ext3_journal_start_sb: Detected aborted journal Jun 13 13:58:18 n202 kernel: Remounting filesystem read-only Once this happens, things break quickly (/tmp being readonly, as a start). Upon reboot, a manual fsck is required, after which the machine is operational again. This particular example is a SATA disk, but it has happened to a regular old IDE disk as well. It is always the root partition. The bad inode number varies (but is always either 3 or 9). There are no other errors about the disk in the log. Kernel: 2.6.11.7 e2fstools: 1.35 (28-Feb-2004) Any thoughts on how to proceed here? Unfortunately, I'm not able to duplicate this at will. David From bunk at stusta.de Tue Jun 14 21:34:19 2005 From: bunk at stusta.de (Adrian Bunk) Date: Tue, 14 Jun 2005 23:34:19 +0200 Subject: [2.6 patch] fs/jbd/: possible cleanups Message-ID: <20050614213419.GK21393@stusta.de> This patch contains the following possible cleanups: - make needlessly global functions static - journal.c: remove the unused global function __journal_internal_check and move the check to journal_init - remove the following write-only global variable: - journal.c: current_journal - remove the following unneeded EXPORT_SYMBOL's: - journal.c: journal_check_used_features - journal.c: journal_recover Please check which of these changes do make sense. Signed-off-by: Adrian Bunk --- fs/jbd/journal.c | 41 ++++++++++++++++++----------------------- fs/jbd/revoke.c | 3 ++- include/linux/jbd.h | 3 --- 3 files changed, 20 insertions(+), 27 deletions(-) --- linux-2.6.12-rc6-mm1-full/include/linux/jbd.h.old 2005-06-14 03:58:20.000000000 +0200 +++ linux-2.6.12-rc6-mm1-full/include/linux/jbd.h 2005-06-14 04:00:56.000000000 +0200 @@ -900,8 +900,6 @@ int start, int len, int bsize); extern journal_t * journal_init_inode (struct inode *); extern int journal_update_format (journal_t *); -extern int journal_check_used_features - (journal_t *, unsigned long, unsigned long, unsigned long); extern int journal_check_available_features (journal_t *, unsigned long, unsigned long, unsigned long); extern int journal_set_features @@ -914,7 +912,6 @@ extern int journal_skip_recovery (journal_t *); extern void journal_update_superblock (journal_t *, int); extern void __journal_abort_hard (journal_t *); -extern void __journal_abort_soft (journal_t *, int); extern void journal_abort (journal_t *, int); extern int journal_errno (journal_t *); extern void journal_ack_err (journal_t *); --- linux-2.6.12-rc6-mm1-full/fs/jbd/journal.c.old 2005-06-14 03:57:39.000000000 +0200 +++ linux-2.6.12-rc6-mm1-full/fs/jbd/journal.c 2005-06-14 04:08:24.000000000 +0200 @@ -59,13 +59,11 @@ EXPORT_SYMBOL(journal_init_dev); EXPORT_SYMBOL(journal_init_inode); EXPORT_SYMBOL(journal_update_format); -EXPORT_SYMBOL(journal_check_used_features); EXPORT_SYMBOL(journal_check_available_features); EXPORT_SYMBOL(journal_set_features); EXPORT_SYMBOL(journal_create); EXPORT_SYMBOL(journal_load); EXPORT_SYMBOL(journal_destroy); -EXPORT_SYMBOL(journal_recover); EXPORT_SYMBOL(journal_update_superblock); EXPORT_SYMBOL(journal_abort); EXPORT_SYMBOL(journal_errno); @@ -81,6 +79,7 @@ EXPORT_SYMBOL(journal_force_commit); static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *); +static void __journal_abort_soft (journal_t *journal, int errno); /* * Helper function used to manage commit timeouts @@ -93,16 +92,6 @@ wake_up_process(p); } -/* Static check for data structure consistency. There's no code - * invoked --- we'll just get a linker failure if things aren't right. - */ -void __journal_internal_check(void) -{ - extern void journal_bad_superblock_size(void); - if (sizeof(struct journal_superblock_s) != 1024) - journal_bad_superblock_size(); -} - /* * kjournald: The main thread function used to manage a logging device * journal. @@ -119,16 +108,12 @@ * known as checkpointing, and this thread is responsible for that job. */ -journal_t *current_journal; // AKPM: debug - -int kjournald(void *arg) +static int kjournald(void *arg) { journal_t *journal = (journal_t *) arg; transaction_t *transaction; struct timer_list timer; - current_journal = journal; - daemonize("kjournald"); /* Set up an interval timer which can be used to trigger a @@ -1181,8 +1166,10 @@ * features. Return true (non-zero) if it does. **/ -int journal_check_used_features (journal_t *journal, unsigned long compat, - unsigned long ro, unsigned long incompat) +static int journal_check_used_features (journal_t *journal, + unsigned long compat, + unsigned long ro, + unsigned long incompat) { journal_superblock_t *sb; @@ -1439,7 +1426,7 @@ * device this journal is present. */ -const char *journal_dev_name(journal_t *journal, char *buffer) +static const char *journal_dev_name(journal_t *journal, char *buffer) { struct block_device *bdev; @@ -1485,7 +1472,7 @@ /* Soft abort: record the abort error status in the journal superblock, * but don't do any other IO. */ -void __journal_abort_soft (journal_t *journal, int errno) +static void __journal_abort_soft (journal_t *journal, int errno) { if (journal->j_flags & JFS_ABORT) return; @@ -1888,7 +1875,7 @@ static struct proc_dir_entry *proc_jbd_debug; -int read_jbd_debug(char *page, char **start, off_t off, +static int read_jbd_debug(char *page, char **start, off_t off, int count, int *eof, void *data) { int ret; @@ -1898,7 +1885,7 @@ return ret; } -int write_jbd_debug(struct file *file, const char __user *buffer, +static int write_jbd_debug(struct file *file, const char __user *buffer, unsigned long count, void *data) { char buf[32]; @@ -1987,6 +1974,14 @@ { int ret; +/* Static check for data structure consistency. There's no code + * invoked --- we'll just get a linker failure if things aren't right. + */ + extern void journal_bad_superblock_size(void); + if (sizeof(struct journal_superblock_s) != 1024) + journal_bad_superblock_size(); + + ret = journal_init_caches(); if (ret != 0) journal_destroy_caches(); --- linux-2.6.12-rc6-mm1-full/fs/jbd/revoke.c.old 2005-06-14 03:58:36.000000000 +0200 +++ linux-2.6.12-rc6-mm1-full/fs/jbd/revoke.c 2005-06-14 03:58:41.000000000 +0200 @@ -116,7 +116,8 @@ (block << (hash_shift - 12))) & (table->hash_size - 1); } -int insert_revoke_hash(journal_t *journal, unsigned long blocknr, tid_t seq) +static int insert_revoke_hash(journal_t *journal, unsigned long blocknr, + tid_t seq) { struct list_head *hash_list; struct jbd_revoke_record_s *record; From adilger at clusterfs.com Tue Jun 14 23:19:23 2005 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 14 Jun 2005 19:19:23 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <20050614211405.GA26456@jabberwocky.com> References: <20050614211405.GA26456@jabberwocky.com> Message-ID: <20050614231923.GA12320@moraine.clusterfs.com> On Jun 14, 2005 17:14 -0400, David Shaw wrote: > Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9 > > This particular example is a SATA disk, but it has happened to a > regular old IDE disk as well. It is always the root partition. The > bad inode number varies (but is always either 3 or 9). There are no > other errors about the disk in the log. The "bad inode number" check is only for inodes inside the "reserved inode" area, namely inum < 12. The only commonly used (=valid) inode numbers in this range are the root inode (=2) and the journal inode (=8), so I suspect you are getting single-bit memory errors in bit 1, or if the controller is the same that would also be viewed with suspicion. It is very likely that you are getting other single-bit errors elsewhere but they are harder to notice. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From dshaw at jabberwocky.com Wed Jun 15 02:26:52 2005 From: dshaw at jabberwocky.com (David Shaw) Date: Tue, 14 Jun 2005 22:26:52 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <20050614231923.GA12320@moraine.clusterfs.com> References: <20050614211405.GA26456@jabberwocky.com> <20050614231923.GA12320@moraine.clusterfs.com> Message-ID: <20050615022652.GA27181@jabberwocky.com> On Tue, Jun 14, 2005 at 07:19:23PM -0400, Andreas Dilger wrote: > On Jun 14, 2005 17:14 -0400, David Shaw wrote: > > Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9 > > > > This particular example is a SATA disk, but it has happened to a > > regular old IDE disk as well. It is always the root partition. The > > bad inode number varies (but is always either 3 or 9). There are no > > other errors about the disk in the log. > > The "bad inode number" check is only for inodes inside the "reserved inode" > area, namely inum < 12. The only commonly used (=valid) inode numbers in > this range are the root inode (=2) and the journal inode (=8), so I suspect > you are getting single-bit memory errors in bit 1, or if the controller > is the same that would also be viewed with suspicion. It is very likely > that you are getting other single-bit errors elsewhere but they are harder > to notice. This is an interesting idea. Is there any simple way this sort of bit flip problem could happen outside of the hardware? I've had this happen on 4 different machines from 3 different vendors, 3 SATA, and 1 IDE. It seems almost impossible that it's a memory or controller error. David From tytso at mit.edu Wed Jun 15 13:19:43 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Wed, 15 Jun 2005 09:19:43 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <20050615022652.GA27181@jabberwocky.com> References: <20050614211405.GA26456@jabberwocky.com> <20050614231923.GA12320@moraine.clusterfs.com> <20050615022652.GA27181@jabberwocky.com> Message-ID: <20050615131943.GC4228@thunk.org> On Tue, Jun 14, 2005 at 10:26:52PM -0400, David Shaw wrote: > On Tue, Jun 14, 2005 at 07:19:23PM -0400, Andreas Dilger wrote: > > On Jun 14, 2005 17:14 -0400, David Shaw wrote: > > > Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9 > > > > > > This particular example is a SATA disk, but it has happened to a > > > regular old IDE disk as well. It is always the root partition. The > > > bad inode number varies (but is always either 3 or 9). There are no > > > other errors about the disk in the log. > > > > The "bad inode number" check is only for inodes inside the "reserved inode" > > area, namely inum < 12. The only commonly used (=valid) inode numbers in > > this range are the root inode (=2) and the journal inode (=8), so I suspect > > you are getting single-bit memory errors in bit 1, or if the controller > > is the same that would also be viewed with suspicion. It is very likely > > that you are getting other single-bit errors elsewhere but they are harder > > to notice. > > This is an interesting idea. Is there any simple way this sort of bit > flip problem could happen outside of the hardware? I've had this > happen on 4 different machines from 3 different vendors, 3 SATA, and 1 > IDE. It seems almost impossible that it's a memory or controller > error. I have to agree with Andreas' analysis. If you could, please send some compressed raw e2image dump files (see the man page for e2image, but basically we need is: "e2image -r /dev/sda5 - | bzip2 > sda5.e2i.bz2"), taken after the disk is remounted read-only. Then take another e2image dump after the system has rebooted in single user mode, but *before* running e2fsck on the filesystem. (That way we can check to see if the filesystem has changed between reboots --- that could indicate hardware problems, or in-memory corruption of the buffer cache due to some kernel bug.) The e2fsck transcript would also be useful, of course. The only other possible explanation I can imagine, beyond a hardware problem, or some strange kernel bug that no one else is seeing, is some a bug in some program that was directly accessing the disk drive; for example, if the bootloader attempted to update some state and wrote that state to the wrong place on disk, or some other program that was doing direct disk accesses, and it was always corrupting the same block(s) in the same way. Good luck, - Ted From theman at josephdwagner.info Wed Jun 15 17:13:34 2005 From: theman at josephdwagner.info (Joseph D. Wagner) Date: Wed, 15 Jun 2005 12:13:34 -0500 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <20050615131943.GC4228@thunk.org> Message-ID: <43vtj5$tska2f@mxip09a.cluster1.charter.net> > The only other possible explanation I can imagine, beyond a > hardware problem Just off the top of my head, could the number be reaching its limit and rolling over somehow? Joseph D. Wagner From adilger at clusterfs.com Wed Jun 15 17:22:05 2005 From: adilger at clusterfs.com ('Andreas Dilger') Date: Wed, 15 Jun 2005 13:22:05 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <43vtj5$tska2f@mxip09a.cluster1.charter.net> References: <20050615131943.GC4228@thunk.org> <43vtj5$tska2f@mxip09a.cluster1.charter.net> Message-ID: <20050615172205.GD12320@moraine.clusterfs.com> On Jun 15, 2005 12:13 -0500, Joseph D. Wagner wrote: > > The only other possible explanation I can imagine, beyond a > > hardware problem > > Just off the top of my head, could the number be reaching its limit > and rolling over somehow? Seems unlikely, as it isn't very easy to create a filesystem that uses 4B inodes. Would have to change a lot of config options to do that and have a very large device. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From tytso at mit.edu Wed Jun 15 19:00:55 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Wed, 15 Jun 2005 15:00:55 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <43vtj5$tska2f@mxip09a.cluster1.charter.net> References: <20050615131943.GC4228@thunk.org> <43vtj5$tska2f@mxip09a.cluster1.charter.net> Message-ID: <20050615190055.GA7722@thunk.org> On Wed, Jun 15, 2005 at 12:13:34PM -0500, Joseph D. Wagner wrote: > > The only other possible explanation I can imagine, beyond a > > hardware problem > > Just off the top of my head, could the number be reaching its limit > and rolling over somehow? The inode number? No. - Ted From tytso at mit.edu Wed Jun 15 19:05:47 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Wed, 15 Jun 2005 15:05:47 -0400 Subject: bad inode number followed by ext3_abort and remount readonly In-Reply-To: <20050615172205.GD12320@moraine.clusterfs.com> References: <20050615131943.GC4228@thunk.org> <43vtj5$tska2f@mxip09a.cluster1.charter.net> <20050615172205.GD12320@moraine.clusterfs.com> Message-ID: <20050615190547.GB7722@thunk.org> On Wed, Jun 15, 2005 at 01:22:05PM -0400, 'Andreas Dilger' wrote: > On Jun 15, 2005 12:13 -0500, Joseph D. Wagner wrote: > > > The only other possible explanation I can imagine, beyond a > > > hardware problem > > > > Just off the top of my head, could the number be reaching its limit > > and rolling over somehow? > > Seems unlikely, as it isn't very easy to create a filesystem that uses > 4B inodes. Would have to change a lot of config options to do that and > have a very large device. Not to mention that we don't do any arithmetic operations on inode numbers, so you're not going to see an overflow. - Ted From kapil.sampath at wipro.com Thu Jun 16 14:42:27 2005 From: kapil.sampath at wipro.com (kapil.sampath at wipro.com) Date: Thu, 16 Jun 2005 20:12:27 +0530 Subject: User directories in /home are missing Message-ID: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com> Hi, I use Redhat Enterprise Edition 2.4.21 kernel version. In my system all the user directories in /home disappeared. No one deleted it. But I don't know how it is missing. I had one or two open sessions where I already logged in as root and cd'ed to one particular home directory. >From there I am able to access files in that home directory. But not from a new session. In new session it always says no such file or directory. All my user accounts still exists. I have totally around 10 user accounts in that machine. I haven't rebooted that machine thinking that I can do something without reboot. Do anyone know the reason for this and if so how to recover it. Regards Kapil Sampath "The greatest sin is to think yourself weak" - Swami Vivekananda -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel at todo-linux.com Thu Jun 16 14:41:59 2005 From: manuel at todo-linux.com (Manuel Arostegui Ramirez) Date: Thu, 16 Jun 2005 16:41:59 +0200 Subject: User directories in /home are missing In-Reply-To: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com> References: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com> Message-ID: <200506161641.59909.manuel@todo-linux.com> El Jueves 16 Junio 2005 16:42, kapil.sampath at wipro.com escribi?: > Hi, > > I use Redhat Enterprise Edition 2.4.21 kernel version. In my system all > the user directories in /home disappeared. No one deleted it. But I > don't know how it is missing. I had one or two open sessions where I > already logged in as root and cd'ed to one particular home directory. > From there I am able to access files in that home directory. But not > from a new session. In new session it always says no such file or > directory. > > > > All my user accounts still exists. I have totally around 10 user > accounts in that machine. I haven't rebooted that machine thinking that > I can do something without reboot. > > > > Do anyone know the reason for this and if so how to recover it. > > > > Regards > > Kapil Sampath Have you look at lost+found ? -- Manuel Arostegui Ramirez #Linux Registered User 295750 Socio de Hispalinux 1813 Red Hat Linux 9, Kernel 2.6.2 ReiserFS Firma cifrada -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE+3O1MqfmPcHTj+twRAm yDAJ9P6ezepIMg06vOet/YPKxVoB+Z/ACfWVhh ---END PGP SIGNATURE----- From michoel_abrams at ml.com Thu Jun 16 15:09:46 2005 From: michoel_abrams at ml.com (Abrams, Michoel (IDS DCS PE)) Date: Thu, 16 Jun 2005 11:09:46 -0400 Subject: User directories in /home are missing Message-ID: <666CE4AC3B5F8E46A67D8F6FE504EBC00E3617F5@mlnya204mb.amrs.win.ml.com> by chance, did you recently enable autofs? a typical mount point for autofs is /home, & if autofs is started w/ that mount point specified, all your local files will not appear until you stop the autofs svc... HTH Mike. -----Original Message----- From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] On Behalf Of kapil.sampath at wipro.com Sent: Thursday, June 16, 2005 10:42 AM To: ext3-users at redhat.com; linux_lovers at yahoogroups.com Subject: User directories in /home are missing Hi, I use Redhat Enterprise Edition 2.4.21 kernel version. In my system all the user directories in /home disappeared. No one deleted it. But I don't know how it is missing. I had one or two open sessions where I already logged in as root and cd'ed to one particular home directory. >From there I am able to access files in that home directory. But not from a new session. In new session it always says no such file or directory. All my user accounts still exists. I have totally around 10 user accounts in that machine. I haven't rebooted that machine thinking that I can do something without reboot. Do anyone know the reason for this and if so how to recover it. Regards Kapil Sampath "The greatest sin is to think yourself weak" - Swami Vivekananda -------------------------------------------------------- If you are not an intended recipient of this e-mail, please notify the sender, delete it and do not read, act upon, print, disclose, copy, retain or redistribute it. Click here for important additional terms relating to this e-mail. http://www.ml.com/email_terms/ -------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvolaski at aecom.yu.edu Fri Jun 17 22:28:03 2005 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Fri, 17 Jun 2005 18:28:03 -0400 Subject: [Q] Is this true and does it mean there is dynamic defragmentation in ext2/3? Message-ID: Someone recently posted the following statement midway down the page at http://forums.gentoo.org/viewtopic-t-305871-postdays-0-postorder-asc-highlight-ext3+ordered+data-start-25.html >You don't need to defragment ext2/ext3 because as you use the >filesystem file blocks and inodes are moved around and reallocated >to keep the data nearly contiguous. It's not perfect, but it works >fairly well and you should almost never see a performance >degradation caused by the filesystem's fragmentation. Is this statement accurate and does it mean ext2/3 is performing a sort of dynamic defragmentation? -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From tytso at mit.edu Sat Jun 18 19:14:51 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Sat, 18 Jun 2005 15:14:51 -0400 Subject: [Q] Is this true and does it mean there is dynamic defragmentation in ext2/3? In-Reply-To: References: Message-ID: <20050618191451.GC16314@thunk.org> On Fri, Jun 17, 2005 at 06:28:03PM -0400, Maurice Volaski wrote: > >You don't need to defragment ext2/ext3 because as you use the > >filesystem file blocks and inodes are moved around and reallocated > >to keep the data nearly contiguous. It's not perfect, but it works > >fairly well and you should almost never see a performance > >degradation caused by the filesystem's fragmentation. > > Is this statement accurate and does it mean ext2/3 is performing a > sort of dynamic defragmentation? No, not true. (At least not today) Ext2/3 has advanced algorithms to make sure that the blocks that are allocated avoid fragmentation, but it is not doing any kind of dynamic moving of blocks/inodes. (At least, not yet; there has been some talk about creating enough kernel hooks so that a user-space program could do dynamic defragmentation of the filesystem, but none of this exists at the moment.) - Ted From menscher at uiuc.edu Sat Jun 18 19:36:59 2005 From: menscher at uiuc.edu (Damian Menscher) Date: Sat, 18 Jun 2005 14:36:59 -0500 (CDT) Subject: [Q] Is this true and does it mean there is dynamicmentation in ext2/3? In-Reply-To: <20050618191451.GC16314@thunk.org> References: <20050618191451.GC16314@thunk.org> Message-ID: On Sat, 18 Jun 2005, Theodore Ts'o wrote: > On Fri, Jun 17, 2005 at 06:28:03PM -0400, Maurice Volaski wrote: >>> You don't need to defragment ext2/ext3 because as you use the >>> filesystem file blocks and inodes are moved around and reallocated >>> to keep the data nearly contiguous. It's not perfect, but it works >>> fairly well and you should almost never see a performance >>> degradation caused by the filesystem's fragmentation. >> >> Is this statement accurate and does it mean ext2/3 is performing a >> sort of dynamic defragmentation? > > Ext2/3 has advanced algorithms to make sure that the blocks that are > allocated avoid fragmentation, but it is not doing any kind of dynamic > moving of blocks/inodes. It's probably worth noting that SGI's XFS filesystem has a userland program to eliminate fragmentation: fsr (file system reorganizer). It basically works by copying files around, and depending on the underlying filesystem to allocate contiguous blocks for the new copies of files. It's a neat hack to allow you to defrag a drive without needing too much kernel-mode involvement. Of course, you probably would need some special stuff to ensure inode numbers don't change (NFS depends on them for filehandles, etc). Damian Menscher -- -=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=- -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=- -=#| 4602 Beckman, VMIL/MS, Imaging Technology Group:(217)244-3074 |#=- -=#| www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=- -=#| The above opinions are not necessarily those of my employers. |#=- From jjletho67-3txe at yahoo.it Sun Jun 19 12:24:27 2005 From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it) Date: Sun, 19 Jun 2005 14:24:27 +0200 (CEST) Subject: ext3 offline resizing Message-ID: <20050619122427.99557.qmail@web25610.mail.ukl.yahoo.com> Hi all, I want to setup a linux workstation with FC4 and with all the partitions (except for /boot) under LVM to be able to resize them in future. I don't need online resizing, I can shutdown the system and reboot with the rescuecd when needed. I have done some test on this configuration and I have sverals doubts: If i format a partition with the resize_inode feature enabled and I resize it offline with resize2fs all is ok until I reach (in a single step or in several step) 1000*(Original Size). When i extend a partition over this size I receive a couple of error when I launch fsck.ext3 and the resize_inode feature disappears. If I format a partition without the resize_inode feature then i can resize it to any size, but after the resize the first launch of fsck.ext3 gives always this error: Backing up journal inode block information In both the scenarios after the resize I'm able to mount the partitions. The question is: for offline resizing is better to diseble the resize_inode feature ? Is the error message i wrote really an error ? jjletho P.S. I'm sorry but at the moment the subscription form does not work so if you can put me in cc... ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it From jjletho67-3txe at yahoo.it Mon Jun 20 07:23:13 2005 From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it) Date: Mon, 20 Jun 2005 09:23:13 +0200 (CEST) Subject: ext3 offline resizing Message-ID: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com> Hi all, I want to setup a linux workstation with FC4 and with all the partitions (except for /boot) under LVM to be able to resize them in future. I don't need online resizing, I can shutdown the system and reboot with the rescuecd when needed. I have done some test on this configuration and I have sverals doubts: If i format a partition with the resize_inode feature enabled and I resize it offline with resize2fs all is ok until I reach (in a single step or in several step) 1000*(Original Size). When i extend a partition over this size I receive a couple of error when I launch fsck.ext3 and the resize_inode feature disappears. If I format a partition without the resize_inode feature then i can resize it to any size, but after the resize the first launch of fsck.ext3 gives always this error: Backing up journal inode block information In both the scenarios after the resize I'm able to mount the partitions. The question is: for offline resizing is better to disable the resize_inode feature ? Is the error message i wrote really an error ? jjletho ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it From amolsurati at yahoo.com Mon Jun 20 09:55:47 2005 From: amolsurati at yahoo.com (amol surati) Date: Mon, 20 Jun 2005 02:55:47 -0700 (PDT) Subject: does fsck.ext3 read data blocks? Message-ID: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com> hi, I am working on a file system related project. I want to know whether the fsck utility for ext3 reads data blocks (storing user files,etc) at any stage? thankin you, amol __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From theman at josephdwagner.info Mon Jun 20 10:00:59 2005 From: theman at josephdwagner.info (Joseph D. Wagner) Date: Mon, 20 Jun 2005 05:00:59 -0500 Subject: does fsck.ext3 read data blocks? In-Reply-To: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com> Message-ID: <4403a9$13jpqh0@mxip13a.cluster1.charter.net> > I want to know whether the fsck utility for ext3 > reads data blocks (storing user files,etc) at any > stage? Only if you tell it to check for bad blocks. Joseph D. Wagner From tytso at mit.edu Mon Jun 20 16:58:41 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Mon, 20 Jun 2005 12:58:41 -0400 Subject: does fsck.ext3 read data blocks? In-Reply-To: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com> References: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com> Message-ID: <20050620165841.GA30339@thunk.org> On Mon, Jun 20, 2005 at 02:55:47AM -0700, amol surati wrote: > > I am working on a file system related project. > I want to know whether the fsck utility for ext3 > reads data blocks (storing user files,etc) at any > stage? > E2fsck will read data blocks for directory inodes and symbolic links where the target of the symlink is greater than 60 bytes. - Ted From domen at coderock.org Mon Jun 20 21:55:53 2005 From: domen at coderock.org (domen at coderock.org) Date: Mon, 20 Jun 2005 23:55:53 +0200 Subject: [patch 1/3] fs/ext3/super.c: fix sparse warnings Message-ID: <20050620215553.356624000@nd47.coderock.org> An embedded and charset-unspecified text was scrubbed... Name: sparse-fs_ext3_super.patch URL: From domen at coderock.org Mon Jun 20 21:55:54 2005 From: domen at coderock.org (domen at coderock.org) Date: Mon, 20 Jun 2005 23:55:54 +0200 Subject: [patch 2/3] fs/ext3/resize.c: fix sparse warnings Message-ID: <20050620215554.514639000@nd47.coderock.org> An embedded and charset-unspecified text was scrubbed... Name: sparse-fs_ext3_resize.patch URL: From domen at coderock.org Mon Jun 20 21:55:55 2005 From: domen at coderock.org (domen at coderock.org) Date: Mon, 20 Jun 2005 23:55:55 +0200 Subject: [patch 3/3] Fix misleading gcc4 warning, size may be used uninitialized (ext3) Message-ID: <20050620215555.322563000@nd47.coderock.org> An embedded and charset-unspecified text was scrubbed... Name: gcc4-fs_ext3_acl.c URL: From tytso at mit.edu Tue Jun 21 02:20:36 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Mon, 20 Jun 2005 22:20:36 -0400 Subject: ext3 offline resizing In-Reply-To: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com> References: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com> Message-ID: <20050621022036.GB29949@thunk.org> On Mon, Jun 20, 2005 at 09:23:13AM +0200, jjletho67-3txe at yahoo.it wrote: > I want to setup a linux workstation with FC4 and with > all the partitions (except for /boot) under LVM to be > able to resize them in future. I don't need online > resizing, I can shutdown the system and reboot with > the rescuecd when needed. > I have done some test on this configuration and I have > sverals doubts: > > If i format a partition with the resize_inode feature > enabled and I resize it offline with resize2fs all is > ok until I reach (in a single step or in several step) > 1000*(Original Size). When i extend a partition over > this size I receive a couple of error when I launch > fsck.ext3 and the resize_inode feature disappears. What error(s) are you getting? Not all messages from e2fsck are errors, you know. Some are informative messages, such as: Backing up journal inode block information > If I format a partition without the resize_inode > feature then i can resize it to any size, but after > the resize the first launch of fsck.ext3 gives always > this error: > > Backing up journal inode block information > > In both the scenarios after the resize I'm able to > mount the partitions. > The question is: for offline resizing is better to > disable the resize_inode feature ? > Is the error message i wrote really an error ? Resize2fs will take advantage of the on-line resizing inode to do off-line resizes faster and more safely. It should work with or without, it, though. It should work fine; send the error messages if you'd like me to give you an opinion about them. - Ted From puhuri at iki.fi Tue Jun 21 05:07:50 2005 From: puhuri at iki.fi (Markus Peuhkuri) Date: Tue, 21 Jun 2005 08:07:50 +0300 Subject: [Q] Is this true and does it mean there is dynamic defragmentation in ext2/3? In-Reply-To: <20050618191451.GC16314@thunk.org> References: <20050618191451.GC16314@thunk.org> Message-ID: <42B7A0A6.7010806@iki.fi> (by mistake only replied to Ted, sorry) Theodore Ts'o wrote: >Ext2/3 has advanced algorithms to make sure that the blocks that are >allocated avoid fragmentation, but it is not doing any kind of dynamic > > And there is a tool 'filefrag' in e2fsprogs that reports how fragmented a particular file is. If your disk grows full (over 90-95%, depending on file sizes etc..) then it is more difficult to find continuous blocks for files. Now, if you delete files, then new files most probably are non-fragmented but those files that were written when disk was full are still fragmented. You can "unfragment" those files just by copying them and deleting old ones (if you have plenty of free space), but as Damian told, you must be careful with locks and nfs handles. -- http://www.iki.fi/puhuri/ From jjletho67-3txe at yahoo.it Tue Jun 21 08:02:07 2005 From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it) Date: Tue, 21 Jun 2005 10:02:07 +0200 (CEST) Subject: ext3 offline resizing In-Reply-To: <20050621022036.GB29949@thunk.org> Message-ID: <20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com> Hi, > > What error(s) are you getting? Not all messages > from e2fsck are > errors, you know. Some are informative messages, > such as: > > Backing up journal inode block information > > Resize2fs will take advantage of the on-line > resizing inode to do > off-line resizes faster and more safely. It should > work with or > without, it, though. It should work fine; send the > error messages if > you'd like me to give you an opinion about them. > > - Ted > with resize_inode feature DISABLED the only warning/message I obtain when doing "fsck.ext3 -f" after the resize is: " Backing up journal inode block information " with resize_inode feature ENABLED when extending to size > 1000*(originalSize) the errors I obtain are: Resize_inode not enabled, but resize inode is non-zero Clear ... Block bitmap differences: -57 Fix Free blocks count wrong for group #0 (6146, counted=6147) Fix Free blocks count wrong (2194393, counted=2194394) Fix After answering yes to all requests I can mount and use the partition, but the resize_inode feature disappears. The example was on a test partition of 2M extended to 2.1G but I made other tests with greater size (50M -> 110G for example) with the same result. letho ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it From tytso at mit.edu Tue Jun 21 13:50:24 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Tue, 21 Jun 2005 09:50:24 -0400 Subject: ext3 offline resizing In-Reply-To: <20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com> References: <20050621022036.GB29949@thunk.org> <20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com> Message-ID: <20050621135024.GD13207@thunk.org> On Tue, Jun 21, 2005 at 10:02:07AM +0200, jjletho67-3txe at yahoo.it wrote: > > with resize_inode feature ENABLED when extending to > size > 1000*(originalSize) the errors I obtain are: > > Resize_inode not enabled, but resize inode is non-zero > Clear > ... > Block bitmap differences: -57 Fix > > Free blocks count wrong for group #0 (6146, > counted=6147) Fix > > Free blocks count wrong (2194393, counted=2194394) > Fix Oh, OK. Resize2fs isn't clearing the left-over resize inode after it's used all of the reserved blocks. That should be fixed. - Ted From jjletho67-3txe at yahoo.it Tue Jun 21 18:58:59 2005 From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it) Date: Tue, 21 Jun 2005 20:58:59 +0200 (CEST) Subject: ext3 offline resizing In-Reply-To: <20050621135024.GD13207@thunk.org> Message-ID: <20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com> Hi, --- Theodore Ts'o ha scritto: > > Oh, OK. Resize2fs isn't clearing the left-over > resize inode after > it's used all of the reserved blocks. That should > be fixed. > > - Ted Ok now is much more clear. In our opinion in this moment (without waiting for a fix) what is the better solution for an ext3 based system which will often need to resize its partitions (offline with resize2fs) ? Disabling the resize_inode feature is safer ? Or is it better to use the resize_inode feature and choose a better initial size ? Is the "1000*(original size)" limit I guessed correct ? I'm sorry but i wasn't able to find any deep documentation about resize_inode. Thank you very much! letho ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it From tytso at mit.edu Tue Jun 21 22:20:31 2005 From: tytso at mit.edu (Theodore Ts'o) Date: Tue, 21 Jun 2005 18:20:31 -0400 Subject: ext3 offline resizing In-Reply-To: <20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com> References: <20050621135024.GD13207@thunk.org> <20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com> Message-ID: <20050621222031.GA17224@thunk.org> On Tue, Jun 21, 2005 at 08:58:59PM +0200, jjletho67-3txe at yahoo.it wrote: > > Ok now is much more clear. In our opinion in this > moment (without waiting for a fix) what is the better > solution for an ext3 based system which will often > need to resize its partitions (offline with resize2fs) > ? > Disabling the resize_inode feature is safer ? Or is it > better to use the resize_inode feature and choose a > better initial size ? > Is the "1000*(original size)" limit I guessed correct > ? > I'm sorry but i wasn't able to find any deep > documentation about resize_inode. It's should be better to use the resize inode. The filesystem inconsistency reported by e2fsck is just e2fsck being very nit-picky; there is no danger of losing data as a result of this. If you use the resize inode, it will allow you to do on-line resizes up to 1000*original_size by default; this figure however can be adjusted by "mke2fs -E resize=" option --- see the mke2fs man page for more details. If the resize inode present, off-line resizes will use those reserved block to allow for fast resizing that doesn't require moving data blocks belonging to files, directories, or the inode table around. It's not a limit, though; if you try to resize a filesystem bigger than 1000*original size (or whatever on-line resizing limit you specified to mke2fs), resize2fs will still work, but it may have to move filesystem data blocks around in order to accomplish the resize. The costs of the online resize inode is that you have to pay a slight penalty upfront in reserved blocks; but given the size of modern disks, the overhead isn't particularly great. - Ted From mvolaski at aecom.yu.edu Sun Jun 26 05:04:08 2005 From: mvolaski at aecom.yu.edu (Maurice Volaski) Date: Sun, 26 Jun 2005 01:04:08 -0400 Subject: [Q] Is errors=panic safe to use, and will it detect a RAID gone psycho? Message-ID: I have had in years past seen hardware (SCSI) RAID controllers lose it electronically causing the kernel to fill the logs with scary SCSI messages and ext3 to complain about "holes" in the filesystem like so: Sep 7 14:47:17 thewarehouse1 kernel: EXT3-fs error (device sd(8,81)): ext3_readdir: directory #376833 contains a hole at offset 0 I'm using drbd and heartbeat so whatever gets written to the hardware RAID gets written independently to a second RAID on a second computer. It would be nice if in the unlikely event hardware failed to cause something bad such as the one aforementioned to trigger the computer to fail entirely and force heartbeat/drbd to kick in on the second computer. If I set the error behavior with tune2fs to panic, would this happen? That is, is this the type of error that would trigger a panic? Are there minor errors that could unnecessarily trigger one? -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University From yvanoers at xs4all.nl Sun Jun 26 13:30:42 2005 From: yvanoers at xs4all.nl (Yuri van Oers) Date: Sun, 26 Jun 2005 15:30:42 +0200 (CEST) Subject: Assertion failure in do_get_write_access() Message-ID: <20050626151224.L25249-100000@xs3.xs4all.nl> Hi, I just had my server cry this out to the console: Assertion failure in do_get_write_access() at transaction.c:658: "jh->b_transaction == journal->j_committing_transaction" kernel BUG at transaction.c:658! invalid operand: 0000 CPU: 0 EIP: 0010:[] Not tainted EFLAGS: 00010286 eax: 0000007d ebx: c2ff4200 ecx: c243e000 edx: c068af00 esi: c0d6d900 edi: c2ff4200 ebp: c03f4190 esp: c243fd44 ds: 0018 es: 0018 ss: 0018 Process gzip (pid: 28630, stackpage=c243f000) Stack: c0273660 c027385b c0273640 00000292 c0273920 c2ff4200 c18dd660 c03f4190 c2ff4294 00000001 c0155dd6 00000000 00000000 00000000 c2f0f740 00000000 c243fdc4 c243fdc4 c2ff4200 c015e4c8 c18dd660 c03f4190 00000000 00000006 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 92 02 40 36 27 c0 83 c4 14 8b 45 08 83 f8 06 0f 85 85 The version of Linux is 2.4.28. It took a hardware reset to get /dev/sda2 (the partition with the fs I suspect was aching) back online. At the time of the error I was tarring jpegs to that partition. I could not find any references to this sort of bug being encountered or fixed after 2.4.28, so I felt obliged to report it. Is it a known bug? If so, has it been fixed? If not, can it be fixed & do you need my help or any other info? Regards, Yuri From hahaha_30k at yahoo.com Tue Jun 28 22:01:51 2005 From: hahaha_30k at yahoo.com (ha haha) Date: Tue, 28 Jun 2005 15:01:51 -0700 (PDT) Subject: How to figure out underlying failed disk(parttions) and sector(s) position ??? Message-ID: <20050628220151.27698.qmail@web30202.mail.mud.yahoo.com> Hi, with being exposed to more and more failed hard disks reports, I've accumulated several questions of the logged messages in /var/log/messages file: like how to identifying failed disks(partitions), where is the exact failed sector(s) on the hard disk, and why badblocks reports OK to the reported disk failure. Let me explained the above with the following several example. scenario #1, a PATA hard disk failed.. Host: host1 ..... Jun 21 16:55:09 host1 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 21 16:55:09 host1 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=234304234, high=13, low=16200426, sector=196487120 Jun 21 16:55:09 host1 kernel: end_request: I/O error, dev 03:0b (hda), sector 196487120 ..... The case #1, showed that /dev/hda failed in a simple way, but some stuff are not obvious: Q1: which failed partition? "dev 03:0b" looks like /dev/hda11, but no quick document explains it, /dev/hda11 is just a guess based on the follwing. -- maybe "/dev/hda11" is a better string to log if my guess is true? root at host1# ls -alF /dev/hda11 brw-rw---- 1 root disk 3, 11 Sep 15 2003 /dev/hda11 Q2: which sector failed? looks like it is sector 234304234 in LBA mode, and relative to the whole disk /dev/hda(relative to sector 0 of /dev/hda); while it is the sector 196487120 when relative to /dev/hda11 (sector 0 of /dev/hda11 partition, which may be Gigabytes offset from the beginning of underlying disk). Is this a reasonable guess? Q3: what does the "high=13, low=16200426" means? Q4: Does linux kernel disk driver tries to relocate the failed sector --mapping access to it to some other good sector first, before failling and logging errors to /var/log/messages? Q5: "badblocks -s -v -n ..." sometimes can not find any disk problems even there were reported disk I/O problems in /var/log/messages? what does that mean? Thanks a lot... ____________________________________________________ Yahoo! Sports Rekindle the Rivalries. Sign up for Fantasy Football http://football.fantasysports.yahoo.com From hahaha_30k at yahoo.com Tue Jun 28 19:47:05 2005 From: hahaha_30k at yahoo.com (ha haha) Date: Tue, 28 Jun 2005 12:47:05 -0700 (PDT) Subject: figure out underlying failed disk(parttions) and sector(s) position ??? Message-ID: <20050628194705.38170.qmail@web30211.mail.mud.yahoo.com> Hi, with being exposed to more and more failed hard disks reports, I've accumulated several questions of the logged messages in /var/log/messages file: like how to identifying failed disks(partitions), where is the exact failed sector(s) on the hard disk, and why badblocks reports OK to the reported disk failure. Let me explained the above with the following several example. scenario #1, a PATA hard disk failed.. Host: host1 ..... Jun 21 16:55:09 host1 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 21 16:55:09 host1 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=234304234, high=13, low=16200426, sector=196487120 Jun 21 16:55:09 host1 kernel: end_request: I/O error, dev 03:0b (hda), sector 196487120 ..... The case #1, showed that /dev/hda failed in a simple way, but some stuff are not obvious: Q1: which failed partition? "dev 03:0b" looks like /dev/hda11, but no quick document explains it, /dev/hda11 is just a guess based on the follwing. -- maybe "/dev/hda11" is a better string to log if my guess is true? root at host1# ls -alF /dev/hda11 brw-rw---- 1 root disk 3, 11 Sep 15 2003 /dev/hda11 Q2: which sector failed? looks like it is sector 234304234 in LBA mode, and relative to the whole disk /dev/hda(relative to sector 0 of /dev/hda); while it is the sector 196487120 when relative to /dev/hda11 (sector 0 of /dev/hda11 partition, which may be Gigabytes offset from the beginning of underlying disk). Is this a reasonable guess? Q3: what does the "high=13, low=16200426" means? Q4: Does linux kernel disk driver tries to relocate the failed sector --mapping access to it to some other good sector first, before failling and logging errors to /var/log/messages? Q5: "badblocks -s -v -n ..." sometimes can not find any disk problems even there were reported disk I/O problems in /var/log/messages? what does that mean? Thanks a lot... __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com