From kapil.sampath at wipro.com  Thu Jun  2 04:52:51 2005
From: kapil.sampath at wipro.com (kapil.sampath at wipro.com)
Date: Thu, 2 Jun 2005 10:22:51 +0530
Subject: passwd : Module is unknown (Redhat 9 Enterprise Edition)
Message-ID: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com>

Hi All,

Can anyone help me in resolving this problem. 

 

I use Redhat 9 Enterprise edition. I have a session in which I logged in
as a root. When I issue the command "su" from any other user it is
throwing error "su : Incorrect password", If I try to change the
password from the root session, it is throwing error "passwd : module
unknown". 

 

[root at TESTING root]# su

su: incorrect password

[root at TESTING root]# passwd

Changing password for user root.

passwd: Module is unknown

[root at TESTING root]# which passwd

/usr/bin/passwd

[root at TESTING root]# ls /etc/passwd

/etc/passwd

[root at TESTING root]#

 

[root at TESTING root]# uname -a

Linux TESTING 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686
i686 i386 GNU/Linux

[root at TESTING root]#

 

As a normal user other than root

 

[kapil at TESTING kapil]$ su

su: incorrect password

[kapil at TESTING kapil]$

 

Please help me in resolving this problem.

 

---------------------------------------

Thanks & Regards
Kapil Sampath
Wipro Technologies,
475A, Old Mahabalipuram Road,
Shollinganallur, Chennai - 600 119
Dir: +91-44-30691701 Ext : 41701

Mobile : +91-9444101619

__________________________

 

"Men should be taught to be practical, physically strong.

A dozen such lions will conquer the world,

Not millions of sheep."

            - Swami Vivekananda

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050602/26f9c718/attachment.htm>

From menscher at uiuc.edu  Thu Jun  2 05:02:09 2005
From: menscher at uiuc.edu (Damian Menscher)
Date: Thu, 2 Jun 2005 00:02:09 -0500 (CDT)
Subject: passwd : Module is unknown (Redhat 9 Enterprise Edition)
In-Reply-To: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com>
References: <2FEE63312285CF428A8480B07AC1C359526C66@CHN-SNR-MBX01.wipro.com>
Message-ID: <Pine.LNX.4.62.0506012356000.24617@lx2.physics.uiuc.edu>

On Thu, 2 Jun 2005 kapil.sampath at wipro.com wrote:

> I use Redhat 9 Enterprise edition. I have a session in which I logged in
> as a root. When I issue the command "su" from any other user it is
> throwing error "su : Incorrect password", If I try to change the
> password from the root session, it is throwing error "passwd : module
> unknown".

First of all, there is no such thing as Redhat 9 Enterprise -- it looks 
like you're running RHEL3, unpatched.  Secondly, your question has 
nothing to do with ext3, so you will likely get a more helpful response 
elsewhere.  But if I had to guess, I'd say your /etc/pam.d/system-auth 
is corrupted.  You might try reinstalling your pam rpm.

Damian Menscher
-- 
-=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=-
-=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=-
-=#| 4602 Beckman, VMIL/MS, Imaging Technology Group:(217)244-3074 |#=-
-=#| <menscher at uiuc.edu> www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=-
-=#| The above opinions are not necessarily those of my employers. |#=-



From brugolsky at telemetry-investments.com  Tue Jun  7 16:04:34 2005
From: brugolsky at telemetry-investments.com (Bill Rugolsky Jr.)
Date: Tue, 7 Jun 2005 12:04:34 -0400
Subject: transaction->t_forget == NULL assertion failure with data=journal
Message-ID: <20050607160434.GC16192@ti64.telemetry-investments.com>

It appears that this bug in data=journal mode,

   https://listman.redhat.com/archives/ext3-users/2005-February/msg00045.html

isn't fixed in 2.6.11.11.

Andrew, I've CC'd you since you have previously looked at this specific issue.

I'm seeing this problem on dual-Opteron x86-64 boxes serving NFS + Samba3 to
a few dozen clients; it takes several hours at high load to reproduce.
We have not tested 2.6.12-rc6 yet, as I need to schedule time for the
clients on the cluster.  I will try and do that ASAP.   I see several important
fixes on the bk-commits-head list, but none of them jump out at me as being
obviously more relevant to data=journal than data=ordered.

Meanwhile, I'll endeavor and reproduce this locally.  It would be really
useful to hunt this down and kill it, because NFS over Ext3 otherwise
performs very well in data=journal mode.  Suggestions welcome.

	-Bill

Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:625: "transaction->t_forget == NULL"
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at checkpoint:625
invalid operand: 0000 [1]
SMP
CPU 1

Modules linked in: e1000 qla2300 qla2xxx netconsole thermal processor fan button battery ac eeprom adm1026 i2c_sensor i2c_amd756 i2c_core
Pid: 17828, comm: kjournald Not tainted 2.6.11.11
RIP: 0010:[<ffffffff801f347f>] ffffffff801f347f>{__journal_drop_transaction+319}
RSP: 0018:ffff810028841b58  EFLAGS: 00010296
RAX: 0000000000000071 RBX: ffff8100f840fe00 RCX: ffffffff80612d88
RDX: ffffffff80612d88 RSI: 0000000000000292 RDI: ffffffff80612d80
RBP: ffff8100faf21800 R08: ffff8100f7b45b40 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100ccba49c0
R13: ffff8100faf21800 R14: 0000000000000000 R15: ffff8100faf2195c
FS:  00002aaaaade8b00(0000) GS:ffffffff80847e00(0000) knlGS:00000000557b26c0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaaac2000 CR3: 000000007f6fb000 CR4: 00000000000006e0
Process kjournald (pid: 17828, threadinfo ffff810028840000, task ffff8100fb3e0230)
Stack: ffff81007ee618c8 ffff8100faf21800 ffff81003adc3ce8 ffffffff801f36c3
       ffff81002fc7d1e8 ffffffff801f275e 0000000100000000 ffff8100faf21824
       00000e8c00000000 ffff8100b32d0174
Call Trace:
ffffffff801f36c3>{__journal_remove_checkpoint+99}
ffffffff801f275e>{journal_commit_transaction+3534}
ffffffff8014aec0>{autoremove_wake_function+0}
ffffffff8014aec0>{autoremove_wake_function+0}
ffffffff8012f047>{recalc_task_prio+327}
ffffffff801f6e2c>{kjournald+268}
ffffffff8014aec0>{autoremove_wake_function+0}
ffffffff80175b4e>{filp_close+126}
ffffffff8014aec0>{autoremove_wake_function+0}
ffffffff801f6fd0>{commit_timeout+0}
ffffffff8010f0f7>{child_rip+8}
ffffffff801f6d20>{kjournald+0}
ffffffff8010f0ef>{child_rip+0}
Code: 0f 0b b9 12 59 80 ff ff ff ff 71 02 66 66 90 66 90 48 83 7b

RIP ffffffff801f347f>{__journal_drop_transaction+319}
RSP <ffff810028841b58>



From nebid2005 at yahoo.com.au  Wed Jun  8 05:16:59 2005
From: nebid2005 at yahoo.com.au (Matt Smith)
Date: Wed, 8 Jun 2005 15:16:59 +1000 (EST)
Subject: clone RHEL 4 ext3 partition
Message-ID: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com>

Hi,

I'm about to roll out a whole bunch of Redhat
Enterprise 4 workstations and have run into problems
cloning from the original.

Normally I would use ghost (v7.5) because it does a
nice job when cloning to a different sized
disk.Unfortunately it comes up with read error 29004.
Looking around it seems that Symantec don't support
Fedora Core 3 (with Ghost v.8 - don't know if v.9
works ???).

Next option was to use dd.
This worked fine but when I went to resize the
partition I noticed that Redhat have removed resize2fs
from e2fsprogs. 
After installing the Redhat e2fsprogs
source rpm (and the newest from sourceforge - i tried
both) and after compiling the resize2fs binary i got
an error - "Filesystem has unsupported feature(s)"

All other techniques that i know eg. dump cpio all
need    to resize the partition after imaging.

THE QUESTION: How do i clone RHEL 4 ext3 partitions to
a different sized disk ?

Thanks
Matt
    


Send instant messages to your online friends http://au.messenger.yahoo.com 



From adilger at clusterfs.com  Wed Jun  8 05:55:31 2005
From: adilger at clusterfs.com (Andreas Dilger)
Date: Tue, 7 Jun 2005 23:55:31 -0600
Subject: clone RHEL 4 ext3 partition
In-Reply-To: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com>
References: <20050608051700.66634.qmail@web33614.mail.mud.yahoo.com>
Message-ID: <20050608055531.GZ14004@schnapps.adilger.int>

On Jun 08, 2005  15:16 +1000, Matt Smith wrote:
> I'm about to roll out a whole bunch of Redhat
> Enterprise 4 workstations and have run into problems
> cloning from the original.
> 
> Normally I would use ghost (v7.5) because it does a
> nice job when cloning to a different sized
> disk.Unfortunately it comes up with read error 29004.
> Looking around it seems that Symantec don't support
> Fedora Core 3 (with Ghost v.8 - don't know if v.9
> works ???).
> 
> Next option was to use dd.
> This worked fine but when I went to resize the
> partition I noticed that Redhat have removed resize2fs
> from e2fsprogs. 
> After installing the Redhat e2fsprogs
> source rpm (and the newest from sourceforge - i tried
> both) and after compiling the resize2fs binary i got
> an error - "Filesystem has unsupported feature(s)"

Both problems are likely from the same root cause - namely
some strange feature enabled in your filesystem.  

What does "dumpe2fs -h {dev} | grep feature" say about
your filesystem?  AFAIK the latest e2fsprogs should
support all the ext3 features in a released kernel.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From maneesh at in.ibm.com  Wed Jun  8 07:36:58 2005
From: maneesh at in.ibm.com (Maneesh Soni)
Date: Wed, 8 Jun 2005 13:06:58 +0530
Subject: [BUGME 4683] oops at log_do_checkpoint+0xa1/0x230
Message-ID: <20050608073658.GD5900@in.ibm.com>

Hello,

I was wondering if we can get some help in resolving this bug. It was
reported earlier and logged in bugme.osdl.org

http://bugme.osdl.org/show_bug.cgi?id=4683

The problem is kernel oops at log_do_checkpoint() due to NULL buffer_head. This
could be because of some race in journalling code for which I don't have much
clue. There is kdump available for analysis as mentioned in the bugme.

Thanks
Maneesh

-- 
Maneesh Soni
Linux Technology Center, 
IBM India Software Labs,
Bangalore, India
email: maneesh at in.ibm.com
Phone: 91-80-25044990



From cchan at outblaze.com  Thu Jun  9 07:46:30 2005
From: cchan at outblaze.com (Christopher Chan)
Date: Thu, 09 Jun 2005 15:46:30 +0800
Subject: kjournald pegging cpu
Message-ID: <42A7F3D6.4030907@outblaze.com>

kernel version 2.6.10-1.771_FC2smp

We have had quite a few instances of kjournald pegging cpu and thereby 
effectively knocking out the system's i/o.

What can we do to provide more information so that the cause can be 
identified and fixed?

Thanks

Christopher



From adilger at clusterfs.com  Thu Jun  9 07:59:25 2005
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 9 Jun 2005 01:59:25 -0600
Subject: kjournald pegging cpu
In-Reply-To: <42A7F3D6.4030907@outblaze.com>
References: <42A7F3D6.4030907@outblaze.com>
Message-ID: <20050609075925.GJ14004@schnapps.adilger.int>

On Jun 09, 2005  15:46 +0800, Christopher Chan wrote:
> kernel version 2.6.10-1.771_FC2smp
> 
> We have had quite a few instances of kjournald pegging cpu and thereby 
> effectively knocking out the system's i/o.
> 
> What can we do to provide more information so that the cause can be 
> identified and fixed?

When the problem hits use sysrq-t and/or sysrq-p on the console to dump
the stack of kjournald, as a starting point to see what it is doing.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From sct at redhat.com  Thu Jun  9 13:23:26 2005
From: sct at redhat.com (Stephen C. Tweedie)
Date: Thu, 09 Jun 2005 14:23:26 +0100
Subject: kjournald pegging cpu
In-Reply-To: <20050609075925.GJ14004@schnapps.adilger.int>
References: <42A7F3D6.4030907@outblaze.com>
	<20050609075925.GJ14004@schnapps.adilger.int>
Message-ID: <1118323405.4851.92.camel@sisko.sctweedie.blueyonder.co.uk>

Hi,

On Thu, 2005-06-09 at 08:59, Andreas Dilger wrote:

> When the problem hits use sysrq-t and/or sysrq-p on the console to dump
> the stack of kjournald, as a starting point to see what it is doing.

A kernel "readprofile" can also be very useful for this.

--Stephen




From nebid2005 at yahoo.com.au  Fri Jun 10 06:19:22 2005
From: nebid2005 at yahoo.com.au (Nebid)
Date: Fri, 10 Jun 2005 16:19:22 +1000 (EST)
Subject: clone RHEL 4 ext3 partition
In-Reply-To: <20050608055531.GZ14004@schnapps.adilger.int>
Message-ID: <20050610061922.99105.qmail@web33604.mail.mud.yahoo.com>

Hi Andreas,

Thanks for replying.

dumpe2fs -h /dev/hda3 | grep feature
yields...
Filesystem features: has_journal ext_attr resize_inode
dir_index filetype needs_recovery sparse_super
large_file

Regards,
Matt
--- Andreas Dilger <adilger at clusterfs.com> wrote:

> On Jun 08, 2005  15:16 +1000, Matt Smith wrote:
> > I'm about to roll out a whole bunch of Redhat
> > Enterprise 4 workstations and have run into
> problems
> > cloning from the original.
> > 
> > Normally I would use ghost (v7.5) because it does
> a
> > nice job when cloning to a different sized
> > disk.Unfortunately it comes up with read error
> 29004.
> > Looking around it seems that Symantec don't
> support
> > Fedora Core 3 (with Ghost v.8 - don't know if v.9
> > works ???).
> > 
> > Next option was to use dd.
> > This worked fine but when I went to resize the
> > partition I noticed that Redhat have removed
> resize2fs
> > from e2fsprogs. 
> > After installing the Redhat e2fsprogs
> > source rpm (and the newest from sourceforge - i
> tried
> > both) and after compiling the resize2fs binary i
> got
> > an error - "Filesystem has unsupported feature(s)"
> 
> Both problems are likely from the same root cause -
> namely
> some strange feature enabled in your filesystem.  
> 
> What does "dumpe2fs -h {dev} | grep feature" say
> about
> your filesystem?  AFAIK the latest e2fsprogs should
> support all the ext3 features in a released kernel.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
> 
> 



Send instant messages to your online friends http://au.messenger.yahoo.com 



From adilger at clusterfs.com  Fri Jun 10 06:34:22 2005
From: adilger at clusterfs.com (Andreas Dilger)
Date: Fri, 10 Jun 2005 00:34:22 -0600
Subject: clone RHEL 4 ext3 partition
In-Reply-To: <20050610061922.99105.qmail@web33604.mail.mud.yahoo.com>
References: <20050608055531.GZ14004@schnapps.adilger.int>
	<20050610061922.99105.qmail@web33604.mail.mud.yahoo.com>
Message-ID: <20050610063422.GR14004@schnapps.adilger.int>

On Jun 10, 2005  16:19 +1000, Nebid wrote:
> Thanks for replying.
> 
> dumpe2fs -h /dev/hda3 | grep feature
> yields...
> Filesystem features: has_journal ext_attr resize_inode
> dir_index filetype needs_recovery sparse_super
> large_file

Well, if your filesystem has "needs_recovery" set that means it is
either mounted (which is OK as long as you do a clean unmount before
trying the clone/resize), or it needs an e2fsck run on it to clean up
the journal before doing the resize ("e2fsck /dev/hda3" will just
replay the journal, as will mounting the filesystem).

I believe the 1.37 e2fsprogs should handle all of the above features
without trouble.  It may be that Ghost doesn't understand the "resize_inode"
feature...  Having said that, if you are running the FC3 kernel on
these nodes you can use the online resize feature to do the resizing.

Mount the filesystem, then run "ext2resize -v /dev/hda3" and it will
resize to fill the partition.

> --- Andreas Dilger <adilger at clusterfs.com> wrote:
> > On Jun 08, 2005  15:16 +1000, Matt Smith wrote:
> > > I'm about to roll out a whole bunch of Redhat
> > > Enterprise 4 workstations and have run into
> > > problems cloning from the original.
> > > 
> > > Normally I would use ghost (v7.5) because it does a
> > > nice job when cloning to a different sized
> > > disk.Unfortunately it comes up with read error 29004.
> > > Looking around it seems that Symantec don't support
> > > Fedora Core 3 (with Ghost v.8 - don't know if v.9
> > > works ???).
> > > 
> > > Next option was to use dd.
> > > This worked fine but when I went to resize the
> > > partition I noticed that Redhat have removed
> > > resize2fs from e2fsprogs. 
> > > After installing the Redhat e2fsprogs
> > > source rpm (and the newest from sourceforge - I tried
> > > both) and after compiling the resize2fs binary I got
> > > an error - "Filesystem has unsupported feature(s)"
> > 
> > Both problems are likely from the same root cause -
> > namely some strange feature enabled in your filesystem.  
> > 
> > What does "dumpe2fs -h {dev} | grep feature" say
> > about your filesystem?  AFAIK the latest e2fsprogs should
> > support all the ext3 features in a released kernel.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From jli at click3x.com  Tue Jun 14 15:17:04 2005
From: jli at click3x.com (Jonathan)
Date: Tue, 14 Jun 2005 11:17:04 -0400
Subject: 2.4.20 kernel patch for ext3
Message-ID: <000201c570f4$222fb450$d6895fc7@astroboy>

Hi everyone, I need a 2.4.20 kernel patch for ext3 file system. It's
corrupting our data.
 
I did some google, and all pages led to this page
http://www.zip.com.au/~akpm/linux/ext3/
 
Where all the link to the patches I need are broken.
 
Anyone can help me get these patches?  OR
 
I'm running  2.4.20 kernel right now, if I just do the kernel upgrade,
do I still need to patch for ext3?
 
Which is the best kernel to upgrade to? 2.4.30 or 2.4.31.
 
 
Thanks everyone !
 
Jonathan
 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050614/b4b565ae/attachment.htm>

From evilninja at gmx.net  Tue Jun 14 18:43:48 2005
From: evilninja at gmx.net (evilninja)
Date: Tue, 14 Jun 2005 20:43:48 +0200
Subject: 2.4.20 kernel patch for ext3
In-Reply-To: <000201c570f4$222fb450$d6895fc7@astroboy>
References: <000201c570f4$222fb450$d6895fc7@astroboy>
Message-ID: <42AF2564.6090709@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jonathan schrieb:
> Hi everyone, I need a 2.4.20 kernel patch for ext3 file system. It's
> corrupting our data.

um, please be a *bit* more specific (how your data gets corrupted).

> I'm running  2.4.20 kernel right now, if I just do the kernel upgrade,
> do I still need to patch for ext3?
>

no, when you upgrade to a current 2.4 kernel (2.4.30) no additional
patches are required to use ext3 and upgrading to a more current kernel
really *is* a  good idea.

- --
BOFH excuse #125:

we just switched to Sprint.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCryVkC/PVm5+NVoYRArSTAJ4/5xAIu0NKnsiaGfb9nS2yINvZoQCgjvlz
v2Z4CMytRGC9oTM1lzUJFyk=
=30ZE
-----END PGP SIGNATURE-----



From dshaw at jabberwocky.com  Tue Jun 14 21:14:05 2005
From: dshaw at jabberwocky.com (David Shaw)
Date: Tue, 14 Jun 2005 17:14:05 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
Message-ID: <20050614211405.GA26456@jabberwocky.com>

I have seen this happen a number of times:

Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9
Jun 13 13:58:16 n202 kernel: Aborting journal on device sda5.
Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9
Jun 13 13:58:16 n202 last message repeated 6 times
Jun 13 13:58:18 n202 kernel: ext3_abort called.
Jun 13 13:58:18 n202 kernel: EXT3-fs error (device sda5): ext3_journal_start_sb: Detected aborted journal
Jun 13 13:58:18 n202 kernel: Remounting filesystem read-only

Once this happens, things break quickly (/tmp being readonly, as a
start).  Upon reboot, a manual fsck is required, after which the
machine is operational again.

This particular example is a SATA disk, but it has happened to a
regular old IDE disk as well.  It is always the root partition.  The
bad inode number varies (but is always either 3 or 9).  There are no
other errors about the disk in the log.

Kernel: 2.6.11.7
e2fstools: 1.35 (28-Feb-2004)

Any thoughts on how to proceed here?  Unfortunately, I'm not able to
duplicate this at will.

David



From bunk at stusta.de  Tue Jun 14 21:34:19 2005
From: bunk at stusta.de (Adrian Bunk)
Date: Tue, 14 Jun 2005 23:34:19 +0200
Subject: [2.6 patch] fs/jbd/: possible cleanups
Message-ID: <20050614213419.GK21393@stusta.de>

This patch contains the following possible cleanups:
- make needlessly global functions static
- journal.c: remove the unused global function __journal_internal_check
             and move the check to journal_init
- remove the following write-only global variable:
  - journal.c: current_journal
- remove the following unneeded EXPORT_SYMBOL's:
  - journal.c: journal_check_used_features
  - journal.c: journal_recover

Please check which of these changes do make sense.

Signed-off-by: Adrian Bunk <bunk at stusta.de>

---

 fs/jbd/journal.c    |   41 ++++++++++++++++++-----------------------
 fs/jbd/revoke.c     |    3 ++-
 include/linux/jbd.h |    3 ---
 3 files changed, 20 insertions(+), 27 deletions(-)

--- linux-2.6.12-rc6-mm1-full/include/linux/jbd.h.old	2005-06-14 03:58:20.000000000 +0200
+++ linux-2.6.12-rc6-mm1-full/include/linux/jbd.h	2005-06-14 04:00:56.000000000 +0200
@@ -900,8 +900,6 @@
 				int start, int len, int bsize);
 extern journal_t * journal_init_inode (struct inode *);
 extern int	   journal_update_format (journal_t *);
-extern int	   journal_check_used_features 
-		   (journal_t *, unsigned long, unsigned long, unsigned long);
 extern int	   journal_check_available_features 
 		   (journal_t *, unsigned long, unsigned long, unsigned long);
 extern int	   journal_set_features 
@@ -914,7 +912,6 @@
 extern int	   journal_skip_recovery	(journal_t *);
 extern void	   journal_update_superblock	(journal_t *, int);
 extern void	   __journal_abort_hard	(journal_t *);
-extern void	   __journal_abort_soft	(journal_t *, int);
 extern void	   journal_abort      (journal_t *, int);
 extern int	   journal_errno      (journal_t *);
 extern void	   journal_ack_err    (journal_t *);
--- linux-2.6.12-rc6-mm1-full/fs/jbd/journal.c.old	2005-06-14 03:57:39.000000000 +0200
+++ linux-2.6.12-rc6-mm1-full/fs/jbd/journal.c	2005-06-14 04:08:24.000000000 +0200
@@ -59,13 +59,11 @@
 EXPORT_SYMBOL(journal_init_dev);
 EXPORT_SYMBOL(journal_init_inode);
 EXPORT_SYMBOL(journal_update_format);
-EXPORT_SYMBOL(journal_check_used_features);
 EXPORT_SYMBOL(journal_check_available_features);
 EXPORT_SYMBOL(journal_set_features);
 EXPORT_SYMBOL(journal_create);
 EXPORT_SYMBOL(journal_load);
 EXPORT_SYMBOL(journal_destroy);
-EXPORT_SYMBOL(journal_recover);
 EXPORT_SYMBOL(journal_update_superblock);
 EXPORT_SYMBOL(journal_abort);
 EXPORT_SYMBOL(journal_errno);
@@ -81,6 +79,7 @@
 EXPORT_SYMBOL(journal_force_commit);
 
 static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *);
+static void __journal_abort_soft (journal_t *journal, int errno);
 
 /*
  * Helper function used to manage commit timeouts
@@ -93,16 +92,6 @@
 	wake_up_process(p);
 }
 
-/* Static check for data structure consistency.  There's no code
- * invoked --- we'll just get a linker failure if things aren't right.
- */
-void __journal_internal_check(void)
-{
-	extern void journal_bad_superblock_size(void);
-	if (sizeof(struct journal_superblock_s) != 1024)
-		journal_bad_superblock_size();
-}
-
 /*
  * kjournald: The main thread function used to manage a logging device
  * journal.
@@ -119,16 +108,12 @@
  *    known as checkpointing, and this thread is responsible for that job.
  */
 
-journal_t *current_journal;		// AKPM: debug
-
-int kjournald(void *arg)
+static int kjournald(void *arg)
 {
 	journal_t *journal = (journal_t *) arg;
 	transaction_t *transaction;
 	struct timer_list timer;
 
-	current_journal = journal;
-
 	daemonize("kjournald");
 
 	/* Set up an interval timer which can be used to trigger a
@@ -1181,8 +1166,10 @@
  * features.  Return true (non-zero) if it does. 
  **/
 
-int journal_check_used_features (journal_t *journal, unsigned long compat,
-				 unsigned long ro, unsigned long incompat)
+static int journal_check_used_features (journal_t *journal,
+					unsigned long compat,
+					unsigned long ro,
+					unsigned long incompat)
 {
 	journal_superblock_t *sb;
 
@@ -1439,7 +1426,7 @@
  * device this journal is present.
  */
 
-const char *journal_dev_name(journal_t *journal, char *buffer)
+static const char *journal_dev_name(journal_t *journal, char *buffer)
 {
 	struct block_device *bdev;
 
@@ -1485,7 +1472,7 @@
 
 /* Soft abort: record the abort error status in the journal superblock,
  * but don't do any other IO. */
-void __journal_abort_soft (journal_t *journal, int errno)
+static void __journal_abort_soft (journal_t *journal, int errno)
 {
 	if (journal->j_flags & JFS_ABORT)
 		return;
@@ -1888,7 +1875,7 @@
 
 static struct proc_dir_entry *proc_jbd_debug;
 
-int read_jbd_debug(char *page, char **start, off_t off,
+static int read_jbd_debug(char *page, char **start, off_t off,
 			  int count, int *eof, void *data)
 {
 	int ret;
@@ -1898,7 +1885,7 @@
 	return ret;
 }
 
-int write_jbd_debug(struct file *file, const char __user *buffer,
+static int write_jbd_debug(struct file *file, const char __user *buffer,
 			   unsigned long count, void *data)
 {
 	char buf[32];
@@ -1987,6 +1974,14 @@
 {
 	int ret;
 
+/* Static check for data structure consistency.  There's no code
+ * invoked --- we'll just get a linker failure if things aren't right.
+ */
+	extern void journal_bad_superblock_size(void);
+	if (sizeof(struct journal_superblock_s) != 1024)
+		journal_bad_superblock_size();
+
+
 	ret = journal_init_caches();
 	if (ret != 0)
 		journal_destroy_caches();
--- linux-2.6.12-rc6-mm1-full/fs/jbd/revoke.c.old	2005-06-14 03:58:36.000000000 +0200
+++ linux-2.6.12-rc6-mm1-full/fs/jbd/revoke.c	2005-06-14 03:58:41.000000000 +0200
@@ -116,7 +116,8 @@
 		(block << (hash_shift - 12))) & (table->hash_size - 1);
 }
 
-int insert_revoke_hash(journal_t *journal, unsigned long blocknr, tid_t seq)
+static int insert_revoke_hash(journal_t *journal, unsigned long blocknr,
+			      tid_t seq)
 {
 	struct list_head *hash_list;
 	struct jbd_revoke_record_s *record;



From adilger at clusterfs.com  Tue Jun 14 23:19:23 2005
From: adilger at clusterfs.com (Andreas Dilger)
Date: Tue, 14 Jun 2005 19:19:23 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <20050614211405.GA26456@jabberwocky.com>
References: <20050614211405.GA26456@jabberwocky.com>
Message-ID: <20050614231923.GA12320@moraine.clusterfs.com>

On Jun 14, 2005  17:14 -0400, David Shaw wrote:
> Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9
> 
> This particular example is a SATA disk, but it has happened to a
> regular old IDE disk as well.  It is always the root partition.  The
> bad inode number varies (but is always either 3 or 9).  There are no
> other errors about the disk in the log.

The "bad inode number" check is only for inodes inside the "reserved inode"
area, namely inum < 12.  The only commonly used (=valid) inode numbers in
this range are the root inode (=2) and the journal inode (=8), so I suspect
you are getting single-bit memory errors in bit 1, or if the controller
is the same that would also be viewed with suspicion.  It is very likely
that you are getting other single-bit errors elsewhere but they are harder
to notice.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From dshaw at jabberwocky.com  Wed Jun 15 02:26:52 2005
From: dshaw at jabberwocky.com (David Shaw)
Date: Tue, 14 Jun 2005 22:26:52 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <20050614231923.GA12320@moraine.clusterfs.com>
References: <20050614211405.GA26456@jabberwocky.com>
	<20050614231923.GA12320@moraine.clusterfs.com>
Message-ID: <20050615022652.GA27181@jabberwocky.com>

On Tue, Jun 14, 2005 at 07:19:23PM -0400, Andreas Dilger wrote:
> On Jun 14, 2005  17:14 -0400, David Shaw wrote:
> > Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9
> > 
> > This particular example is a SATA disk, but it has happened to a
> > regular old IDE disk as well.  It is always the root partition.  The
> > bad inode number varies (but is always either 3 or 9).  There are no
> > other errors about the disk in the log.
> 
> The "bad inode number" check is only for inodes inside the "reserved inode"
> area, namely inum < 12.  The only commonly used (=valid) inode numbers in
> this range are the root inode (=2) and the journal inode (=8), so I suspect
> you are getting single-bit memory errors in bit 1, or if the controller
> is the same that would also be viewed with suspicion.  It is very likely
> that you are getting other single-bit errors elsewhere but they are harder
> to notice.

This is an interesting idea.  Is there any simple way this sort of bit
flip problem could happen outside of the hardware?  I've had this
happen on 4 different machines from 3 different vendors, 3 SATA, and 1
IDE.  It seems almost impossible that it's a memory or controller
error.

David



From tytso at mit.edu  Wed Jun 15 13:19:43 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Wed, 15 Jun 2005 09:19:43 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <20050615022652.GA27181@jabberwocky.com>
References: <20050614211405.GA26456@jabberwocky.com>
	<20050614231923.GA12320@moraine.clusterfs.com>
	<20050615022652.GA27181@jabberwocky.com>
Message-ID: <20050615131943.GC4228@thunk.org>

On Tue, Jun 14, 2005 at 10:26:52PM -0400, David Shaw wrote:
> On Tue, Jun 14, 2005 at 07:19:23PM -0400, Andreas Dilger wrote:
> > On Jun 14, 2005  17:14 -0400, David Shaw wrote:
> > > Jun 13 13:58:16 n202 kernel: EXT3-fs error (device sda5): ext3_get_inode_block: bad inode number: 9
> > > 
> > > This particular example is a SATA disk, but it has happened to a
> > > regular old IDE disk as well.  It is always the root partition.  The
> > > bad inode number varies (but is always either 3 or 9).  There are no
> > > other errors about the disk in the log.
> > 
> > The "bad inode number" check is only for inodes inside the "reserved inode"
> > area, namely inum < 12.  The only commonly used (=valid) inode numbers in
> > this range are the root inode (=2) and the journal inode (=8), so I suspect
> > you are getting single-bit memory errors in bit 1, or if the controller
> > is the same that would also be viewed with suspicion.  It is very likely
> > that you are getting other single-bit errors elsewhere but they are harder
> > to notice.
> 
> This is an interesting idea.  Is there any simple way this sort of bit
> flip problem could happen outside of the hardware?  I've had this
> happen on 4 different machines from 3 different vendors, 3 SATA, and 1
> IDE.  It seems almost impossible that it's a memory or controller
> error.

I have to agree with Andreas' analysis.  If you could, please send
some compressed raw e2image dump files (see the man page for e2image,
but basically we need is: "e2image -r /dev/sda5 - | bzip2 >
sda5.e2i.bz2"), taken after the disk is remounted read-only.  Then
take another e2image dump after the system has rebooted in single user
mode, but *before* running e2fsck on the filesystem.  (That way we can
check to see if the filesystem has changed between reboots --- that
could indicate hardware problems, or in-memory corruption of the
buffer cache due to some kernel bug.)  The e2fsck transcript would
also be useful, of course.

The only other possible explanation I can imagine, beyond a hardware
problem, or some strange kernel bug that no one else is seeing, is
some a bug in some program that was directly accessing the disk drive;
for example, if the bootloader attempted to update some state and
wrote that state to the wrong place on disk, or some other program
that was doing direct disk accesses, and it was always corrupting the
same block(s) in the same way.

Good luck,

						- Ted



From theman at josephdwagner.info  Wed Jun 15 17:13:34 2005
From: theman at josephdwagner.info (Joseph D. Wagner)
Date: Wed, 15 Jun 2005 12:13:34 -0500
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <20050615131943.GC4228@thunk.org>
Message-ID: <43vtj5$tska2f@mxip09a.cluster1.charter.net>

> The only other possible explanation I can imagine, beyond a
> hardware problem

Just off the top of my head, could the number be reaching its limit and rolling over somehow?

Joseph D. Wagner



From adilger at clusterfs.com  Wed Jun 15 17:22:05 2005
From: adilger at clusterfs.com ('Andreas Dilger')
Date: Wed, 15 Jun 2005 13:22:05 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <43vtj5$tska2f@mxip09a.cluster1.charter.net>
References: <20050615131943.GC4228@thunk.org>
	<43vtj5$tska2f@mxip09a.cluster1.charter.net>
Message-ID: <20050615172205.GD12320@moraine.clusterfs.com>

On Jun 15, 2005  12:13 -0500, Joseph D. Wagner wrote:
> > The only other possible explanation I can imagine, beyond a
> > hardware problem
> 
> Just off the top of my head, could the number be reaching its limit
> and rolling over somehow?

Seems unlikely, as it isn't very easy to create a filesystem that uses
4B inodes.  Would have to change a lot of config options to do that and
have a very large device.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



From tytso at mit.edu  Wed Jun 15 19:00:55 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Wed, 15 Jun 2005 15:00:55 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <43vtj5$tska2f@mxip09a.cluster1.charter.net>
References: <20050615131943.GC4228@thunk.org>
	<43vtj5$tska2f@mxip09a.cluster1.charter.net>
Message-ID: <20050615190055.GA7722@thunk.org>

On Wed, Jun 15, 2005 at 12:13:34PM -0500, Joseph D. Wagner wrote:
> > The only other possible explanation I can imagine, beyond a
> > hardware problem
> 
> Just off the top of my head, could the number be reaching its limit
> and rolling over somehow?

The inode number?  No.

					- Ted



From tytso at mit.edu  Wed Jun 15 19:05:47 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Wed, 15 Jun 2005 15:05:47 -0400
Subject: bad inode number followed by ext3_abort and remount readonly
In-Reply-To: <20050615172205.GD12320@moraine.clusterfs.com>
References: <20050615131943.GC4228@thunk.org>
	<43vtj5$tska2f@mxip09a.cluster1.charter.net>
	<20050615172205.GD12320@moraine.clusterfs.com>
Message-ID: <20050615190547.GB7722@thunk.org>

On Wed, Jun 15, 2005 at 01:22:05PM -0400, 'Andreas Dilger' wrote:
> On Jun 15, 2005  12:13 -0500, Joseph D. Wagner wrote:
> > > The only other possible explanation I can imagine, beyond a
> > > hardware problem
> > 
> > Just off the top of my head, could the number be reaching its limit
> > and rolling over somehow?
> 
> Seems unlikely, as it isn't very easy to create a filesystem that uses
> 4B inodes.  Would have to change a lot of config options to do that and
> have a very large device.

Not to mention that we don't do any arithmetic operations on inode
numbers, so you're not going to see an overflow.

						- Ted



From kapil.sampath at wipro.com  Thu Jun 16 14:42:27 2005
From: kapil.sampath at wipro.com (kapil.sampath at wipro.com)
Date: Thu, 16 Jun 2005 20:12:27 +0530
Subject: User directories in /home are missing
Message-ID: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com>

Hi,

I use Redhat Enterprise Edition 2.4.21 kernel version. In my system all
the user directories in /home disappeared. No one deleted it. But I
don't know how it is missing. I had one or two open sessions where I
already logged in as root and cd'ed to one particular home directory.
>From there I am able to access files in that home directory. But not
from a new session. In new session it always says no such file or
directory.

 

All my user accounts still exists. I have totally around 10 user
accounts in that machine. I haven't rebooted that machine thinking that
I can do something without reboot. 

 

Do anyone know the reason for this and if so how to recover it.

 

Regards

Kapil Sampath

 

"The greatest sin is to think yourself weak"

                             - Swami Vivekananda

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050616/6526baf5/attachment.htm>

From manuel at todo-linux.com  Thu Jun 16 14:41:59 2005
From: manuel at todo-linux.com (Manuel Arostegui Ramirez)
Date: Thu, 16 Jun 2005 16:41:59 +0200
Subject: User directories in /home are missing
In-Reply-To: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com>
References: <2FEE63312285CF428A8480B07AC1C35969DCDA@CHN-SNR-MBX01.wipro.com>
Message-ID: <200506161641.59909.manuel@todo-linux.com>

El Jueves 16 Junio 2005 16:42, kapil.sampath at wipro.com escribi?:
> Hi,
>
> I use Redhat Enterprise Edition 2.4.21 kernel version. In my system all
> the user directories in /home disappeared. No one deleted it. But I
> don't know how it is missing. I had one or two open sessions where I
> already logged in as root and cd'ed to one particular home directory.
> From there I am able to access files in that home directory. But not
> from a new session. In new session it always says no such file or
> directory.
>
>
>
> All my user accounts still exists. I have totally around 10 user
> accounts in that machine. I haven't rebooted that machine thinking that
> I can do something without reboot.
>
>
>
> Do anyone know the reason for this and if so how to recover it.
>
>
>
> Regards
>
> Kapil Sampath

Have you look at lost+found ?

-- 
Manuel Arostegui Ramirez #Linux Registered User 295750
Socio de Hispalinux 1813
Red Hat Linux 9, Kernel 2.6.2 ReiserFS
Firma  cifrada
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE+3O1MqfmPcHTj+twRAm
yDAJ9P6ezepIMg06vOet/YPKxVoB+Z/ACfWVhh
---END PGP SIGNATURE-----



From michoel_abrams at ml.com  Thu Jun 16 15:09:46 2005
From: michoel_abrams at ml.com (Abrams, Michoel (IDS DCS PE))
Date: Thu, 16 Jun 2005 11:09:46 -0400
Subject: User directories in /home are missing
Message-ID: <666CE4AC3B5F8E46A67D8F6FE504EBC00E3617F5@mlnya204mb.amrs.win.ml.com>

by chance, did you recently enable autofs?  a typical mount point for
autofs is /home, & if autofs is started w/ that mount point specified,
all your local files will not appear until you stop the autofs svc...
 
HTH
 
Mike.

	-----Original Message-----
	From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com] On Behalf Of
kapil.sampath at wipro.com
	Sent: Thursday, June 16, 2005 10:42 AM
	To: ext3-users at redhat.com; linux_lovers at yahoogroups.com
	Subject: User directories in /home are missing
	
	

	Hi,

	I use Redhat Enterprise Edition 2.4.21 kernel version. In my
system all the user directories in /home disappeared. No one deleted it.
But I don't know how it is missing. I had one or two open sessions where
I already logged in as root and cd'ed to one particular home directory.
>From there I am able to access files in that home directory. But not
from a new session. In new session it always says no such file or
directory.

	 

	All my user accounts still exists. I have totally around 10 user
accounts in that machine. I haven't rebooted that machine thinking that
I can do something without reboot. 

	 

	Do anyone know the reason for this and if so how to recover it.

	 

	Regards

	Kapil Sampath

	 

	"The greatest sin is to think yourself weak"

	                             - Swami Vivekananda
--------------------------------------------------------

If you are not an intended recipient of this e-mail, please notify the sender, delete it and do not read, act upon, print, disclose, copy, retain or redistribute it. Click here for important additional terms relating to this e-mail.     http://www.ml.com/email_terms/
--------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050616/18bf07b0/attachment.htm>

From mvolaski at aecom.yu.edu  Fri Jun 17 22:28:03 2005
From: mvolaski at aecom.yu.edu (Maurice Volaski)
Date: Fri, 17 Jun 2005 18:28:03 -0400
Subject: [Q] Is this true and does it mean there is dynamic
 defragmentation in ext2/3?
Message-ID: <a0621024ebed8fb13822d@[129.98.90.227]>

Someone recently posted the following statement midway down the page 
at 
http://forums.gentoo.org/viewtopic-t-305871-postdays-0-postorder-asc-highlight-ext3+ordered+data-start-25.html

>You don't need to defragment ext2/ext3 because as you use the 
>filesystem file blocks and inodes are moved around and reallocated 
>to keep the data nearly contiguous. It's not perfect, but it works 
>fairly well and you should almost never see a performance 
>degradation caused by the filesystem's fragmentation.

Is this statement accurate and does it mean ext2/3 is performing a 
sort of dynamic defragmentation?
-- 

Maurice Volaski, mvolaski at aecom.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University



From tytso at mit.edu  Sat Jun 18 19:14:51 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Sat, 18 Jun 2005 15:14:51 -0400
Subject: [Q] Is this true and does it mean there is dynamic
	defragmentation in ext2/3?
In-Reply-To: <a0621024ebed8fb13822d@[129.98.90.227]>
References: <a0621024ebed8fb13822d@[129.98.90.227]>
Message-ID: <20050618191451.GC16314@thunk.org>

On Fri, Jun 17, 2005 at 06:28:03PM -0400, Maurice Volaski wrote:
> >You don't need to defragment ext2/ext3 because as you use the 
> >filesystem file blocks and inodes are moved around and reallocated 
> >to keep the data nearly contiguous. It's not perfect, but it works 
> >fairly well and you should almost never see a performance 
> >degradation caused by the filesystem's fragmentation.
> 
> Is this statement accurate and does it mean ext2/3 is performing a 
> sort of dynamic defragmentation?

No, not true.  (At least not today)

Ext2/3 has advanced algorithms to make sure that the blocks that are
allocated avoid fragmentation, but it is not doing any kind of dynamic
moving of blocks/inodes.  

(At least, not yet; there has been some talk about creating enough
kernel hooks so that a user-space program could do dynamic
defragmentation of the filesystem, but none of this exists at the
moment.)

						- Ted



From menscher at uiuc.edu  Sat Jun 18 19:36:59 2005
From: menscher at uiuc.edu (Damian Menscher)
Date: Sat, 18 Jun 2005 14:36:59 -0500 (CDT)
Subject: [Q] Is this true and does it mean there is dynamicmentation in
 ext2/3?
In-Reply-To: <20050618191451.GC16314@thunk.org>
References: <a0621024ebed8fb13822d@[129.98.90.227]>
	<20050618191451.GC16314@thunk.org>
Message-ID: <Pine.LNX.4.62.0506181433260.4844@lx2.physics.uiuc.edu>

On Sat, 18 Jun 2005, Theodore Ts'o wrote:
> On Fri, Jun 17, 2005 at 06:28:03PM -0400, Maurice Volaski wrote:
>>> You don't need to defragment ext2/ext3 because as you use the
>>> filesystem file blocks and inodes are moved around and reallocated
>>> to keep the data nearly contiguous. It's not perfect, but it works
>>> fairly well and you should almost never see a performance
>>> degradation caused by the filesystem's fragmentation.
>>
>> Is this statement accurate and does it mean ext2/3 is performing a
>> sort of dynamic defragmentation?
>
> Ext2/3 has advanced algorithms to make sure that the blocks that are
> allocated avoid fragmentation, but it is not doing any kind of dynamic
> moving of blocks/inodes.

It's probably worth noting that SGI's XFS filesystem has a userland 
program to eliminate fragmentation: fsr (file system reorganizer).  It 
basically works by copying files around, and depending on the underlying 
filesystem to allocate contiguous blocks for the new copies of files. 
It's a neat hack to allow you to defrag a drive without needing too much 
kernel-mode involvement.

Of course, you probably would need some special stuff to ensure inode 
numbers don't change (NFS depends on them for filehandles, etc).

Damian Menscher
-- 
-=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=-
-=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=-
-=#| 4602 Beckman, VMIL/MS, Imaging Technology Group:(217)244-3074 |#=-
-=#| <menscher at uiuc.edu> www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=-
-=#| The above opinions are not necessarily those of my employers. |#=-



From jjletho67-3txe at yahoo.it  Sun Jun 19 12:24:27 2005
From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it)
Date: Sun, 19 Jun 2005 14:24:27 +0200 (CEST)
Subject: ext3 offline resizing
Message-ID: <20050619122427.99557.qmail@web25610.mail.ukl.yahoo.com>

Hi all,
I want to setup a linux workstation with FC4 and with
all the partitions (except for /boot) under LVM to be
able to resize them in future. I don't need online
resizing, I can shutdown the system and reboot with
the rescuecd when needed.
I have done some test on this configuration and I have
sverals doubts:

If i format a partition with the resize_inode feature
enabled and I resize it offline with resize2fs all is
ok until I reach (in a single step or in several step)
1000*(Original Size). When i extend a partition over
this size I receive a couple of error when I launch
fsck.ext3 and the resize_inode feature disappears.

If I format a partition without the resize_inode
feature then i can resize it to any size, but after
the resize the first launch of fsck.ext3 gives always
this error:

Backing up journal inode block information

In both the scenarios after the resize I'm able to
mount the partitions.
The question is: for offline resizing is better to
diseble the resize_inode feature ?
Is the error message i wrote really an error ?

jjletho

P.S. I'm sorry but at the moment the subscription form
does not work so if you can put me in cc...








	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



From jjletho67-3txe at yahoo.it  Mon Jun 20 07:23:13 2005
From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it)
Date: Mon, 20 Jun 2005 09:23:13 +0200 (CEST)
Subject: ext3 offline resizing
Message-ID: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com>

Hi all,
I want to setup a linux workstation with FC4 and with
all the partitions (except for /boot) under LVM to be
able to resize them in future. I don't need online
resizing, I can shutdown the system and reboot with
the rescuecd when needed.
I have done some test on this configuration and I have
sverals doubts:

If i format a partition with the resize_inode feature
enabled and I resize it offline with resize2fs all is
ok until I reach (in a single step or in several step)
1000*(Original Size). When i extend a partition over
this size I receive a couple of error when I launch
fsck.ext3 and the resize_inode feature disappears.  

If I format a partition without the resize_inode
feature then i can resize it to any size, but after
the resize the first launch of fsck.ext3 gives always
this error:

Backing up journal inode block information

In both the scenarios after the resize I'm able to
mount the partitions.
The question is: for offline resizing is better to
disable the resize_inode feature ?
Is the error message i wrote really an error ?

jjletho



	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



From amolsurati at yahoo.com  Mon Jun 20 09:55:47 2005
From: amolsurati at yahoo.com (amol surati)
Date: Mon, 20 Jun 2005 02:55:47 -0700 (PDT)
Subject: does fsck.ext3 read data blocks?
Message-ID: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com>

hi,

I am working on a file system related project. 
I want to know whether the fsck utility for ext3
reads data blocks (storing user files,etc)  at any
stage?

thankin you,
amol

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



From theman at josephdwagner.info  Mon Jun 20 10:00:59 2005
From: theman at josephdwagner.info (Joseph D. Wagner)
Date: Mon, 20 Jun 2005 05:00:59 -0500
Subject: does fsck.ext3 read data blocks?
In-Reply-To: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com>
Message-ID: <4403a9$13jpqh0@mxip13a.cluster1.charter.net>

> I want to know whether the fsck utility for ext3
> reads data blocks (storing user files,etc)  at any
> stage?

Only if you tell it to check for bad blocks.

Joseph D. Wagner



From tytso at mit.edu  Mon Jun 20 16:58:41 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Mon, 20 Jun 2005 12:58:41 -0400
Subject: does fsck.ext3 read data blocks?
In-Reply-To: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com>
References: <20050620095547.96738.qmail@web30402.mail.mud.yahoo.com>
Message-ID: <20050620165841.GA30339@thunk.org>

On Mon, Jun 20, 2005 at 02:55:47AM -0700, amol surati wrote:
> 
> I am working on a file system related project. 
> I want to know whether the fsck utility for ext3
> reads data blocks (storing user files,etc)  at any
> stage?
> 

E2fsck will read data blocks for directory inodes and symbolic links
where the target of the symlink is greater than 60 bytes.

					- Ted



From domen at coderock.org  Mon Jun 20 21:55:53 2005
From: domen at coderock.org (domen at coderock.org)
Date: Mon, 20 Jun 2005 23:55:53 +0200
Subject: [patch 1/3] fs/ext3/super.c: fix sparse warnings
Message-ID: <20050620215553.356624000@nd47.coderock.org>

An embedded and charset-unspecified text was scrubbed...
Name: sparse-fs_ext3_super.patch
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050620/af49961a/attachment.ksh>

From domen at coderock.org  Mon Jun 20 21:55:54 2005
From: domen at coderock.org (domen at coderock.org)
Date: Mon, 20 Jun 2005 23:55:54 +0200
Subject: [patch 2/3] fs/ext3/resize.c: fix sparse warnings
Message-ID: <20050620215554.514639000@nd47.coderock.org>

An embedded and charset-unspecified text was scrubbed...
Name: sparse-fs_ext3_resize.patch
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050620/8e33ffa8/attachment.ksh>

From domen at coderock.org  Mon Jun 20 21:55:55 2005
From: domen at coderock.org (domen at coderock.org)
Date: Mon, 20 Jun 2005 23:55:55 +0200
Subject: [patch 3/3] Fix misleading gcc4 warning,
	size may be used uninitialized (ext3)
Message-ID: <20050620215555.322563000@nd47.coderock.org>

An embedded and charset-unspecified text was scrubbed...
Name: gcc4-fs_ext3_acl.c
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20050620/80eb5460/attachment.c>

From tytso at mit.edu  Tue Jun 21 02:20:36 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Mon, 20 Jun 2005 22:20:36 -0400
Subject: ext3 offline resizing
In-Reply-To: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com>
References: <20050620072313.5810.qmail@web25610.mail.ukl.yahoo.com>
Message-ID: <20050621022036.GB29949@thunk.org>

On Mon, Jun 20, 2005 at 09:23:13AM +0200, jjletho67-3txe at yahoo.it wrote:
> I want to setup a linux workstation with FC4 and with
> all the partitions (except for /boot) under LVM to be
> able to resize them in future. I don't need online
> resizing, I can shutdown the system and reboot with
> the rescuecd when needed.
> I have done some test on this configuration and I have
> sverals doubts:
> 
> If i format a partition with the resize_inode feature
> enabled and I resize it offline with resize2fs all is
> ok until I reach (in a single step or in several step)
> 1000*(Original Size). When i extend a partition over
> this size I receive a couple of error when I launch
> fsck.ext3 and the resize_inode feature disappears.  

What error(s) are you getting?  Not all messages from e2fsck are
errors, you know.  Some are informative messages, such as:

	Backing up journal inode block information

> If I format a partition without the resize_inode
> feature then i can resize it to any size, but after
> the resize the first launch of fsck.ext3 gives always
> this error:
> 
> Backing up journal inode block information
> 
> In both the scenarios after the resize I'm able to
> mount the partitions.
> The question is: for offline resizing is better to
> disable the resize_inode feature ?
> Is the error message i wrote really an error ?

Resize2fs will take advantage of the on-line resizing inode to do
off-line resizes faster and more safely.  It should work with or
without, it, though.  It should work fine; send the error messages if
you'd like me to give you an opinion about them.

					- Ted



From puhuri at iki.fi  Tue Jun 21 05:07:50 2005
From: puhuri at iki.fi (Markus Peuhkuri)
Date: Tue, 21 Jun 2005 08:07:50 +0300
Subject: [Q] Is this true and does it mean there is
 dynamic	defragmentation in ext2/3?
In-Reply-To: <20050618191451.GC16314@thunk.org>
References: <a0621024ebed8fb13822d@[129.98.90.227]>
	<20050618191451.GC16314@thunk.org>
Message-ID: <42B7A0A6.7010806@iki.fi>

(by mistake only replied to Ted, sorry)
Theodore Ts'o wrote:

>Ext2/3 has advanced algorithms to make sure that the blocks that are
>allocated avoid fragmentation, but it is not doing any kind of dynamic
>  
>
And there is a tool 'filefrag' in e2fsprogs that reports how fragmented
a particular file is.  If your disk grows full (over 90-95%, depending
on file sizes etc..) then it is more difficult to find continuous blocks
for files.  Now, if you delete files, then new files most probably are
non-fragmented but those files that were written when disk was full are
still fragmented.

You can "unfragment" those files just by copying them and deleting old
ones (if you have plenty of free space), but as Damian told, you must be
careful with locks and nfs handles.

-- 
http://www.iki.fi/puhuri/





From jjletho67-3txe at yahoo.it  Tue Jun 21 08:02:07 2005
From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it)
Date: Tue, 21 Jun 2005 10:02:07 +0200 (CEST)
Subject: ext3 offline resizing
In-Reply-To: <20050621022036.GB29949@thunk.org>
Message-ID: <20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com>

Hi,
> 
> What error(s) are you getting?  Not all messages
> from e2fsck are
> errors, you know.  Some are informative messages,
> such as:
> 
> 	Backing up journal inode block information
> 
> Resize2fs will take advantage of the on-line
> resizing inode to do
> off-line resizes faster and more safely.  It should
> work with or
> without, it, though.  It should work fine; send the
> error messages if
> you'd like me to give you an opinion about them.
> 
> 					- Ted
> 

with resize_inode feature DISABLED the only
warning/message I obtain when doing "fsck.ext3 -f"
after the resize is:

"
Backing up journal inode block information
"

with resize_inode feature ENABLED when extending to
size > 1000*(originalSize) the errors I obtain are:

Resize_inode not enabled, but resize inode is non-zero
Clear<y>
...
Block bitmap differences: -57 Fix<y>

Free blocks count wrong for group #0 (6146,
counted=6147) Fix<y>

Free blocks count wrong (2194393, counted=2194394)
Fix<y>

After answering yes to all requests I can mount and
use the partition, but the resize_inode feature
disappears.
The example was on a test partition of 2M extended to
2.1G but I made other tests with greater size (50M ->
110G for example) with the same result.

letho



	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



From tytso at mit.edu  Tue Jun 21 13:50:24 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Tue, 21 Jun 2005 09:50:24 -0400
Subject: ext3 offline resizing
In-Reply-To: <20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com>
References: <20050621022036.GB29949@thunk.org>
	<20050621080208.40484.qmail@web25605.mail.ukl.yahoo.com>
Message-ID: <20050621135024.GD13207@thunk.org>

On Tue, Jun 21, 2005 at 10:02:07AM +0200, jjletho67-3txe at yahoo.it wrote:
>
> with resize_inode feature ENABLED when extending to
> size > 1000*(originalSize) the errors I obtain are:
> 
> Resize_inode not enabled, but resize inode is non-zero
> Clear<y>
> ...
> Block bitmap differences: -57 Fix<y>
> 
> Free blocks count wrong for group #0 (6146,
> counted=6147) Fix<y>
> 
> Free blocks count wrong (2194393, counted=2194394)
> Fix<y>

Oh, OK.  Resize2fs isn't clearing the left-over resize inode after
it's used all of the reserved blocks.   That should be fixed.

					- Ted



From jjletho67-3txe at yahoo.it  Tue Jun 21 18:58:59 2005
From: jjletho67-3txe at yahoo.it (jjletho67-3txe at yahoo.it)
Date: Tue, 21 Jun 2005 20:58:59 +0200 (CEST)
Subject: ext3 offline resizing
In-Reply-To: <20050621135024.GD13207@thunk.org>
Message-ID: <20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com>

Hi,

--- Theodore Ts'o <tytso at mit.edu> ha scritto: 

> 
> Oh, OK.  Resize2fs isn't clearing the left-over
> resize inode after
> it's used all of the reserved blocks.   That should
> be fixed.
> 
> 					- Ted

Ok now is much more clear. In our opinion in this
moment (without waiting for a fix) what is the better
solution for an ext3 based system which will often
need to resize its partitions (offline with resize2fs)
?
Disabling the resize_inode feature is safer ? Or is it
better to use the resize_inode feature and choose a
better initial size ?
Is the "1000*(original size)" limit I guessed correct
?
I'm sorry but i wasn't able to find any deep
documentation about resize_inode.

Thank you very much!

letho


	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



From tytso at mit.edu  Tue Jun 21 22:20:31 2005
From: tytso at mit.edu (Theodore Ts'o)
Date: Tue, 21 Jun 2005 18:20:31 -0400
Subject: ext3 offline resizing
In-Reply-To: <20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com>
References: <20050621135024.GD13207@thunk.org>
	<20050621185859.49037.qmail@web25603.mail.ukl.yahoo.com>
Message-ID: <20050621222031.GA17224@thunk.org>

On Tue, Jun 21, 2005 at 08:58:59PM +0200, jjletho67-3txe at yahoo.it wrote:
> 
> Ok now is much more clear. In our opinion in this
> moment (without waiting for a fix) what is the better
> solution for an ext3 based system which will often
> need to resize its partitions (offline with resize2fs)
> ?
> Disabling the resize_inode feature is safer ? Or is it
> better to use the resize_inode feature and choose a
> better initial size ?
> Is the "1000*(original size)" limit I guessed correct
> ?
> I'm sorry but i wasn't able to find any deep
> documentation about resize_inode.

It's should be better to use the resize inode.  The filesystem
inconsistency reported by e2fsck is just e2fsck being very nit-picky;
there is no danger of losing data as a result of this.  

If you use the resize inode, it will allow you to do on-line resizes
up to 1000*original_size by default; this figure however can be
adjusted by "mke2fs -E resize=" option --- see the mke2fs man page for
more details.  If the resize inode present, off-line resizes will use
those reserved block to allow for fast resizing that doesn't require
moving data blocks belonging to files, directories, or the inode table
around.  It's not a limit, though; if you try to resize a filesystem
bigger than 1000*original size (or whatever on-line resizing limit you
specified to mke2fs), resize2fs will still work, but it may have to
move filesystem data blocks around in order to accomplish the resize.

The costs of the online resize inode is that you have to pay a slight
penalty upfront in reserved blocks; but given the size of modern
disks, the overhead isn't particularly great.  

						- Ted



From mvolaski at aecom.yu.edu  Sun Jun 26 05:04:08 2005
From: mvolaski at aecom.yu.edu (Maurice Volaski)
Date: Sun, 26 Jun 2005 01:04:08 -0400
Subject: [Q] Is errors=panic safe to use, and will it detect a RAID gone
 psycho?
Message-ID: <a06230913bee365817415@[129.98.90.227]>

I have had in years past seen hardware (SCSI) RAID controllers lose 
it electronically causing the kernel to fill the logs with scary SCSI 
messages and ext3 to complain about "holes" in the filesystem like so:

Sep  7 14:47:17 thewarehouse1 kernel: EXT3-fs error (device 
sd(8,81)): ext3_readdir: directory #376833 contains a hole at offset 0

I'm using drbd and heartbeat so whatever gets written to the hardware 
RAID gets written independently to a second RAID on a second 
computer. It would be nice if in the unlikely event hardware failed 
to cause something bad such as the one aforementioned to trigger the 
computer to fail entirely and force heartbeat/drbd to kick in on the 
second computer.

If I set the error behavior with tune2fs to panic, would this happen? 
That is, is this the type of error that would trigger a panic? Are 
there minor errors that could unnecessarily trigger one?
-- 

Maurice Volaski, mvolaski at aecom.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University



From yvanoers at xs4all.nl  Sun Jun 26 13:30:42 2005
From: yvanoers at xs4all.nl (Yuri van Oers)
Date: Sun, 26 Jun 2005 15:30:42 +0200 (CEST)
Subject: Assertion failure in do_get_write_access()
Message-ID: <20050626151224.L25249-100000@xs3.xs4all.nl>


Hi,

I just had my server cry this out to the console:

Assertion failure in do_get_write_access() at transaction.c:658:
"jh->b_transaction == journal->j_committing_transaction"
kernel BUG at transaction.c:658!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c015e1f6>]    Not tainted
EFLAGS: 00010286
eax: 0000007d   ebx: c2ff4200   ecx: c243e000   edx: c068af00
esi: c0d6d900   edi: c2ff4200   ebp: c03f4190   esp: c243fd44
ds: 0018   es: 0018   ss: 0018
Process gzip (pid: 28630, stackpage=c243f000)
Stack: c0273660 c027385b c0273640 00000292 c0273920 c2ff4200 c18dd660 c03f4190
       c2ff4294 00000001 c0155dd6 00000000 00000000 00000000 c2f0f740 00000000
       c243fdc4 c243fdc4 c2ff4200 c015e4c8 c18dd660 c03f4190 00000000 00000006
Call Trace:    [<c0155dd6>] [<c015e4c8>] [<c0156337>] [<c015668d>] [<c0156788>]
  [<c01319d2>] [<c013226e>] [<c015672c>] [<c0156bc5>] [<c015672c>] [<c0125685>]
  [<c0125ae3>] [<c0117856>] [<c0154906>] [<c012f766>] [<c0106b63>]

Code: 0f 0b 92 02 40 36 27 c0 83 c4 14 8b 45 08 83 f8 06 0f 85 85


The version of Linux is 2.4.28. It took a hardware reset to get
/dev/sda2 (the partition with the fs I suspect was aching) back online. At
the time of the error I was tarring jpegs to that partition.

I could not find any references to this sort of bug being encountered or
fixed after 2.4.28, so I felt obliged to report it.


Is it a known bug? If so, has it been fixed? If not, can it be fixed & do
you need my help or any other info?


Regards,
Yuri



From hahaha_30k at yahoo.com  Tue Jun 28 22:01:51 2005
From: hahaha_30k at yahoo.com (ha haha)
Date: Tue, 28 Jun 2005 15:01:51 -0700 (PDT)
Subject: How to figure out underlying failed disk(parttions) and sector(s)
	position ???
Message-ID: <20050628220151.27698.qmail@web30202.mail.mud.yahoo.com>

Hi,

 with being exposed to more and more failed hard disks
reports, I've accumulated several questions of the
logged messages in /var/log/messages file: like how to
identifying failed disks(partitions), where is the
exact failed sector(s) on the hard disk, and why
badblocks reports OK to the reported disk failure.

Let me explained the above with the following several
example.

scenario #1, a PATA hard disk failed..

Host:		host1
.....
Jun 21 16:55:09 host1 kernel: hda: dma_intr:
status=0x51 { DriveReady SeekComplete Error }
Jun 21 16:55:09 host1 kernel: hda: dma_intr:
error=0x40 { UncorrectableError }, LBAsect=234304234,
high=13, low=16200426, sector=196487120
Jun 21 16:55:09 host1 kernel: end_request: I/O error,
dev 03:0b (hda), sector 196487120 
.....

The case #1, showed that /dev/hda failed in a simple
way, but some stuff are not obvious:

Q1: which failed partition? 

"dev 03:0b" looks like /dev/hda11, but no quick
document explains it, /dev/hda11 is just a guess based
on the follwing. -- maybe "/dev/hda11" is a better
string to log if my guess is true?

root at host1# ls -alF /dev/hda11
brw-rw----  1 root disk 3, 11 Sep 15  2003 /dev/hda11

Q2: which sector failed?

looks like it is sector 234304234 in LBA mode, and
relative to the whole disk /dev/hda(relative to sector
0 of /dev/hda); while it is the sector 196487120 when
relative to /dev/hda11 (sector 0 of /dev/hda11
partition, which may be Gigabytes offset from the
beginning of underlying disk). Is this a reasonable
guess?

Q3: what does the "high=13, low=16200426" means?

Q4: Does linux kernel disk driver tries to relocate
the failed sector --mapping access to it to some other
good sector first, before failling and logging errors
to /var/log/messages?

Q5: "badblocks -s -v -n ..." sometimes can not find
any disk problems even there were reported disk I/O
problems in /var/log/messages? what does that mean?

Thanks a lot...





		
____________________________________________________ 
Yahoo! Sports 
Rekindle the Rivalries. Sign up for Fantasy Football 
http://football.fantasysports.yahoo.com



From hahaha_30k at yahoo.com  Tue Jun 28 19:47:05 2005
From: hahaha_30k at yahoo.com (ha haha)
Date: Tue, 28 Jun 2005 12:47:05 -0700 (PDT)
Subject: figure out underlying failed disk(parttions) and sector(s) position
	???
Message-ID: <20050628194705.38170.qmail@web30211.mail.mud.yahoo.com>

Hi,

 with being exposed to more and more failed hard disks
reports, I've accumulated several questions of the
logged messages in /var/log/messages file: like how to
identifying failed disks(partitions), where is the
exact failed sector(s) on the hard disk, and why
badblocks reports OK to the reported disk failure.

Let me explained the above with the following several
example.

scenario #1, a PATA hard disk failed..

Host:		host1
.....
Jun 21 16:55:09 host1 kernel: hda: dma_intr:
status=0x51 { DriveReady SeekComplete Error }
Jun 21 16:55:09 host1 kernel: hda: dma_intr:
error=0x40 { UncorrectableError }, LBAsect=234304234,
high=13, low=16200426, sector=196487120
Jun 21 16:55:09 host1 kernel: end_request: I/O error,
dev 03:0b (hda), sector 196487120 
.....

The case #1, showed that /dev/hda failed in a simple
way, but some stuff are not obvious:

Q1: which failed partition? 

"dev 03:0b" looks like /dev/hda11, but no quick
document explains it, /dev/hda11 is just a guess based
on the follwing. -- maybe "/dev/hda11" is a better
string to log if my guess is true?

root at host1# ls -alF /dev/hda11
brw-rw----  1 root disk 3, 11 Sep 15  2003 /dev/hda11

Q2: which sector failed?

looks like it is sector 234304234 in LBA mode, and
relative to the whole disk /dev/hda(relative to sector
0 of /dev/hda); while it is the sector 196487120 when
relative to /dev/hda11 (sector 0 of /dev/hda11
partition, which may be Gigabytes offset from the
beginning of underlying disk). Is this a reasonable
guess?

Q3: what does the "high=13, low=16200426" means?

Q4: Does linux kernel disk driver tries to relocate
the failed sector --mapping access to it to some other
good sector first, before failling and logging errors
to /var/log/messages?

Q5: "badblocks -s -v -n ..." sometimes can not find
any disk problems even there were reported disk I/O
problems in /var/log/messages? what does that mean?

Thanks a lot...




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com