From doseyg at r-networks.net  Sun Jul  1 00:59:21 2007
From: doseyg at r-networks.net (Glen Dosey)
Date: Sat, 30 Jun 2007 20:59:21 -0400
Subject: poor read performance
In-Reply-To: <4685E79E.3060709@fnal.gov>
References: <1183159302.30971.37.camel@localhost.localdomain>
	<4685E79E.3060709@fnal.gov>
Message-ID: <1183251561.8003.11.camel@eclipse.office.r-networks.net>

ling,

Wow, Thanks ! 

That made an enormous difference. Now I'd like to try and understand
why. I had initially tried 256, 512 and 1024 block read ahead settings
in the OS with no difference in performance and went no further. 8192
made a huge difference, but 16384 brings the performance back to the
same at 256. Why is 8192 such a magic number ? The RAID enclosure itself
also has a read ahead setting , which I've tried at 512K , 1M and 2M,
also with no difference. 

I also want to make sure I understand how read-ahead is working in my
setup. If I request some data A, the OS will request 8192 blocks (4MB)
past the end of data A. Now the controller will see the OS request for A
+ 4M and read an additional 2MB past that, such that the disks have read
6MB beyond the end of what was initially requested, with 2MB being in
the controller cache, 4MB in the OS cache, and data A passed to the
application. Is this correct ?

Thanks,
Glen


On Sat, 2007-06-30 at 00:18 -0500, Ling C. Ho wrote:
> Hi,
> 
> Did you see any difference when different block size is used (for 
> example, dd with bs=64k or 128k)? Try also change the read-ahead cache. 
> blockdev --getra /dev/sdd to see what is the current value, and blockdev 
> --setra 8192 /dev/sdd to change it. 8192 is a good number that has been 
> working well for me for the similar size setup.
> 
> ...
> ling
> 
> Glen Dosey wrote:
> > I am seeing what seems to be a notable limit on read performance of an
> > ext3 filesystem. If anyone could offer some insight it would be helpful.
> >
> > Background:
> > 12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC
> > to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the
> > enclosure was configured RAID5 10+1 parity, although I've also tried
> > RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. 
> >
> > Problem:
> > No matter what I do I cannot get the ext3 read performance above
> > ~90MB/s. Under virtually every configuration listed above the write
> > performance is greater than the read performance. I've run a large
> > number of Bonnie++ and IOzone tests, but for the sake of simplicity in
> > this email I'll just refer to simple dd's with /dev/zero.
> >
> > Details:
> > Under the current RAID0 setup I see the following when dd'ing.
> >
> > DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync
> > 28 seconds
> > DD 4G from /dev/sdd to /dev/null 32 seconds
> > DD 4G to ext3 on /dev/sdd & sync 32 seconds
> > DD 4G from ext3 file to /dev/null 48 seconds.
> >
> > I've been watching the port usage on the FC switch and it verifies what
> > I am seeing, Writes max out near 2Gb/s but reads hit some artificial
> > limit around 90 MB/s and never ever exceed it with the filesystem,
> > regardless of they underlying RAID configuration. Without a filesystem
> > the reads are atleast 50% faster, and it can be seen on the FC switch
> > graphs as well.
> >
> > Any help or thoughts would be appreciated.
> >
> > Thanks,
> > ~Glen
> >
> >
> > _______________________________________________
> > Ext3-users mailing list
> > Ext3-users at redhat.com
> > https://www.redhat.com/mailman/listinfo/ext3-users
> >   
> 


From jprats at cesca.es  Tue Jul  3 07:31:45 2007
From: jprats at cesca.es (Jordi Prats)
Date: Tue, 03 Jul 2007 09:31:45 +0200
Subject: journal corrupted on / filesystem
Message-ID: <4689FB61.5090401@cesca.es>

Hi,
I'm getting this errors on the / filesystem:

Jul  3 08:03:54 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 18448395
Jul  3 08:03:54 inf04 kernel: Aborting journal on device cciss/c0d0p2.
Jul  3 08:03:54 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 18448387
Jul  3 08:03:55 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 18448387
Jul  3 08:03:55 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 18448395
Jul  3 08:03:57 inf04 kernel: ext3_abort called.
Jul  3 08:03:57 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_journal_start_sb: Detected aborted journal
Jul  3 08:03:57 inf04 kernel: Remounting filesystem read-only
(...)
Jul  3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 49464569
Jul  3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 49464569
Jul  3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 49464569
Jul  3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): 
ext3_get_inode_block: bad inode number: 49464569

I supose I should remove the journal (get it back to ext2) and recreate 
it (tune2fs -j /dev/...) It's possible to do it without rebooting it? 
It's no problem to turn it to read-only (it already is on that mode) 
Whitch command I should do to achive his?

Thanks!
Jordi


From tweeks at rackspace.com  Tue Jul  3 20:01:01 2007
From: tweeks at rackspace.com (tweeks)
Date: Tue, 3 Jul 2007 15:01:01 -0500
Subject: journal corrupted on / filesystem
In-Reply-To: <4689FB61.5090401@cesca.es>
References: <4689FB61.5090401@cesca.es>
Message-ID: <200707031501.02708.tweeks@rackspace.com>

On Tuesday 03 July 2007 02:31, Jordi Prats wrote:
> Hi,
> I'm getting this errors on the / filesystem:
[...]
> I supose I should remove the journal (get it back to ext2) and recreate
> it (tune2fs -j /dev/...) It's possible to do it without rebooting it?
> It's no problem to turn it to read-only (it already is on that mode)
> Whitch command I should do to achive his?

Just remount as ext2:

# mount -t ext2 -o remount /

Tweeks


Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace
Managed Hosting. Any dissemination, distribution or copying of the enclosed
material is prohibited. If you receive this transmission in error, please
notify us immediately by e-mail at abuse at rackspace.com, and delete the
original message. Your cooperation is appreciated.


From adilger at clusterfs.com  Tue Jul  3 21:42:31 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Tue, 3 Jul 2007 15:42:31 -0600
Subject: journal corrupted on / filesystem
In-Reply-To: <200707031501.02708.tweeks@rackspace.com>
References: <4689FB61.5090401@cesca.es>
	<200707031501.02708.tweeks@rackspace.com>
Message-ID: <20070703214231.GI6578@schatzie.adilger.int>

On Jul 03, 2007  15:01 -0500, tweeks wrote:
> On Tuesday 03 July 2007 02:31, Jordi Prats wrote:
> > I'm getting this errors on the / filesystem:
> [...]
> > I supose I should remove the journal (get it back to ext2) and recreate
> > it (tune2fs -j /dev/...) It's possible to do it without rebooting it?
> > It's no problem to turn it to read-only (it already is on that mode)
> > Whitch command I should do to achive his?
> 
> Just remount as ext2:
> 
> # mount -t ext2 -o remount /

Won't work.

You need to unmount the filesystem at least, at which point recreating
the journal with tune2fs is easy.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From ms419 at freezone.co.uk  Wed Jul  4 00:49:26 2007
From: ms419 at freezone.co.uk (Jack Bates)
Date: Tue, 03 Jul 2007 17:49:26 -0700
Subject: fopen extended attributes
Message-ID: <1183510166.4877.3.camel@ket.lat>

Are ext3 file system extended attributes mapped anywhere to file system
paths? e.g. /sys/fs/ext3/<path>/<attr-name>? I want to edit a file's
extended attribute using standard system calls, fopen, fread, fwrite,
etc. Is it possible?

Thanks, Jack
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 242 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070703/ae43b6f6/attachment.sig>

From agruen at suse.de  Wed Jul  4 09:15:50 2007
From: agruen at suse.de (Andreas Gruenbacher)
Date: Wed, 4 Jul 2007 11:15:50 +0200
Subject: fopen extended attributes
In-Reply-To: <1183510166.4877.3.camel@ket.lat>
References: <1183510166.4877.3.camel@ket.lat>
Message-ID: <200707041115.51164.agruen@suse.de>

On Wednesday 04 July 2007 02:49, Jack Bates wrote:
> Are ext3 file system extended attributes mapped anywhere to file system
> paths? e.g. /sys/fs/ext3/<path>/<attr-name>?

No, they're not.

> I want to edit a file's extended attribute using standard system calls,
> fopen, fread, fwrite, etc. Is it possible?

Sorry no. You can only access them using the *xattr syscalls.

Andreas


From jprats at cesca.es  Wed Jul  4 10:59:47 2007
From: jprats at cesca.es (Jordi Prats)
Date: Wed, 04 Jul 2007 12:59:47 +0200
Subject: journal corrupted on / filesystem
In-Reply-To: <20070703214231.GI6578@schatzie.adilger.int>
References: <4689FB61.5090401@cesca.es>	<200707031501.02708.tweeks@rackspace.com>
	<20070703214231.GI6578@schatzie.adilger.int>
Message-ID: <468B7DA3.9070403@cesca.es>

Hi,
Yes it did not work on a live system, you need to reboot it, after the 
procedure you must reboot.

For the record what I did was:

Mark the filesystem as it does not have a journal (take it to ext2)

tune2fs -O ^has_journal /dev/cciss/c0d0p2

fsck it to delete the journal:

e2fsck /dev/cciss/c0d0p2

Create the journal (take it back to ext3)

tune2fs -j /dev/cciss/c0d0p2

and finaly, remount it. On a live system, just reboot it.

Thank you all,
Jordi


Andreas Dilger wrote:
> On Jul 03, 2007  15:01 -0500, tweeks wrote:
>   
>> On Tuesday 03 July 2007 02:31, Jordi Prats wrote:
>>     
>>> I'm getting this errors on the / filesystem:
>>>       
>> [...]
>>     
>>> I supose I should remove the journal (get it back to ext2) and recreate
>>> it (tune2fs -j /dev/...) It's possible to do it without rebooting it?
>>> It's no problem to turn it to read-only (it already is on that mode)
>>> Whitch command I should do to achive his?
>>>       
>> Just remount as ext2:
>>
>> # mount -t ext2 -o remount /
>>     
>
> Won't work.
>
> You need to unmount the filesystem at least, at which point recreating
> the journal with tune2fs is easy.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>
>
>   


-- 
......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputaci? de Catalunya

  Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
  T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
...................................................................... 


From ramesh25 at gmail.com  Tue Jul 10 17:28:49 2007
From: ramesh25 at gmail.com (Ramesh Natarajan)
Date: Tue, 10 Jul 2007 12:28:49 -0500
Subject: Ext3 fsck questions
Message-ID: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com>

Hi,

  I am currently having a RAID disk configured to appear as
3 ext3 disks (/dev/sda,/dev/sdb and /dev/sdc)

The disks are initially formatted using

mkfs.ext3 /dev/sda

and mounted as follows

mount -t ext3 -o data=ordered -o commit=1 /dev/sda /mnt/san

>From what I read from the man page and other maillist archives I must run
fsck
periodically  ( default after 38 mounts or 6 months) to ensure the
filesystem is clean.

Is this still valid if I mount using the following options?

-o data=ordered -o commit=1


Thanks
Ramesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070710/96b9f164/attachment.htm>

From jprats at cesca.es  Tue Jul 10 20:50:18 2007
From: jprats at cesca.es (Jordi Prats)
Date: Tue, 10 Jul 2007 22:50:18 +0200
Subject: Get journal position
Message-ID: <4693F10A.5070201@cesca.es>

Hi,
There is any way to figure out where physically is the journal on a ext3
fs and it's size?

Thanks!
Jordi


From adilger at clusterfs.com  Wed Jul 11 12:27:45 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Wed, 11 Jul 2007 06:27:45 -0600
Subject: Get journal position
In-Reply-To: <4693F10A.5070201@cesca.es>
References: <4693F10A.5070201@cesca.es>
Message-ID: <20070711122745.GY6417@schatzie.adilger.int>

On Jul 10, 2007  22:50 +0200, Jordi Prats wrote:
> There is any way to figure out where physically is the journal on a ext3
> fs and it's size?

debugfs -c -R "stat <8>" {dev}

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From jprats at cesca.es  Wed Jul 11 12:55:15 2007
From: jprats at cesca.es (Jordi Prats)
Date: Wed, 11 Jul 2007 14:55:15 +0200
Subject: Get journal position
In-Reply-To: <20070711122745.GY6417@schatzie.adilger.int>
References: <4693F10A.5070201@cesca.es>
	<20070711122745.GY6417@schatzie.adilger.int>
Message-ID: <4694D333.9090005@cesca.es>

Hi,
But on the example above, this what does it means? 2055 is a byte count 
or a sector count (512bytes) or a block size (4K) count?

Thanks!

debugfs 1.39 (29-May-2006)
/dev/cciss/c0d0p1: catastrophic mode - not reading inode or group bitmaps
Inode: 8   Type: regular    Mode:  0600   Flags: 0x0   Generation: 0
User:     0   Group:     0   Size: 134217728
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 262416
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x466d1870 -- Mon Jun 11 11:40:00 2007
atime: 0x00000000 -- Thu Jan  1 01:00:00 1970
mtime: 0x466d1870 -- Mon Jun 11 11:40:00 2007
BLOCKS:
(0-11):2055-2066, (IND):2067, (12-1035):2068-3091, (DIND):3092, 
(IND):3093, (1036-2059):3094-4117, (IND):4118, (2060-3083):4119-5142, 
(IND):5143, (3084-4107):5144-6167, (IND):6168, (4108-5131):6169-7192, 
(IND):7193, (5132-6155):7194-8217, (IND):8218, (6156-7179):8219-9242, 
(IND):9243, (7180-8203):9244-10267, (IND):10268, 
(8204-9227):10269-11292, (IND):11293, (9228-10251):11294-12317, 
(IND):12318, (10252-11275):12319-13342, (IND):13343, 
(11276-12299):13344-14367, (IND):14368, (12300-13323):14369-15392, 
(IND):15393, (13324-14347):15394-16417, (IND):16418, 
(14348-15371):16419-17442, (IND):17443, (15372-16395):17444-18467, 
(IND):18468, (16396-17419):18469-19492, (IND):19493, 
(17420-18443):19494-20517, (IND):20518, (18444-19467):20519-21542, 
(IND):21543, (19468-20491):21544-22567, (IND):22568, 
(20492-21515):22569-23592, (IND):23593, (21516-22539):23594-24617, 
(IND):24618, (22540-23563):24619-25642, (IND):25643, 
(23564-24587):25644-26667, (IND):26668, (24588-25611):26669-27692, 
(IND):27693, (25612-26635):27694-28717, (IND):28718, 
(26636-27659):28719-29742, (IND):29743, (27660-28683):29744-30767, 
(IND):30768, (28684-29707):30769-31792, (IND):31793, 
(29708-30673):31794-32759, (30674-30731):34809-34866, (IND):34867, 
(30732-31755):34868-35891, (IND):35892, (31756-32768):35893-36905
TOTAL: 32802


Andreas Dilger wrote:
> On Jul 10, 2007  22:50 +0200, Jordi Prats wrote:
>   
>> There is any way to figure out where physically is the journal on a ext3
>> fs and it's size?
>>     
>
> debugfs -c -R "stat <8>" {dev}
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
>
>
>   


-- 
......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputaci? de Catalunya

  Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
  T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
...................................................................... 


From ulf at atc-onlane.com  Sat Jul 14 09:45:11 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Sat, 14 Jul 2007 02:45:11 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>

This may or may not be ext3 related but I am trying to find any pointers
which might help me. I got a number of HP Proliant DL380 g5 with a P400
controller and also two qla2400 cards. The OS is RedHat EL4 U5 x86_64.
 
Every time during reboot these systems panic after the last umount and I
believe before the cciss driver is getting unloaded. The last messages I
am able to see are:

md: stopping all md devices.
md: md0 switched to read-only mode.

After this I get the panic, on other systems I believe the following
messages come (DL380 g4 with 6i controller):

cciss: stopping all cciss devices.
cciss: removing controller 0

The last 3 lines of the panic are:

Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08
RIP <ffffffffa011d0f4>{:ext3:dx_probe+427} RSP <00000102259338e8>
 <0>Kernel panic - not syncing: Oops

I can not reproduce this problem with a DL380 g4 with 6i controller.
Tried
the included cciss driver in EL4 Update 5 and the one provided by HP. No
difference.

Any tips what to look at would be appreciated.

Regards, Ulf.

---------------------------------------------------------------------
ATC-Onlane Inc., T: 650-532-6382, F: 650-532-6441
4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025
---------------------------------------------------------------------


From ulf at atc-onlane.com  Sat Jul 14 20:10:45 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Sat, 14 Jul 2007 13:10:45 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com>

> -----Original Message-----
> From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com]
> On Behalf Of Ulf Zimmermann
> Sent: Saturday, July 14, 2007 02:45
> To: ext3-users at redhat.com
> Subject: Kernel panic in ext3:dx_probe, help needed
> 
> This may or may not be ext3 related but I am trying to find any
pointers
> which might help me. I got a number of HP Proliant DL380 g5 with a
P400
> controller and also two qla2400 cards. The OS is RedHat EL4 U5 x86_64.
> 
> Every time during reboot these systems panic after the last umount and
I
> believe before the cciss driver is getting unloaded. The last messages
I
> am able to see are:
> 
> md: stopping all md devices.
> md: md0 switched to read-only mode.
> 
> After this I get the panic, on other systems I believe the following
> messages come (DL380 g4 with 6i controller):
> 
> cciss: stopping all cciss devices.
> cciss: removing controller 0
> 
> The last 3 lines of the panic are:
> 
> Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08
> RIP <ffffffffa011d0f4>{:ext3:dx_probe+427} RSP <00000102259338e8>
>  <0>Kernel panic - not syncing: Oops
> 
> I can not reproduce this problem with a DL380 g4 with 6i controller.
> Tried
> the included cciss driver in EL4 Update 5 and the one provided by HP.
No
> difference.
> 
> Any tips what to look at would be appreciated.
> 

Have been able to reproduce it on yet another system but here I was able
to catch the top of the panic with " dx_probe: Unrecognised inode hash
code 28" on cciss/c0d0p6, which is the / file system.


From ulf at atc-onlane.com  Sat Jul 14 20:42:27 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Sat, 14 Jul 2007 13:42:27 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com>

> -----Original Message-----
> From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com]
> On Behalf Of Ulf Zimmermann
> Sent: Saturday, July 14, 2007 13:11
> To: ext3-users at redhat.com
> Subject: RE: Kernel panic in ext3:dx_probe, help needed
> 
> > -----Original Message-----
> > From: ext3-users-bounces at redhat.com
> [mailto:ext3-users-bounces at redhat.com]
> > On Behalf Of Ulf Zimmermann
> > Sent: Saturday, July 14, 2007 02:45
> > To: ext3-users at redhat.com
> > Subject: Kernel panic in ext3:dx_probe, help needed
> >
> > This may or may not be ext3 related but I am trying to find any
> pointers
> > which might help me. I got a number of HP Proliant DL380 g5 with a
> P400
> > controller and also two qla2400 cards. The OS is RedHat EL4 U5
x86_64.
> >
> > Every time during reboot these systems panic after the last umount
and
> I
> > believe before the cciss driver is getting unloaded. The last
messages
> I
> > am able to see are:
> >
> > md: stopping all md devices.
> > md: md0 switched to read-only mode.
> >
> > After this I get the panic, on other systems I believe the following
> > messages come (DL380 g4 with 6i controller):
> >
> > cciss: stopping all cciss devices.
> > cciss: removing controller 0
> >
> > The last 3 lines of the panic are:
> >
> > Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08
> > RIP <ffffffffa011d0f4>{:ext3:dx_probe+427} RSP <00000102259338e8>
> >  <0>Kernel panic - not syncing: Oops
> >
> > I can not reproduce this problem with a DL380 g4 with 6i controller.
> > Tried
> > the included cciss driver in EL4 Update 5 and the one provided by
HP.
> No
> > difference.
> >
> > Any tips what to look at would be appreciated.
> >
> 
> Have been able to reproduce it on yet another system but here I was
able
> to catch the top of the panic with " dx_probe: Unrecognised inode hash
> code 28" on cciss/c0d0p6, which is the / file system.

Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 as we
are installing it). The RedHat provided cciss driver (2.6.14-RH2) has no
problem with that, the latest cciss driver from HP, 2.6.16-6, does.
Turning off dir_index for /, forcing fsck during reboot and everything
is fine.

Ulf.


From lists at nerdbynature.de  Sun Jul 15 01:56:55 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sun, 15 Jul 2007 03:56:55 +0200 (CEST)
Subject: Ext3 fsck questions
In-Reply-To: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com>
References: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com>
Message-ID: <alpine.DEB.0.99.0707150347450.18512@sheep.housecafe.de>

On Tue, 10 Jul 2007, Ramesh Natarajan wrote:
> and mounted as follows
> mount -t ext3 -o data=ordered -o commit=1 /dev/sda /mnt/san

data=ordered seems to be the default anyway.

> From what I read from the man page and other maillist archives I must run
> fsck periodically  ( default after 38 mounts or 6 months) to ensure the
> filesystem is clean.
> Is this still valid if I mount using the following options?

It's recommended to run e2fsck once in a while (otherwise there would be 
no need for the 'max-mount-count' and 'interval-between-checks' 
tunables). But since it's a tunable you can of course turn it off.

Really, there is not definite answer here. I for one use e2fsck once in 
a while and see it more as a datapoint ("fs was OK on 2007-07-15") or 
as a mere sanity check :)

C.
-- 
BOFH excuse #146:

Communications satellite used by the military for star wars.


From lists at nerdbynature.de  Sun Jul 15 02:04:22 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Sun, 15 Jul 2007 04:04:22 +0200 (CEST)
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com>
Message-ID: <alpine.DEB.0.99.0707150400310.18512@sheep.housecafe.de>

On Sat, 14 Jul 2007, Ulf Zimmermann wrote:
>>> believe before the cciss driver is getting unloaded. The last
>>> messages I am able to see are:
>>>
>>> md: stopping all md devices.
>>> md: md0 switched to read-only mode.

I think these messages are the real cause of the ext3 errors.

> Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 as we
> are installing it). The RedHat provided cciss driver (2.6.14-RH2) has no
> problem with that, the latest cciss driver from HP, 2.6.16-6, does.
> Turning off dir_index for /, forcing fsck during reboot and everything
> is fine.

A device driver should not care about filesystem features, IMHO. Either 
there are problems with the cciss driver (syslog messages please) or the 
ext3 fs is corrupted - in which case fsck should be run.

C.
-- 
BOFH excuse #115:

your keyboard's space bar is generating spurious keycodes.


From ulf at atc-onlane.com  Sun Jul 15 03:32:41 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Sat, 14 Jul 2007 20:32:41 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <alpine.DEB.0.99.0707150400310.18512@sheep.housecafe.de>
References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com>
	<alpine.DEB.0.99.0707150400310.18512@sheep.housecafe.de>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>

> -----Original Message-----
> From: "evil at g-house.de"@mail.g-house.de
[mailto:"evil at g-house.de"@mail.g-
> house.de] On Behalf Of Christian Kujau
> Sent: Saturday, July 14, 2007 19:04
> To: Ulf Zimmermann
> Cc: ext3-users at redhat.com
> Subject: RE: Kernel panic in ext3:dx_probe, help needed
> 
> On Sat, 14 Jul 2007, Ulf Zimmermann wrote:
> >>> believe before the cciss driver is getting unloaded. The last
> >>> messages I am able to see are:
> >>>
> >>> md: stopping all md devices.
> >>> md: md0 switched to read-only mode.
> 
> I think these messages are the real cause of the ext3 errors.
> 
> > Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6
as we
> > are installing it). The RedHat provided cciss driver (2.6.14-RH2)
has no
> > problem with that, the latest cciss driver from HP, 2.6.16-6, does.
> > Turning off dir_index for /, forcing fsck during reboot and
everything
> > is fine.
> 
> A device driver should not care about filesystem features, IMHO.
Either
> there are problems with the cciss driver (syslog messages please) or
the
> ext3 fs is corrupted - in which case fsck should be run.

I can reproduce this on 8+ servers, 6 of them were just installed
yesterday afternoon. Using "tune2fs -O ^dir_index /dev/cciss/c0d0p6"
followed by a "touch /forcefsck && reboot" leads to no panics are reboot
time.

I have reported this to HP for now.

Ulf.


From walker at stsci.edu  Tue Jul 17 18:07:38 2007
From: walker at stsci.edu (Thomas Walker)
Date: Tue, 17 Jul 2007 14:07:38 -0400
Subject: large ext3 filesystem consistantly locking itself read-only
Message-ID: <469D056A.9020504@stsci.edu>


   We have several large ext3 file system partitions.  One of them sets 
itself to read-only after getting journel problems.  I understand that's 
a good thing, but obviously I need to correct the problem so that it 
will stop locking itself.  Here are some details;

OS is Redhat EL4 x86_64 running on a SunFire v40z, kernel is 
2.6.9-42.0.2.ELsmp.  The disk storage in question is external, via fiber 
cable.  The fiber HBA is a Qlogic ISP2312 connected to a Qlogic San 
Switch connected to four Apple Xserve Raids.  There are 8 individual 
LUN's coming from the four XRaids, they appear on the host as 
/dev/sd[cdefghij].  Those LUNs are put into two LVM volume groups and 
then mounted from logical volumes.

   The partition in question is 8TB, about 92% full at the moment.  One 
oddity about this partition is it has a subdirectory which contains over 
2700 symbolic links to other partitions.  Here is the output from 
/var/adm/messages the last time the file system locked itself;

Jul 17 09:01:06  kernel: Info fld=0x0, Current sdd: sense key No Sense
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_free_blocks_sb: bit already cleared for block 786856796
Jul 17 09:01:06  kernel: Aborting journal on device dm-3.
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3) in 
start_transaction: Readonly filesystem
Jul 17 09:01:06  kernel: Aborting journal on device dm-3.
Jul 17 09:01:06  kernel: ext3_abort called.
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_journal_start_sb: Detected aborted journal
Jul 17 09:01:06  kernel: Remounting filesystem read-only
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3) in 
start_transaction: Journal has aborted
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_free_blocks_sb: bit already cleared for block 786856797
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_free_blocks_sb: bit already cleared for block 786856798
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_free_blocks_sb: bit already cleared for block 786856799
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): 
ext3_free_blocks_sb: bit already cleared for block 786856800
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3) in 
ext3_reserve_inode_write: Journal has aborted
Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3) in ext3_truncate: 
Journal has aborted
Jul 17 09:01:07  kernel: EXT3-fs error (device dm-3) in 
ext3_reserve_inode_write: Journal has aborted
Jul 17 09:01:07  kernel: EXT3-fs error (device dm-3) in ext3_orphan_del: 
Journal has aborted
Jul 17 09:01:07  kernel: EXT3-fs error (device dm-3) in 
ext3_reserve_inode_write: Journal has aborted
Jul 17 09:01:07  kernel: EXT3-fs error (device dm-3) in 
ext3_delete_inode: Journal has aborted
Jul 17 09:01:07  kernel: __journal_remove_journal_head: freeing 
b_committed_data

   If I run fsck it does seem to repair bad blocks and clears inodes but 
of course for 8TB it takes a long time to run and the corruption only 
comes back later.

   I have considered upgrading the kernel, it could be done.  I think 
part of the problem is the large number of symbolic links on that 
partition but without evidence it will be difficult to get people to 
change it.  I also don't like the first line in the messages about 
device sdd getting a "No Sense" response to a SCSI sense key request.

   Any good advice on how to proceed would be appreciated.  I have 
looked at the dumpe2fs and debugfs tools but I don't see how to put them 
to good use in this case.

   Thomas Walker


From lists at nerdbynature.de  Tue Jul 17 23:10:04 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Wed, 18 Jul 2007 01:10:04 +0200 (CEST)
Subject: large ext3 filesystem consistantly locking itself read-only
In-Reply-To: <469D056A.9020504@stsci.edu>
References: <469D056A.9020504@stsci.edu>
Message-ID: <alpine.DEB.0.99.0707180101320.11743@sheep.housecafe.de>

On Tue, 17 Jul 2007, Thomas Walker wrote:
> Jul 17 09:01:06  kernel: Info fld=0x0, Current sdd: sense key No Sense
> Jul 17 09:01:06  kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: 
> bit already cleared for block 786856796

...the rest of the errors seem to stem from ext3, but what about the 
"sdd: sense key: .." message above? Are there more device related 
messages? If so, this could be the cause for ext3 to barf.

>  If I run fsck it does seem to repair bad blocks

as in "bad blocks on disk"? No good then. Try to rule out device errors 
first (HBA driver, cabling, cooling, etc.) before doing more e2fsck 
work...

-- 
BOFH excuse #191:

Just type 'mv * /dev/null'.


From alex at alex.org.uk  Wed Jul 18 10:01:48 2007
From: alex at alex.org.uk (Alex Bligh)
Date: Wed, 18 Jul 2007 11:01:48 +0100
Subject: large ext3 filesystem consistantly locking itself read-only
In-Reply-To: <alpine.DEB.0.99.0707180101320.11743@sheep.housecafe.de>
References: <469D056A.9020504@stsci.edu>
	<alpine.DEB.0.99.0707180101320.11743@sheep.housecafe.de>
Message-ID: <B91E6D1E970A9579731C9B49@[192.168.100.25]>

--On 18 July 2007 01:10 +0200 Christian Kujau <lists at nerdbynature.de> wrote:

> BOFH excuse #191: Just type 'mv * /dev/null'.

(OT apols). This doesn't do what you might expect. You will get write
permission denied as non-root...

Alex


From lists at nerdbynature.de  Wed Jul 18 11:31:57 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Wed, 18 Jul 2007 13:31:57 +0200 (CEST)
Subject: [OT] Re: large ext3 filesystem consistantly locking itself 
 read-only
In-Reply-To: <B91E6D1E970A9579731C9B49@[192.168.100.25]>
References: <469D056A.9020504@stsci.edu>
	<alpine.DEB.0.99.0707180101320.11743@sheep.housecafe.de>
	<B91E6D1E970A9579731C9B49@[192.168.100.25]>
Message-ID: <32882.62.180.231.196.1184758317.squirrel@housecafe.dyndns.org>


On Wed, July 18, 2007 12:01, Alex Bligh wrote:
>> BOFH excuse #191: Just type 'mv * /dev/null'.
>>
> (OT apols). This doesn't do what you might expect. You will get write
> permission denied as non-root...

Good catch :)

So, are you suggesting that I should open a bugreport for the
fortunes-bofh-excuses package?

SCNR,
C.
-- 
BOFH excuse #442:

Trojan horse ran out of hay


From wolf at CLEMSON.EDU  Fri Jul 20 20:37:53 2007
From: wolf at CLEMSON.EDU (Randy Martin)
Date: Fri, 20 Jul 2007 16:37:53 -0400
Subject: ext3 partition problems w/Apple Xserve RAID
Message-ID: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>

I'm running Red Hat AS 4 on a Dell PowerEdge 1950.  It connects to an Apple
Xserve RAID via a Qlogic QLE2460 card.  I am able to create a 4TB ext3
partition with no problems and use it fine.  When the system power drops or
it's rebooted, the file system can't  be mounted again.  It looks like the
partition table is getting corrupted.  Here is some of the doc I gathered:

 
Output from fsck:

 
fsck /dev/sdb1

fsck 1.35 (28-Feb-2004)

e2fsck 1.35 (28-Feb-2004)

 
The filesystem size (according to the superblock) is 1098848000 blocks

 
The physical size of the device is 25106176 blocks

Either the superblock or the partition table is likely to be corrupt!

 
----------------------------------------------------------------------------

Output from parted:

 
parted /dev/sdb

GNU Parted 1.6.19

Copyright (C) 1998 - 2004 Free Software Foundation, Inc.^M

This program is free software, covered by the GNU General Public License.

 
This program is distributed in the hope that it will be useful, but WITHOUT

ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS

FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more

details.

 
Using /dev/sdb

(parted) print

Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes

Disk label type: msdos

Minor    Start       End     Type      Filesystem  Flags

1          0.031  98071.031  primary   ext3        

(parted) check 1

Warning: Partition 1 is 98071.000Mb, but the file system is 4292375.000Mb.

 
---------------------------------------------------------------------------

Output from debugfs:

 
debugfs /dev/sdb

debugfs 1.35 (28-Feb-2004)

/dev/sdb: Bad magic number in super-block while opening filesystem

debugfs:  open /dev/sdb1

/dev/sdb1: Can't read an inode bitmap while reading inode bitmap

debugfs:  quit

----------------------------------------------------------------------------

We connect via a Qlogic QLE2460 to the Apple XServe RAID:

 
qla2400 0000:0c:00.0:

 QLogic Fibre Channel HBA Driver: 8.01.07

  QLogic QLE2460 - PCI-Express to 4Gb FC, Single Channel

  ISP2432: PCIe (2.5Gb/s x4) @ 0000:0c:00.0 hdma+, host#=1, fw=4.00.26 [IP]

  Vendor: APPLE     Model: Xserve RAID       Rev: 1.51

  Type:   Direct-Access                      ANSI SCSI revision: 05

qla2400 0000:0c:00.0: scsi(1:0:0:0): Enabled tagged queuing, queue depth 32.

sdb : very big device. try to use READ CAPACITY(16).

SCSI device sdb: 8790786048 512-byte hdwr sectors (4500882 MB)

SCSI device sdb: drive cache: write back

sdb : very big device. try to use READ CAPACITY(16).

SCSI device sdb: 8790786048 512-byte hdwr sectors (4500882 MB)

SCSI device sdb: drive cache: write back

 sdb: sdb1

Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0

 
----------------------------------------------------------------------------

Current output of parted after I remade the file system and restored the
data:

 
parted /dev/sdb1

GNU Parted 1.6.19

Copyright (C) 1998 - 2004 Free Software Foundation, Inc.

This program is free software, covered by the GNU General Public License.

 
This program is distributed in the hope that it will be useful, but WITHOUT
ANY

WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A

PARTICULAR PURPOSE.  See the GNU General Public License for more details.

 
Using /dev/sdb1

(parted) print

Disk geometry for /dev/sdb1: 0.000-4292375.967 megabytes

Disk label type: loop

Minor    Start       End     Filesystem  Flags

1          0.000 4292375.967  ext3

 
I'm afraid this will happen again the next time I reboot the system.  Any
ideas what might be causing it and how to fix it?

 
Thanks,

Randy

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070720/d837f7e1/attachment.htm>

From jlforrest at berkeley.edu  Fri Jul 20 20:55:07 2007
From: jlforrest at berkeley.edu (Jon Forrest)
Date: Fri, 20 Jul 2007 13:55:07 -0700
Subject: ext3 partition problems w/Apple Xserve RAID
In-Reply-To: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>
References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>
Message-ID: <46A1212B.7010501@berkeley.edu>

Randy Martin wrote:
> I?m running Red Hat AS 4 on a Dell PowerEdge 1950.  It connects to an 
> Apple Xserve RAID via a Qlogic QLE2460 card.  I am able to create a 4TB 
> ext3 partition with no problems and use it fine.  When the system power 
> drops or it?s rebooted, the file system can?t  be mounted again.  It 
> looks like the partition table is getting corrupted.  Here is some of 
> the doc I gathered:

> Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes
> 
> Disk label type: msdos

Bingo. That's your problem. You have to use a gpt disk label
for partitions this large. I had the identical problem
and I was able to fix it without loosing a single bit.
I described it in a posting to this group on 3/14/2007.
(I'll send it to you directly).

Cordially,
-- 
Jon Forrest
Unix Computing Support
College of Chemistry
173 Tan Hall
University of California Berkeley
Berkeley, CA
94720-1460
510-643-1032
jlforrest at berkeley.edu


From tambewilliam at gmail.com  Sat Jul 21 23:23:24 2007
From: tambewilliam at gmail.com (William Tambe)
Date: Sat, 21 Jul 2007 18:23:24 -0500
Subject: Please How do I calculate the offset of a file within a ext3
	partition
Message-ID: <46A2956C.1020100@gmail.com>

Hi,
I need to understand and to calculate the offset of the beginning of a 
file within my partition which uses an ext3 filesystem.

Can I use dumpe2fs to figure that out, if yes how?

Sincerely,
William Tambe


From duaneg at dghda.com  Mon Jul 23 00:16:03 2007
From: duaneg at dghda.com (Duane Griffin)
Date: Mon, 23 Jul 2007 01:16:03 +0100
Subject: Please How do I calculate the offset of a file within a ext3
	partition
In-Reply-To: <46A2956C.1020100@gmail.com>
References: <46A2956C.1020100@gmail.com>
Message-ID: <e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>

On 22/07/07, William Tambe <tambewilliam at gmail.com> wrote:
> I need to understand and to calculate the offset of the beginning of a
> file within my partition which uses an ext3 filesystem.
>
> Can I use dumpe2fs to figure that out, if yes how?

(Sorry for the duplicate William, forgot to reply to the list)

Not sure about dumpe2fs but you can use debugfs to do so. For example:

/sbin/debugfs <fs> -R "bmap /path/to/file 0"

Will give you the first physical block corresponding to logical block
0 of the file.

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan


From tambewilliam at gmail.com  Mon Jul 23 01:07:31 2007
From: tambewilliam at gmail.com (William Tambe)
Date: Sun, 22 Jul 2007 20:07:31 -0500
Subject: Please How do I calculate the offset of a file within a ext3
 partition
In-Reply-To: <e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
Message-ID: <46A3FF53.4000704@gmail.com>

Thank you for your response, but one more question, does this logical 
block 0 hold the header of the file, if not where is located the header 
of a file in a ext3 filesystem.

The reason why I need to know that is because I wish to use swsusp on my 
swap-file so I really need to know the location of the file's swap header.

Thank you for helping.

Sincerely,
William Tambe

Duane Griffin wrote:
> On 22/07/07, William Tambe <tambewilliam at gmail.com> wrote:
>> I need to understand and to calculate the offset of the beginning of a
>> file within my partition which uses an ext3 filesystem.
>>
>> Can I use dumpe2fs to figure that out, if yes how?
> 
> (Sorry for the duplicate William, forgot to reply to the list)
> 
> Not sure about dumpe2fs but you can use debugfs to do so. For example:
> 
> /sbin/debugfs <fs> -R "bmap /path/to/file 0"
> 
> Will give you the first physical block corresponding to logical block
> 0 of the file.
> 
> Cheers,
> Duane.
> 


From lists at nerdbynature.de  Mon Jul 23 08:20:49 2007
From: lists at nerdbynature.de (Christian Kujau)
Date: Mon, 23 Jul 2007 10:20:49 +0200 (CEST)
Subject: ext3 partition problems w/Apple Xserve RAID
In-Reply-To: <46A1212B.7010501@berkeley.edu>
References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>
	<46A1212B.7010501@berkeley.edu>
Message-ID: <31412.62.180.231.196.1185178849.squirrel@housecafe.dyndns.org>

On Fri, July 20, 2007 22:55, Jon Forrest wrote:
> fix it without loosing a single bit. I described it in a posting to this
> group on 3/14/2007.

https://www.redhat.com/archives/ext3-users/2007-March/msg00023.html

nice tutorial :-)

-- 
BOFH excuse #442:

Trojan horse ran out of hay


From tytso at mit.edu  Mon Jul 23 14:59:03 2007
From: tytso at mit.edu (Theodore Tso)
Date: Mon, 23 Jul 2007 10:59:03 -0400
Subject: Please How do I calculate the offset of a file within a ext3
	partition
In-Reply-To: <46A3FF53.4000704@gmail.com>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
	<46A3FF53.4000704@gmail.com>
Message-ID: <20070723145902.GF19927@thunk.org>

On Sun, Jul 22, 2007 at 08:07:31PM -0500, William Tambe wrote:
> Thank you for your response, but one more question, does this logical 
> block 0 hold the header of the file, if not where is located the header 
> of a file in a ext3 filesystem.
> 
> The reason why I need to know that is because I wish to use swsusp on my 
> swap-file so I really need to know the location of the file's swap header.

The swap header is located at the beginning of the file, so yes, that
would be found in block 0 of the file.

The bugger question is what are you *doing*?  If you're just trying to
enable swsusp, you should need to be using debugfs to find the block
number and then manually editing the swap header.  The swap file
should have been set up correctly before you started using it, or if
you want to initialize a new swap-file, you can use the mkswap
command.  If you're needing to manually edit the swap header, you're
almost certainly doing something wrong....

						- Ted


From tambewilliam at gmail.com  Mon Jul 23 19:17:40 2007
From: tambewilliam at gmail.com (William Tambe)
Date: Mon, 23 Jul 2007 14:17:40 -0500
Subject: Please How do I calculate the offset of a file within a ext3
 partition
In-Reply-To: <20070723145902.GF19927@thunk.org>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
	<46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org>
Message-ID: <46A4FED4.2020004@gmail.com>

Thank you for warning me, I am already using a specific file as my swap, 
so I had already done mkswap on it.
I only wanted to be able suspend on it and resume from it using swsusp.
To do that I needed to give to the kernel as arguments the following:
resume=<swap_file_partition> resume_offset=<swap_file_header_offset>

So I had to figure out a way to find out where the header of my swap 
file was.
I haven't tried it yet, I rather want to backup my file first, in case 
something wrong happen.

Sincerely,
William Tambe

Theodore Tso wrote:
> On Sun, Jul 22, 2007 at 08:07:31PM -0500, William Tambe wrote:
>> Thank you for your response, but one more question, does this logical 
>> block 0 hold the header of the file, if not where is located the header 
>> of a file in a ext3 filesystem.
>>
>> The reason why I need to know that is because I wish to use swsusp on my 
>> swap-file so I really need to know the location of the file's swap header.
> 
> The swap header is located at the beginning of the file, so yes, that
> would be found in block 0 of the file.
> 
> The bugger question is what are you *doing*?  If you're just trying to
> enable swsusp, you should need to be using debugfs to find the block
> number and then manually editing the swap header.  The swap file
> should have been set up correctly before you started using it, or if
> you want to initialize a new swap-file, you can use the mkswap
> command.  If you're needing to manually edit the swap header, you're
> almost certainly doing something wrong....
> 
> 						- Ted
> 


From adilger at clusterfs.com  Mon Jul 23 20:26:18 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Mon, 23 Jul 2007 14:26:18 -0600
Subject: Please How do I calculate the offset of a file within a ext3
	partition
In-Reply-To: <46A4FED4.2020004@gmail.com>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
	<46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org>
	<46A4FED4.2020004@gmail.com>
Message-ID: <20070723202618.GC5992@schatzie.adilger.int>

On Jul 23, 2007  14:17 -0500, William Tambe wrote:
> Thank you for warning me, I am already using a specific file as my swap, 
> so I had already done mkswap on it.
> I only wanted to be able suspend on it and resume from it using swsusp.
> To do that I needed to give to the kernel as arguments the following:
> resume=<swap_file_partition> resume_offset=<swap_file_header_offset>

This is in fact very dangerous (I think, though I'm no swsusp user).
There is no guarantee at all that the swap file is contiguous on disk.

If it isn't working at the level of a regular file (at which point it
could just use ->bmap() to find this information itself) then it is
likely expecting some contigous number of blocks and it may clobber
your filesystem.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From tytso at mit.edu  Mon Jul 23 20:34:50 2007
From: tytso at mit.edu (Theodore Tso)
Date: Mon, 23 Jul 2007 16:34:50 -0400
Subject: Please How do I calculate the offset of a file within a ext3
	partition
In-Reply-To: <46A4FED4.2020004@gmail.com>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
	<46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org>
	<46A4FED4.2020004@gmail.com>
Message-ID: <20070723203450.GD30165@thunk.org>

On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote:
> Thank you for warning me, I am already using a specific file as my swap, 
> so I had already done mkswap on it.
> I only wanted to be able suspend on it and resume from it using swsusp.
> To do that I needed to give to the kernel as arguments the following:
> resume=<swap_file_partition> resume_offset=<swap_file_header_offset>

If you have the filefrag program, you can just do 

# filefrag -v /var/cache/swap  | head
Checking /var/cache/swap
Filesystem type is: ef53
Filesystem cylinder groups is approximately 578
Blocksize of file /var/cache/swap is 4096
File size of /var/cache/swap is 1073741824 (262144 blocks)
First block: 13778944
Last block: 14406757
Discontinuity: Block 6137 is at 13785112 (was 13785087)
Discontinuity: Block 12251 is at 13791992 (was 13791231)

So the first block is 13778944.  So the byte offset is 4096*13778944
or 56438554624.


					- Ted


From darkonc at gmail.com  Tue Jul 24 02:49:10 2007
From: darkonc at gmail.com (Stephen Samuel)
Date: Mon, 23 Jul 2007 19:49:10 -0700
Subject: Please How do I calculate the offset of a file within a ext3
	partition
In-Reply-To: <6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com>
References: <46A2956C.1020100@gmail.com>
	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>
	<46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org>
	<46A4FED4.2020004@gmail.com> <20070723203450.GD30165@thunk.org>
	<6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com>
Message-ID: <6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com>

What I'd note here is that the file has discontinuities, so this file
is probably not appropriate for doing suspends to swap.
At a quick guess, you probably need to either:
  1) set up a proper swap PARTITION.
(e.g. remove the current swap file, shrink the /var (or /, as the case
may be) partition by that much, and then use the newly freed space to
create a proper partition.)

I believe that you can use qtparted to do the work of shrinking the
partition for you.  You might want to download a live-CD linux (like
Knoppix, or the Ubuntu live CD)  so that you can do the resize without
having to worry about the partition being in use.

or
2) Find a program that will allow you to allocate a file as one
contiguous chunk (nothing off the top of my head). then allocate the
swap file using that,

On 7/23/07, Stephen Samuel <darkonc at gmail.com> wrote:
> What I'd note here is that the file has discontinuities, so this file
> is probably not appropriate for doing suspends to swap.
> At a quick guess, you probably need to either:
>    1) set up a proper swap PARTITION.
> (e.g. remove the current swap file, shrink the /var (or /, as the case
> may be) partition by that much, and then use the newly freed space to
> create a proper partition.)
>
> I believe that you can use qtparted to do the work of shrinking the
> partition for you.  You might want to download a live-CD linux (like
> Knoppix, or the Ubuntu live CD)  so that you can do the resize without
> having to worry about the partition being in use.
>
> or
> 2) Find a program that will allow you to allocate a file as one
> contiguous chunk (nothing off the top of my head). then allocate the
> swap file using that,
>
> On 7/23/07, Theodore Tso <tytso at mit.edu> wrote:
> > On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote:
> > > Thank you for warning me, I am already using a specific file as my swap,
> > > so I had already done mkswap on it.
> > > I only wanted to be able suspend on it and resume from it using swsusp.
> > > To do that I needed to give to the kernel as arguments the following:
> > > resume=<swap_file_partition> resume_offset=<swap_file_header_offset>
> >
> > If you have the filefrag program, you can just do
> >
> > # filefrag -v /var/cache/swap  | head
> > Checking /var/cache/swap
> > Filesystem type is: ef53
> > Filesystem cylinder groups is approximately 578
> > Blocksize of file /var/cache/swap is 4096
> > File size of /var/cache/swap is 1073741824 (262144 blocks)
> > First block: 13778944
> > Last block: 14406757
> > Discontinuity: Block 6137 is at 13785112 (was 13785087)
> > Discontinuity: Block 12251 is at 13791992 (was 13791231)
> >
> > So the first block is 13778944.  So the byte offset is 4096*13778944
> > or 56438554624.
> >
> >
> >                                         - Ted
> >
> > _______________________________________________
> > Ext3-users mailing list
> > Ext3-users at redhat.com
> > https://www.redhat.com/mailman/listinfo/ext3-users
> >
>
>
> --
> Stephen Samuel http://www.bcgreen.com
> 778-861-7641
>


-- 
Stephen Samuel http://www.bcgreen.com
778-861-7641


From tambewilliam at gmail.com  Tue Jul 24 18:10:02 2007
From: tambewilliam at gmail.com (William Tambe)
Date: Tue, 24 Jul 2007 13:10:02 -0500
Subject: Please How do I calculate the offset of a file within a ext3
 partition
In-Reply-To: <6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com>
References: <46A2956C.1020100@gmail.com>	<e9e943910707221716p398fd8f5q9d89cd17e4f44f2c@mail.gmail.com>	<46A3FF53.4000704@gmail.com>
	<20070723145902.GF19927@thunk.org>	<46A4FED4.2020004@gmail.com>
	<20070723203450.GD30165@thunk.org>	<6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com>
	<6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com>
Message-ID: <46A6407A.8040407@gmail.com>

The reason why I am using a file as my swap partition is because, I want 
to be able to change the size of my swap just as easy as if I was to 
create a smaller or larger file.

In the swsusp kernel documentation: 
Documentation/power/swsusp-and-swap-files.txt

It says that the swap files need not to be contiguous. swsusp need only 
to find the header of the the swap-file to find where all the blocks 
belonging to the swap-file are located and use it.

The reason why I wanted to backup my data first in case something go 
wrong was just because, I was not certain that the header was in the 
first block of the swapfile, and I am not sure whether swsusp do check 
if the file being used is a valid swap-file.

Thank you to Theodore Tso, for reminding me to multiply the block number 
by the size of a single block, otherwise I was going to use the block 
number instead of calculating its offset.

I still haven't tried anything, because I only have one machine and I 
need to wait till the weekend when I don't need to use it much for work 
and try it. So if something wrong happen, I have enough time to fix it.

I will let you guys know of the outcome...

William Tambe


Stephen Samuel wrote:
> What I'd note here is that the file has discontinuities, so this file
> is probably not appropriate for doing suspends to swap.
> At a quick guess, you probably need to either:
>  1) set up a proper swap PARTITION.
> (e.g. remove the current swap file, shrink the /var (or /, as the case
> may be) partition by that much, and then use the newly freed space to
> create a proper partition.)
> 
> I believe that you can use qtparted to do the work of shrinking the
> partition for you.  You might want to download a live-CD linux (like
> Knoppix, or the Ubuntu live CD)  so that you can do the resize without
> having to worry about the partition being in use.
> 
> or
> 2) Find a program that will allow you to allocate a file as one
> contiguous chunk (nothing off the top of my head). then allocate the
> swap file using that,
> 
> On 7/23/07, Stephen Samuel <darkonc at gmail.com> wrote:
>> What I'd note here is that the file has discontinuities, so this file
>> is probably not appropriate for doing suspends to swap.
>> At a quick guess, you probably need to either:
>>    1) set up a proper swap PARTITION.
>> (e.g. remove the current swap file, shrink the /var (or /, as the case
>> may be) partition by that much, and then use the newly freed space to
>> create a proper partition.)
>>
>> I believe that you can use qtparted to do the work of shrinking the
>> partition for you.  You might want to download a live-CD linux (like
>> Knoppix, or the Ubuntu live CD)  so that you can do the resize without
>> having to worry about the partition being in use.
>>
>> or
>> 2) Find a program that will allow you to allocate a file as one
>> contiguous chunk (nothing off the top of my head). then allocate the
>> swap file using that,
>>
>> On 7/23/07, Theodore Tso <tytso at mit.edu> wrote:
>> > On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote:
>> > > Thank you for warning me, I am already using a specific file as my 
>> swap,
>> > > so I had already done mkswap on it.
>> > > I only wanted to be able suspend on it and resume from it using 
>> swsusp.
>> > > To do that I needed to give to the kernel as arguments the following:
>> > > resume=<swap_file_partition> resume_offset=<swap_file_header_offset>
>> >
>> > If you have the filefrag program, you can just do
>> >
>> > # filefrag -v /var/cache/swap  | head
>> > Checking /var/cache/swap
>> > Filesystem type is: ef53
>> > Filesystem cylinder groups is approximately 578
>> > Blocksize of file /var/cache/swap is 4096
>> > File size of /var/cache/swap is 1073741824 (262144 blocks)
>> > First block: 13778944
>> > Last block: 14406757
>> > Discontinuity: Block 6137 is at 13785112 (was 13785087)
>> > Discontinuity: Block 12251 is at 13791992 (was 13791231)
>> >
>> > So the first block is 13778944.  So the byte offset is 4096*13778944
>> > or 56438554624.
>> >
>> >
>> >                                         - Ted
>> >
>> > _______________________________________________
>> > Ext3-users mailing list
>> > Ext3-users at redhat.com
>> > https://www.redhat.com/mailman/listinfo/ext3-users
>> >
>>
>>
>> -- 
>> Stephen Samuel http://www.bcgreen.com
>> 778-861-7641
>>
> 
> 


From alvin.cao at gmail.com  Sat Jul 28 02:56:23 2007
From: alvin.cao at gmail.com (Alvin Cao)
Date: Sat, 28 Jul 2007 10:56:23 +0800
Subject: Is ext3 the right choice?
Message-ID: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com>

Dear All,

Our mobile device, which runs linux 2.4, uses ext3 as its filesystem.
To make ext3 work, we have Samsung's xrs module, a middle layer which
resembles MTD, to simulate disk devices over Samsung's onenand flash.
Recently some of our phones are suffering a filesystem crash with only
30% space used on that partition. So I began to doubt whether it is
right to employ an disk filesystem on an embedded system. It seems the
kjournald kernel thread sends out an oops. Just assuming the xrs layer
simulates perfectly a real disk device, I want to discuss what the
disadvantages or advantages, if there is any, are in such a design.

 I think the point is that to keep ext3 safe, we must umount these
devices cleanly before rebooting to let the kernel flush useful
information to the disks. On a PC we don't do many reboots. Even dirty
reboots without umount happen, data are very likely to be recovered.
And yet we have experienced administrators and uitilities like e2fsck
to resort to. But even then there are still chances that disks could
fail.

Embedded systems are quite different. Developers and customers could
pull out the battery at all times. It's unpredictable. Consequently
there should be much more chances than on a PC that a disk failure
happen. And we can't bet on the customers. Once the products are
delivered to our customers, any disk failure, either recoverable(I
think it's the most cases) or unrecoverable, is unacceptable. We can't
expect the customers do what we are supposed to do.

Guys, I really want polish the products as much as I can. Please give
your comments on what kind of risks we may take by using ext3 in such
a design. And if you have rich experience of using ext3 in an embedded
system, great, please feel free to share it. Any helps are
appreciated.

--
Best Regards,
Alvin Cao


From ulf at atc-onlane.com  Mon Jul 30 03:00:16 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Sun, 29 Jul 2007 20:00:16 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com><alpine.DEB.0.99.0707150400310.18512@sheep.housecafe.de>
	<5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>

Ok, I finally got a complete message of this panic:

Unmounting pipe file systems:  
Unmounting file systems:  
Halting system...
md: stopping all md devices.
md: md0 switched to read-only mode.
cciss: stopping all cciss devices.
cciss: removing controller 0
Assertion failure in dx_probe() at fs/ext3/namei.c:381:
"dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)"
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at namei:381
invalid operand: 0000 [1] SMP 
CPU 2 
Modules linked in: mptctl mptbase sg md5 ipv6 parport_pc lp parport
autofs4 i2c_dev i2c_core ocfs2(U) debugfs(U) ocfs2_dlmfs(U) ocfs2_dlm(U)
ocfs2_nodemanager(U) configfs(U) hangcheck_timer sunrpc dm_mirror
dm_round_robin dm_multipath dm_mod button battery ac joydev ehci_hcd
uhci_hcd hw_random e1000 bnx2(U) ext3 jbd qla2400(U) qla2xxx(U) ata_piix
libata cciss(U) sd_mod scsi_mod
Pid: 4272, comm: khelper Tainted: P      2.6.9-55.ELsmp
RIP: 0010:[<ffffffffa010a0f4>] <ffffffffa010a0f4>{:ext3:dx_probe+427}
RSP: 0018:00000104194738e8  EFLAGS: 00010212
RAX: 0000000000000081 RBX: 000001041e9cd800 RCX: 0000000000000246
RDX: 0000000000007c88 RSI: 0000000000000246 RDI: ffffffff803e4d80
RBP: 000001041e9cd818 R08: 00000000000927bf R09: 000001041e9cd800
R10: ffffffff803184a0 R11: 0000ffff803ffbe0 R12: 00000104194739d8
R13: 0000000000000000 R14: 000001041e58f4a8 R15: 000001041fae3c58
FS:  0000000000000000(0000) GS:ffffffff804ed800(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000005b98c0 CR3: 000000042153e000 CR4: 00000000000006e0
Process khelper (pid: 4272, threadinfo 0000010419472000, task
000001041821f7f0)
Stack: 1355fd8200000041 00000104194739d4 00000104194739d8
fffffffffffffff4 
       000001041bc19628 000001041fae3c58 0000010421255448
0000010419473c68 
       000001041fae3c58 ffffffffa010a873 
Call Trace:<ffffffffa010a873>{:ext3:ext3_find_entry+293}
<ffffffffa010ad81>{:ext3:ext3_lookup+47} 
       <ffffffff80186665>{do_lookup+230}
<ffffffff80187301>{__link_path_walk+2579} 
       <ffffffff80187801>{link_path_walk+82}
<ffffffff8015ce9d>{generic_file_aio_read+48} 
       <ffffffff801a27c0>{load_script+0}
<ffffffff80187a4f>{path_lookup+452} 
       <ffffffff801a27c0>{load_script+0}
<ffffffff801832fa>{open_exec+30} 
       <ffffffff8015ce9d>{generic_file_aio_read+48}
<ffffffff801a27c0>{load_script+0} 
       <ffffffff801a2977>{load_script+439}
<ffffffff8017466d>{alloc_page_interleave+61} 
       <ffffffff801a2cda>{load_elf_binary+0}
<ffffffff8018448d>{search_binary_handler+210} 
       <ffffffff801847c2>{do_execve+398}
<ffffffff80147639>{__call_usermodehelper+0} 
       <ffffffff8010ee44>{sys_execve+52} <ffffffff80110fb5>{execve+101} 
       <ffffffff80147639>{__call_usermodehelper+0}
<ffffffff80147578>{____exec_usermodehelper+236} 
       <ffffffff801475b8>{____call_usermodehelper+44}
<ffffffff80110f47>{child_rip+8} 
       <ffffffff80147639>{__call_usermodehelper+0}
<ffffffff8014758c>{____call_usermodehelper+0} 
       <ffffffff80110f3f>{child_rip+0} 

Code: 0f 0b 5d 63 11 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08 
RIP <ffffffffa010a0f4>{:ext3:dx_probe+427} RSP <00000104194738e8>
 <0>Kernel panic - not syncing: Oops

This only happens when / (c0d0p6) has dir_index set with the latest HP
cciss driver (cpq_cciss-2.6.16-6.x86_64).

Regards, Ulf.

---------------------------------------------------------------------
ATC-Onlane Inc., T: 650-532-6382, F: 650-532-6441
4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025
---------------------------------------------------------------------

> -----Original Message-----
> From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com]
> On Behalf Of Ulf Zimmermann
> Sent: 07/14/2007 20:33
> To: Christian Kujau
> Cc: ext3-users at redhat.com
> Subject: RE: Kernel panic in ext3:dx_probe, help needed
> 
> > -----Original Message-----
> > From: "evil at g-house.de"@mail.g-house.de
> [mailto:"evil at g-house.de"@mail.g-
> > house.de] On Behalf Of Christian Kujau
> > Sent: Saturday, July 14, 2007 19:04
> > To: Ulf Zimmermann
> > Cc: ext3-users at redhat.com
> > Subject: RE: Kernel panic in ext3:dx_probe, help needed
> >
> > On Sat, 14 Jul 2007, Ulf Zimmermann wrote:
> > >>> believe before the cciss driver is getting unloaded. The last
> > >>> messages I am able to see are:
> > >>>
> > >>> md: stopping all md devices.
> > >>> md: md0 switched to read-only mode.
> >
> > I think these messages are the real cause of the ext3 errors.
> >
> > > Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6
> as we
> > > are installing it). The RedHat provided cciss driver (2.6.14-RH2)
> has no
> > > problem with that, the latest cciss driver from HP, 2.6.16-6,
does.
> > > Turning off dir_index for /, forcing fsck during reboot and
> everything
> > > is fine.
> >
> > A device driver should not care about filesystem features, IMHO.
> Either
> > there are problems with the cciss driver (syslog messages please) or
> the
> > ext3 fs is corrupted - in which case fsck should be run.
> 
> I can reproduce this on 8+ servers, 6 of them were just installed
> yesterday afternoon. Using "tune2fs -O ^dir_index /dev/cciss/c0d0p6"
> followed by a "touch /forcefsck && reboot" leads to no panics are
reboot
> time.
> 
> I have reported this to HP for now.
> 
> Ulf.
> 
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users


From ionel.gardais at tech-advantage.com  Mon Jul 30 07:38:40 2007
From: ionel.gardais at tech-advantage.com (Ionel GARDAIS)
Date: Mon, 30 Jul 2007 09:38:40 +0200
Subject: ext3 partition problems w/Apple Xserve RAID
In-Reply-To: <46A1212B.7010501@berkeley.edu>
References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>
	<46A1212B.7010501@berkeley.edu>
Message-ID: <46AD9580.5080002@tech-advantage.com>

Hi there,

I'm running the same kind of configuration (RHEL ES 4 on a PowerEdge 
2950 with a QLE2460 connected to a SanBox 5200 with 2 XServe RAID 10.5TB).
The four 4.5TB slices are directly formated in ext3, no partitions were 
created.

Should I expect to get some data corruption on power failure ?

Thanks,
Ionel


Jon Forrest wrote:
> Randy Martin wrote:
>> I?m running Red Hat AS 4 on a Dell PowerEdge 1950.  It connects to an 
>> Apple Xserve RAID via a Qlogic QLE2460 card.  I am able to create a 
>> 4TB ext3 partition with no problems and use it fine.  When the system 
>> power drops or it?s rebooted, the file system can?t  be mounted 
>> again.  It looks like the partition table is getting corrupted.  Here 
>> is some of the doc I gathered:
>
>> Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes
>>
>> Disk label type: msdos
>
> Bingo. That's your problem. You have to use a gpt disk label
> for partitions this large. I had the identical problem
> and I was able to fix it without loosing a single bit.
> I described it in a posting to this group on 3/14/2007.
> (I'll send it to you directly).
>
> Cordially,

-- 
Ionel GARDAIS
System-Network Engineer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ionel.gardais.vcf
Type: text/x-vcard
Size: 289 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20070730/8c2bac52/attachment.vcf>

From adilger at clusterfs.com  Mon Jul 30 10:27:21 2007
From: adilger at clusterfs.com (Andreas Dilger)
Date: Mon, 30 Jul 2007 04:27:21 -0600
Subject: ext3 partition problems w/Apple Xserve RAID
In-Reply-To: <46AD9580.5080002@tech-advantage.com>
References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu>
	<46A1212B.7010501@berkeley.edu>
	<46AD9580.5080002@tech-advantage.com>
Message-ID: <20070730102721.GA5992@schatzie.adilger.int>

On Jul 30, 2007  09:38 +0200, Ionel GARDAIS wrote:
> I'm running the same kind of configuration (RHEL ES 4 on a PowerEdge 
> 2950 with a QLE2460 connected to a SanBox 5200 with 2 XServe RAID 10.5TB).
> The four 4.5TB slices are directly formated in ext3, no partitions were 
> created.
> 
> Should I expect to get some data corruption on power failure ?

No, that's only if you have a DOS partition.  We have lots of > 4TB
ext3 filesystems w/o problem.

> Jon Forrest wrote:
> >Randy Martin wrote:
> >>I?m running Red Hat AS 4 on a Dell PowerEdge 1950.  It connects to an 
> >>Apple Xserve RAID via a Qlogic QLE2460 card.  I am able to create a 
> >>4TB ext3 partition with no problems and use it fine.  When the system 
> >>power drops or it?s rebooted, the file system can?t  be mounted 
> >>again.  It looks like the partition table is getting corrupted.  Here 
> >>is some of the doc I gathered:
> >
> >>Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes
> >>
> >>Disk label type: msdos
> >
> >Bingo. That's your problem. You have to use a gpt disk label
> >for partitions this large. I had the identical problem
> >and I was able to fix it without loosing a single bit.
> >I described it in a posting to this group on 3/14/2007.
> >(I'll send it to you directly).
> >
> >Cordially,
> 
> -- 
> Ionel GARDAIS
> System-Network Engineer
> 


> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From darkonc at gmail.com  Tue Jul 31 04:05:13 2007
From: darkonc at gmail.com (Stephen Samuel)
Date: Mon, 30 Jul 2007 21:05:13 -0700
Subject: Is ext3 the right choice?
In-Reply-To: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com>
References: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com>
Message-ID: <6cd50f9f0707302105s200b9a0cm58516ecdcde11f20@mail.gmail.com>

The only way to be sure about what happened with these phones is to
bring one in and have someone who understands filesystems check them
out and see what's wrong with them.  It could be a software error, or
it could be a hardware error. Jumping to conclusions without testing
those same conclusions could result in you chasing ghosts.

There have been some stories about flash drives failing because of too
many rewrites in one location (caused by things like access time
updates and journaling). That's probably something worth excluding
before you presume that the cause is untimely power cycling.

There are, apparently some filesystems that are purposely designed to
be used on flash drives, and the like.  These are probably going to be
far more useful to you than an ext3 filesystem.
(among other things, the  ext3 filesystem is designed to implicitly
minimize fragmentation,  which increases access times.  This isn't as
    much an issue with solid state devices, and there may be  other
things to worry about, like rewrite fatigue.

On 7/27/07, Alvin Cao <alvin.cao at gmail.com> wrote:
> Dear All,
>
> Our mobile device, which runs linux 2.4, uses ext3 as its filesystem.
> To make ext3 work, we have Samsung's xrs module, a middle layer which
> resembles MTD, to simulate disk devices over Samsung's onenand flash.
> Recently some of our phones are suffering a filesystem crash with only
> 30% space used on that partition. So I began to doubt whether it is
> right to employ an disk filesystem on an embedded system. It seems the
> kjournald kernel thread sends out an oops. Just assuming the xrs layer
> simulates perfectly a real disk device, I want to discuss what the
> disadvantages or advantages, if there is any, are in such a design.
>
>  I think the point is that to keep ext3 safe, we must umount these
> devices cleanly before rebooting to let the kernel flush useful
> information to the disks. On a PC we don't do many reboots. Even dirty
> reboots without umount happen, data are very likely to be recovered.
> And yet we have experienced administrators and uitilities like e2fsck
> to resort to. But even then there are still chances that disks could
> fail.
>
> Embedded systems are quite different. Developers and customers could
> pull out the battery at all times. It's unpredictable. Consequently
> there should be much more chances than on a PC that a disk failure
> happen. And we can't bet on the customers. Once the products are
> delivered to our customers, any disk failure, either recoverable(I
> think it's the most cases) or unrecoverable, is unacceptable. We can't
> expect the customers do what we are supposed to do.
>
> Guys, I really want polish the products as much as I can. Please give
> your comments on what kind of risks we may take by using ext3 in such
> a design. And if you have rich experience of using ext3 in an embedded
> system, great, please feel free to share it. Any helps are
> appreciated.

-- 
Stephen Samuel http://www.bcgreen.com
778-861-7641


From tytso at mit.edu  Tue Jul 31 04:07:05 2007
From: tytso at mit.edu (Theodore Tso)
Date: Tue, 31 Jul 2007 00:07:05 -0400
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>
Message-ID: <20070731040705.GF25876@thunk.org>

On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote:
> Ok, I finally got a complete message of this panic:
> 
> Assertion failure in dx_probe() at fs/ext3/namei.c:381:
> "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)"

The filesystem got corrupted (and that's probably a device driver
issue), but ext3 shouldn't have panic'ed the kernel.  The assertion
needs to be replaced by an ext3_error() call.  I'll whip up a patch;
thanks for bringing this to my attention.

      	    	     	   		- Ted


From ulf at atc-onlane.com  Tue Jul 31 04:09:34 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Mon, 30 Jul 2007 21:09:34 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <20070731040705.GF25876@thunk.org>
References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>
	<20070731040705.GF25876@thunk.org>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com>

> -----Original Message-----
> From: Theodore Tso [mailto:tytso at mit.edu]
> Sent: Monday, July 30, 2007 21:07
> To: Ulf Zimmermann
> Cc: Christian Kujau; ext3-users at redhat.com
> Subject: Re: Kernel panic in ext3:dx_probe, help needed
> 
> On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote:
> > Ok, I finally got a complete message of this panic:
> >
> > Assertion failure in dx_probe() at fs/ext3/namei.c:381:
> > "dx_get_limit(entries) == dx_root_limit(dir,
root->info.info_length)"
> 
> The filesystem got corrupted (and that's probably a device driver
> issue), but ext3 shouldn't have panic'ed the kernel.  The assertion
> needs to be replaced by an ext3_error() call.  I'll whip up a patch;
> thanks for bringing this to my attention.
> 
>       	    	     	   		- Ted

Over 10 nodes, brand new install from kickstart. I can reproduce it
every time. Use the EL4 Update 5 cciss driver, no problem, install HP
provided driver, it panics at shutdown/reboot.

Turn off dir_index on root and use the HP driver, no problem either.

Ulf.


From ulf at atc-onlane.com  Tue Jul 31 04:10:38 2007
From: ulf at atc-onlane.com (Ulf Zimmermann)
Date: Mon, 30 Jul 2007 21:10:38 -0700
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com><20070731040705.GF25876@thunk.org>
	<5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com>
Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D39@msmpk01.corp.autc.com>

> -----Original Message-----
> From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com]
> On Behalf Of Ulf Zimmermann
> Sent: Monday, July 30, 2007 21:10
> To: Theodore Tso
> Cc: ext3-users at redhat.com
> Subject: RE: Kernel panic in ext3:dx_probe, help needed
> 
> > -----Original Message-----
> > From: Theodore Tso [mailto:tytso at mit.edu]
> > Sent: Monday, July 30, 2007 21:07
> > To: Ulf Zimmermann
> > Cc: Christian Kujau; ext3-users at redhat.com
> > Subject: Re: Kernel panic in ext3:dx_probe, help needed
> >
> > On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote:
> > > Ok, I finally got a complete message of this panic:
> > >
> > > Assertion failure in dx_probe() at fs/ext3/namei.c:381:
> > > "dx_get_limit(entries) == dx_root_limit(dir,
> root->info.info_length)"
> >
> > The filesystem got corrupted (and that's probably a device driver
> > issue), but ext3 shouldn't have panic'ed the kernel.  The assertion
> > needs to be replaced by an ext3_error() call.  I'll whip up a patch;
> > thanks for bringing this to my attention.
> >
> >       	    	     	   		- Ted
> 
> Over 10 nodes, brand new install from kickstart. I can reproduce it
> every time. Use the EL4 Update 5 cciss driver, no problem, install HP
> provided driver, it panics at shutdown/reboot.
> 
> Turn off dir_index on root and use the HP driver, no problem either.
> 
> Ulf.

Or if there is corruption, it gets corrupted by the HP driver. Or
something.


From tytso at mit.edu  Tue Jul 31 06:16:46 2007
From: tytso at mit.edu (Theodore Tso)
Date: Tue, 31 Jul 2007 02:16:46 -0400
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>
	<20070731040705.GF25876@thunk.org>
	<5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com>
Message-ID: <20070731061646.GH25876@thunk.org>

On Mon, Jul 30, 2007 at 09:09:34PM -0700, Ulf Zimmermann wrote:
> 
> Over 10 nodes, brand new install from kickstart. I can reproduce it
> every time. Use the EL4 Update 5 cciss driver, no problem, install HP
> provided driver, it panics at shutdown/reboot.

The assertion failure is a sanity check when looking up a directory
entry via the htree.  It triggers when the size field in the root node
of the htree (i.e., the directory index) is larger it could possibly
be.  This can happen only if the htree directory is corrupted, or the
on-disk buffer cache of directory is corrupted.

If you run e2fsck -f on the filesystem after you reboot, does it
report any errors?  If not, it's probably the buffer cache which is
getting corrupted.  This is probably more likely, if I had to guess.

> Turn off dir_index on root and use the HP driver, no problem either.

My guess is that the driver is doing something screwy at shutdown,
corrupting some directory in the buffer cache in memory.  As the
shutdown scripts continue to execute, one of then accesses the
corrupted directory, and this triggers the assertion failure.  I agree
it's odd that is so repeatable, but the assertion that was generated
is pretty clear about what caused it.  Without directory indexing
enabled, the filesystem code is either not noticing the corruption, or
it's printing a warning which is being ignored instead of causing an
assertion failure.

My recommendation would be to file a bug report with HP about their
device driver.

					- Ted


From duaneg at dghda.com  Tue Jul 31 11:19:22 2007
From: duaneg at dghda.com (Duane Griffin)
Date: Tue, 31 Jul 2007 12:19:22 +0100
Subject: Kernel panic in ext3:dx_probe, help needed
In-Reply-To: <20070731040705.GF25876@thunk.org>
References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com>
	<5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com>
	<20070731040705.GF25876@thunk.org>
Message-ID: <e9e943910707310419s23b4e432ie9bf2b3840f8c9c5@mail.gmail.com>

On 31/07/07, Theodore Tso <tytso at mit.edu> wrote:
> On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote:
> > Ok, I finally got a complete message of this panic:
> >
> > Assertion failure in dx_probe() at fs/ext3/namei.c:381:
> > "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)"
>
> The filesystem got corrupted (and that's probably a device driver
> issue), but ext3 shouldn't have panic'ed the kernel.  The assertion
> needs to be replaced by an ext3_error() call.  I'll whip up a patch;
> thanks for bringing this to my attention.

I've been looking at this very issue recently, following a gentoo bug
report. I have a patch ready that replaces the asserts with a fallback
to a linear directory scan, following the example set by other parts
of that code. I've been waiting to get feedback from the bug reporters
before sending it to you, but if you'd like to take a look at it you
can find it here:

http://bugs.gentoo.org/show_bug.cgi?id=183207

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan