From doseyg at r-networks.net Sun Jul 1 00:59:21 2007 From: doseyg at r-networks.net (Glen Dosey) Date: Sat, 30 Jun 2007 20:59:21 -0400 Subject: poor read performance In-Reply-To: <4685E79E.3060709@fnal.gov> References: <1183159302.30971.37.camel@localhost.localdomain> <4685E79E.3060709@fnal.gov> Message-ID: <1183251561.8003.11.camel@eclipse.office.r-networks.net> ling, Wow, Thanks ! That made an enormous difference. Now I'd like to try and understand why. I had initially tried 256, 512 and 1024 block read ahead settings in the OS with no difference in performance and went no further. 8192 made a huge difference, but 16384 brings the performance back to the same at 256. Why is 8192 such a magic number ? The RAID enclosure itself also has a read ahead setting , which I've tried at 512K , 1M and 2M, also with no difference. I also want to make sure I understand how read-ahead is working in my setup. If I request some data A, the OS will request 8192 blocks (4MB) past the end of data A. Now the controller will see the OS request for A + 4M and read an additional 2MB past that, such that the disks have read 6MB beyond the end of what was initially requested, with 2MB being in the controller cache, 4MB in the OS cache, and data A passed to the application. Is this correct ? Thanks, Glen On Sat, 2007-06-30 at 00:18 -0500, Ling C. Ho wrote: > Hi, > > Did you see any difference when different block size is used (for > example, dd with bs=64k or 128k)? Try also change the read-ahead cache. > blockdev --getra /dev/sdd to see what is the current value, and blockdev > --setra 8192 /dev/sdd to change it. 8192 is a good number that has been > working well for me for the similar size setup. > > ... > ling > > Glen Dosey wrote: > > I am seeing what seems to be a notable limit on read performance of an > > ext3 filesystem. If anyone could offer some insight it would be helpful. > > > > Background: > > 12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC > > to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the > > enclosure was configured RAID5 10+1 parity, although I've also tried > > RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. > > > > Problem: > > No matter what I do I cannot get the ext3 read performance above > > ~90MB/s. Under virtually every configuration listed above the write > > performance is greater than the read performance. I've run a large > > number of Bonnie++ and IOzone tests, but for the sake of simplicity in > > this email I'll just refer to simple dd's with /dev/zero. > > > > Details: > > Under the current RAID0 setup I see the following when dd'ing. > > > > DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync > > 28 seconds > > DD 4G from /dev/sdd to /dev/null 32 seconds > > DD 4G to ext3 on /dev/sdd & sync 32 seconds > > DD 4G from ext3 file to /dev/null 48 seconds. > > > > I've been watching the port usage on the FC switch and it verifies what > > I am seeing, Writes max out near 2Gb/s but reads hit some artificial > > limit around 90 MB/s and never ever exceed it with the filesystem, > > regardless of they underlying RAID configuration. Without a filesystem > > the reads are atleast 50% faster, and it can be seen on the FC switch > > graphs as well. > > > > Any help or thoughts would be appreciated. > > > > Thanks, > > ~Glen > > > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > > > From jprats at cesca.es Tue Jul 3 07:31:45 2007 From: jprats at cesca.es (Jordi Prats) Date: Tue, 03 Jul 2007 09:31:45 +0200 Subject: journal corrupted on / filesystem Message-ID: <4689FB61.5090401@cesca.es> Hi, I'm getting this errors on the / filesystem: Jul 3 08:03:54 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 18448395 Jul 3 08:03:54 inf04 kernel: Aborting journal on device cciss/c0d0p2. Jul 3 08:03:54 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 18448387 Jul 3 08:03:55 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 18448387 Jul 3 08:03:55 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 18448395 Jul 3 08:03:57 inf04 kernel: ext3_abort called. Jul 3 08:03:57 inf04 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_journal_start_sb: Detected aborted journal Jul 3 08:03:57 inf04 kernel: Remounting filesystem read-only (...) Jul 3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 49464569 Jul 3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 49464569 Jul 3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 49464569 Jul 3 08:04:30 inf05 kernel: EXT3-fs error (device cciss/c0d0p2): ext3_get_inode_block: bad inode number: 49464569 I supose I should remove the journal (get it back to ext2) and recreate it (tune2fs -j /dev/...) It's possible to do it without rebooting it? It's no problem to turn it to read-only (it already is on that mode) Whitch command I should do to achive his? Thanks! Jordi From tweeks at rackspace.com Tue Jul 3 20:01:01 2007 From: tweeks at rackspace.com (tweeks) Date: Tue, 3 Jul 2007 15:01:01 -0500 Subject: journal corrupted on / filesystem In-Reply-To: <4689FB61.5090401@cesca.es> References: <4689FB61.5090401@cesca.es> Message-ID: <200707031501.02708.tweeks@rackspace.com> On Tuesday 03 July 2007 02:31, Jordi Prats wrote: > Hi, > I'm getting this errors on the / filesystem: [...] > I supose I should remove the journal (get it back to ext2) and recreate > it (tune2fs -j /dev/...) It's possible to do it without rebooting it? > It's no problem to turn it to read-only (it already is on that mode) > Whitch command I should do to achive his? Just remount as ext2: # mount -t ext2 -o remount / Tweeks Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace Managed Hosting. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From adilger at clusterfs.com Tue Jul 3 21:42:31 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 3 Jul 2007 15:42:31 -0600 Subject: journal corrupted on / filesystem In-Reply-To: <200707031501.02708.tweeks@rackspace.com> References: <4689FB61.5090401@cesca.es> <200707031501.02708.tweeks@rackspace.com> Message-ID: <20070703214231.GI6578@schatzie.adilger.int> On Jul 03, 2007 15:01 -0500, tweeks wrote: > On Tuesday 03 July 2007 02:31, Jordi Prats wrote: > > I'm getting this errors on the / filesystem: > [...] > > I supose I should remove the journal (get it back to ext2) and recreate > > it (tune2fs -j /dev/...) It's possible to do it without rebooting it? > > It's no problem to turn it to read-only (it already is on that mode) > > Whitch command I should do to achive his? > > Just remount as ext2: > > # mount -t ext2 -o remount / Won't work. You need to unmount the filesystem at least, at which point recreating the journal with tune2fs is easy. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From ms419 at freezone.co.uk Wed Jul 4 00:49:26 2007 From: ms419 at freezone.co.uk (Jack Bates) Date: Tue, 03 Jul 2007 17:49:26 -0700 Subject: fopen extended attributes Message-ID: <1183510166.4877.3.camel@ket.lat> Are ext3 file system extended attributes mapped anywhere to file system paths? e.g. /sys/fs/ext3//? I want to edit a file's extended attribute using standard system calls, fopen, fread, fwrite, etc. Is it possible? Thanks, Jack -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 242 bytes Desc: This is a digitally signed message part URL: From agruen at suse.de Wed Jul 4 09:15:50 2007 From: agruen at suse.de (Andreas Gruenbacher) Date: Wed, 4 Jul 2007 11:15:50 +0200 Subject: fopen extended attributes In-Reply-To: <1183510166.4877.3.camel@ket.lat> References: <1183510166.4877.3.camel@ket.lat> Message-ID: <200707041115.51164.agruen@suse.de> On Wednesday 04 July 2007 02:49, Jack Bates wrote: > Are ext3 file system extended attributes mapped anywhere to file system > paths? e.g. /sys/fs/ext3//? No, they're not. > I want to edit a file's extended attribute using standard system calls, > fopen, fread, fwrite, etc. Is it possible? Sorry no. You can only access them using the *xattr syscalls. Andreas From jprats at cesca.es Wed Jul 4 10:59:47 2007 From: jprats at cesca.es (Jordi Prats) Date: Wed, 04 Jul 2007 12:59:47 +0200 Subject: journal corrupted on / filesystem In-Reply-To: <20070703214231.GI6578@schatzie.adilger.int> References: <4689FB61.5090401@cesca.es> <200707031501.02708.tweeks@rackspace.com> <20070703214231.GI6578@schatzie.adilger.int> Message-ID: <468B7DA3.9070403@cesca.es> Hi, Yes it did not work on a live system, you need to reboot it, after the procedure you must reboot. For the record what I did was: Mark the filesystem as it does not have a journal (take it to ext2) tune2fs -O ^has_journal /dev/cciss/c0d0p2 fsck it to delete the journal: e2fsck /dev/cciss/c0d0p2 Create the journal (take it back to ext3) tune2fs -j /dev/cciss/c0d0p2 and finaly, remount it. On a live system, just reboot it. Thank you all, Jordi Andreas Dilger wrote: > On Jul 03, 2007 15:01 -0500, tweeks wrote: > >> On Tuesday 03 July 2007 02:31, Jordi Prats wrote: >> >>> I'm getting this errors on the / filesystem: >>> >> [...] >> >>> I supose I should remove the journal (get it back to ext2) and recreate >>> it (tune2fs -j /dev/...) It's possible to do it without rebooting it? >>> It's no problem to turn it to read-only (it already is on that mode) >>> Whitch command I should do to achive his? >>> >> Just remount as ext2: >> >> # mount -t ext2 -o remount / >> > > Won't work. > > You need to unmount the filesystem at least, at which point recreating > the journal with tune2fs is easy. > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > > > -- ...................................................................... __ / / Jordi Prats C E / S / C A Dept. de Sistemes /_/ Centre de Supercomputaci? de Catalunya Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona T. 93 205 6464 ? F. 93 205 6979 ? jprats at cesca.es ...................................................................... From ramesh25 at gmail.com Tue Jul 10 17:28:49 2007 From: ramesh25 at gmail.com (Ramesh Natarajan) Date: Tue, 10 Jul 2007 12:28:49 -0500 Subject: Ext3 fsck questions Message-ID: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com> Hi, I am currently having a RAID disk configured to appear as 3 ext3 disks (/dev/sda,/dev/sdb and /dev/sdc) The disks are initially formatted using mkfs.ext3 /dev/sda and mounted as follows mount -t ext3 -o data=ordered -o commit=1 /dev/sda /mnt/san >From what I read from the man page and other maillist archives I must run fsck periodically ( default after 38 mounts or 6 months) to ensure the filesystem is clean. Is this still valid if I mount using the following options? -o data=ordered -o commit=1 Thanks Ramesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From jprats at cesca.es Tue Jul 10 20:50:18 2007 From: jprats at cesca.es (Jordi Prats) Date: Tue, 10 Jul 2007 22:50:18 +0200 Subject: Get journal position Message-ID: <4693F10A.5070201@cesca.es> Hi, There is any way to figure out where physically is the journal on a ext3 fs and it's size? Thanks! Jordi From adilger at clusterfs.com Wed Jul 11 12:27:45 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 11 Jul 2007 06:27:45 -0600 Subject: Get journal position In-Reply-To: <4693F10A.5070201@cesca.es> References: <4693F10A.5070201@cesca.es> Message-ID: <20070711122745.GY6417@schatzie.adilger.int> On Jul 10, 2007 22:50 +0200, Jordi Prats wrote: > There is any way to figure out where physically is the journal on a ext3 > fs and it's size? debugfs -c -R "stat <8>" {dev} Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From jprats at cesca.es Wed Jul 11 12:55:15 2007 From: jprats at cesca.es (Jordi Prats) Date: Wed, 11 Jul 2007 14:55:15 +0200 Subject: Get journal position In-Reply-To: <20070711122745.GY6417@schatzie.adilger.int> References: <4693F10A.5070201@cesca.es> <20070711122745.GY6417@schatzie.adilger.int> Message-ID: <4694D333.9090005@cesca.es> Hi, But on the example above, this what does it means? 2055 is a byte count or a sector count (512bytes) or a block size (4K) count? Thanks! debugfs 1.39 (29-May-2006) /dev/cciss/c0d0p1: catastrophic mode - not reading inode or group bitmaps Inode: 8 Type: regular Mode: 0600 Flags: 0x0 Generation: 0 User: 0 Group: 0 Size: 134217728 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 262416 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x466d1870 -- Mon Jun 11 11:40:00 2007 atime: 0x00000000 -- Thu Jan 1 01:00:00 1970 mtime: 0x466d1870 -- Mon Jun 11 11:40:00 2007 BLOCKS: (0-11):2055-2066, (IND):2067, (12-1035):2068-3091, (DIND):3092, (IND):3093, (1036-2059):3094-4117, (IND):4118, (2060-3083):4119-5142, (IND):5143, (3084-4107):5144-6167, (IND):6168, (4108-5131):6169-7192, (IND):7193, (5132-6155):7194-8217, (IND):8218, (6156-7179):8219-9242, (IND):9243, (7180-8203):9244-10267, (IND):10268, (8204-9227):10269-11292, (IND):11293, (9228-10251):11294-12317, (IND):12318, (10252-11275):12319-13342, (IND):13343, (11276-12299):13344-14367, (IND):14368, (12300-13323):14369-15392, (IND):15393, (13324-14347):15394-16417, (IND):16418, (14348-15371):16419-17442, (IND):17443, (15372-16395):17444-18467, (IND):18468, (16396-17419):18469-19492, (IND):19493, (17420-18443):19494-20517, (IND):20518, (18444-19467):20519-21542, (IND):21543, (19468-20491):21544-22567, (IND):22568, (20492-21515):22569-23592, (IND):23593, (21516-22539):23594-24617, (IND):24618, (22540-23563):24619-25642, (IND):25643, (23564-24587):25644-26667, (IND):26668, (24588-25611):26669-27692, (IND):27693, (25612-26635):27694-28717, (IND):28718, (26636-27659):28719-29742, (IND):29743, (27660-28683):29744-30767, (IND):30768, (28684-29707):30769-31792, (IND):31793, (29708-30673):31794-32759, (30674-30731):34809-34866, (IND):34867, (30732-31755):34868-35891, (IND):35892, (31756-32768):35893-36905 TOTAL: 32802 Andreas Dilger wrote: > On Jul 10, 2007 22:50 +0200, Jordi Prats wrote: > >> There is any way to figure out where physically is the journal on a ext3 >> fs and it's size? >> > > debugfs -c -R "stat <8>" {dev} > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > > > > -- ...................................................................... __ / / Jordi Prats C E / S / C A Dept. de Sistemes /_/ Centre de Supercomputaci? de Catalunya Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona T. 93 205 6464 ? F. 93 205 6979 ? jprats at cesca.es ...................................................................... From ulf at atc-onlane.com Sat Jul 14 09:45:11 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Sat, 14 Jul 2007 02:45:11 -0700 Subject: Kernel panic in ext3:dx_probe, help needed Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> This may or may not be ext3 related but I am trying to find any pointers which might help me. I got a number of HP Proliant DL380 g5 with a P400 controller and also two qla2400 cards. The OS is RedHat EL4 U5 x86_64. Every time during reboot these systems panic after the last umount and I believe before the cciss driver is getting unloaded. The last messages I am able to see are: md: stopping all md devices. md: md0 switched to read-only mode. After this I get the panic, on other systems I believe the following messages come (DL380 g4 with 6i controller): cciss: stopping all cciss devices. cciss: removing controller 0 The last 3 lines of the panic are: Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08 RIP {:ext3:dx_probe+427} RSP <00000102259338e8> <0>Kernel panic - not syncing: Oops I can not reproduce this problem with a DL380 g4 with 6i controller. Tried the included cciss driver in EL4 Update 5 and the one provided by HP. No difference. Any tips what to look at would be appreciated. Regards, Ulf. --------------------------------------------------------------------- ATC-Onlane Inc., T: 650-532-6382, F: 650-532-6441 4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025 --------------------------------------------------------------------- From ulf at atc-onlane.com Sat Jul 14 20:10:45 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Sat, 14 Jul 2007 13:10:45 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com> > -----Original Message----- > From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] > On Behalf Of Ulf Zimmermann > Sent: Saturday, July 14, 2007 02:45 > To: ext3-users at redhat.com > Subject: Kernel panic in ext3:dx_probe, help needed > > This may or may not be ext3 related but I am trying to find any pointers > which might help me. I got a number of HP Proliant DL380 g5 with a P400 > controller and also two qla2400 cards. The OS is RedHat EL4 U5 x86_64. > > Every time during reboot these systems panic after the last umount and I > believe before the cciss driver is getting unloaded. The last messages I > am able to see are: > > md: stopping all md devices. > md: md0 switched to read-only mode. > > After this I get the panic, on other systems I believe the following > messages come (DL380 g4 with 6i controller): > > cciss: stopping all cciss devices. > cciss: removing controller 0 > > The last 3 lines of the panic are: > > Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08 > RIP {:ext3:dx_probe+427} RSP <00000102259338e8> > <0>Kernel panic - not syncing: Oops > > I can not reproduce this problem with a DL380 g4 with 6i controller. > Tried > the included cciss driver in EL4 Update 5 and the one provided by HP. No > difference. > > Any tips what to look at would be appreciated. > Have been able to reproduce it on yet another system but here I was able to catch the top of the panic with " dx_probe: Unrecognised inode hash code 28" on cciss/c0d0p6, which is the / file system. From ulf at atc-onlane.com Sat Jul 14 20:42:27 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Sat, 14 Jul 2007 13:42:27 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com> > -----Original Message----- > From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] > On Behalf Of Ulf Zimmermann > Sent: Saturday, July 14, 2007 13:11 > To: ext3-users at redhat.com > Subject: RE: Kernel panic in ext3:dx_probe, help needed > > > -----Original Message----- > > From: ext3-users-bounces at redhat.com > [mailto:ext3-users-bounces at redhat.com] > > On Behalf Of Ulf Zimmermann > > Sent: Saturday, July 14, 2007 02:45 > > To: ext3-users at redhat.com > > Subject: Kernel panic in ext3:dx_probe, help needed > > > > This may or may not be ext3 related but I am trying to find any > pointers > > which might help me. I got a number of HP Proliant DL380 g5 with a > P400 > > controller and also two qla2400 cards. The OS is RedHat EL4 U5 x86_64. > > > > Every time during reboot these systems panic after the last umount and > I > > believe before the cciss driver is getting unloaded. The last messages > I > > am able to see are: > > > > md: stopping all md devices. > > md: md0 switched to read-only mode. > > > > After this I get the panic, on other systems I believe the following > > messages come (DL380 g4 with 6i controller): > > > > cciss: stopping all cciss devices. > > cciss: removing controller 0 > > > > The last 3 lines of the panic are: > > > > Code: 0f 0b 5d 93 12 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08 > > RIP {:ext3:dx_probe+427} RSP <00000102259338e8> > > <0>Kernel panic - not syncing: Oops > > > > I can not reproduce this problem with a DL380 g4 with 6i controller. > > Tried > > the included cciss driver in EL4 Update 5 and the one provided by HP. > No > > difference. > > > > Any tips what to look at would be appreciated. > > > > Have been able to reproduce it on yet another system but here I was able > to catch the top of the panic with " dx_probe: Unrecognised inode hash > code 28" on cciss/c0d0p6, which is the / file system. Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 as we are installing it). The RedHat provided cciss driver (2.6.14-RH2) has no problem with that, the latest cciss driver from HP, 2.6.16-6, does. Turning off dir_index for /, forcing fsck during reboot and everything is fine. Ulf. From lists at nerdbynature.de Sun Jul 15 01:56:55 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sun, 15 Jul 2007 03:56:55 +0200 (CEST) Subject: Ext3 fsck questions In-Reply-To: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com> References: <11e67100707101028k19baca1bt39131edf8d3e8deb@mail.gmail.com> Message-ID: On Tue, 10 Jul 2007, Ramesh Natarajan wrote: > and mounted as follows > mount -t ext3 -o data=ordered -o commit=1 /dev/sda /mnt/san data=ordered seems to be the default anyway. > From what I read from the man page and other maillist archives I must run > fsck periodically ( default after 38 mounts or 6 months) to ensure the > filesystem is clean. > Is this still valid if I mount using the following options? It's recommended to run e2fsck once in a while (otherwise there would be no need for the 'max-mount-count' and 'interval-between-checks' tunables). But since it's a tunable you can of course turn it off. Really, there is not definite answer here. I for one use e2fsck once in a while and see it more as a datapoint ("fs was OK on 2007-07-15") or as a mere sanity check :) C. -- BOFH excuse #146: Communications satellite used by the military for star wars. From lists at nerdbynature.de Sun Jul 15 02:04:22 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sun, 15 Jul 2007 04:04:22 +0200 (CEST) Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com> Message-ID: On Sat, 14 Jul 2007, Ulf Zimmermann wrote: >>> believe before the cciss driver is getting unloaded. The last >>> messages I am able to see are: >>> >>> md: stopping all md devices. >>> md: md0 switched to read-only mode. I think these messages are the real cause of the ext3 errors. > Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 as we > are installing it). The RedHat provided cciss driver (2.6.14-RH2) has no > problem with that, the latest cciss driver from HP, 2.6.16-6, does. > Turning off dir_index for /, forcing fsck during reboot and everything > is fine. A device driver should not care about filesystem features, IMHO. Either there are problems with the cciss driver (syslog messages please) or the ext3 fs is corrupted - in which case fsck should be run. C. -- BOFH excuse #115: your keyboard's space bar is generating spurious keycodes. From ulf at atc-onlane.com Sun Jul 15 03:32:41 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Sat, 14 Jul 2007 20:32:41 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> > -----Original Message----- > From: "evil at g-house.de"@mail.g-house.de [mailto:"evil at g-house.de"@mail.g- > house.de] On Behalf Of Christian Kujau > Sent: Saturday, July 14, 2007 19:04 > To: Ulf Zimmermann > Cc: ext3-users at redhat.com > Subject: RE: Kernel panic in ext3:dx_probe, help needed > > On Sat, 14 Jul 2007, Ulf Zimmermann wrote: > >>> believe before the cciss driver is getting unloaded. The last > >>> messages I am able to see are: > >>> > >>> md: stopping all md devices. > >>> md: md0 switched to read-only mode. > > I think these messages are the real cause of the ext3 errors. > > > Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 as we > > are installing it). The RedHat provided cciss driver (2.6.14-RH2) has no > > problem with that, the latest cciss driver from HP, 2.6.16-6, does. > > Turning off dir_index for /, forcing fsck during reboot and everything > > is fine. > > A device driver should not care about filesystem features, IMHO. Either > there are problems with the cciss driver (syslog messages please) or the > ext3 fs is corrupted - in which case fsck should be run. I can reproduce this on 8+ servers, 6 of them were just installed yesterday afternoon. Using "tune2fs -O ^dir_index /dev/cciss/c0d0p6" followed by a "touch /forcefsck && reboot" leads to no panics are reboot time. I have reported this to HP for now. Ulf. From walker at stsci.edu Tue Jul 17 18:07:38 2007 From: walker at stsci.edu (Thomas Walker) Date: Tue, 17 Jul 2007 14:07:38 -0400 Subject: large ext3 filesystem consistantly locking itself read-only Message-ID: <469D056A.9020504@stsci.edu> We have several large ext3 file system partitions. One of them sets itself to read-only after getting journel problems. I understand that's a good thing, but obviously I need to correct the problem so that it will stop locking itself. Here are some details; OS is Redhat EL4 x86_64 running on a SunFire v40z, kernel is 2.6.9-42.0.2.ELsmp. The disk storage in question is external, via fiber cable. The fiber HBA is a Qlogic ISP2312 connected to a Qlogic San Switch connected to four Apple Xserve Raids. There are 8 individual LUN's coming from the four XRaids, they appear on the host as /dev/sd[cdefghij]. Those LUNs are put into two LVM volume groups and then mounted from logical volumes. The partition in question is 8TB, about 92% full at the moment. One oddity about this partition is it has a subdirectory which contains over 2700 symbolic links to other partitions. Here is the output from /var/adm/messages the last time the file system locked itself; Jul 17 09:01:06 kernel: Info fld=0x0, Current sdd: sense key No Sense Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 786856796 Jul 17 09:01:06 kernel: Aborting journal on device dm-3. Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3) in start_transaction: Readonly filesystem Jul 17 09:01:06 kernel: Aborting journal on device dm-3. Jul 17 09:01:06 kernel: ext3_abort called. Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_journal_start_sb: Detected aborted journal Jul 17 09:01:06 kernel: Remounting filesystem read-only Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3) in start_transaction: Journal has aborted Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 786856797 Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 786856798 Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 786856799 Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: bit already cleared for block 786856800 Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3) in ext3_truncate: Journal has aborted Jul 17 09:01:07 kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted Jul 17 09:01:07 kernel: EXT3-fs error (device dm-3) in ext3_orphan_del: Journal has aborted Jul 17 09:01:07 kernel: EXT3-fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted Jul 17 09:01:07 kernel: EXT3-fs error (device dm-3) in ext3_delete_inode: Journal has aborted Jul 17 09:01:07 kernel: __journal_remove_journal_head: freeing b_committed_data If I run fsck it does seem to repair bad blocks and clears inodes but of course for 8TB it takes a long time to run and the corruption only comes back later. I have considered upgrading the kernel, it could be done. I think part of the problem is the large number of symbolic links on that partition but without evidence it will be difficult to get people to change it. I also don't like the first line in the messages about device sdd getting a "No Sense" response to a SCSI sense key request. Any good advice on how to proceed would be appreciated. I have looked at the dumpe2fs and debugfs tools but I don't see how to put them to good use in this case. Thomas Walker From lists at nerdbynature.de Tue Jul 17 23:10:04 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 18 Jul 2007 01:10:04 +0200 (CEST) Subject: large ext3 filesystem consistantly locking itself read-only In-Reply-To: <469D056A.9020504@stsci.edu> References: <469D056A.9020504@stsci.edu> Message-ID: On Tue, 17 Jul 2007, Thomas Walker wrote: > Jul 17 09:01:06 kernel: Info fld=0x0, Current sdd: sense key No Sense > Jul 17 09:01:06 kernel: EXT3-fs error (device dm-3): ext3_free_blocks_sb: > bit already cleared for block 786856796 ...the rest of the errors seem to stem from ext3, but what about the "sdd: sense key: .." message above? Are there more device related messages? If so, this could be the cause for ext3 to barf. > If I run fsck it does seem to repair bad blocks as in "bad blocks on disk"? No good then. Try to rule out device errors first (HBA driver, cabling, cooling, etc.) before doing more e2fsck work... -- BOFH excuse #191: Just type 'mv * /dev/null'. From alex at alex.org.uk Wed Jul 18 10:01:48 2007 From: alex at alex.org.uk (Alex Bligh) Date: Wed, 18 Jul 2007 11:01:48 +0100 Subject: large ext3 filesystem consistantly locking itself read-only In-Reply-To: References: <469D056A.9020504@stsci.edu> Message-ID: --On 18 July 2007 01:10 +0200 Christian Kujau wrote: > BOFH excuse #191: Just type 'mv * /dev/null'. (OT apols). This doesn't do what you might expect. You will get write permission denied as non-root... Alex From lists at nerdbynature.de Wed Jul 18 11:31:57 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 18 Jul 2007 13:31:57 +0200 (CEST) Subject: [OT] Re: large ext3 filesystem consistantly locking itself read-only In-Reply-To: References: <469D056A.9020504@stsci.edu> Message-ID: <32882.62.180.231.196.1184758317.squirrel@housecafe.dyndns.org> On Wed, July 18, 2007 12:01, Alex Bligh wrote: >> BOFH excuse #191: Just type 'mv * /dev/null'. >> > (OT apols). This doesn't do what you might expect. You will get write > permission denied as non-root... Good catch :) So, are you suggesting that I should open a bugreport for the fortunes-bofh-excuses package? SCNR, C. -- BOFH excuse #442: Trojan horse ran out of hay From wolf at CLEMSON.EDU Fri Jul 20 20:37:53 2007 From: wolf at CLEMSON.EDU (Randy Martin) Date: Fri, 20 Jul 2007 16:37:53 -0400 Subject: ext3 partition problems w/Apple Xserve RAID Message-ID: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> I'm running Red Hat AS 4 on a Dell PowerEdge 1950. It connects to an Apple Xserve RAID via a Qlogic QLE2460 card. I am able to create a 4TB ext3 partition with no problems and use it fine. When the system power drops or it's rebooted, the file system can't be mounted again. It looks like the partition table is getting corrupted. Here is some of the doc I gathered: Output from fsck: fsck /dev/sdb1 fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) The filesystem size (according to the superblock) is 1098848000 blocks The physical size of the device is 25106176 blocks Either the superblock or the partition table is likely to be corrupt! ---------------------------------------------------------------------------- Output from parted: parted /dev/sdb GNU Parted 1.6.19 Copyright (C) 1998 - 2004 Free Software Foundation, Inc.^M This program is free software, covered by the GNU General Public License. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Using /dev/sdb (parted) print Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 1 0.031 98071.031 primary ext3 (parted) check 1 Warning: Partition 1 is 98071.000Mb, but the file system is 4292375.000Mb. --------------------------------------------------------------------------- Output from debugfs: debugfs /dev/sdb debugfs 1.35 (28-Feb-2004) /dev/sdb: Bad magic number in super-block while opening filesystem debugfs: open /dev/sdb1 /dev/sdb1: Can't read an inode bitmap while reading inode bitmap debugfs: quit ---------------------------------------------------------------------------- We connect via a Qlogic QLE2460 to the Apple XServe RAID: qla2400 0000:0c:00.0: QLogic Fibre Channel HBA Driver: 8.01.07 QLogic QLE2460 - PCI-Express to 4Gb FC, Single Channel ISP2432: PCIe (2.5Gb/s x4) @ 0000:0c:00.0 hdma+, host#=1, fw=4.00.26 [IP] Vendor: APPLE Model: Xserve RAID Rev: 1.51 Type: Direct-Access ANSI SCSI revision: 05 qla2400 0000:0c:00.0: scsi(1:0:0:0): Enabled tagged queuing, queue depth 32. sdb : very big device. try to use READ CAPACITY(16). SCSI device sdb: 8790786048 512-byte hdwr sectors (4500882 MB) SCSI device sdb: drive cache: write back sdb : very big device. try to use READ CAPACITY(16). SCSI device sdb: 8790786048 512-byte hdwr sectors (4500882 MB) SCSI device sdb: drive cache: write back sdb: sdb1 Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0 ---------------------------------------------------------------------------- Current output of parted after I remade the file system and restored the data: parted /dev/sdb1 GNU Parted 1.6.19 Copyright (C) 1998 - 2004 Free Software Foundation, Inc. This program is free software, covered by the GNU General Public License. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Using /dev/sdb1 (parted) print Disk geometry for /dev/sdb1: 0.000-4292375.967 megabytes Disk label type: loop Minor Start End Filesystem Flags 1 0.000 4292375.967 ext3 I'm afraid this will happen again the next time I reboot the system. Any ideas what might be causing it and how to fix it? Thanks, Randy -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlforrest at berkeley.edu Fri Jul 20 20:55:07 2007 From: jlforrest at berkeley.edu (Jon Forrest) Date: Fri, 20 Jul 2007 13:55:07 -0700 Subject: ext3 partition problems w/Apple Xserve RAID In-Reply-To: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> Message-ID: <46A1212B.7010501@berkeley.edu> Randy Martin wrote: > I?m running Red Hat AS 4 on a Dell PowerEdge 1950. It connects to an > Apple Xserve RAID via a Qlogic QLE2460 card. I am able to create a 4TB > ext3 partition with no problems and use it fine. When the system power > drops or it?s rebooted, the file system can?t be mounted again. It > looks like the partition table is getting corrupted. Here is some of > the doc I gathered: > Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes > > Disk label type: msdos Bingo. That's your problem. You have to use a gpt disk label for partitions this large. I had the identical problem and I was able to fix it without loosing a single bit. I described it in a posting to this group on 3/14/2007. (I'll send it to you directly). Cordially, -- Jon Forrest Unix Computing Support College of Chemistry 173 Tan Hall University of California Berkeley Berkeley, CA 94720-1460 510-643-1032 jlforrest at berkeley.edu From tambewilliam at gmail.com Sat Jul 21 23:23:24 2007 From: tambewilliam at gmail.com (William Tambe) Date: Sat, 21 Jul 2007 18:23:24 -0500 Subject: Please How do I calculate the offset of a file within a ext3 partition Message-ID: <46A2956C.1020100@gmail.com> Hi, I need to understand and to calculate the offset of the beginning of a file within my partition which uses an ext3 filesystem. Can I use dumpe2fs to figure that out, if yes how? Sincerely, William Tambe From duaneg at dghda.com Mon Jul 23 00:16:03 2007 From: duaneg at dghda.com (Duane Griffin) Date: Mon, 23 Jul 2007 01:16:03 +0100 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <46A2956C.1020100@gmail.com> References: <46A2956C.1020100@gmail.com> Message-ID: On 22/07/07, William Tambe wrote: > I need to understand and to calculate the offset of the beginning of a > file within my partition which uses an ext3 filesystem. > > Can I use dumpe2fs to figure that out, if yes how? (Sorry for the duplicate William, forgot to reply to the list) Not sure about dumpe2fs but you can use debugfs to do so. For example: /sbin/debugfs -R "bmap /path/to/file 0" Will give you the first physical block corresponding to logical block 0 of the file. Cheers, Duane. -- "I never could learn to drink that blood and call it wine" - Bob Dylan From tambewilliam at gmail.com Mon Jul 23 01:07:31 2007 From: tambewilliam at gmail.com (William Tambe) Date: Sun, 22 Jul 2007 20:07:31 -0500 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: References: <46A2956C.1020100@gmail.com> Message-ID: <46A3FF53.4000704@gmail.com> Thank you for your response, but one more question, does this logical block 0 hold the header of the file, if not where is located the header of a file in a ext3 filesystem. The reason why I need to know that is because I wish to use swsusp on my swap-file so I really need to know the location of the file's swap header. Thank you for helping. Sincerely, William Tambe Duane Griffin wrote: > On 22/07/07, William Tambe wrote: >> I need to understand and to calculate the offset of the beginning of a >> file within my partition which uses an ext3 filesystem. >> >> Can I use dumpe2fs to figure that out, if yes how? > > (Sorry for the duplicate William, forgot to reply to the list) > > Not sure about dumpe2fs but you can use debugfs to do so. For example: > > /sbin/debugfs -R "bmap /path/to/file 0" > > Will give you the first physical block corresponding to logical block > 0 of the file. > > Cheers, > Duane. > From lists at nerdbynature.de Mon Jul 23 08:20:49 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Mon, 23 Jul 2007 10:20:49 +0200 (CEST) Subject: ext3 partition problems w/Apple Xserve RAID In-Reply-To: <46A1212B.7010501@berkeley.edu> References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> <46A1212B.7010501@berkeley.edu> Message-ID: <31412.62.180.231.196.1185178849.squirrel@housecafe.dyndns.org> On Fri, July 20, 2007 22:55, Jon Forrest wrote: > fix it without loosing a single bit. I described it in a posting to this > group on 3/14/2007. https://www.redhat.com/archives/ext3-users/2007-March/msg00023.html nice tutorial :-) -- BOFH excuse #442: Trojan horse ran out of hay From tytso at mit.edu Mon Jul 23 14:59:03 2007 From: tytso at mit.edu (Theodore Tso) Date: Mon, 23 Jul 2007 10:59:03 -0400 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <46A3FF53.4000704@gmail.com> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> Message-ID: <20070723145902.GF19927@thunk.org> On Sun, Jul 22, 2007 at 08:07:31PM -0500, William Tambe wrote: > Thank you for your response, but one more question, does this logical > block 0 hold the header of the file, if not where is located the header > of a file in a ext3 filesystem. > > The reason why I need to know that is because I wish to use swsusp on my > swap-file so I really need to know the location of the file's swap header. The swap header is located at the beginning of the file, so yes, that would be found in block 0 of the file. The bugger question is what are you *doing*? If you're just trying to enable swsusp, you should need to be using debugfs to find the block number and then manually editing the swap header. The swap file should have been set up correctly before you started using it, or if you want to initialize a new swap-file, you can use the mkswap command. If you're needing to manually edit the swap header, you're almost certainly doing something wrong.... - Ted From tambewilliam at gmail.com Mon Jul 23 19:17:40 2007 From: tambewilliam at gmail.com (William Tambe) Date: Mon, 23 Jul 2007 14:17:40 -0500 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <20070723145902.GF19927@thunk.org> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org> Message-ID: <46A4FED4.2020004@gmail.com> Thank you for warning me, I am already using a specific file as my swap, so I had already done mkswap on it. I only wanted to be able suspend on it and resume from it using swsusp. To do that I needed to give to the kernel as arguments the following: resume= resume_offset= So I had to figure out a way to find out where the header of my swap file was. I haven't tried it yet, I rather want to backup my file first, in case something wrong happen. Sincerely, William Tambe Theodore Tso wrote: > On Sun, Jul 22, 2007 at 08:07:31PM -0500, William Tambe wrote: >> Thank you for your response, but one more question, does this logical >> block 0 hold the header of the file, if not where is located the header >> of a file in a ext3 filesystem. >> >> The reason why I need to know that is because I wish to use swsusp on my >> swap-file so I really need to know the location of the file's swap header. > > The swap header is located at the beginning of the file, so yes, that > would be found in block 0 of the file. > > The bugger question is what are you *doing*? If you're just trying to > enable swsusp, you should need to be using debugfs to find the block > number and then manually editing the swap header. The swap file > should have been set up correctly before you started using it, or if > you want to initialize a new swap-file, you can use the mkswap > command. If you're needing to manually edit the swap header, you're > almost certainly doing something wrong.... > > - Ted > From adilger at clusterfs.com Mon Jul 23 20:26:18 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Mon, 23 Jul 2007 14:26:18 -0600 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <46A4FED4.2020004@gmail.com> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org> <46A4FED4.2020004@gmail.com> Message-ID: <20070723202618.GC5992@schatzie.adilger.int> On Jul 23, 2007 14:17 -0500, William Tambe wrote: > Thank you for warning me, I am already using a specific file as my swap, > so I had already done mkswap on it. > I only wanted to be able suspend on it and resume from it using swsusp. > To do that I needed to give to the kernel as arguments the following: > resume= resume_offset= This is in fact very dangerous (I think, though I'm no swsusp user). There is no guarantee at all that the swap file is contiguous on disk. If it isn't working at the level of a regular file (at which point it could just use ->bmap() to find this information itself) then it is likely expecting some contigous number of blocks and it may clobber your filesystem. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From tytso at mit.edu Mon Jul 23 20:34:50 2007 From: tytso at mit.edu (Theodore Tso) Date: Mon, 23 Jul 2007 16:34:50 -0400 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <46A4FED4.2020004@gmail.com> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org> <46A4FED4.2020004@gmail.com> Message-ID: <20070723203450.GD30165@thunk.org> On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote: > Thank you for warning me, I am already using a specific file as my swap, > so I had already done mkswap on it. > I only wanted to be able suspend on it and resume from it using swsusp. > To do that I needed to give to the kernel as arguments the following: > resume= resume_offset= If you have the filefrag program, you can just do # filefrag -v /var/cache/swap | head Checking /var/cache/swap Filesystem type is: ef53 Filesystem cylinder groups is approximately 578 Blocksize of file /var/cache/swap is 4096 File size of /var/cache/swap is 1073741824 (262144 blocks) First block: 13778944 Last block: 14406757 Discontinuity: Block 6137 is at 13785112 (was 13785087) Discontinuity: Block 12251 is at 13791992 (was 13791231) So the first block is 13778944. So the byte offset is 4096*13778944 or 56438554624. - Ted From darkonc at gmail.com Tue Jul 24 02:49:10 2007 From: darkonc at gmail.com (Stephen Samuel) Date: Mon, 23 Jul 2007 19:49:10 -0700 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org> <46A4FED4.2020004@gmail.com> <20070723203450.GD30165@thunk.org> <6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com> Message-ID: <6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com> What I'd note here is that the file has discontinuities, so this file is probably not appropriate for doing suspends to swap. At a quick guess, you probably need to either: 1) set up a proper swap PARTITION. (e.g. remove the current swap file, shrink the /var (or /, as the case may be) partition by that much, and then use the newly freed space to create a proper partition.) I believe that you can use qtparted to do the work of shrinking the partition for you. You might want to download a live-CD linux (like Knoppix, or the Ubuntu live CD) so that you can do the resize without having to worry about the partition being in use. or 2) Find a program that will allow you to allocate a file as one contiguous chunk (nothing off the top of my head). then allocate the swap file using that, On 7/23/07, Stephen Samuel wrote: > What I'd note here is that the file has discontinuities, so this file > is probably not appropriate for doing suspends to swap. > At a quick guess, you probably need to either: > 1) set up a proper swap PARTITION. > (e.g. remove the current swap file, shrink the /var (or /, as the case > may be) partition by that much, and then use the newly freed space to > create a proper partition.) > > I believe that you can use qtparted to do the work of shrinking the > partition for you. You might want to download a live-CD linux (like > Knoppix, or the Ubuntu live CD) so that you can do the resize without > having to worry about the partition being in use. > > or > 2) Find a program that will allow you to allocate a file as one > contiguous chunk (nothing off the top of my head). then allocate the > swap file using that, > > On 7/23/07, Theodore Tso wrote: > > On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote: > > > Thank you for warning me, I am already using a specific file as my swap, > > > so I had already done mkswap on it. > > > I only wanted to be able suspend on it and resume from it using swsusp. > > > To do that I needed to give to the kernel as arguments the following: > > > resume= resume_offset= > > > > If you have the filefrag program, you can just do > > > > # filefrag -v /var/cache/swap | head > > Checking /var/cache/swap > > Filesystem type is: ef53 > > Filesystem cylinder groups is approximately 578 > > Blocksize of file /var/cache/swap is 4096 > > File size of /var/cache/swap is 1073741824 (262144 blocks) > > First block: 13778944 > > Last block: 14406757 > > Discontinuity: Block 6137 is at 13785112 (was 13785087) > > Discontinuity: Block 12251 is at 13791992 (was 13791231) > > > > So the first block is 13778944. So the byte offset is 4096*13778944 > > or 56438554624. > > > > > > - Ted > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > > > > > -- > Stephen Samuel http://www.bcgreen.com > 778-861-7641 > -- Stephen Samuel http://www.bcgreen.com 778-861-7641 From tambewilliam at gmail.com Tue Jul 24 18:10:02 2007 From: tambewilliam at gmail.com (William Tambe) Date: Tue, 24 Jul 2007 13:10:02 -0500 Subject: Please How do I calculate the offset of a file within a ext3 partition In-Reply-To: <6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com> References: <46A2956C.1020100@gmail.com> <46A3FF53.4000704@gmail.com> <20070723145902.GF19927@thunk.org> <46A4FED4.2020004@gmail.com> <20070723203450.GD30165@thunk.org> <6cd50f9f0707231947k7e880d37x1746a9f0210d3b7@mail.gmail.com> <6cd50f9f0707231949t63fef928n42f1dff6f524b1db@mail.gmail.com> Message-ID: <46A6407A.8040407@gmail.com> The reason why I am using a file as my swap partition is because, I want to be able to change the size of my swap just as easy as if I was to create a smaller or larger file. In the swsusp kernel documentation: Documentation/power/swsusp-and-swap-files.txt It says that the swap files need not to be contiguous. swsusp need only to find the header of the the swap-file to find where all the blocks belonging to the swap-file are located and use it. The reason why I wanted to backup my data first in case something go wrong was just because, I was not certain that the header was in the first block of the swapfile, and I am not sure whether swsusp do check if the file being used is a valid swap-file. Thank you to Theodore Tso, for reminding me to multiply the block number by the size of a single block, otherwise I was going to use the block number instead of calculating its offset. I still haven't tried anything, because I only have one machine and I need to wait till the weekend when I don't need to use it much for work and try it. So if something wrong happen, I have enough time to fix it. I will let you guys know of the outcome... William Tambe Stephen Samuel wrote: > What I'd note here is that the file has discontinuities, so this file > is probably not appropriate for doing suspends to swap. > At a quick guess, you probably need to either: > 1) set up a proper swap PARTITION. > (e.g. remove the current swap file, shrink the /var (or /, as the case > may be) partition by that much, and then use the newly freed space to > create a proper partition.) > > I believe that you can use qtparted to do the work of shrinking the > partition for you. You might want to download a live-CD linux (like > Knoppix, or the Ubuntu live CD) so that you can do the resize without > having to worry about the partition being in use. > > or > 2) Find a program that will allow you to allocate a file as one > contiguous chunk (nothing off the top of my head). then allocate the > swap file using that, > > On 7/23/07, Stephen Samuel wrote: >> What I'd note here is that the file has discontinuities, so this file >> is probably not appropriate for doing suspends to swap. >> At a quick guess, you probably need to either: >> 1) set up a proper swap PARTITION. >> (e.g. remove the current swap file, shrink the /var (or /, as the case >> may be) partition by that much, and then use the newly freed space to >> create a proper partition.) >> >> I believe that you can use qtparted to do the work of shrinking the >> partition for you. You might want to download a live-CD linux (like >> Knoppix, or the Ubuntu live CD) so that you can do the resize without >> having to worry about the partition being in use. >> >> or >> 2) Find a program that will allow you to allocate a file as one >> contiguous chunk (nothing off the top of my head). then allocate the >> swap file using that, >> >> On 7/23/07, Theodore Tso wrote: >> > On Mon, Jul 23, 2007 at 02:17:40PM -0500, William Tambe wrote: >> > > Thank you for warning me, I am already using a specific file as my >> swap, >> > > so I had already done mkswap on it. >> > > I only wanted to be able suspend on it and resume from it using >> swsusp. >> > > To do that I needed to give to the kernel as arguments the following: >> > > resume= resume_offset= >> > >> > If you have the filefrag program, you can just do >> > >> > # filefrag -v /var/cache/swap | head >> > Checking /var/cache/swap >> > Filesystem type is: ef53 >> > Filesystem cylinder groups is approximately 578 >> > Blocksize of file /var/cache/swap is 4096 >> > File size of /var/cache/swap is 1073741824 (262144 blocks) >> > First block: 13778944 >> > Last block: 14406757 >> > Discontinuity: Block 6137 is at 13785112 (was 13785087) >> > Discontinuity: Block 12251 is at 13791992 (was 13791231) >> > >> > So the first block is 13778944. So the byte offset is 4096*13778944 >> > or 56438554624. >> > >> > >> > - Ted >> > >> > _______________________________________________ >> > Ext3-users mailing list >> > Ext3-users at redhat.com >> > https://www.redhat.com/mailman/listinfo/ext3-users >> > >> >> >> -- >> Stephen Samuel http://www.bcgreen.com >> 778-861-7641 >> > > From alvin.cao at gmail.com Sat Jul 28 02:56:23 2007 From: alvin.cao at gmail.com (Alvin Cao) Date: Sat, 28 Jul 2007 10:56:23 +0800 Subject: Is ext3 the right choice? Message-ID: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com> Dear All, Our mobile device, which runs linux 2.4, uses ext3 as its filesystem. To make ext3 work, we have Samsung's xrs module, a middle layer which resembles MTD, to simulate disk devices over Samsung's onenand flash. Recently some of our phones are suffering a filesystem crash with only 30% space used on that partition. So I began to doubt whether it is right to employ an disk filesystem on an embedded system. It seems the kjournald kernel thread sends out an oops. Just assuming the xrs layer simulates perfectly a real disk device, I want to discuss what the disadvantages or advantages, if there is any, are in such a design. I think the point is that to keep ext3 safe, we must umount these devices cleanly before rebooting to let the kernel flush useful information to the disks. On a PC we don't do many reboots. Even dirty reboots without umount happen, data are very likely to be recovered. And yet we have experienced administrators and uitilities like e2fsck to resort to. But even then there are still chances that disks could fail. Embedded systems are quite different. Developers and customers could pull out the battery at all times. It's unpredictable. Consequently there should be much more chances than on a PC that a disk failure happen. And we can't bet on the customers. Once the products are delivered to our customers, any disk failure, either recoverable(I think it's the most cases) or unrecoverable, is unacceptable. We can't expect the customers do what we are supposed to do. Guys, I really want polish the products as much as I can. Please give your comments on what kind of risks we may take by using ext3 in such a design. And if you have rich experience of using ext3 in an embedded system, great, please feel free to share it. Any helps are appreciated. -- Best Regards, Alvin Cao From ulf at atc-onlane.com Mon Jul 30 03:00:16 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Sun, 29 Jul 2007 20:00:16 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CC3@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3CC6@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3CC8@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> Ok, I finally got a complete message of this panic: Unmounting pipe file systems: Unmounting file systems: Halting system... md: stopping all md devices. md: md0 switched to read-only mode. cciss: stopping all cciss devices. cciss: removing controller 0 Assertion failure in dx_probe() at fs/ext3/namei.c:381: "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)" ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at namei:381 invalid operand: 0000 [1] SMP CPU 2 Modules linked in: mptctl mptbase sg md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core ocfs2(U) debugfs(U) ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs(U) hangcheck_timer sunrpc dm_mirror dm_round_robin dm_multipath dm_mod button battery ac joydev ehci_hcd uhci_hcd hw_random e1000 bnx2(U) ext3 jbd qla2400(U) qla2xxx(U) ata_piix libata cciss(U) sd_mod scsi_mod Pid: 4272, comm: khelper Tainted: P 2.6.9-55.ELsmp RIP: 0010:[] {:ext3:dx_probe+427} RSP: 0018:00000104194738e8 EFLAGS: 00010212 RAX: 0000000000000081 RBX: 000001041e9cd800 RCX: 0000000000000246 RDX: 0000000000007c88 RSI: 0000000000000246 RDI: ffffffff803e4d80 RBP: 000001041e9cd818 R08: 00000000000927bf R09: 000001041e9cd800 R10: ffffffff803184a0 R11: 0000ffff803ffbe0 R12: 00000104194739d8 R13: 0000000000000000 R14: 000001041e58f4a8 R15: 000001041fae3c58 FS: 0000000000000000(0000) GS:ffffffff804ed800(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000005b98c0 CR3: 000000042153e000 CR4: 00000000000006e0 Process khelper (pid: 4272, threadinfo 0000010419472000, task 000001041821f7f0) Stack: 1355fd8200000041 00000104194739d4 00000104194739d8 fffffffffffffff4 000001041bc19628 000001041fae3c58 0000010421255448 0000010419473c68 000001041fae3c58 ffffffffa010a873 Call Trace:{:ext3:ext3_find_entry+293} {:ext3:ext3_lookup+47} {do_lookup+230} {__link_path_walk+2579} {link_path_walk+82} {generic_file_aio_read+48} {load_script+0} {path_lookup+452} {load_script+0} {open_exec+30} {generic_file_aio_read+48} {load_script+0} {load_script+439} {alloc_page_interleave+61} {load_elf_binary+0} {search_binary_handler+210} {do_execve+398} {__call_usermodehelper+0} {sys_execve+52} {execve+101} {__call_usermodehelper+0} {____exec_usermodehelper+236} {____call_usermodehelper+44} {child_rip+8} {__call_usermodehelper+0} {____call_usermodehelper+0} {child_rip+0} Code: 0f 0b 5d 63 11 a0 ff ff ff ff 7d 01 0f b7 5d 02 85 db 74 08 RIP {:ext3:dx_probe+427} RSP <00000104194738e8> <0>Kernel panic - not syncing: Oops This only happens when / (c0d0p6) has dir_index set with the latest HP cciss driver (cpq_cciss-2.6.16-6.x86_64). Regards, Ulf. --------------------------------------------------------------------- ATC-Onlane Inc., T: 650-532-6382, F: 650-532-6441 4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025 --------------------------------------------------------------------- > -----Original Message----- > From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] > On Behalf Of Ulf Zimmermann > Sent: 07/14/2007 20:33 > To: Christian Kujau > Cc: ext3-users at redhat.com > Subject: RE: Kernel panic in ext3:dx_probe, help needed > > > -----Original Message----- > > From: "evil at g-house.de"@mail.g-house.de > [mailto:"evil at g-house.de"@mail.g- > > house.de] On Behalf Of Christian Kujau > > Sent: Saturday, July 14, 2007 19:04 > > To: Ulf Zimmermann > > Cc: ext3-users at redhat.com > > Subject: RE: Kernel panic in ext3:dx_probe, help needed > > > > On Sat, 14 Jul 2007, Ulf Zimmermann wrote: > > >>> believe before the cciss driver is getting unloaded. The last > > >>> messages I am able to see are: > > >>> > > >>> md: stopping all md devices. > > >>> md: md0 switched to read-only mode. > > > > I think these messages are the real cause of the ext3 errors. > > > > > Ok, found more information. EL4 sets dir_index for / (cciss/c0d0p6 > as we > > > are installing it). The RedHat provided cciss driver (2.6.14-RH2) > has no > > > problem with that, the latest cciss driver from HP, 2.6.16-6, does. > > > Turning off dir_index for /, forcing fsck during reboot and > everything > > > is fine. > > > > A device driver should not care about filesystem features, IMHO. > Either > > there are problems with the cciss driver (syslog messages please) or > the > > ext3 fs is corrupted - in which case fsck should be run. > > I can reproduce this on 8+ servers, 6 of them were just installed > yesterday afternoon. Using "tune2fs -O ^dir_index /dev/cciss/c0d0p6" > followed by a "touch /forcefsck && reboot" leads to no panics are reboot > time. > > I have reported this to HP for now. > > Ulf. > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users From ionel.gardais at tech-advantage.com Mon Jul 30 07:38:40 2007 From: ionel.gardais at tech-advantage.com (Ionel GARDAIS) Date: Mon, 30 Jul 2007 09:38:40 +0200 Subject: ext3 partition problems w/Apple Xserve RAID In-Reply-To: <46A1212B.7010501@berkeley.edu> References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> <46A1212B.7010501@berkeley.edu> Message-ID: <46AD9580.5080002@tech-advantage.com> Hi there, I'm running the same kind of configuration (RHEL ES 4 on a PowerEdge 2950 with a QLE2460 connected to a SanBox 5200 with 2 XServe RAID 10.5TB). The four 4.5TB slices are directly formated in ext3, no partitions were created. Should I expect to get some data corruption on power failure ? Thanks, Ionel Jon Forrest wrote: > Randy Martin wrote: >> I?m running Red Hat AS 4 on a Dell PowerEdge 1950. It connects to an >> Apple Xserve RAID via a Qlogic QLE2460 card. I am able to create a >> 4TB ext3 partition with no problems and use it fine. When the system >> power drops or it?s rebooted, the file system can?t be mounted >> again. It looks like the partition table is getting corrupted. Here >> is some of the doc I gathered: > >> Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes >> >> Disk label type: msdos > > Bingo. That's your problem. You have to use a gpt disk label > for partitions this large. I had the identical problem > and I was able to fix it without loosing a single bit. > I described it in a posting to this group on 3/14/2007. > (I'll send it to you directly). > > Cordially, -- Ionel GARDAIS System-Network Engineer -------------- next part -------------- A non-text attachment was scrubbed... Name: ionel.gardais.vcf Type: text/x-vcard Size: 289 bytes Desc: not available URL: From adilger at clusterfs.com Mon Jul 30 10:27:21 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Mon, 30 Jul 2007 04:27:21 -0600 Subject: ext3 partition problems w/Apple Xserve RAID In-Reply-To: <46AD9580.5080002@tech-advantage.com> References: <007401c7cb0d$d8a5b3e0$89f11ba0$@edu> <46A1212B.7010501@berkeley.edu> <46AD9580.5080002@tech-advantage.com> Message-ID: <20070730102721.GA5992@schatzie.adilger.int> On Jul 30, 2007 09:38 +0200, Ionel GARDAIS wrote: > I'm running the same kind of configuration (RHEL ES 4 on a PowerEdge > 2950 with a QLE2460 connected to a SanBox 5200 with 2 XServe RAID 10.5TB). > The four 4.5TB slices are directly formated in ext3, no partitions were > created. > > Should I expect to get some data corruption on power failure ? No, that's only if you have a DOS partition. We have lots of > 4TB ext3 filesystems w/o problem. > Jon Forrest wrote: > >Randy Martin wrote: > >>I?m running Red Hat AS 4 on a Dell PowerEdge 1950. It connects to an > >>Apple Xserve RAID via a Qlogic QLE2460 card. I am able to create a > >>4TB ext3 partition with no problems and use it fine. When the system > >>power drops or it?s rebooted, the file system can?t be mounted > >>again. It looks like the partition table is getting corrupted. Here > >>is some of the doc I gathered: > > > >>Disk geometry for /dev/sdb: 0.000-4292376.000 megabytes > >> > >>Disk label type: msdos > > > >Bingo. That's your problem. You have to use a gpt disk label > >for partitions this large. I had the identical problem > >and I was able to fix it without loosing a single bit. > >I described it in a posting to this group on 3/14/2007. > >(I'll send it to you directly). > > > >Cordially, > > -- > Ionel GARDAIS > System-Network Engineer > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From darkonc at gmail.com Tue Jul 31 04:05:13 2007 From: darkonc at gmail.com (Stephen Samuel) Date: Mon, 30 Jul 2007 21:05:13 -0700 Subject: Is ext3 the right choice? In-Reply-To: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com> References: <2fb803240707271956m4c83b7eck30a686b46fea9eb0@mail.gmail.com> Message-ID: <6cd50f9f0707302105s200b9a0cm58516ecdcde11f20@mail.gmail.com> The only way to be sure about what happened with these phones is to bring one in and have someone who understands filesystems check them out and see what's wrong with them. It could be a software error, or it could be a hardware error. Jumping to conclusions without testing those same conclusions could result in you chasing ghosts. There have been some stories about flash drives failing because of too many rewrites in one location (caused by things like access time updates and journaling). That's probably something worth excluding before you presume that the cause is untimely power cycling. There are, apparently some filesystems that are purposely designed to be used on flash drives, and the like. These are probably going to be far more useful to you than an ext3 filesystem. (among other things, the ext3 filesystem is designed to implicitly minimize fragmentation, which increases access times. This isn't as much an issue with solid state devices, and there may be other things to worry about, like rewrite fatigue. On 7/27/07, Alvin Cao wrote: > Dear All, > > Our mobile device, which runs linux 2.4, uses ext3 as its filesystem. > To make ext3 work, we have Samsung's xrs module, a middle layer which > resembles MTD, to simulate disk devices over Samsung's onenand flash. > Recently some of our phones are suffering a filesystem crash with only > 30% space used on that partition. So I began to doubt whether it is > right to employ an disk filesystem on an embedded system. It seems the > kjournald kernel thread sends out an oops. Just assuming the xrs layer > simulates perfectly a real disk device, I want to discuss what the > disadvantages or advantages, if there is any, are in such a design. > > I think the point is that to keep ext3 safe, we must umount these > devices cleanly before rebooting to let the kernel flush useful > information to the disks. On a PC we don't do many reboots. Even dirty > reboots without umount happen, data are very likely to be recovered. > And yet we have experienced administrators and uitilities like e2fsck > to resort to. But even then there are still chances that disks could > fail. > > Embedded systems are quite different. Developers and customers could > pull out the battery at all times. It's unpredictable. Consequently > there should be much more chances than on a PC that a disk failure > happen. And we can't bet on the customers. Once the products are > delivered to our customers, any disk failure, either recoverable(I > think it's the most cases) or unrecoverable, is unacceptable. We can't > expect the customers do what we are supposed to do. > > Guys, I really want polish the products as much as I can. Please give > your comments on what kind of risks we may take by using ext3 in such > a design. And if you have rich experience of using ext3 in an embedded > system, great, please feel free to share it. Any helps are > appreciated. -- Stephen Samuel http://www.bcgreen.com 778-861-7641 From tytso at mit.edu Tue Jul 31 04:07:05 2007 From: tytso at mit.edu (Theodore Tso) Date: Tue, 31 Jul 2007 00:07:05 -0400 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> Message-ID: <20070731040705.GF25876@thunk.org> On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote: > Ok, I finally got a complete message of this panic: > > Assertion failure in dx_probe() at fs/ext3/namei.c:381: > "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)" The filesystem got corrupted (and that's probably a device driver issue), but ext3 shouldn't have panic'ed the kernel. The assertion needs to be replaced by an ext3_error() call. I'll whip up a patch; thanks for bringing this to my attention. - Ted From ulf at atc-onlane.com Tue Jul 31 04:09:34 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Mon, 30 Jul 2007 21:09:34 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <20070731040705.GF25876@thunk.org> References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> <20070731040705.GF25876@thunk.org> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com> > -----Original Message----- > From: Theodore Tso [mailto:tytso at mit.edu] > Sent: Monday, July 30, 2007 21:07 > To: Ulf Zimmermann > Cc: Christian Kujau; ext3-users at redhat.com > Subject: Re: Kernel panic in ext3:dx_probe, help needed > > On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote: > > Ok, I finally got a complete message of this panic: > > > > Assertion failure in dx_probe() at fs/ext3/namei.c:381: > > "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)" > > The filesystem got corrupted (and that's probably a device driver > issue), but ext3 shouldn't have panic'ed the kernel. The assertion > needs to be replaced by an ext3_error() call. I'll whip up a patch; > thanks for bringing this to my attention. > > - Ted Over 10 nodes, brand new install from kickstart. I can reproduce it every time. Use the EL4 Update 5 cciss driver, no problem, install HP provided driver, it panics at shutdown/reboot. Turn off dir_index on root and use the HP driver, no problem either. Ulf. From ulf at atc-onlane.com Tue Jul 31 04:10:38 2007 From: ulf at atc-onlane.com (Ulf Zimmermann) Date: Mon, 30 Jul 2007 21:10:38 -0700 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com><5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com><20070731040705.GF25876@thunk.org> <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com> Message-ID: <5DE4B7D3E79067418154C49A739C1251022E3D39@msmpk01.corp.autc.com> > -----Original Message----- > From: ext3-users-bounces at redhat.com [mailto:ext3-users-bounces at redhat.com] > On Behalf Of Ulf Zimmermann > Sent: Monday, July 30, 2007 21:10 > To: Theodore Tso > Cc: ext3-users at redhat.com > Subject: RE: Kernel panic in ext3:dx_probe, help needed > > > -----Original Message----- > > From: Theodore Tso [mailto:tytso at mit.edu] > > Sent: Monday, July 30, 2007 21:07 > > To: Ulf Zimmermann > > Cc: Christian Kujau; ext3-users at redhat.com > > Subject: Re: Kernel panic in ext3:dx_probe, help needed > > > > On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote: > > > Ok, I finally got a complete message of this panic: > > > > > > Assertion failure in dx_probe() at fs/ext3/namei.c:381: > > > "dx_get_limit(entries) == dx_root_limit(dir, > root->info.info_length)" > > > > The filesystem got corrupted (and that's probably a device driver > > issue), but ext3 shouldn't have panic'ed the kernel. The assertion > > needs to be replaced by an ext3_error() call. I'll whip up a patch; > > thanks for bringing this to my attention. > > > > - Ted > > Over 10 nodes, brand new install from kickstart. I can reproduce it > every time. Use the EL4 Update 5 cciss driver, no problem, install HP > provided driver, it panics at shutdown/reboot. > > Turn off dir_index on root and use the HP driver, no problem either. > > Ulf. Or if there is corruption, it gets corrupted by the HP driver. Or something. From tytso at mit.edu Tue Jul 31 06:16:46 2007 From: tytso at mit.edu (Theodore Tso) Date: Tue, 31 Jul 2007 02:16:46 -0400 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com> References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> <20070731040705.GF25876@thunk.org> <5DE4B7D3E79067418154C49A739C1251022E3D38@msmpk01.corp.autc.com> Message-ID: <20070731061646.GH25876@thunk.org> On Mon, Jul 30, 2007 at 09:09:34PM -0700, Ulf Zimmermann wrote: > > Over 10 nodes, brand new install from kickstart. I can reproduce it > every time. Use the EL4 Update 5 cciss driver, no problem, install HP > provided driver, it panics at shutdown/reboot. The assertion failure is a sanity check when looking up a directory entry via the htree. It triggers when the size field in the root node of the htree (i.e., the directory index) is larger it could possibly be. This can happen only if the htree directory is corrupted, or the on-disk buffer cache of directory is corrupted. If you run e2fsck -f on the filesystem after you reboot, does it report any errors? If not, it's probably the buffer cache which is getting corrupted. This is probably more likely, if I had to guess. > Turn off dir_index on root and use the HP driver, no problem either. My guess is that the driver is doing something screwy at shutdown, corrupting some directory in the buffer cache in memory. As the shutdown scripts continue to execute, one of then accesses the corrupted directory, and this triggers the assertion failure. I agree it's odd that is so repeatable, but the assertion that was generated is pretty clear about what caused it. Without directory indexing enabled, the filesystem code is either not noticing the corruption, or it's printing a warning which is being ignored instead of causing an assertion failure. My recommendation would be to file a bug report with HP about their device driver. - Ted From duaneg at dghda.com Tue Jul 31 11:19:22 2007 From: duaneg at dghda.com (Duane Griffin) Date: Tue, 31 Jul 2007 12:19:22 +0100 Subject: Kernel panic in ext3:dx_probe, help needed In-Reply-To: <20070731040705.GF25876@thunk.org> References: <5DE4B7D3E79067418154C49A739C1251022E3CCC@msmpk01.corp.autc.com> <5DE4B7D3E79067418154C49A739C1251022E3D28@msmpk01.corp.autc.com> <20070731040705.GF25876@thunk.org> Message-ID: On 31/07/07, Theodore Tso wrote: > On Sun, Jul 29, 2007 at 08:00:16PM -0700, Ulf Zimmermann wrote: > > Ok, I finally got a complete message of this panic: > > > > Assertion failure in dx_probe() at fs/ext3/namei.c:381: > > "dx_get_limit(entries) == dx_root_limit(dir, root->info.info_length)" > > The filesystem got corrupted (and that's probably a device driver > issue), but ext3 shouldn't have panic'ed the kernel. The assertion > needs to be replaced by an ext3_error() call. I'll whip up a patch; > thanks for bringing this to my attention. I've been looking at this very issue recently, following a gentoo bug report. I have a patch ready that replaces the asserts with a fallback to a linear directory scan, following the example set by other parts of that code. I've been waiting to get feedback from the bug reporters before sending it to you, but if you'd like to take a look at it you can find it here: http://bugs.gentoo.org/show_bug.cgi?id=183207 Cheers, Duane. -- "I never could learn to drink that blood and call it wine" - Bob Dylan