From Arun.Somasundaram at honeywell.com Tue Jun 5 16:31:00 2007 From: Arun.Somasundaram at honeywell.com (Somasundaram, Arun (IE10)) Date: Tue, 5 Jun 2007 22:01:00 +0530 Subject: Help on ext3 file system corruption issue Message-ID: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> Hi All, I m a novice developer of Linux applications. Recently I faced a file system corruption. (I guess) I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The system was up for 3 months and was running with average load conditions. One fine day, it just started sending kernel messages on the serial console. The message was like this. EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab le to read inode block - inode=20089, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20090, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20091, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20092, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20093, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20094, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20095, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20096, block=81926 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20097, block=81927 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20098, block=81927 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20099, block=81927 Assertion failure in do_get_write_access() at transaction.c:606: "!(((jh2bh(jh)) ->b_state & (1UL << BH_Lock)) != 0)" invalid operand: 0000 CPU: 0 EIP: 0010:[] EFLAGS: 00010286 eax: 00000021 ebx: c171ce94 ecx: 00000001 edx: 0015991c esi: c171ce00 edi: c788cdc0 ebp: cb5f4690 esp: cf581bf0 ds: 0018 es: 0018 ss: 0018 Process syslogd (pid: 608, stackpage=cf581000) Stack: d0816990 0000025e 00000000 00000000 c171ce00 cf641820 c171ce94 c171ce00 c788cdc0 cb5f4690 d080edb5 c788cdc0 cb5f4690 00000000 00000000 000001ae cfc86400 cfa670e0 d081a0cb c788cdc0 cfa67140 cf581c60 c1448088 cfa66800 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 5b 5e 8b 54 24 24 f6 42 10 04 bb e2 ff ff ff b8 01 00 hda: read_intr: status=0x51 { DriveReady SeekComplete Error } hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, sector=163854 end_request: I/O error, dev 03:02 (hda), sector 163854 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20100, block=81927 hda: read_intr: status=0x51 { DriveReady SeekComplete Error } hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, sector=163854 end_request: I/O error, dev 03:02 (hda), sector 163854 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20101, block=81927 hda: read_intr: status=0x51 { DriveReady SeekComplete Error } hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, sector=163854 end_request: I/O error, dev 03:02 (hda), sector 163854 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20102, block=81927 hda: read_intr: status=0x51 { DriveReady SeekComplete Error } hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, sector=163854 end_request: I/O error, dev 03:02 (hda), sector 163854 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20103, block=81927 hda: read_intr: status=0x51 { DriveReady SeekComplete Error } hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, sector=163854 end_request: I/O error, dev 03:02 (hda), sector 163854 EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unable to read inode block - inode=20104, block=81927 30916 Some more information on this system: Wondering if this could have some deep impact on this issue. 1. LILO boots the 2.4.7-10 kernel image with option ide=nodma. Can this have any impact on these errors? 2. The system has a postgresql database which writes data to the maximum of 1 record per 5 second. That much data writes will it do. 3. On restart, the fsck in the bootup scipts (rc.sysinit) could not resolve this, It said. Checking filesystems Could this be a zero-length partition? fsck.ext3: Attempt to read block from filesystem resulted in short read while tr ying to open /dev/hda2 /dev/hda3: recovering journal /dev/hda3: clean, 81/125488 files, 35684/500472 blocks Checking all file systems. [/sbin/fsck.ext3 -- /tmp] fsck.ext3 -a /dev/hda2 [/sbin/fsck.ext3 -- /tmp2] fsck.ext3 -a /dev/hda3 [FAILED] *** An error occurred during the file system check. *** Dropping you to a shell; the system will reboot *** when you leave the shell. Give root password for maintenance (or type Control-D for normal startup): I went ahead and further gave root password and ran the command. e2fsck -a -c -C 0 /dev/hda2 It said: e2fsck: Attempt to read block from filesystem resulted in short read while trying to open /dev/hda2 Could this be a zero-length partition? Please give your advice, as this problem has become a big-bang show stopper for our product. Your advice will be very helpful for me to go ahead with this issue. Thanks in advance, Arun S -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at clusterfs.com Tue Jun 5 21:35:46 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 5 Jun 2007 15:35:46 -0600 Subject: Help on ext3 file system corruption issue In-Reply-To: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> Message-ID: <20070605213546.GO5181@schatzie.adilger.int> On Jun 05, 2007 22:01 +0530, Somasundaram, Arun (IE10) wrote: > I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The > system was up for 3 months and was running with average load conditions. Unless you have a support contract with some vendor, nobody will look at bugs from such an old kernel. There are a hundred old bugs that might have been fixed already. > One fine day, it just started sending kernel messages on the serial > console. The message was like this. > > > > hda: read_intr: status=0x51 { DriveReady SeekComplete Error } > > hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, > sector=163854 > > end_request: I/O error, dev 03:02 (hda), sector 163854 > > EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab > > le to read inode block - inode=20089, block=81926 This is likely a hardware error. Probably due to the fact that ext3 is not a good filesystem to use on CF because the journal is always overwriting the same part of the CF device. Try something like JFFS2 instead. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From tod at gust.sr.unh.edu Tue Jun 5 22:47:59 2007 From: tod at gust.sr.unh.edu (Tod Hagan) Date: Tue, 05 Jun 2007 18:47:59 -0400 Subject: Calculating stride values? Message-ID: <1181083679.8077.23.camel@trop.sr.unh.edu> All, I have a question about calculating the value for the -E stride option to mke2fs. The mke2fs man page says stride=stripe-size Configure the filesystem for a RAID array with stripe-size filesystem blocks per stripe. So stride = size of stripe/blocksize. The size of a stripe is the RAID chunk size * the number of drives in the RAID. My question: are parity disks included in the number of drives, or are only data drives counted? For example, take the example of six drives configured for RAID 5 with a chunk size of 64 and a 4K blocksize: 1. Parity drive included: 64*6/4 = 96 2. Parity drive excluded: 64*5/4 = 80 Which is correct? Thanks. Tod -- Tod Hagan Information Technologist AIRMAP/Climate Change Research Center Institute for the Study of Earth, Oceans, and Space University of New Hampshire Durham, NH 03824 Phone: 603-862-3116 From adilger at clusterfs.com Tue Jun 5 23:23:11 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 5 Jun 2007 17:23:11 -0600 Subject: Calculating stride values? In-Reply-To: <1181083679.8077.23.camel@trop.sr.unh.edu> References: <1181083679.8077.23.camel@trop.sr.unh.edu> Message-ID: <20070605232311.GT5181@schatzie.adilger.int> On Jun 05, 2007 18:47 -0400, Tod Hagan wrote: > The mke2fs man page says > > stride=stripe-size > Configure the filesystem for a RAID array with stripe-size filesystem blocks per stripe. > > So stride = size of stripe/blocksize. > > The size of a stripe is the RAID chunk size * the number of drives in the RAID. Not really. We submitted a patch to clarify this, so the "stride=" value is the number of blocks on a SINGLE disk. This ensures that the bitmaps are round-robined across all disks. > For example, take the example of six drives configured for RAID 5 with a > chunk size of 64 and a 4K blocksize: > > 1. Parity drive included: 64*6/4 = 96 > 2. Parity drive excluded: 64*5/4 = 80 > > Which is correct? -E stride=16, based on 64k / 4k Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From darkonc at gmail.com Tue Jun 5 23:37:50 2007 From: darkonc at gmail.com (Stephen Samuel) Date: Tue, 5 Jun 2007 16:37:50 -0700 Subject: Help on ext3 file system corruption issue In-Reply-To: <20070605213546.GO5181@schatzie.adilger.int> References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> <20070605213546.GO5181@schatzie.adilger.int> Message-ID: <6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com> On 6/5/07, Andreas Dilger wrote: > > On Jun 05, 2007 22:01 +0530, Somasundaram, Arun (IE10) wrote: > > I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The > > system was up for 3 months and was running with average load conditions. > > Unless you have a support contract with some vendor, nobody will look at > bugs from such an old kernel. There are a hundred old bugs that might > have been fixed already. There are extensions to DD that will, on an error allow you to skip over the block(s) in error while zeroing (instead of just ignoring) the blocks on the output... (I think that Knoppix has such a version of DD, if that'll help you) This means that everything that can be read will be where (relative to the start of your recovery partition or file ) ext3fs is expecting to find it. Once you find the recovery DD, use it to copy your filesystem to a hard drive or whatever, You can then recover your data and then --- presuming andreas is right -- you'll have to replace your flash, and then put a filesystem on it that's more conducive to how flash works. > hda: read_intr: status=0x51 { DriveReady SeekComplete Error } > > > > hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, > > sector=163854 > > > > end_request: I/O error, dev 03:02 (hda), sector 163854 > > > > EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab > > > > le to read inode block - inode=20089, block=81926 > > This is likely a hardware error. Probably due to the fact that ext3 > is not a good filesystem to use on CF because the journal is always > overwriting the same part of the CF device. Try something like JFFS2 > instead. > > Cheers, Andreas > > -- Stephen Samuel http://www.bcgreen.com 778-861-7641 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arun.Somasundaram at honeywell.com Wed Jun 6 10:18:23 2007 From: Arun.Somasundaram at honeywell.com (Somasundaram, Arun (IE10)) Date: Wed, 6 Jun 2007 15:48:23 +0530 Subject: Help on ext3 file system corruption issue In-Reply-To: <20070605213546.GO5181@schatzie.adilger.int> References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> <20070605213546.GO5181@schatzie.adilger.int> Message-ID: <1E675F21DFB0C74294A0FCA4987BE45F724FE1@IE10EV811.global.ds.honeywell.com> Hi Andreas, Thanks for your reply. Are there any patches in this kernel for these ext3 bugs? Please guide me to these patches if available. What is the suitable file system for compact flash? I read that JFFS2 is suitable for raw NAND flash card and is not suitable for CF card. Is it true? My application has write operation performed to the CF card even every 3 secs.(worst case) Thanks, Arun -----Original Message----- From: Andreas Dilger [mailto:adilger at clusterfs.com] Sent: Wednesday, June 06, 2007 3:06 AM To: Somasundaram, Arun (IE10) Cc: ext3-users at redhat.com Subject: Re: Help on ext3 file system corruption issue On Jun 05, 2007 22:01 +0530, Somasundaram, Arun (IE10) wrote: > I have a Kernel 2.4.7-10 with ext3 file system in compact flash. The > system was up for 3 months and was running with average load conditions. Unless you have a support contract with some vendor, nobody will look at bugs from such an old kernel. There are a hundred old bugs that might have been fixed already. > One fine day, it just started sending kernel messages on the serial > console. The message was like this. > > > > hda: read_intr: status=0x51 { DriveReady SeekComplete Error } > > hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=667854, > sector=163854 > > end_request: I/O error, dev 03:02 (hda), sector 163854 > > EXT3-fs error (device ide0(3,2)): ext3_get_inode_loc: unab > > le to read inode block - inode=20089, block=81926 This is likely a hardware error. Probably due to the fact that ext3 is not a good filesystem to use on CF because the journal is always overwriting the same part of the CF device. Try something like JFFS2 instead. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From duaneg at dghda.com Wed Jun 6 12:22:06 2007 From: duaneg at dghda.com (Duane Griffin) Date: Wed, 6 Jun 2007 13:22:06 +0100 Subject: Help on ext3 file system corruption issue In-Reply-To: <6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com> References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> <20070605213546.GO5181@schatzie.adilger.int> <6cd50f9f0706051637o69c72f8qb7e186fa31d2ebd9@mail.gmail.com> Message-ID: On 06/06/07, Stephen Samuel wrote: > There are extensions to DD that will, on an error allow you to skip over the > block(s) in error while zeroing (instead of just ignoring) the blocks on the > output... I use ddrescue for this. It works very well. Cheers, Duane. -- "I never could learn to drink that blood and call it wine" - Bob Dylan From tod at gust.sr.unh.edu Wed Jun 6 15:23:48 2007 From: tod at gust.sr.unh.edu (Tod Hagan) Date: Wed, 06 Jun 2007 11:23:48 -0400 Subject: Calculating stride values? In-Reply-To: <20070605232311.GT5181@schatzie.adilger.int> References: <1181083679.8077.23.camel@trop.sr.unh.edu> <20070605232311.GT5181@schatzie.adilger.int> Message-ID: <1181143428.17547.5.camel@trop.sr.unh.edu> On Tue, 2007-06-05 at 17:23 -0600, Andreas Dilger wrote: > Not really. We submitted a patch to clarify this, so the "stride=" value > is the number of blocks on a SINGLE disk. This ensures that the bitmaps > are round-robined across all disks. > > > For example, take the example of six drives configured for RAID 5 with a > > chunk size of 64 and a 4K blocksize: > > -E stride=16, based on 64k / 4k Thanks for clearing this up. The number of blocks for a single disk means you don't have to worry about parity drives, so it's much easier to deal with. And good to hear about the clarifying patch, as I'm not the only person confused by this -- currently, the information on the link below is wrong: http://wiki.centos.org/HowTos/Disk_Optimization Tod -- Tod Hagan Information Technologist AIRMAP/Climate Change Research Center Institute for the Study of Earth, Oceans, and Space University of New Hampshire Durham, NH 03824 Phone: 603-862-3116 From lanzi at quantentunnel.de Thu Jun 7 16:56:23 2007 From: lanzi at quantentunnel.de (=?ISO-8859-15?Q?J=FCrgen_Landsmann?=) Date: Thu, 07 Jun 2007 18:56:23 +0200 Subject: Crashed ext3-filesystem Message-ID: <466838B7.6060800@quantentunnel.de> Hi! We have a server still running Debian 3.0 (Woody) that nobody likes to touch for maintenance ... ;) Our home-directories are located on a separate HDD (30GB, 1 large primary ext3 partition) and until yesterday it worked correctly. Because the partition was nearly full we had to enlarge our home-space by moving it to a larger HDD. We decided to try a copy of whole partition by using gparted from the "SystemRescueCd" (http://www.sysresccd.org). The first try failed, because the FS-type of our home-partition wasn't recognized. A second boot was tried and this time the FS-type was recognized correctly so we started copying the partition to another HDD (80GB). After about 40% to 50% the copy-procedure crashed an left our system in an unusable state. The only way to re-use the sytem was to press the "Reset"-button. Because we thought that the reason of this crash was caused by an error in the APM-funcionality we tried it once more by booting the kernel using the "noapm" parameter. But even this try crashed ... After we rebooted again I mounted the source-partition to check it's content. But all I found were three files visible on the partition. The directory containing the userfiles was completely gone an in "lost&found" there are hundreds of items. After this horrifying discovery I unmounted the partition and subscribed this mailing-list ... ;) Unfortunately also our whole webserver-files were located in this directory ... Now my question: Is there any possibility to restore my directory (completely or at least partial)? Thanx in advance for your help!!! Bye Juergen From lists at nerdbynature.de Sat Jun 9 12:03:36 2007 From: lists at nerdbynature.de (Christian Kujau) Date: Sat, 9 Jun 2007 14:03:36 +0200 (CEST) Subject: Crashed ext3-filesystem In-Reply-To: <466838B7.6060800@quantentunnel.de> References: <466838B7.6060800@quantentunnel.de> Message-ID: On Thu, 7 Jun 2007, J?rgen Landsmann wrote: > Our home-directories are located on a separate HDD (30GB, 1 large primary > ext3 partition) and until yesterday it worked correctly. ...30 GB and no backups? > Because the > partition was nearly full we had to enlarge our home-space by moving it to a > larger HDD. > We decided to try a copy of whole partition by using gparted from the > "SystemRescueCd" (http://www.sysresccd.org). Why would you do this? What's wrong with tar/cp? > Because we thought that the reason of this crash was caused by an error in > the APM-funcionality we tried it once more by booting the kernel using the > "noapm" parameter. But even this try crashed ... Any more details regarding the crashes? log messages, sysrq-t available? > The directory containing the userfiles was completely gone an in "lost&found" > there are hundreds of items. Ouch :( Not much you can do here. I'd take a first look with "file /lost+found/*" to see if there's something useful in there. ext2/3-recovery tools are out there, but I guess you'll have to try a few and see if they can recover anything: - e2undel, recover (both available as debian packages in unstable) - R-Linux, a free (as in beer) recovery tool for win32 (works pretty good though) - ...and then there's always grep(1) & friends :( hth, Christian. -- make bzImage, not war From lanzi at quantentunnel.de Fri Jun 15 09:08:55 2007 From: lanzi at quantentunnel.de (=?ISO-8859-15?Q?J=FCrgen_Landsmann?=) Date: Fri, 15 Jun 2007 11:08:55 +0200 Subject: Crashed ext3-filesystem In-Reply-To: References: <466838B7.6060800@quantentunnel.de> Message-ID: <46725727.8010508@quantentunnel.de> Christian Kujau schrieb: > On Thu, 7 Jun 2007, J?rgen Landsmann wrote: >> Our home-directories are located on a separate HDD (30GB, 1 large >> primary ext3 partition) and until yesterday it worked correctly. > > ...30 GB and no backups? plz don't ask, why! ;) > >> Because the partition was nearly full we had to enlarge our home-space >> by moving it to a larger HDD. >> We decided to try a copy of whole partition by using gparted from the >> "SystemRescueCd" (http://www.sysresccd.org). > > Why would you do this? What's wrong with tar/cp? It was just a try and the crashed partition wasn't event mountet. The partition must have had an unrecognized error! > - e2undel, recover (both available as debian packages in unstable) > - R-Linux, a free (as in beer) recovery tool for win32 (works pretty > good though) > - ...and then there's always grep(1) & friends :( Some items listed there with the type "file" are directories and others listed as "directory" are files. I already tried to find some files by using grep, find, cat (...) but no chance! I will try some of the utilities mentioned above but I don't think, there is any possibility to get some data back ... Thanx for your hints!! Bye J. Landsmann From public at miernik.name Sat Jun 16 00:41:42 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 02:41:42 +0200 Subject: Help on ext3 file system corruption issue References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> <20070605213546.GO5181@schatzie.adilger.int> Message-ID: <20070616004142.6C4A.0.NOFFLE@debian107.local> Andreas Dilger wrote: > This is likely a hardware error. Probably due to the fact that ext3 > is not a good filesystem to use on CF because the journal is always > overwriting the same part of the CF device. Try something like JFFS2 > instead. Isn't CF always wear-levelled internally, so it shouln't matter, and the internal compact flash controller will take care not to write to the same physical chip all the time? I wonder, because I had recently had two CF cards used as root sidk in a CF-ICE adapter go bad, one with unrecoverable bad sectors (ext3 couldn't be used on it, it was only 32 sectors = 16 kB, but still I couldn't use the card at all, because these sectors where coming back over and over again, like if the CF was remapping there somewhere else, and ext3 not knowing about that jumed upon them again, and so on, very strange). Then a second card got completely destroyed in just couple of months standard desktop usage as root filesystem. I didn't use swap on any of the cards, /home was also somewhere else, no really often changing data. -- Miernik http://miernik.name/ From public at miernik.name Sat Jun 16 00:56:49 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 02:56:49 +0200 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files Message-ID: <20070616005649.6C4A.1.NOFFLE@debian107.local> I recently bought 2 different USB flash disks. These are some cheap no-name devices. Their parameters: bytes C/H/S ID 4194304512 509/255/63 Vendor: Generic Model: USB Flash Drive Rev: 1.00 ANSI SCSI revision: 02 4288676352 1023/132/62 Vendor: USB Model: USB 2.0 Rev: 1.00 ANSI SCSI revision: 02 When I put a FAT32 filesystem on them, everything is OK, but when I put an ext3 filesystem, everything is OK when I write files to the disk, I can fill it with files, but then when I remove the disk from the computer (after a proper umount) and putting it in again, most of the files have corrupted direcotry entries (they look red in midnight commander, some of them pink). But some (about 5 to 10%) files are normal, and normally accessible. I tried them both on two completely different computers with very different hardware, and different Linux versions, and the effect is the same. One of the computers is a desktop with and old AMD K7 Clayton motherboard with only old USB1.1: VT82xxxxx UHCI USB 1.1 Controller, and Debian sid with 2.6.18-4-k7 kernel from Debian. The other computer is a much newer AMD Athlon64 HP laptop with USB2.0 port and SuSE 10.2. Did anyone observe anything similar with any USB flash drives (FAT OK, ext3 corrupted)? I've put files on these disks on FAT32 and run fsck.vfat, and everything looks fine: root at tarnica:~# dosfsck -v /dev/sda1 dosfsck 2.11 (12 Mar 2005) dosfsck 2.11, 12 Mar 2005, FAT32, LFN Checking we can access the last sector of the filesystem Boot sector contents: System ID "mkdosfs" Media byte 0xf8 (hard disk) 512 bytes per logical sector 4096 bytes per cluster 32 reserved sectors First FAT starts at byte 16384 (sector 32) 2 FATs, 32 bit entries 4177920 bytes per FAT (= 8160 sectors) Root directory start at cluster 2 (arbitrary size) Data area starts at byte 8372224 (sector 16352) 1044477 data clusters (4278177792 bytes) 62 sectors/track, 132 heads 0 hidden sectors 8372170 sectors total Checking for unused clusters. Checking free cluster summary. /dev/sda1: 121 files, 116397/1044477 clusters root at tarnica:~# echo "$?" 0 root at tarnica:~# With FAT I can read any file I saved to the disk just fine, it simply works. Maybe you'd like my 'lsusb -v' (this is on the USB1.1-only Debian machine): Bus 002 Device 001: ID 0000:0000 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 9 Hub bDeviceSubClass 0 Unused bDeviceProtocol 0 Full speed hub bMaxPacketSize0 64 idVendor 0x0000 idProduct 0x0000 bcdDevice 2.06 iManufacturer 3 Linux 2.6.18-4-k7 uhci_hcd iProduct 2 UHCI Host Controller iSerial 1 0000:00:07.3 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 25 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xe0 Self Powered Remote Wakeup MaxPower 0mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 9 Hub bInterfaceSubClass 0 Unused bInterfaceProtocol 0 Full speed hub iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0002 1x 2 bytes bInterval 255 Hub Descriptor: bLength 9 bDescriptorType 41 nNbrPorts 2 wHubCharacteristic 0x000a No power switching (usb 1.0) Per-port overcurrent protection bPwrOn2PwrGood 1 * 2 milli seconds bHubContrCurrent 0 milli Ampere DeviceRemovable 0x00 PortPwrCtrlMask 0xff Hub Port Status: Port 1: 0000.0300 lowspeed power Port 2: 0000.0300 lowspeed power Device Status: 0x0003 Self Powered Remote Wakeup Enabled Bus 001 Device 002: ID 1043:8012 iCreate Technologies Corp. Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1043 iCreate Technologies Corp. idProduct 0x8012 bcdDevice 1.00 iManufacturer 1 USB iProduct 2 USB 2.0 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 32 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk (Zip) iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Qualifier (for other device speed): bLength 10 bDescriptorType 6 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 bNumConfigurations 1 Device Status: 0x0000 (Bus Powered) Bus 001 Device 001: ID 0000:0000 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 9 Hub bDeviceSubClass 0 Unused bDeviceProtocol 0 Full speed hub bMaxPacketSize0 64 idVendor 0x0000 idProduct 0x0000 bcdDevice 2.06 iManufacturer 3 Linux 2.6.18-4-k7 uhci_hcd iProduct 2 UHCI Host Controller iSerial 1 0000:00:07.2 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 25 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xe0 Self Powered Remote Wakeup MaxPower 0mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 9 Hub bInterfaceSubClass 0 Unused bInterfaceProtocol 0 Full speed hub iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0002 1x 2 bytes bInterval 255 Hub Descriptor: bLength 9 bDescriptorType 41 nNbrPorts 2 wHubCharacteristic 0x000a No power switching (usb 1.0) Per-port overcurrent protection bPwrOn2PwrGood 1 * 2 milli seconds bHubContrCurrent 0 milli Ampere DeviceRemovable 0x00 PortPwrCtrlMask 0xff Hub Port Status: Port 1: 0000.0103 power enable connect Port 2: 0000.0100 power Device Status: 0x0003 Self Powered Remote Wakeup Enabled Let me give you some more diag. Here is what I did: Having read that too large max_sectors sometimes gives problems, I did: root at tarnica:~# echo "64" > /sys/block/sda/device/max_sectors But before I tried without reducing max_sectors from the default - no difference. root at tarnica:~# mkfs.ext3 /dev/sda1 mke2fs 1.40-WIP (07-Apr-2007) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 523264 inodes, 1046521 blocks 52326 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 16352 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 24 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. root at tarnica:~# mount /dev/sda1 /mnt/sda1 No errors up till here. root at tarnica:~# cp -dr /usr/share/doc/ /mnt/sda1/ Here we get this error in kern.log: Jun 16 00:47:09 tarnica kernel: scsi0: PCI error Interrupt at seqaddr = 0x8 Jun 16 00:47:09 tarnica kernel: scsi0: Data Parity Error Detected during address or write data phase root at tarnica:~# sync Later I did an 'find -ls' in the /mnt/sda1/ directory, then 'umount /mnt/sda1' and then 'mount /dev/sda1 /mnt/sda1' again. The above commands caused that to appear in the output of 'dmesg': EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 Aborting journal on device sda1. EXT3-fs error (device sda1) in ext3_ordered_writepage: IO failure ext3_abort called. EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal Remounting filesystem read-only __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #11: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_frozen_data __journal_remove_journal_head: freeing b_committed_data kjournald starting. Commit interval 5 seconds EXT3-fs warning (device sda1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure EXT3-fs warning (device sda1): ext3_clear_journal_err: Marking fs in need of filesystem check. EXT3-fs warning: mounting fs with errors, running e2fsck is recommended EXT3 FS on sda1, internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. At this point: root at tarnica:~# ls -al /mnt/sda1 total 24 drwxr-xr-x 4 root root 4096 2007-06-16 00:47 . drwxr-xr-x 27 root root 4096 2007-06-09 09:25 .. drwx------ 2 root root 16384 2007-06-16 00:37 lost+found ?--------- ? ? ? ? ? /mnt/sda1/doc root at tarnica:~# So I did 'umount /mnt/sda1' and 'e2fsck -v -y /dev/sda1', which runs endlessly with a zillion errors, for example: Inode 133561 has compression flag set on filesystem without compression support. Clear? yes Inode 133561 has illegal block(s). Clear? yes Illegal block #0 (189057594) in inode 133561. CLEARED. Illegal block #1 (3559149010) in inode 133561. CLEARED. Illegal block #2 (4279737499) in inode 133561. CLEARED. Illegal block #3 (362979125) in inode 133561. CLEARED. Illegal block #4 (3152073428) in inode 133561. CLEARED. Illegal block #5 (679595262) in inode 133561. CLEARED. Illegal block #6 (1924390837) in inode 133561. CLEARED. Illegal block #7 (1058295063) in inode 133561. CLEARED. Illegal block #8 (795243680) in inode 133561. CLEARED. Illegal block #9 (3130620932) in inode 133561. CLEARED. Illegal block #10 (1544529913) in inode 133561. CLEARED. Too many illegal blocks in inode 133561. Clear inode? yes or: Inode 134287 has compression flag set on filesystem without compression support. Clear? yes Inode 134287, i_size is 7400753060221116605, should be 0. Fix? yes Inode 134287, i_blocks is 2017258698, should be 0. Fix? yes Inode 134303 has compression flag set on filesystem without compression support. Clear? yes Inode 134303, i_size is 7400753060221116605, should be 0. Fix? yes Inode 134303, i_blocks is 2017258698, should be 0. Fix? yes or: Inode 132391 has imagic flag set. Clear? yes Special (device/socket/fifo) inode 132391 has non-zero size. Fix? yes Inode 132392 is in use, but has dtime set. Fix? yes Inode 132392 has imagic flag set. Clear? yes and other different types of errors alternatively, from time to time doing: Restarting e2fsck from the beginning... /dev/sda1 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes and it seems this goes on forever. Nothing shows in 'dmesg', nor kern.log nor 'cat /proc/kmsg' nor any other log file/output during doing the filesystem check. I would think it's a broken hardware, but this happens on two different 4 GB USB sticks, from two different sources, one used, one new, on different computers, so it's quite unlikely that both of these sticks would be bad. Besides that they work perfectly with FAT32. And it is also unlikely that so different USB controllers on two different computers would be bad at the same time. Similar symptoms are in this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=404486 But I have never saw such symptoms using HDDs, CF cards (also 4 GB ones, on the same machine) and 256 MB USB flash disks. Any clues? -- Miernik http://miernik.name/ From jamarconi at sbcglobal.net Sat Jun 16 13:17:37 2007 From: jamarconi at sbcglobal.net (John Marconi) Date: Sat, 16 Jun 2007 08:17:37 -0500 Subject: kjournald hang on ext3 to ext3 copy Message-ID: <4673E2F1.2090704@sbcglobal.net> All, I am running into a situation in which one of my ext3 filesystems is getting hung during normal usage. There are three ext3 filesystems on a CompactFLASH. One is mounted as / and one as /tmp. In my test, I am copying a 100 MB file from /root to /tmp repeatedly. While doing this test, I eventually see the copying stop, and any attempts to access /tmp fail - if I even do ls /tmp the command will hang. I suspect kjournald because of the following ps output: PID PPID WCHAN:20 PCPU %MEM PSR COMM 8847 99 start_this_handle 1.1 0.0 28 pdflush 8853 99 schedule_timeout 0.2 0.0 7 pdflush 188 1 kswapd 0.0 0.0 19 kswapd0 8051 1 mtd_blktrans_thread 0.0 0.0 22 mtdblockd 8243 1 kjournald 0.0 0.0 0 kjournald 8305 1 schedule_timeout 0.0 0.0 2 udevd 8378 1 kjournald 0.0 0.0 0 kjournald 8379 1 journal_commit_trans 16.6 0.0 0 kjournald 8437 1 schedule_timeout 0.0 0.0 0 evlogd 8527 1 syslog 0.0 0.0 1 klogd 8534 1 schedule_timeout 0.0 0.0 0 portmap 8569 1 schedule_timeout 0.0 0.0 0 rngd 8639 1 schedule_timeout 0.1 0.0 24 sshd 8741 8639 schedule_timeout 0.0 0.0 0 sshd 8743 8741 wait 0.0 0.0 9 bash 8857 8743 schedule_timeout 4.9 0.0 7 cp 8664 1 schedule_timeout 0.0 0.0 0 xinetd 8679 1 schedule_timeout 0.0 0.0 0 evlnotifyd 8689 1 schedule_timeout 0.0 0.0 0 evlactiond 8704 1 wait 0.0 0.0 1 bash 8882 8704 - 0.0 0.0 2 ps If I run ps repeatedly, I always see process 8379 in journal_commit_transaction, and it is always taking between 12% and 20% of processor 0 up. This process never completes. I also see process 8847 in start_this_handle forever as well - so I believe they are related. This system is using a 2.6.14 kernel. Has anyone seen this type of behaviour before? Note, if I change /tmp to ext2 I never see this issue - it is only when /tmp is mounted as ext3. Thank you, John From tytso at mit.edu Sat Jun 16 15:57:29 2007 From: tytso at mit.edu (Theodore Tso) Date: Sat, 16 Jun 2007 11:57:29 -0400 Subject: Help on ext3 file system corruption issue In-Reply-To: <20070616004142.6C4A.0.NOFFLE@debian107.local> References: <1E675F21DFB0C74294A0FCA4987BE45F724C17@IE10EV811.global.ds.honeywell.com> <20070605213546.GO5181@schatzie.adilger.int> <20070616004142.6C4A.0.NOFFLE@debian107.local> Message-ID: <20070616155728.GA5351@thunk.org> On Sat, Jun 16, 2007 at 02:41:42AM +0200, Miernik wrote: > Isn't CF always wear-levelled internally, so it shouln't matter, and the > internal compact flash controller will take care not to write to the > same physical chip all the time? Cards do seem to have some differences in quality and quality of their wear levelling algorithms (some of which I believe are patented, but I'm not an expert in this area). > I wonder, because I had recently had two CF cards used as root sidk in a > CF-ICE adapter go bad, one with unrecoverable bad sectors (ext3 couldn't > be used on it, it was only 32 sectors = 16 kB, but still I couldn't use > the card at all, because these sectors where coming back over and over > again, like if the CF was remapping there somewhere else, and ext3 not > knowing about that jumed upon them again, and so on, very strange). Then > a second card got completely destroyed in just couple of months standard > desktop usage as root filesystem. I didn't use swap on any of the cards, > /home was also somewhere else, no really often changing data. Did you mount the filesystems with the noatime mount option? If not, then there was probably a huge amount of changes to the CF caused by the last access time getting updated. Regards, - Ted From public at miernik.name Sat Jun 16 16:11:50 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 18:11:50 +0200 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files References: <20070616005649.6C4A.1.NOFFLE@debian107.local> Message-ID: <20070616161150.6FBD.0.NOFFLE@debian107.local> Posting now to two lists, one about USB and the other about ext3 as I don't know what is the source of the problem. Miernik wrote: > I recently bought 2 different USB flash disks. These are some cheap no-name > devices. Their parameters: > > bytes C/H/S ID > 4288676352 1023/132/62 Vendor: USB Model: USB 2.0 Rev: 1.00 ANSI SCSI revision: 02 Right now after trying to copy about 0.5 GB of files to a freshly created ext3 filesystem on the device, this is the output of dmesg: ontroller doesn't have AUX irq; using default 12 serio: i8042 KBD port at 0x60,0x64 irq 1 mice: PS/2 mouse device common for all mice TCP bic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI No-Shortcut mode Freeing unused kernel memory: 212k freed input: AT Translated Set 2 keyboard as /class/input/input0 ACPI: CPU0 (power states: C1[C1] C2[C2]) ACPI: Processor [CPU0] (supports 2 throttling states) usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb USB Universal Host Controller Interface driver v3.0 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 PCI: setting IRQ 11 as level-triggered ACPI: PCI Interrupt 0000:00:07.2[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd 0000:00:07.2: UHCI Host Controller uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1 uhci_hcd 0000:00:07.2: irq 11, io base 0x0000d400 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ACPI: PCI Interrupt 0000:00:07.3[D] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11 uhci_hcd 0000:00:07.3: UHCI Host Controller uhci_hcd 0000:00:07.3: new USB bus registered, assigned bus number 2 uhci_hcd 0000:00:07.3: irq 11, io base 0x0000d800 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004) SCSI subsystem initialized VP_IDE: IDE controller at PCI slot 0000:00:07.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:07.1 ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... Time: acpi_pm clocksource has been installed. hda: ST33210A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... 8139cp 0000:00:08.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip 8139cp 0000:00:08.0: Try the "8139too" driver instead. ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IRQ 11 libata version 2.20 loaded. 8139too Fast Ethernet driver 0.9.28 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 aic7870: Single Channel A, SCSI Id=7, 16/253 SCBs ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 PCI: setting IRQ 10 as level-triggered ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 10 (level, low) -> IRQ 10 eth0: RealTek RTL8139 at 0xe800, 00:02:44:29:57:bf, IRQ 10 eth0: Identified 8139 chip type 'RTL-8139C' hda: max request size: 128KiB hda: 6346368 sectors (3249 MB) w/256KiB Cache, CHS=6296/16/63, UDMA(33) hda: cache flushes not supported hda: hda1 hda2 < hda5 > scsi 0:0:2:0: Processor HP C5110A 3638 PQ: 0 ANSI: 2 target0:0:2: Beginning Domain Validation target0:0:2: Ending Domain Validation kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. scsi 0:0:2:0: Attached scsi generic sg0 type 3 pci_hotplug: PCI Hot Plug PCI Core version: 0.5 shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 input: PC Speaker as /class/input/input1 Real Time Clock Driver v1.12ac Linux agpgart interface v0.102 (c) Dave Jones agpgart: Detected VIA Twister-K/KT133x/KM133 chipset agpgart: AGP aperture is 64M @ 0xe0000000 parport_pc: VIA 686A/8231 detected parport_pc: probing current configuration parport_pc: Current parallel port base: 0x378 parport0: PC-style at 0x378, irq 7 [PCSPP,EPP] parport_pc: VIA parallel port: io=0x378, irq=7 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 12 PCI: setting IRQ 12 as level-triggered ACPI: PCI Interrupt 0000:00:07.5[C] -> Link [LNKC] -> GSI 12 (level, low) -> IRQ 12 PCI: Setting latency timer of device 0000:00:07.5 to 64 EXT3 FS on hda1, internal journal Probing IDE interface ide1... device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel at redhat.com Sound Blaster 16 soundcard not found or device busy In case, if you have non-AWE card, try snd-sb16 module [drm] Initialized drm 1.1.0 20060810 ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 10 (level, low) -> IRQ 10 [drm] Initialized radeon 1.25.0 20060524 on minor 0 radeonfb: Found Intel x86 BIOS ROM Image radeonfb: Retrieved PLL infos from BIOS radeonfb: Reference=27.00 MHz (RefDiv=12) Memory=240.00 Mhz, System=166.00 MHz radeonfb: PLL min 20000 max 40000 i2c_adapter i2c-2: unable to read EDID block. i2c_adapter i2c-2: unable to read EDID block. i2c_adapter i2c-2: unable to read EDID block. i2c_adapter i2c-4: unable to read EDID block. i2c_adapter i2c-4: unable to read EDID block. i2c_adapter i2c-4: unable to read EDID block. radeonfb: Monitor 1 type CRT found radeonfb: EDID probed radeonfb: Monitor 2 type no found Console: switching to colour frame buffer device 200x75 radeonfb (0000:01:00.0): ATI Radeon Y` Intel ISA PCIC probe: Intel i82365sl B step ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets host opts [0]: none host opts [1]: none ISA irqs (scanned) = 9,15 polling interval = 1000 ms pccard: PCMCIA card inserted into slot 0 TCP hybla registered cs: IO port probe 0x100-0x3af: clean. cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7 cs: IO port probe 0x820-0x8ff: clean. cs: IO port probe 0xc00-0xcf7: clean. cs: IO port probe 0xa00-0xaff: clean. cs: memory probe 0x0d0000-0x0dffff: excluding 0xd0000-0xd7fff cs: memory probe 0x0e0000-0x0effff: clean. pcmcia: registering new device pcmcia0.0 cs: IO port probe 0x100-0x3af: clean. cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7 cs: IO port probe 0x820-0x8ff: clean. cs: IO port probe 0xc00-0xcf7: clean. cs: IO port probe 0xa00-0xaff: clean. Probing IDE interface ide2... hde: SAMSUNG CF/ATA, CFA DISK drive ide2 at 0x100-0x107,0x10e on irq 9 hde: max request size: 128KiB hde: 8211168 sectors (4204 MB) w/0KiB Cache, CHS=8146/16/63 hde: hde1 hde2 ide-cs: hde: Vpp = 0.0 pcmcia: Detected deprecated PCMCIA ioctl usage from process: discover. pcmcia: This interface will soon be removed from the kernel; please expect breakage unless you upgrade to new tools. pcmcia: see http://www.kernel.org/pub/linux/utils/kernel/pcmcia/pcmcia.html for details. eth0: link up, 100Mbps, half-duplex, lpa 0x44E1 NET: Registered protocol family 10 lo: Disabled Privacy Extensions eth0: no IPv6 routers present input: Power Button (FF) as /class/input/input2 ACPI: Power Button (FF) [PWRF] input: Power Button (CM) as /class/input/input3 ACPI: Power Button (CM) [PWRB] input: Sleep Button (CM) as /class/input/input4 ACPI: Sleep Button (CM) [SLPB] irda_init() NET: Registered protocol family 23 powernow-k8: Processor cpuid 680 not supported agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode [drm] Setting GART location based on new memory map [drm] Loading R200 Microcode [drm] writeback test succeeded in 1 usecs kjournald starting. Commit interval 5 seconds EXT3 FS on dm-0, internal journal EXT3-fs: mounted filesystem with ordered data mode. agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode [drm] Loading R200 Microcode usb 1-1: new full speed USB device using uhci_hcd and address 2 usb 1-1: configuration #1 chosen from 1 choice Initializing USB Mass Storage driver... scsi1 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage USB Mass Storage support registered. usb-storage: device found at 2 usb-storage: waiting for device to settle before scanning usb-storage: device scan complete scsi 1:0:0:0: Direct-Access USB USB 2.0 1.00 PQ: 0 ANSI: 2 SCSI device sda: 8376321 512-byte hdwr sectors (4289 MB) sda: Write Protect is off sda: Mode Sense: 03 00 00 00 sda: assuming drive cache: write through SCSI device sda: 8376321 512-byte hdwr sectors (4289 MB) sda: Write Protect is off sda: Mode Sense: 03 00 00 00 sda: assuming drive cache: write through sda: sda1 sd 1:0:0:0: Attached scsi removable disk sda sd 1:0:0:0: Attached scsi generic sg1 type 0 kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. scsi0: PCI error Interrupt at seqaddr = 0x7 scsi0: Data Parity Error Detected during address or write data phase usb 1-2: new full speed USB device using uhci_hcd and address 3 usb 1-2: configuration #1 chosen from 1 choice scsi2 : SCSI emulation for USB Mass Storage devices usb-storage: device found at 3 usb-storage: waiting for device to settle before scanning usb-storage: device scan complete scsi 2:0:0:0: Direct-Access USB 2.0 Mobile Disk PMAP PQ: 0 ANSI: 0 CCS SCSI device sdb: 1003520 512-byte hdwr sectors (514 MB) sdb: Write Protect is off sdb: Mode Sense: 23 00 00 00 sdb: assuming drive cache: write through SCSI device sdb: 1003520 512-byte hdwr sectors (514 MB) sdb: Write Protect is off sdb: Mode Sense: 23 00 00 00 sdb: assuming drive cache: write through sdb: sdb1 sd 2:0:0:0: Attached scsi removable disk sdb sd 2:0:0:0: Attached scsi generic sg2 type 0 kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3-fs error (device sda1): ext3_new_block: block(1046522) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046523) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046524) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046525) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046526) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046531) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046532) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046535) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046537) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046541) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046542) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046544) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046546) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046548) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046549) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046550) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046553) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046554) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046556) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046558) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046561) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046562) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046563) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046565) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046566) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046567) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046568) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046571) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046573) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046575) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046578) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046579) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046581) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046582) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046583) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046585) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046586) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046587) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046589) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046590) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046593) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046594) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046595) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046596) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046597) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046598) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046602) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046603) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046604) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046607) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046610) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046612) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046613) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046614) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046615) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046616) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046620) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046622) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046623) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046624) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046625) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046626) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046627) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046628) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046629) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046633) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046636) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046638) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046639) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046641) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046642) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046643) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046646) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046647) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046648) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046650) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046651) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046652) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046653) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046655) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046657) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046661) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046667) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046669) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046671) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046672) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046673) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046674) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046676) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046680) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046681) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046683) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046685) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046686) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046688) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046689) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046690) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046692) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046694) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046697) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046698) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046701) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046705) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046708) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046713) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046714) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046716) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046718) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046721) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046723) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046726) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046727) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046730) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046731) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046732) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046733) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046735) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046736) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046738) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046740) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046741) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046743) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046749) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046751) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046752) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046757) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046758) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046759) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046760) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046762) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046764) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046766) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046767) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046768) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046770) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046772) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046774) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046776) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046777) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046783) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046784) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046786) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046787) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046788) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046790) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046791) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046792) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046795) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046796) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046797) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046799) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046800) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046804) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046806) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046811) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046812) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046813) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046815) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046816) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046817) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046818) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046821) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046822) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046823) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046825) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046826) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046827) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046828) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046830) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046831) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046832) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046834) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046836) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046838) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046839) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046840) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046841) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046843) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046844) >= blocks count(1046521) - block_group = 31, es == d8f5d400 EXT3-fs error (device sda1): ext3_new_block: block(1046845) >= blocks count(1046521) - block_group = 31, es == d8f5d400 And trying to write any more files gives "No space left on device" message, while only 8% of the device is used: Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 4120356 305988 3605064 8% /mnt/sda1 Its mounted like this: miernik at tarnica:~$ cat /proc/mounts | grep sda1 /dev/sda1 /mnt/sda1 ext3 rw,nosuid,nodev,noexec,data=ordered 0 0 miernik at tarnica:~$ -- Miernik http://miernik.name/ From public at miernik.name Sat Jun 16 17:48:25 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 19:48:25 +0200 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files References: <20070616005649.6C4A.1.NOFFLE@debian107.local> <20070616161150.6FBD.0.NOFFLE@debian107.local> Message-ID: <20070616174825.6FBD.1.NOFFLE@debian107.local> Miernik wrote: > And trying to write any more files gives "No space left on device" message, > while only 8% of the device is used: > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 4120356 305988 3605064 8% /mnt/sda1 And the device really has the 4 GB: debian105:~# dd if=/dev/zero of=/dev/sda dd: writing to `/dev/sda': No space left on device 8376322+0 records in 8376321+0 records out 4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s debian105:~# And no error was encountered while doing this dd write. Neither in dmesg, nor in kern.log. Everything was fine. -- Miernik http://miernik.name/ From public at miernik.name Sat Jun 16 19:01:02 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 21:01:02 +0200 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files References: <20070616005649.6C4A.1.NOFFLE@debian107.local> <20070616161150.6FBD.0.NOFFLE@debian107.local> Message-ID: <20070616190102.6FBD.2.NOFFLE@debian107.local> Miernik wrote: > And trying to write any more files gives "No space left on device" message, > while only 8% of the device is used: > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 4120356 305988 3605064 8% /mnt/sda1 And the device really has the 4 GB: debian105:~# dd if=/dev/zero of=/dev/sda dd: writing to `/dev/sda': No space left on device 8376322+0 records in 8376321+0 records out 4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s debian105:~# And no error was encountered while doing this dd write. Neither in dmesg, nor in kern.log. Everything was fine. Reading also fine: debian105:~# dd if=/dev/sda of=/dev/null 8376321+0 records in 8376321+0 records out 4288676352 bytes (4.3 GB) copied, 4128.19 seconds, 1.0 MB/s debian105:~# No strange messages in any of the logs while doing that. Also I consider it unlikely that these sticks are bad, because the same happens on both ot them. I bought them on these Internet auctions: http://allegro.pl/item203519391_203519391.html http://allegro.pl/item201343628_pendrive_4_gb_od_1_zl_.html As you can see the first one was a multi-item fixed price sale of 20 of such sticks, and many people bought these, some of whom already given positive comments. The seller has 100% of positive comments, many of which from sale of the same type of USB sticks. I would contact the seller if only one of the sticks was bad - well, that could happen, but would he give me two different bad sticks? Both where bought from the same seller. -- Miernik http://miernik.name/ From adilger at clusterfs.com Mon Jun 18 06:20:27 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Mon, 18 Jun 2007 00:20:27 -0600 Subject: kjournald hang on ext3 to ext3 copy In-Reply-To: <4673E2F1.2090704@sbcglobal.net> References: <4673E2F1.2090704@sbcglobal.net> Message-ID: <20070618062027.GB5181@schatzie.adilger.int> On Jun 16, 2007 08:17 -0500, John Marconi wrote: > I am running into a situation in which one of my ext3 filesystems is > getting hung during normal usage. There are three ext3 filesystems on a > CompactFLASH. One is mounted as / and one as /tmp. In my test, I am > copying a 100 MB file from /root to /tmp repeatedly. While doing this > test, I eventually see the copying stop, and any attempts to access /tmp > fail - if I even do ls /tmp the command will hang. > > I suspect kjournald because of the following ps output: > PID PPID WCHAN:20 PCPU %MEM PSR COMM > 8847 99 start_this_handle 1.1 0.0 28 pdflush > 8853 99 schedule_timeout 0.2 0.0 7 pdflush > 188 1 kswapd 0.0 0.0 19 kswapd0 > 8051 1 mtd_blktrans_thread 0.0 0.0 22 mtdblockd > 8243 1 kjournald 0.0 0.0 0 kjournald > 8305 1 schedule_timeout 0.0 0.0 2 udevd > 8378 1 kjournald 0.0 0.0 0 kjournald > 8379 1 journal_commit_trans 16.6 0.0 0 kjournald > 8437 1 schedule_timeout 0.0 0.0 0 evlogd > 8527 1 syslog 0.0 0.0 1 klogd > 8534 1 schedule_timeout 0.0 0.0 0 portmap > 8569 1 schedule_timeout 0.0 0.0 0 rngd > 8639 1 schedule_timeout 0.1 0.0 24 sshd > 8741 8639 schedule_timeout 0.0 0.0 0 sshd > 8743 8741 wait 0.0 0.0 9 bash > 8857 8743 schedule_timeout 4.9 0.0 7 cp > 8664 1 schedule_timeout 0.0 0.0 0 xinetd > 8679 1 schedule_timeout 0.0 0.0 0 evlnotifyd > 8689 1 schedule_timeout 0.0 0.0 0 evlactiond > 8704 1 wait 0.0 0.0 1 bash > 8882 8704 - 0.0 0.0 2 ps > > If I run ps repeatedly, I always see process 8379 in > journal_commit_transaction, and it is always taking between 12% and 20% > of processor 0 up. This process never completes. I also see process > 8847 in start_this_handle forever as well - so I believe they are related. > > This system is using a 2.6.14 kernel. Please try to reproduce with a newer kernel, as this kind of problem might have been fixed already. Two tips for debugging this kind of issue: - you need to have detailed stack traces (e.g. sysrq-t) of all the interesting processes - if a process is stuck inside a large function (e.g. 8379 in example) you need to provide the exact line number. this can be found by compiling the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the byte offset is printed in the sysrq-t output, and then include the code surrounding that line from the source file - a process stuck in "start_this_handle()" is often just an innocent bystander. It is waiting for the currently committing transaction to complete before it can start a new filesystem-modifying operation (handle). That said, the journal handle acts like a lock and has been the cause of many deadlock problems (e.g. process 1 holds lock, waits for handle; process 2 holds transaction open waiting for lock). pdflush might be one of the "process 1" kind of tasks, and some other process is holding the transaction open preventing it from completing. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From public at miernik.name Mon Jun 18 08:59:44 2007 From: public at miernik.name (Miernik) Date: Mon, 18 Jun 2007 10:59:44 +0200 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files References: <20070616213253.GA6528@tarnica> Message-ID: <20070618085944.7924.0.NOFFLE@debian107.local> Alan Stern wrote: > What you should do is fill up the drive with known data (not just 0's > like in your dd test), and then read it back to see if the data has > changed. Using dd I found out the reason of the problem: http://reviews.ebay.co.uk/Beware-of-FAKE-1GB-2GB-4GB-8GB-USB-Flash-Drives-on-eBay_W0QQugidZ10000000000953346 http://blog.uhuru.de/?p=1080 http://projectglop.com/2007/05/14/fake-usb-memory-keys/ http://reviews.ebay.com.au/BEWARE-of-FAKE-1GB-2GB-4GB-8GB-USB-Flash-Drives-on-eBay_W0QQugidZ10000000000706427 These USB sticks are fake, and have a 1 GB flash chip, and a fake controller which makes the computer think it is a 4 GB stick. Any data written past the first 1066401792 bytes is lost, and reading any data over that boundary gives a copy of the last 2048 bytes of the real flash chip, repeated as many times to fill the whole stick. I taken one of the sticks apart. The flash chip is FBNM40A4GK3WG The controller is iCreate I5128-LG L702 CE7103 http://www.icreate.com.tw/img/PDF/i5128-L_datasheet_preliminary_v010.pdf Sorry for wasting your time. But the benefit is that now searching for the error messages that I encountered in Gmane or Google will reveal this thread with the real cause. It's only very strange that if 95% of the sticks sold on eBay are such fake's, then why noone on this mailing list about USB knew about and my Googling for the error messages didn't reveal any posts about the cause. I hope this post will fix this lack of knowledge spread. I am also very surprised that these sellers manage to get positive comments for these sticks, and the people who buy them don't notice? People don't fill them past 1 GB? If so, why buy a 4 GB stick, you could have bought a 1 GB one? And when they fail a lot of time after they buy it, when they finally try to fill it past 1 GB and actually read that data, maybe its so lot of time since they bought it by average that they think that the stick just got broken? Am I one of the few ones who tried to fill it past 1 GB on the first day I got it? -- Miernik http://miernik.name/ From alex at alex.org.uk Mon Jun 18 09:33:26 2007 From: alex at alex.org.uk (Alex Bligh) Date: Mon, 18 Jun 2007 10:33:26 +0100 Subject: 4 GB USB flash disk with FAT ok, with ext3 corrupted files In-Reply-To: <20070618085944.7924.0.NOFFLE@debian107.local> References: <20070616213253.GA6528@tarnica> <20070618085944.7924.0.NOFFLE@debian107.local> Message-ID: --On 18 June 2007 10:59 +0200 Miernik wrote: > These USB sticks are fake, and have a 1 GB flash chip, and a fake > controller which makes the computer think it is a 4 GB stick. > > Any data written past the first 1066401792 bytes is lost, and reading > any data over that boundary gives a copy of the last 2048 bytes of the > real flash chip, repeated as many times to fill the whole stick. Hmmmm.... I wonder whether it would be useful for mke2fs etc. to write to sector n-1 and n-2 (where there are n sectors on the disk) and read the sectors back to check the last sectors on the disk actually work. This would detect bad extents very easily and quickly. I am sure there are innocent causes of this problem on other media (i.e. it would be useful beyond fake USB drives) Alex From stern at rowland.harvard.edu Sat Jun 16 21:20:46 2007 From: stern at rowland.harvard.edu (Alan Stern) Date: Sat, 16 Jun 2007 17:20:46 -0400 (EDT) Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3 corrupted files In-Reply-To: <20070616174825.6FBD.1.NOFFLE@debian107.local> Message-ID: On Sat, 16 Jun 2007, Miernik wrote: > Miernik wrote: > > And trying to write any more files gives "No space left on device" message, > > while only 8% of the device is used: > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > /dev/sda1 4120356 305988 3605064 8% /mnt/sda1 > > And the device really has the 4 GB: > > debian105:~# dd if=/dev/zero of=/dev/sda > dd: writing to `/dev/sda': No space left on device > 8376322+0 records in > 8376321+0 records out > 4288676352 bytes (4.3 GB) copied, 4145.71 seconds, 1.0 MB/s > debian105:~# > > And no error was encountered while doing this dd write. > Neither in dmesg, nor in kern.log. Everything was fine. This is a little misleading. You are comparing the "df" output for /dev/sda1 with a transfer to /dev/sda. Furthermore the units are different; df uses 1-KB blocks and dd uses 512-byte blocks. It would help to see the output from "fdisk -l /dev/sda". Alan Stern From public at miernik.name Sat Jun 16 21:32:53 2007 From: public at miernik.name (Miernik) Date: Sat, 16 Jun 2007 23:32:53 +0200 Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3 corrupted files In-Reply-To: References: <20070616174825.6FBD.1.NOFFLE@debian107.local> Message-ID: <20070616213253.GA6528@tarnica> On Sat, Jun 16, 2007 at 05:20:46PM -0400, Alan Stern wrote: > This is a little misleading. You are comparing the "df" output for > /dev/sda1 with a transfer to /dev/sda. Furthermore the units are > different; df uses 1-KB blocks and dd uses 512-byte blocks. The point of doing dd was to find out if there will be any errors, not to check the size. > It would help to see the output from "fdisk -l /dev/sda". debian105:~# fdisk -l /dev/sda Disk /dev/sda: 4288 MB, 4288676352 bytes 132 heads, 62 sectors/track, 1023 cylinders Units = cylinders of 8184 * 512 = 4190208 bytes Disk /dev/sda doesn't contain a valid partition table debian105:~# Ah, that was after the dd zeroing. I created the partition table and partition, and here it is again: debian105:~# fdisk -l /dev/sda Disk /dev/sda: 4288 MB, 4288676352 bytes 132 heads, 62 sectors/track, 1023 cylinders Units = cylinders of 8184 * 512 = 4190208 bytes Device Boot Start End Blocks Id System /dev/sda1 1 1023 4186085 83 Linux debian105:~# And created the filesystem: debian105:~# mkfs.ext3 /dev/sda1 mke2fs 1.40-WIP (14-Nov-2006) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 523264 inodes, 1046521 blocks 52326 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 16352 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 29 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. debian105:~# fdisk -l /dev/sda Disk /dev/sda: 4288 MB, 4288676352 bytes 132 heads, 62 sectors/track, 1023 cylinders Units = cylinders of 8184 * 512 = 4190208 bytes Device Boot Start End Blocks Id System /dev/sda1 1 1023 4186085 83 Linux debian105:~# Did it help? Is this a bug in ext3 code? My USB sticks are broken? Bug in kernel USB subsystem? My USB ports suck? Whatever else? Did anyone see any cheap no-name USB 4 GB stick work with ext3? The seller of these sticks got plenty of positive comments on the auction site, and no complaints, so they work for everyone besides me. But porbably everyone else uses windows. I don't have a windows system to test this. -- Miernik http://miernik.name/ From stern at rowland.harvard.edu Sat Jun 16 21:34:05 2007 From: stern at rowland.harvard.edu (Alan Stern) Date: Sat, 16 Jun 2007 17:34:05 -0400 (EDT) Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3 corrupted files In-Reply-To: <20070616161150.6FBD.0.NOFFLE@debian107.local> Message-ID: On Sat, 16 Jun 2007, Miernik wrote: > Posting now to two lists, one about USB and the other about ext3 as I > don't know what is the source of the problem. > > Miernik wrote: > > I recently bought 2 different USB flash disks. These are some cheap no-name > > devices. Their parameters: > > > > bytes C/H/S ID > > 4288676352 1023/132/62 Vendor: USB Model: USB 2.0 Rev: 1.00 ANSI SCSI revision: 02 > > Right now after trying to copy about 0.5 GB of files to a freshly created ext3 > filesystem on the device, this is the output of dmesg: ... > EXT3-fs error (device sda1): ext3_new_block: block(1046522) >= blocks count(1046521) - block_group = 31, es == d8f5d400 > EXT3-fs error (device sda1): ext3_new_block: block(1046523) >= blocks count(1046521) - block_group = 31, es == d8f5d400 > And trying to write any more files gives "No space left on device" message, > while only 8% of the device is used: > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 4120356 305988 3605064 8% /mnt/sda1 This doesn't seem to be a USB error. Look at the ext3 error message. It's complaining about a block number being out of range, not any sort of I/O problem. Also I have no idea where that value of 1046521 for the total block count came from. These are 4-KB size blocks; converting to 1-KB blocks gives 4186084, which is larger than than total size listed above for /dev/sda1. The output from "fdisk -l /dev/sda" would come in useful here. Alan Stern From stern at rowland.harvard.edu Sat Jun 16 21:47:02 2007 From: stern at rowland.harvard.edu (Alan Stern) Date: Sat, 16 Jun 2007 17:47:02 -0400 (EDT) Subject: [Linux-usb-users] 4 GB USB flash disk with FAT ok, with ext3 corrupted files In-Reply-To: <20070616213253.GA6528@tarnica> Message-ID: On Sat, 16 Jun 2007, Miernik wrote: > Ah, that was after the dd zeroing. I created the partition table and partition, > and here it is again: > > debian105:~# fdisk -l /dev/sda > > Disk /dev/sda: 4288 MB, 4288676352 bytes > 132 heads, 62 sectors/track, 1023 cylinders > Units = cylinders of 8184 * 512 = 4190208 bytes > > Device Boot Start End Blocks Id System > /dev/sda1 1 1023 4186085 83 Linux > debian105:~# Ah, good. Note that 4186085 / 4 = 1046521, which agrees with the value below and explains those ext3 error messages. > And created the filesystem: > > debian105:~# mkfs.ext3 /dev/sda1 > mke2fs 1.40-WIP (14-Nov-2006) > Filesystem label= > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > 523264 inodes, 1046521 blocks > 52326 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=1073741824 > 32 block groups > 32768 blocks per group, 32768 fragments per group > 16352 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736 > > Writing inode tables: done > Creating journal (16384 blocks): done > Writing superblocks and filesystem accounting information: done > > This filesystem will be automatically checked every 29 mounts or > 180 days, whichever comes first. Use tune2fs -c or -i to override. > debian105:~# fdisk -l /dev/sda > Did it help? > > Is this a bug in ext3 code? I don't know; maybe. Or maybe the code is okay but it's getting bad data from somewhere. For example, even though the USB reads succeed, they might not return the same data that was originally written to the device. > My USB sticks are broken? Maybe. > Bug in kernel USB subsystem? No. > My USB ports suck? No. A problem in the port would cause a USB error, not bad data. What you should do is fill up the drive with known data (not just 0's like in your dd test), and then read it back to see if the data has changed. Alan Stern From jamarconi at sbcglobal.net Tue Jun 19 03:53:02 2007 From: jamarconi at sbcglobal.net (John Marconi) Date: Mon, 18 Jun 2007 22:53:02 -0500 Subject: kjournald hang on ext3 to ext3 copy In-Reply-To: <20070618062027.GB5181@schatzie.adilger.int> References: <4673E2F1.2090704@sbcglobal.net> <20070618062027.GB5181@schatzie.adilger.int> Message-ID: <4677531E.1030108@sbcglobal.net> Andreas Dilger wrote: > On Jun 16, 2007 08:17 -0500, John Marconi wrote: > >> I am running into a situation in which one of my ext3 filesystems is >> getting hung during normal usage. There are three ext3 filesystems on a >> CompactFLASH. One is mounted as / and one as /tmp. In my test, I am >> copying a 100 MB file from /root to /tmp repeatedly. While doing this >> test, I eventually see the copying stop, and any attempts to access /tmp >> fail - if I even do ls /tmp the command will hang. >> >> I suspect kjournald because of the following ps output: >> PID PPID WCHAN:20 PCPU %MEM PSR COMM >> 8847 99 start_this_handle 1.1 0.0 28 pdflush >> 8853 99 schedule_timeout 0.2 0.0 7 pdflush >> 188 1 kswapd 0.0 0.0 19 kswapd0 >> 8051 1 mtd_blktrans_thread 0.0 0.0 22 mtdblockd >> 8243 1 kjournald 0.0 0.0 0 kjournald >> 8305 1 schedule_timeout 0.0 0.0 2 udevd >> 8378 1 kjournald 0.0 0.0 0 kjournald >> 8379 1 journal_commit_trans 16.6 0.0 0 kjournald >> 8437 1 schedule_timeout 0.0 0.0 0 evlogd >> 8527 1 syslog 0.0 0.0 1 klogd >> 8534 1 schedule_timeout 0.0 0.0 0 portmap >> 8569 1 schedule_timeout 0.0 0.0 0 rngd >> 8639 1 schedule_timeout 0.1 0.0 24 sshd >> 8741 8639 schedule_timeout 0.0 0.0 0 sshd >> 8743 8741 wait 0.0 0.0 9 bash >> 8857 8743 schedule_timeout 4.9 0.0 7 cp >> 8664 1 schedule_timeout 0.0 0.0 0 xinetd >> 8679 1 schedule_timeout 0.0 0.0 0 evlnotifyd >> 8689 1 schedule_timeout 0.0 0.0 0 evlactiond >> 8704 1 wait 0.0 0.0 1 bash >> 8882 8704 - 0.0 0.0 2 ps >> >> If I run ps repeatedly, I always see process 8379 in >> journal_commit_transaction, and it is always taking between 12% and 20% >> of processor 0 up. This process never completes. I also see process >> 8847 in start_this_handle forever as well - so I believe they are related. >> >> This system is using a 2.6.14 kernel. >> > > Please try to reproduce with a newer kernel, as this kind of problem > might have been fixed already. > > > Two tips for debugging this kind of issue: > - you need to have detailed stack traces (e.g. sysrq-t) of all the > interesting processes > > - if a process is stuck inside a large function (e.g. 8379 in example) > you need to provide the exact line number. this can be found by compiling > the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing > "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the > byte offset is printed in the sysrq-t output, and then include the code > surrounding that line from the source file > > - a process stuck in "start_this_handle()" is often just an innocent > bystander. It is waiting for the currently committing transaction to > complete before it can start a new filesystem-modifying operation (handle). > That said, the journal handle acts like a lock and has been the cause of > many deadlock problems (e.g. process 1 holds lock, waits for handle; > process 2 holds transaction open waiting for lock). pdflush might be one > of the "process 1" kind of tasks, and some other process is holding the > transaction open preventing it from completing. > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > > > Andreas, Thanks for the information. I am not able to update the entire kernel to a new version for a variety of reasons, however I can update certain parts in my system (such as the filesystem). I did a diff of the 2.6.16 kernel against my kernel, and the changes to jbd were minimal. I plan on looking at the latest versions of the kernel to determine if anything has changed since 2.6.16. I took a look at the place that kjournald was stuck - it is in the journal_commit_transaction "while (comiit_transaction->t_updates)" loop and it is trying to "spin_lock(&journal->j_state_lock). When I look at pdflush, it is also trying to take the journal->j_state_lock. Do you have any tips on finding out which process might own journal->j_state_lock? Thanks again, John From adilger at clusterfs.com Tue Jun 19 05:14:02 2007 From: adilger at clusterfs.com (Andreas Dilger) Date: Mon, 18 Jun 2007 23:14:02 -0600 Subject: kjournald hang on ext3 to ext3 copy In-Reply-To: <4677531E.1030108@sbcglobal.net> References: <4673E2F1.2090704@sbcglobal.net> <20070618062027.GB5181@schatzie.adilger.int> <4677531E.1030108@sbcglobal.net> Message-ID: <20070619051402.GO5181@schatzie.adilger.int> On Jun 18, 2007 22:53 -0500, John Marconi wrote: > Andreas Dilger wrote: > >Two tips for debugging this kind of issue: > >- you need to have detailed stack traces (e.g. sysrq-t) of all the > > interesting processes > > > >- if a process is stuck inside a large function (e.g. 8379 in example) > > you need to provide the exact line number. this can be found by > > compiling > > the kernel with CONFIG_DEBUG_INFO (-g flag to gcc) and then doing > > "gdb vmlinux" and "p *(journal_commit_transaction+{offset})", where the > > byte offset is printed in the sysrq-t output, and then include the code > > surrounding that line from the source file > > > >- a process stuck in "start_this_handle()" is often just an innocent > > bystander. It is waiting for the currently committing transaction to > > complete before it can start a new filesystem-modifying operation > > (handle). > > That said, the journal handle acts like a lock and has been the cause of > > many deadlock problems (e.g. process 1 holds lock, waits for handle; > > process 2 holds transaction open waiting for lock). pdflush might be one > > of the "process 1" kind of tasks, and some other process is holding the > > transaction open preventing it from completing. > > I am not able to update the entire kernel to a new version for a variety > of reasons, however I can update certain parts in my system (such as the > filesystem). I did a diff of the 2.6.16 kernel against my kernel, and > the changes to jbd were minimal. I plan on looking at the latest > versions of the kernel to determine if anything has changed since 2.6.16. The problem may also be in the ext3 layer and not jbd. > I took a look at the place that kjournald was stuck - it is in the > journal_commit_transaction "while (comiit_transaction->t_updates)" loop > and it is trying to "spin_lock(&journal->j_state_lock). When I look at > pdflush, it is also trying to take the journal->j_state_lock. Do you > have any tips on finding out which process might own journal->j_state_lock? You can enable CONFIG_DEBUG_SPINLOCK in newer kernels and it appears the spinlock will set the "owner" field to the task struct. You still need to get access to this via e.g. "crash" or lkcd or something. Hmm, it seems this is only set for ppc and s390??? That is how I would debug this in any case. The other way (I've done this too many times in the past) is to look through all of the stack traces and figure out which ones are in a filesystem context, then check if any of them are blocked on locks while holding transactions open. Needs a detailed understanding of kernel callpaths. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From doseyg at r-networks.net Fri Jun 29 23:21:42 2007 From: doseyg at r-networks.net (Glen Dosey) Date: Fri, 29 Jun 2007 19:21:42 -0400 Subject: poor read performance Message-ID: <1183159302.30971.37.camel@localhost.localdomain> I am seeing what seems to be a notable limit on read performance of an ext3 filesystem. If anyone could offer some insight it would be helpful. Background: 12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the enclosure was configured RAID5 10+1 parity, although I've also tried RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. Problem: No matter what I do I cannot get the ext3 read performance above ~90MB/s. Under virtually every configuration listed above the write performance is greater than the read performance. I've run a large number of Bonnie++ and IOzone tests, but for the sake of simplicity in this email I'll just refer to simple dd's with /dev/zero. Details: Under the current RAID0 setup I see the following when dd'ing. DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync 28 seconds DD 4G from /dev/sdd to /dev/null 32 seconds DD 4G to ext3 on /dev/sdd & sync 32 seconds DD 4G from ext3 file to /dev/null 48 seconds. I've been watching the port usage on the FC switch and it verifies what I am seeing, Writes max out near 2Gb/s but reads hit some artificial limit around 90 MB/s and never ever exceed it with the filesystem, regardless of they underlying RAID configuration. Without a filesystem the reads are atleast 50% faster, and it can be seen on the FC switch graphs as well. Any help or thoughts would be appreciated. Thanks, ~Glen From ling at fnal.gov Sat Jun 30 05:18:22 2007 From: ling at fnal.gov (Ling C. Ho) Date: Sat, 30 Jun 2007 00:18:22 -0500 Subject: poor read performance In-Reply-To: <1183159302.30971.37.camel@localhost.localdomain> References: <1183159302.30971.37.camel@localhost.localdomain> Message-ID: <4685E79E.3060709@fnal.gov> Hi, Did you see any difference when different block size is used (for example, dd with bs=64k or 128k)? Try also change the read-ahead cache. blockdev --getra /dev/sdd to see what is the current value, and blockdev --setra 8192 /dev/sdd to change it. 8192 is a good number that has been working well for me for the similar size setup. ... ling Glen Dosey wrote: > I am seeing what seems to be a notable limit on read performance of an > ext3 filesystem. If anyone could offer some insight it would be helpful. > > Background: > 12 x 500G SATA disks in a Hardware RAID enclosure connected via 2Gb/s FC > to a 4 x 2.6 Ghz system with 4GB ram running RHEL4.5. Initially the > enclosure was configured RAID5 10+1 parity, although I've also tried > RAID 50 and currently RAID 0. I've varied chunk sizes from 64-256K. > > Problem: > No matter what I do I cannot get the ext3 read performance above > ~90MB/s. Under virtually every configuration listed above the write > performance is greater than the read performance. I've run a large > number of Bonnie++ and IOzone tests, but for the sake of simplicity in > this email I'll just refer to simple dd's with /dev/zero. > > Details: > Under the current RAID0 setup I see the following when dd'ing. > > DD 4G from /dev/zero to /dev/sdd disk (no filesystem) & sync > 28 seconds > DD 4G from /dev/sdd to /dev/null 32 seconds > DD 4G to ext3 on /dev/sdd & sync 32 seconds > DD 4G from ext3 file to /dev/null 48 seconds. > > I've been watching the port usage on the FC switch and it verifies what > I am seeing, Writes max out near 2Gb/s but reads hit some artificial > limit around 90 MB/s and never ever exceed it with the filesystem, > regardless of they underlying RAID configuration. Without a filesystem > the reads are atleast 50% faster, and it can be seen on the FC switch > graphs as well. > > Any help or thoughts would be appreciated. > > Thanks, > ~Glen > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users >