From tobi at oetiker.ch Tue Mar 3 13:59:54 2009 From: tobi at oetiker.ch (Tobias Oetiker) Date: Tue, 3 Mar 2009 14:59:54 +0100 (CET) Subject: external journal damaged after crash Message-ID: We are running ext3 with external journal on an ssd disk, as described on http://insights.oetiker.ch/linux/external-journal-on-ssd/ Today our server crashed (reason unknown) and when it booted again, all the ext3 filesystems with external journal refused to mount with the following error message Mar 3 12:16:58 srv-rhein kernel: [ 331.556972] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.605125] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.641793] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.668446] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.714423] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.749856] EXT3-fs: journal UUID does not match [kern.err] Mar 3 12:16:58 srv-rhein kernel: [ 331.779711] EXT3-fs: journal UUID does not match [kern.err] It was the same from ALL filesystems so I supect there is a systematic problem ... We are running kernel 2.6.22.18 The filesystems were all mounted with defaults,rw,errors=panic,noatime,data=journal I could recover from the problem by - fsck of the filesystems (all were consistant) - re-initializing all journals - re-adding all the journals Any ideas what caused the problem ? After all this is exactly the type of crash we have these journals for, so it is pretty bad when they fail exactly at that moment ... Should we not be using the LABEL syntax for journal reference ? cheers tobi -- Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900 From sandeen at redhat.com Tue Mar 3 15:27:25 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Tue, 03 Mar 2009 09:27:25 -0600 Subject: Questions regarding journal replay In-Reply-To: <20090225193108.GE8554@charite.de> References: <20090205125847.GR23918@charite.de> <20090206142641.9FE446F064@alopias.GreenKey.net> <20090206142822.GE31519@charite.de> <20090225162426.GA26291@charite.de> <49A5726E.6030703@redhat.com> <20090225172334.GF26291@charite.de> <49A58199.2060101@redhat.com> <20090225174038.GH26291@charite.de> <49A59BE3.6070906@redhat.com> <20090225193108.GE8554@charite.de> Message-ID: <49AD4C5D.7070500@redhat.com> Ralf Hildebrandt wrote: > * Eric Sandeen : > >>> Journal block size: 4096 >>> Journal length: 8488436 >>> Journal first block: 2 >>> Journal sequence: 0x0027c611 >>> Journal start: 2 >>> Journal number of users: 1 >>> Journal users: 032613d3-6035-4872-bc0a-11db92feec5e >> Ok we might be getting a little off-track here. Your journal is indeed >> 32G in size. But you also saw this with an internal journal, which >> should be limited to 128M, and yet you still saw a very long replay, right? > > 800s for 128M, yes So how well can you reproduce this... it would be interesting to run this recovery under blktrace, to see where the IO is happening. I'd be most interested in that for the "normal" 128M log, not the crazy 32G log :) -Eric From vegard at svanberg.no Wed Mar 4 10:53:11 2009 From: vegard at svanberg.no (Vegard Svanberg) Date: Wed, 4 Mar 2009 11:53:11 +0100 Subject: file system, kernel or hardware raid failure? Message-ID: <20090304105311.GW16295@svanberg.no> I had a busy mailserver fail on me the other day. Below is what was printed in dmesg. We first suspected a hardware failure (raid controller or something else), so we moved the drives to another (identical hardware) machine and ran fsck. Fsck complained ("short read while reading inode") and asked if I wanted to ignore and rewrite (which I did). After booting up again, the problem came back immediately and root was remounted read only. We moved the data from the read only drive to a new machine. While copying the data, we got this message from time to time (on various files): "EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=22561891, block=90243144. I need to find the cause(s) of the problems. So far I have these questions/concerns: - Kernel bug? (This is Ubuntu 8.10 with 2.6.27-7-server) - Filesystem bug/failure? - Did the RAID controller fail to detect a failing drive? This is an Adaptec aoc-usas-s4ir running on a Supermicro motherboard. I suspect that one of the drives (RAID 6 btw) has failed, but I'm not sure what to do from here. Any ideas? Thanks in advance. dmesg: [ 38.907730] end_request: I/O error, dev sda, sector 284688831 [ 38.907802] EXT3-fs error (device dm-0): read_block_bitmap: Cannot = read block bitmap - block_group =3D 1086, block_bitmap =3D 35586048 [ 38.907956] Aborting journal on device dm-0. [ 38.919742] ext3_abort called. [ 38.919798] EXT3-fs error (device dm-0): ext3_journal_start_sb: = Detected aborted journal [ 38.919942] Remounting filesystem read-only [ 38.925855] __journal_remove_journal_head: freeing b_committed_data [ 38.925915] journal commit I/O error [ 38.925935] journal commit I/O error [ 38.925953] journal commit I/O error [ 38.943245] Remounting filesystem read-only [ 38.958907] EXT3-fs error (device dm-0) in ext3_reserve_inode_write: = Journal has aborted [ 38.958988] EXT3-fs error (device dm-0) in ext3_truncate: Journal has = aborted [ 38.959051] EXT3-fs error (device dm-0) in ext3_reserve_inode_write: = Journal has aborted [ 38.959137] EXT3-fs error (device dm-0) in ext3_orphan_del: Journal = has aborted [ 38.959222] EXT3-fs error (device dm-0) in ext3_reserve_inode_write: = Journal has aborted [ 39.024087] journal commit I/O error [ 39.024103] journal commit I/O error [ 39.024117] journal commit I/O error [ 39.024124] journal commit I/O error [ 39.024181] journal commit I/O error [ 39.024201] journal commit I/O error [ 39.024208] journal commit I/O error [ 39.024258] journal commit I/O error [ 39.024275] journal commit I/O error [ 39.024284] journal commit I/O error [ 39.024330] journal commit I/O error [ 39.024358] journal commit I/O error [ 39.024384] journal commit I/O error [ 39.024432] journal commit I/O error [ 39.024481] journal commit I/O error [ 45.749997] sd 0:0:0:0: [sda] Result: hostbyte=3DDID_OK driverbyte=3DD= RIVER_SENSE,SUGGEST_OK [ 45.750008] sd 0:0:0:0: [sda] Sense Key : Hardware Error [current]=20 [ 45.750012] sd 0:0:0:0: [sda] Add. Sense: Internal target failure [ 45.750017] end_request: I/O error, dev sda, sector 721945599 [ 45.750079] Buffer I/O error on device dm-0, logical block 90243144 [ 45.750137] lost page write due to I/O error on dm-0 [ 87.970284] sd 0:0:0:0: [sda] Result: hostbyte=3DDID_OK driverbyte=3DD= RIVER_SENSE,SUGGEST_OK [ 87.970292] sd 0:0:0:0: [sda] Sense Key : Hardware Error [current]=20 [ 87.970296] sd 0:0:0:0: [sda] Add. Sense: Internal target failure [ 87.970302] end_request: I/O error, dev sda, sector 83324999 -- Vegard Svanberg [*Takapa at IRC (EFnet)] From sandeen at redhat.com Wed Mar 4 17:26:11 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Wed, 04 Mar 2009 11:26:11 -0600 Subject: file system, kernel or hardware raid failure? In-Reply-To: <20090304105311.GW16295@svanberg.no> References: <20090304105311.GW16295@svanberg.no> Message-ID: <49AEB9B3.5090902@redhat.com> Vegard Svanberg wrote: > I had a busy mailserver fail on me the other day. Below is what was > printed in dmesg. We first suspected a hardware failure (raid controller > or something else), so we moved the drives to another (identical > hardware) machine and ran fsck. Fsck complained ("short read while > reading inode") and asked if I wanted to ignore and rewrite (which I > did). > > After booting up again, the problem came back immediately and root was > remounted read only. We moved the data from the read only drive to a new > machine. While copying the data, we got this message from time to time > (on various files): "EXT3-fs error (device dm-0): ext3_get_inode_loc: > unable to read inode block - inode=22561891, block=90243144. > > I need to find the cause(s) of the problems. So far I have these > questions/concerns: > > - Kernel bug? (This is Ubuntu 8.10 with 2.6.27-7-server) > - Filesystem bug/failure? > - Did the RAID controller fail to detect a failing drive? This is an > Adaptec aoc-usas-s4ir running on a Supermicro motherboard. > > I suspect that one of the drives (RAID 6 btw) has failed, but I'm not > sure what to do from here. > > Any ideas? Thanks in advance. > > dmesg: > > [ 38.907730] end_request: I/O error, dev sda, sector 284688831 Drive hardware on sda failing; I'd run smart tools or vendor diagnostics, to be sure. > [ 45.749997] sd 0:0:0:0: [sda] Result: hostbyte=3DDID_OK driverbyte=3DD= > RIVER_SENSE,SUGGEST_OK > [ 45.750008] sd 0:0:0:0: [sda] Sense Key : Hardware Error [current] ^^^^^^^^^^^^^^ I can't speak to whether the raid controller should have detected this. -Eric From sandeen at redhat.com Mon Mar 9 21:15:49 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Mon, 09 Mar 2009 16:15:49 -0500 Subject: ext4 and unexpected eh_depth In-Reply-To: <498954BC.9050805@iki.fi> References: <498954BC.9050805@iki.fi> Message-ID: <49B58705.7050003@redhat.com> Markus Peuhkuri wrote: > Hi, I'm running Debian lenny with linux-image-2.6.26-1-amd64 (deb > 2.6.26-11). I have a lvm stripe over three sata disks (3.5TB total) > that is shared over NFS, and I getting following errors > > EXT4-fs error (device dm-0): ext4_ext_search_right: bad header in inode #269200: unexpected eh_depth - magic f30a,entries 18, max 340(0), depth 1(2) > > An user is having errors in concatenating large files (100+GB): basicly seems that the the resulting file is right size > and ends with right data, but anyway he gets following error: > cat: write error: Input/output error > on system that has imported partition over NFS. I'm not sure if the file he was accessing did had the same inode. ... If you are still seeing this, the patch at: http://bugzilla.kernel.org/show_bug.cgi?id=12821#c8 may resolve it, if you're willing to test. Thanks, -Eric From mck222 at gmail.com Thu Mar 12 20:27:20 2009 From: mck222 at gmail.com (James McKain [Gmail]) Date: Thu, 12 Mar 2009 16:27:20 -0400 Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 Message-ID: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> I'm having a strange problem I've never seen before. Sometimes my system crashes, and upon restart I am missing *at least* a handful of files. They are completely gone and untraceable. At first I forced fsck on reboot, and that helped recover some of them, but the problem continues. I have no clue even where to start tracing this. Can anyone help? The system is a new AMD Phenom on a SATA Seagate 1Tb disk. Ubuntu Ibex is the only OS loaded, on ext3 partitioned 5 ways. My system just crashed today, and now that I'm back up and running there is one file in particular that is completely gone. I haven't touched this file in months, and it just up and disappeared. I checked with Seagate to see if my drive was part of the recall, it's not. (so they say) HELP! -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at nerdbynature.de Fri Mar 13 06:25:15 2009 From: lists at nerdbynature.de (Christian Kujau) Date: Thu, 12 Mar 2009 23:25:15 -0700 (PDT) Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 In-Reply-To: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> References: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> Message-ID: On Thu, 12 Mar 2009, James McKain [Gmail] wrote: > I'm having a strange problem I've never seen before. Sometimes my system > crashes What's wrong with this system that it sometimes just "crashes"? >, and upon restart I am missing *at least* a handful of files. They > are completely gone and untraceable. At first I forced fsck on reboot, and > that helped recover some of them, but the problem continues. Anything related in the logs, does fsck report anything during bootup? > My system just crashed today, and now that I'm back up and running there is > one file in particular that is completely gone. I haven't touched this file > in months, and it just up and disappeared. Did anything show up in /lost+found? > I checked with Seagate to see if my drive was part of the recall, it's not. Speaking of drive errors: did you test your drive, can you read from the beginning to the end of it without any errors? Christian. -- If you want to contact Bruce Schneier, just type his name into your shell prompt. From tweeks at rackspace.com Fri Mar 13 19:24:30 2009 From: tweeks at rackspace.com (Tweeks) Date: Fri, 13 Mar 2009 14:24:30 -0500 Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 In-Reply-To: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> References: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> Message-ID: <12529_1236972244_n2DJO4bv003331_200903131424.30798.tweeks@rackspace.com> On Thursday 12 March 2009, James McKain [Gmail] wrote: > I'm having a strange problem I've never seen before. Sometimes my system > crashes, and upon restart I am missing *at least* a handful of files. They > are completely gone and untraceable. At first I forced fsck on reboot, and > that helped recover some of them, but the problem continues. I have no > clue even where to start tracing this. Can anyone help? The system is a > new AMD Phenom on a SATA Seagate 1Tb disk. Ubuntu Ibex is the only OS > loaded, on ext3 partitioned 5 ways. > > My system just crashed today, and now that I'm back up and running there is > one file in particular that is completely gone. I haven't touched this > file in months, and it just up and disappeared. > > I checked with Seagate to see if my drive was part of the recall, it's not. > (so they say) Test it using smart: # smartctl -T permissive -d ata -s on /dev/sda # smartctl -T permissive -d ata -t long /dev/sda && sleep 2h && smartctl -T permissive -d ata -l selftest /dev/sda If the output is all %00, then the drive is passing it's self test. If it's failing before it reached %00 (down form 100%) then it's failing.. get your stuff off asap. Tweeks Confidentiality Notice: This e-mail message (including any attached or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com, and delete the original message. Your cooperation is appreciated. From darkonc at gmail.com Fri Mar 13 20:39:27 2009 From: darkonc at gmail.com (Stephen Samuel) Date: Fri, 13 Mar 2009 13:39:27 -0700 Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 In-Reply-To: <12529_1236972244_n2DJO4bv003331_200903131424.30798.tweeks@rackspace.com> References: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> <12529_1236972244_n2DJO4bv003331_200903131424.30798.tweeks@rackspace.com> Message-ID: <6cd50f9f0903131339kcfb3707sf7acc81b07f707f3@mail.gmail.com> One question I have is: why is the system repeatedly rebooting?? The answer to that question may point to something about why you're losing data. I tend to find that, with Linux (and unlike some other OSs [cough]WIndows[cough]), repeated crashes tend to point to some kind of hardware error, and -- where there's a software source to the crashes -- there are people who are genuinely interested in(and capable of) resolving the problem. On Fri, Mar 13, 2009 at 12:24 PM, Tweeks wrote: > On Thursday 12 March 2009, James McKain [Gmail] wrote: > > I'm having a strange problem I've never seen before. Sometimes my system > > crashes, and upon restart I am missing *at least* a handful of files. > They > > are completely gone and untraceable. At first I forced fsck on reboot, > and > > that helped recover some of them, but the problem continues. I have no > > clue even where to start tracing this. Can anyone help? The system is a > > new AMD Phenom on a SATA Seagate 1Tb disk. Ubuntu Ibex is the only OS > > loaded, on ext3 partitioned 5 ways. > > > > My system just crashed today, and now that I'm back up and running there > is > > one file in particular that is completely gone. I haven't touched this > > file in months, and it just up and disappeared. > > > > I checked with Seagate to see if my drive was part of the recall, it's > not. > > (so they say) > > Test it using smart: > > # smartctl -T permissive -d ata -s on /dev/sda > # smartctl -T permissive -d ata -t long /dev/sda && sleep 2h && smartctl -T > permissive -d ata -l selftest /dev/sda > > If the output is all %00, then the drive is passing it's self test. If > it's > failing before it reached %00 (down form 100%) then it's failing.. get your > stuff off asap. > > Tweeks > > > Confidentiality Notice: This e-mail message (including any attached or > embedded documents) is intended for the exclusive and confidential use of > the > individual or entity to which this message is addressed, and unless > otherwise > expressly indicated, is confidential and privileged information of > Rackspace. > Any dissemination, distribution or copying of the enclosed material is > prohibited. > If you receive this transmission in error, please notify us immediately by > e-mail > at abuse at rackspace.com, and delete the original message. > Your cooperation is appreciated. > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > -- Stephen Samuel http://www.bcgreen.com 778-861-7641 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tweeks at rackspace.com Fri Mar 13 22:55:12 2009 From: tweeks at rackspace.com (Tweeks) Date: Fri, 13 Mar 2009 17:55:12 -0500 Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 In-Reply-To: <6cd50f9f0903131339kcfb3707sf7acc81b07f707f3@mail.gmail.com> References: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> <12529_1236972244_n2DJO4bv003331_200903131424.30798.tweeks@rackspace.com> <6cd50f9f0903131339kcfb3707sf7acc81b07f707f3@mail.gmail.com> Message-ID: <200903131755.13324.tweeks@rackspace.com> But when Linux crashes.. it tends NOT to just automatically reboot (unless specifically set up to do so). It usually either black screens, sig-faults, or stops. Tweeks On Friday 13 March 2009, Stephen Samuel wrote: > One question I have is: why is the system repeatedly rebooting?? The > answer to that question may > point to something about why you're losing data. > > I tend to find that, with Linux (and unlike some other OSs > [cough]WIndows[cough]), repeated > crashes tend to point to some kind of hardware error, and -- where there's > a software source to > the crashes -- there are people who are genuinely interested in(and capable > of) resolving the problem. > > On Fri, Mar 13, 2009 at 12:24 PM, Tweeks wrote: > > On Thursday 12 March 2009, James McKain [Gmail] wrote: > > > I'm having a strange problem I've never seen before. Sometimes my > > > system crashes, and upon restart I am missing *at least* a handful of > > > files. > > > > They > > > > > are completely gone and untraceable. At first I forced fsck on reboot, > > > > and > > > > > that helped recover some of them, but the problem continues. I have no > > > clue even where to start tracing this. Can anyone help? The system is > > > a new AMD Phenom on a SATA Seagate 1Tb disk. Ubuntu Ibex is the only > > > OS loaded, on ext3 partitioned 5 ways. > > > > > > My system just crashed today, and now that I'm back up and running > > > there > > > > is > > > > > one file in particular that is completely gone. I haven't touched this > > > file in months, and it just up and disappeared. > > > > > > I checked with Seagate to see if my drive was part of the recall, it's > > > > not. > > > > > (so they say) > > > > Test it using smart: > > > > # smartctl -T permissive -d ata -s on /dev/sda > > # smartctl -T permissive -d ata -t long /dev/sda && sleep 2h && smartctl > > -T permissive -d ata -l selftest /dev/sda > > > > If the output is all %00, then the drive is passing it's self test. If > > it's > > failing before it reached %00 (down form 100%) then it's failing.. get > > your stuff off asap. > > > > Tweeks > > > > > > Confidentiality Notice: This e-mail message (including any attached or > > embedded documents) is intended for the exclusive and confidential use of > > the > > individual or entity to which this message is addressed, and unless > > otherwise > > expressly indicated, is confidential and privileged information of > > Rackspace. > > Any dissemination, distribution or copying of the enclosed material is > > prohibited. > > If you receive this transmission in error, please notify us immediately > > by e-mail > > at abuse at rackspace.com, and delete the original message. > > Your cooperation is appreciated. > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users From mck222 at gmail.com Sat Mar 14 03:31:56 2009 From: mck222 at gmail.com (James McKain [Gmail]) Date: Fri, 13 Mar 2009 23:31:56 -0400 Subject: Ubuntu Ibex on single SATA Seagate disk, ext3 In-Reply-To: <6cd50f9f0903131339kcfb3707sf7acc81b07f707f3@mail.gmail.com> References: <16d3fdee0903121327h6038dccfif0591855f2c9b328@mail.gmail.com> <12529_1236972244_n2DJO4bv003331_200903131424.30798.tweeks@rackspace.com> <6cd50f9f0903131339kcfb3707sf7acc81b07f707f3@mail.gmail.com> Message-ID: <16d3fdee0903132031v5079257dw91ddaa588e65b947@mail.gmail.com> Hi group, thanks for the replies. I ran the smartctl diags and they returned successful. No problems there. Thanks for the commands. The "crash" problems have mainly been freeze-ups. Everything just seems to halt and I have to hit the reset button. I had a couple instances where X crashed but the machine seemed to stay running. I just had to hit the reset button to continue. The other day my keyboard and mouse suddenly froze. I could see the system running fine, but I couldn't interact with it. I could pull up SMB shares and even fired-through a print job, but nothing I did could recover the input devices (not event physical disconnection), except of course for the magic reset button. The disk/EXT3 problems seem to surface after the crashes; fsck repairs most of the damage, but in every instance I'm missing at least a handful of files from across the disk/partitions. Some files are gone without a trace, others are reverted back to states/timestamps of months prior, with any recent changes (<2wks) completely overwritten. By far the stranges disk behavior I've ever come across....how does a file decide to revert itself, and how does it regurgitate the previous file/timestamp to revert back to!? Almost as if I there was a secret CVS system running or something. What I find most bazarre about the file problem is that my operating system files have remained resiliant. Not one missing critical file or lost config. It always seems to be my email (Thunderbird), PHP or Java (.jsp and .java) files that are affected. Any chance this is a permissions issue tip-off? I have good reason to believe there is a problem with the motherboard. I was getting 4 beeps when it posts so I started debugging via BIOS settings and discovered if I disable USB support it posts cleanly. Running stable now, so far so good anyways (8hrs). I am in the process of getting an RMA from MSI...should have a replacement in 10-15 days they say. They suggested a bad cmos battery could cause the I/O write errors, which I skeptically absorbed (from the tech). I think it's just a bad board. If anyone is interested in the outcome let me know and I'll fill you in a couple weeks from now. For the rest of the group who get too many emails as it is, thank you for listening! Best. -James 2009/3/13 Stephen Samuel > One question I have is: why is the system repeatedly rebooting?? The > answer to that question may > point to something about why you're losing data. > > I tend to find that, with Linux (and unlike some other OSs > [cough]WIndows[cough]), repeated > crashes tend to point to some kind of hardware error, and -- where there's > a software source to > the crashes -- there are people who are genuinely interested in(and capable > of) resolving the problem. > > On Fri, Mar 13, 2009 at 12:24 PM, Tweeks wrote: > >> On Thursday 12 March 2009, James McKain [Gmail] wrote: >> > I'm having a strange problem I've never seen before. Sometimes my >> system >> > crashes, and upon restart I am missing *at least* a handful of files. >> They >> > are completely gone and untraceable. At first I forced fsck on reboot, >> and >> > that helped recover some of them, but the problem continues. I have no >> > clue even where to start tracing this. Can anyone help? The system is >> a >> > new AMD Phenom on a SATA Seagate 1Tb disk. Ubuntu Ibex is the only OS >> > loaded, on ext3 partitioned 5 ways. >> > >> > My system just crashed today, and now that I'm back up and running there >> is >> > one file in particular that is completely gone. I haven't touched this >> > file in months, and it just up and disappeared. >> > >> > I checked with Seagate to see if my drive was part of the recall, it's >> not. >> > (so they say) >> >> Test it using smart: >> >> # smartctl -T permissive -d ata -s on /dev/sda >> # smartctl -T permissive -d ata -t long /dev/sda && sleep 2h && smartctl >> -T >> permissive -d ata -l selftest /dev/sda >> >> If the output is all %00, then the drive is passing it's self test. If >> it's >> failing before it reached %00 (down form 100%) then it's failing.. get >> your >> stuff off asap. >> >> Tweeks >> >> >> Confidentiality Notice: This e-mail message (including any attached or >> embedded documents) is intended for the exclusive and confidential use of >> the >> individual or entity to which this message is addressed, and unless >> otherwise >> expressly indicated, is confidential and privileged information of >> Rackspace. >> Any dissemination, distribution or copying of the enclosed material is >> prohibited. >> If you receive this transmission in error, please notify us immediately by >> e-mail >> at abuse at rackspace.com, and delete the original message. >> Your cooperation is appreciated. >> >> _______________________________________________ >> Ext3-users mailing list >> Ext3-users at redhat.com >> https://www.redhat.com/mailman/listinfo/ext3-users >> > > > > -- > Stephen Samuel http://www.bcgreen.com > 778-861-7641 > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laytonjb at att.net Sun Mar 15 19:21:58 2009 From: laytonjb at att.net (Jeff Layton) Date: Sun, 15 Mar 2009 14:21:58 -0500 Subject: Problem creating external journal with ext3 Message-ID: <49BD5556.6090900@att.net> Afternoon, I've been trying to create a new ext3 that has an external journal. I have two 500GB SATA drives (/dev/sda and /dev/sdb) that I'm using for my experiments. I partitioned /dev/sda into a single partition (/dev/sda1) and I partitioned /dev/sdb into two partitions, /dev/sdb1 that is 1024M in size and /dev/sdb2 that is the remainder. I want to create an ext3 file system on /dev/sda1 and use /dev/sdb1 as an external journal. (BTW - I'm running the 2.6.28.7 kernal and 1.41.4 for efs2progs). I've tried various commands and they all fail. What I'm currently trying, mke2fs -O journal_dev /dev/sdb1 mke2fs -t ext3 -O ^has_journal /dev/sdba1 tune2fs -o journal_data -j -J device=/dev/sdb1 /dev/sda1 gives me an error at the very end: tune2fs: The ext2 superblock is corrupt while trying to open journal on /dev/sdb1 Any ideas? TIA! Jeff From laytonjb at att.net Sun Mar 15 19:25:22 2009 From: laytonjb at att.net (Jeff Layton) Date: Sun, 15 Mar 2009 14:25:22 -0500 Subject: Problem creating external journal with ext3 In-Reply-To: <49BD5556.6090900@att.net> References: <49BD5556.6090900@att.net> Message-ID: <49BD5622.4060306@att.net> Jeff Layton wrote: > mke2fs -t ext3 -O ^has_journal /dev/sdba1 Already found a typo (just in the email, not on the system). Should be, mke2fs -t ext3 -O ^has_journal /dev/sda1 Sorry for the churn, Jeff From vegard at svanberg.no Mon Mar 16 07:57:51 2009 From: vegard at svanberg.no (Vegard Svanberg) Date: Mon, 16 Mar 2009 08:57:51 +0100 Subject: file system, kernel or hardware raid failure? In-Reply-To: <49AEB9B3.5090902@redhat.com> References: <20090304105311.GW16295@svanberg.no> <49AEB9B3.5090902@redhat.com> Message-ID: <20090316075751.GH3501@svanberg.no> * Eric Sandeen [2009-03-04 18:26]: > > [ 38.907730] end_request: I/O error, dev sda, sector 284688831 > > Drive hardware on sda failing; I'd run smart tools or vendor > diagnostics, to be sure. Late answer, but... After posting this, we figured this had to be due to a power failure occuring some weeks before. But yesterday, we suddenly had one other, identical, machine failing with exactly the same error messages: [2834866.071770] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK [2834866.071778] sd 0:0:0:0: [sda] Sense Key : Hardware Error [current] [2834866.071782] sd 0:0:0:0: [sda] Add. Sense: Internal target failure [2834866.071787] end_request: I/O error, dev sda, sector 302515639 [2834866.071823] EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=9455580, block=37814399 Any ideas? Fsck will fix the errors, but report short reads while fixing, and on reboot/remount, the problems are back. As this probably isn't relevant to ext3 list anymore (I guess it's more relevant for kernel/SCSI subsystem developers), I'll find some other lists. -- Vegard Svanberg [*Takapa at IRC (EFnet)] From neishm at atmosp.physics.utoronto.ca Mon Mar 16 16:39:40 2009 From: neishm at atmosp.physics.utoronto.ca (Michael Neish) Date: Mon, 16 Mar 2009 12:39:40 -0400 (EDT) Subject: Problem creating external journal with ext3 In-Reply-To: <49BD5556.6090900@att.net> References: <49BD5556.6090900@att.net> Message-ID: <46484.99.237.74.5.1237221580.squirrel@mail.atmosp.physics.utoronto.ca> Can you verify if the journal has the same block size as the fs? /not an ext3 expert > Afternoon, > > I've been trying to create a new ext3 that has an external journal. > I have two 500GB SATA drives (/dev/sda and /dev/sdb) that I'm > using for my experiments. I partitioned /dev/sda into a single > partition (/dev/sda1) and I partitioned /dev/sdb into two partitions, > /dev/sdb1 that is 1024M in size and /dev/sdb2 that is the remainder. > I want to create an ext3 file system on /dev/sda1 and use /dev/sdb1 > as an external journal. (BTW - I'm running the 2.6.28.7 kernal and > 1.41.4 for efs2progs). > > I've tried various commands and they all fail. What I'm currently > trying, > > mke2fs -O journal_dev /dev/sdb1 > mke2fs -t ext3 -O ^has_journal /dev/sdba1 > tune2fs -o journal_data -j -J device=/dev/sdb1 /dev/sda1 > > gives me an error at the very end: > > tune2fs: The ext2 superblock is corrupt > while trying to open journal on /dev/sdb1 > > Any ideas? > > TIA! > > Jeff > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > > From laytonjb at att.net Mon Mar 16 20:11:19 2009 From: laytonjb at att.net (Jeff Layton) Date: Mon, 16 Mar 2009 15:11:19 -0500 Subject: Problem creating external journal with ext3 In-Reply-To: <46484.99.237.74.5.1237221580.squirrel@mail.atmosp.physics.utoronto.ca> References: <49BD5556.6090900@att.net> <46484.99.237.74.5.1237221580.squirrel@mail.atmosp.physics.utoronto.ca> Message-ID: <49BEB267.7010207@att.net> Michael Neish wrote: > Can you verify if the journal has the same block size as the fs? > > /not an ext3 expert > They had the same block size (but I will rerun the steps and force the block size). Jeff From tytso at mit.edu Tue Mar 17 02:20:19 2009 From: tytso at mit.edu (Theodore Tso) Date: Mon, 16 Mar 2009 22:20:19 -0400 Subject: Problem creating external journal with ext3 In-Reply-To: <49BD5556.6090900@att.net> References: <49BD5556.6090900@att.net> Message-ID: <20090317022019.GB15989@mit.edu> On Sun, Mar 15, 2009 at 02:21:58PM -0500, Jeff Layton wrote: > Afternoon, > > > mke2fs -O journal_dev /dev/sdb1 > mke2fs -t ext3 -O ^has_journal /dev/sdba1 > tune2fs -o journal_data -j -J device=/dev/sdb1 /dev/sda1 Ah, I see what is happening. When we added some consistency checks to prevent insane filesystems from causing e2fsprogs to core dump, we accidentally broke the ability of e2fsprogs manipulate journal device files. Oops. This is a regression in e2fsprogs 1.41.4, that we'll fix in e2fsprogs 1.41.5. I've included the patch below. - Ted commit 341b52dfa8c2afc11d8410de9bd381b03e75c6af Author: Theodore Ts'o Date: Mon Mar 16 22:16:44 2009 -0400 libext2fs: external journal devices should not cause ext2fs_open2 to fail This fixes a regression introduced in commit 79a9ab14 which caused attempts to open external journals to fail due to overly strict filesystem consistency checks. Signed-off-by: "Theodore Ts'o" diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c index cdfeaec..f6fe3f0 100644 --- a/lib/ext2fs/openfs.c +++ b/lib/ext2fs/openfs.c @@ -243,10 +243,6 @@ errcode_t ext2fs_open2(const char *name, const char *io_options, goto cleanup; } fs->fragsize = EXT2_FRAG_SIZE(fs->super); - if (EXT2_INODES_PER_GROUP(fs->super) == 0) { - retval = EXT2_ET_CORRUPT_SUPERBLOCK; - goto cleanup; - } fs->inode_blocks_per_group = ((EXT2_INODES_PER_GROUP(fs->super) * EXT2_INODE_SIZE(fs->super) + EXT2_BLOCK_SIZE(fs->super) - 1) / @@ -273,6 +269,11 @@ errcode_t ext2fs_open2(const char *name, const char *io_options, return 0; } + if (EXT2_INODES_PER_GROUP(fs->super) == 0) { + retval = EXT2_ET_CORRUPT_SUPERBLOCK; + goto cleanup; + } + /* * Read group descriptors */ From laytonjb at att.net Tue Mar 17 12:25:31 2009 From: laytonjb at att.net (Jeff Layton) Date: Tue, 17 Mar 2009 07:25:31 -0500 Subject: Problem creating external journal with ext3 In-Reply-To: <20090317022019.GB15989@mit.edu> References: <49BD5556.6090900@att.net> <20090317022019.GB15989@mit.edu> Message-ID: <49BF96BB.2020804@att.net> Theodore Tso wrote: > On Sun, Mar 15, 2009 at 02:21:58PM -0500, Jeff Layton wrote: > >> Afternoon, >> >> >> mke2fs -O journal_dev /dev/sdb1 >> mke2fs -t ext3 -O ^has_journal /dev/sdba1 >> tune2fs -o journal_data -j -J device=/dev/sdb1 /dev/sda1 >> > > Ah, I see what is happening. When we added some consistency checks to > prevent insane filesystems from causing e2fsprogs to core dump, we > accidentally broke the ability of e2fsprogs manipulate journal device > files. Oops. This is a regression in e2fsprogs 1.41.4, that we'll > fix in e2fsprogs 1.41.5. I've included the patch below. > > - Ted > Thanks! I'll give the patch a whirl and keep an eye out for 1.41.5. Jeff From lists at nerdbynature.de Wed Mar 18 07:53:04 2009 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 18 Mar 2009 00:53:04 -0700 (PDT) Subject: file system, kernel or hardware raid failure? In-Reply-To: <20090316075751.GH3501@svanberg.no> References: <20090304105311.GW16295@svanberg.no> <49AEB9B3.5090902@redhat.com> <20090316075751.GH3501@svanberg.no> Message-ID: On Mon, 16 Mar 2009, Vegard Svanberg wrote: > [2834866.071770] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK > [2834866.071778] sd 0:0:0:0: [sda] Sense Key : Hardware Error [current] ^^^^^^^^^^^^^^ > Any ideas? Fsck will fix the errors, but report short reads while No, you don't want to run fsck on a faulty device, it'll probably make things even worse. If the data is valuable, make two copies of it (with dd(1) or dd_rescue), then run fsck on one of these. Christian. -- Bruce Schneier has found SHA-512 preimages of all these facts. From sandro at e-den.it Wed Mar 18 11:03:51 2009 From: sandro at e-den.it (Alessandro Dentella) Date: Wed, 18 Mar 2009 12:03:51 +0100 Subject: undeletable files - even after fsck.ext3 -f (long) Message-ID: <20090318110351.GA6753@ubuntu> Hi, sorry for the long mail but I'm really in an emergency and have no enought specific knowledge to understand what is happening. situation ========= I have a samba server (debian etch, kernel 2.6.26) on which a user 'silvia' has a profile that got corrupted in some way (see below). I wanted to get rid of the files but there is no way to delete them: srv-: # ll total 16 ?--------- ? ? ? ? ? compreg.dat drwxrwsr-x+ 2 silvia amministrazione 6 2007-06-08 18:33 extensions ?--------- ? ? ? ? ? extensions.rdf drwxrwsr-x+ 3 silvia amministrazione 46 2007-10-31 16:16 ImapMail -rwx------ 1 mauro amministrazione 62 2007-03-28 15:53 impab.mab drwxrwsr-x+ 3 silvia amministrazione 26 2007-06-08 18:34 Mail srv-: # rm compreg.dat rm: cannot remove `compreg.dat': No such file or directory I hava already used fsck (clean filesystem) and fsck -f (FILESYSTEM MODIFIED, but I think is just becouse it added lost+found) if I look for all files I obtain: # find ! -user silvia -ls find: ./guhdjtzd.default/extensions.rdf: No such file or directory find: ./guhdjtzd.default/extensions.rdf: No such file or directory 53348196 4 -rwx------ 1 mauro amministrazione 62 Mar 28 2007 ./guhdjtzd.default/impab.mab find: ./guhdjtzd.default/compreg.dat: No such file or directory find: ./guhdjtzd.default/compreg.dat: No such file or directory copying files ============== I also tried to copy using 'cp -a dir_with_corrupetd_file /tmp' and I ended up 1. filling the / (It was way too big -- / is a diffrent filesystem/disk) 2. a directory that cannot be deleted: srv-fossati:/tmp/pappo# rm -Rf M rm: cannot chdir from `.' to `M': Exec format error orign of these file ==================== Just in case it is of any help to diagnose the problem, the corrupted file have a very strange origin: they where found there with ownership 'postfix' in a folder of the profile of 'silvia' in a directory of Thunderbird settings: at that moment (apart from the ownership that was not easy to explain) I could read the file (a normal mail message). What was strance was that I could see 2 occurrencies of that file using 'ls -l' so that I thought there was some unprintable char in the name and decided to delete it. After the attempt to delete it I obtained the situation described above. What now? ========= I'm really lost. I don't have *any* idea of how to repare the filesystem or what can be the cause: virus? memory corruption on server (I tested it 10 day ago due to some strange freeze, but resulted no error in 24 hours of memtest), controller (/ and /home are on different controllers). I'd be really thankfull for any possible hint Thanks in advance sandro -- Sandro Dentella *:-) http://sqlkit.argolinux.org SQLkit home page - PyGTK/python/sqlalchemy From sandro at e-den.it Wed Mar 18 12:11:49 2009 From: sandro at e-den.it (Alessandro Dentella) Date: Wed, 18 Mar 2009 13:11:49 +0100 Subject: undeletable files - even after fsck.ext3 -f (long) In-Reply-To: <20090318110351.GA6753@ubuntu> References: <20090318110351.GA6753@ubuntu> Message-ID: <20090318121149.GA13636@ubuntu> On Wed, Mar 18, 2009 at 12:03:51PM +0100, Alessandro Dentella wrote: > Hi, > > sorry for the long mail but I'm really in an emergency and have no enought > specific knowledge to understand what is happening. I'm sorry for this mail, but I made a big mistake. I checked one filesystem when the file was on another filesystem... checking the correct filesystem (/home that was an xfs) fixed the problem. Still there are many things that are not at all clear in the genesis of the problem, and the fact that copying the directory the corruption of the filesystem propagated. Should anybody have any idea I'd be pleased to read it. sandro From stevec at cisco.com Thu Mar 19 17:55:31 2009 From: stevec at cisco.com (Steven Chervets) Date: Thu, 19 Mar 2009 11:55:31 -0600 Subject: Sparse File Creation Message-ID: I need to create an 8 GB sized sparse file which will do an actual block allocation. In other words, I want to allocate the blocks in the directory structure, but not fill them with zeroes until they are actually used. Since this will be implemented in a closed appliance I'm not concerned about security and can execute any command as root. I have tried the dd seek command. And while it creates a sparse file, it doesn't reserve the blocks. Thank you for your help. Steve From sandeen at redhat.com Thu Mar 19 18:29:23 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Thu, 19 Mar 2009 13:29:23 -0500 Subject: Sparse File Creation In-Reply-To: References: Message-ID: <49C28F03.7000205@redhat.com> Steven Chervets wrote: > I need to create an 8 GB sized sparse file which will do an actual > block allocation. In other words, I want to allocate the blocks in > the directory structure, but not fill them with zeroes until they are > actually used. Since this will be implemented in a closed appliance > I'm not concerned about security and can execute any command as root. > > I have tried the dd seek command. And while it creates a sparse file, > it doesn't reserve the blocks. > > Thank you for your help. > > Steve Hi Steve - On ext3 your only option to allocate a block is to write something into it. On ext4, xfs, and similar extent-based filesystems, blocks are tracked with more metadata (the extent structures) so they can be allocated quickly and flagged as unwritten so that they will be read back as zeros. You can get to this through the fallocate syscall, glibc's posix_fallocate(3) (which falls back to 0-writing if the fs doesn't support sys_fallocate), and in bleeding-edge glibc, the fallocate(2) call. Since you say you don't care about security, ext3 could probably be hacked to skip the actual block initialization part and expose garbage, but this is generally something filesystem try *very* hard to avoid, and I don't think there'll be any way to do it with the stock code. -Eric From dushyanth at gmail.com Tue Mar 24 11:45:08 2009 From: dushyanth at gmail.com (Dushyanth) Date: Tue, 24 Mar 2009 11:45:08 +0000 (UTC) Subject: Ext3 - Frequent read-only FS issues Message-ID: Hi all, I have a bunch of mail servers running postfix (external smtp), qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes into read-only mode forcing recovery using fsck. Below are the errors we have seen so far on these systems and those systems config. The ext3 errors are common in many cases. 1. Red Hat Enterprise Linux Server release 5.2 (Tikanga) 2.6.18-92.1.10.el5 #1 SMP x86_64 Adaptec 2420SA RAID controller Logical Disks's write cache disabled, Physical disk write cache enabled Battery not installed 2 * 500GB ST373307LW in RAID1 Instances of EXT3 errors which caused read only FS : a. ?EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 8766158 in dir #8765708? 2. Red Hat Enterprise Linux Server release 5.2 (Tikanga) 2.6.18-92.1.10.el5 #1 SMP x86_64 Dell PERC 6/i - Write back cache enabled Battery available 2 * ST3500630AS 500GB in RAID1 Instances of EXT3 errors which caused read only FS : a. ?EXT3-fs error (device sda3): ext3_lookup: unlinked inode 89065027 in dir #89065024? b. ?EST3-fs error (device sda3): htree_dirblock_to_tree: bad entry #65077525: rec_len is smaller than minimal - offset=0, inode=0, rc_len=0, name_len=0? c. ?EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory #65077525: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0 ? d. ?kernel: EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory #65077525: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0? 3. Red Hat Enterprise Linux Server release 5.2 (Tikanga) 2.6.18-92.1.13.el5 #1 SMP x86_64 Dell PERC 6/i - Write back cache enabled Battery available 2 * ST3500630AS 500GB in RAID1 Instances of EXT3 errors which caused read only FS : a. ?EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 26968135 in dir #35127737? b. "EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 9994260 in dir #39518393" * Iam not sure whats causing this errors. These crashes have been happening since over 6 months now on a fairly regular basis - but on different servers. * The disks on these systems seem to be fine - i haven't checked for badblocks on them yet. * Each of those servers have their own disks for mail storage - there is no NFS or cluster FS involved. * The inode numbers in "ext3_lookup: unlinked inode" seem to be referring to a non existent courier pop/imap servers cache file (courierpop3dsizelist). At this point, iam trying to figure out the possible causes for such ext3 errors. Any pointers/recommendations will be of great help. P.S : Logs provided here is not complete and i would be glad to dig & post complete logs of these events as required. TIA Dushyanth From dushyanth at gmail.com Tue Mar 24 13:58:29 2009 From: dushyanth at gmail.com (Dushyanth) Date: Tue, 24 Mar 2009 13:58:29 +0000 (UTC) Subject: Ext3 - Frequent read-only FS issues References: Message-ID: > I have a bunch of mail servers running postfix (external smtp), > qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes > into read-only mode forcing recovery using fsck. > > Below are the errors we have seen so far on these systems and those > systems config. The ext3 errors are common in many cases. Forgot to mention that all ext3 mounts are ordered /dev/sdX1 on /mountpoint type ext3 (rw,noatime,usrquota,grpquota,data=ordered) Dushyanth From stevec at cisco.com Tue Mar 24 15:01:41 2009 From: stevec at cisco.com (Steven Chervets) Date: Tue, 24 Mar 2009 09:01:41 -0600 Subject: Ext3 - Frequent read-only FS issues In-Reply-To: References: Message-ID: Dushyanth, If you have the disk write-cache enabled it means that any power outage can cause file system corruption. I would suggest that you put a battery in your disk controller and enable the write-cache on it, and at the same time, disable the write-cache on the disk itself. If you can't get your hands on a battery, then disabled the write- cache on the hard disk and see if you get any more file system corruption errors. Good Luck, Steve On Mar 24, 2009, at 7:58 AM, Dushyanth wrote: > I have a bunch of mail servers running postfix (external smtp), > qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes > into read-only mode forcing recovery using fsck. > > Below are the errors we have seen so far on these systems and those > systems config. The ext3 errors are common in many cases. Forgot to mention that all ext3 mounts are ordered /dev/sdX1 on /mountpoint type ext3 (rw,noatime,usrquota,grpquota,data=ordered) Dushyanth _______________________________________________ Ext3-users mailing list Ext3-users at redhat.com https://www.redhat.com/mailman/listinfo/ext3-users From sandeen at redhat.com Tue Mar 24 15:25:43 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Tue, 24 Mar 2009 10:25:43 -0500 Subject: Ext3 - Frequent read-only FS issues In-Reply-To: References: Message-ID: <49C8FB77.3050206@redhat.com> Steven Chervets wrote: > Dushyanth, > > If you have the disk write-cache enabled it means that any power > outage can cause file system corruption. I would suggest that you put > a battery in your disk controller and enable the write-cache on it, > and at the same time, disable the write-cache on the disk itself. > > If you can't get your hands on a battery, then disabled the write- > cache on the hard disk and see if you get any more file system > corruption errors. > > Good Luck, > Steve If you think it could be write-cache related, just mount with -o barrier=1, and see if things get better. (or disable write cache on the disks with hdparm as Steve suggests) Do you lose power much? Do these errors correspond to power loss events? -Eric From dushyanth at gmail.com Tue Mar 24 15:37:48 2009 From: dushyanth at gmail.com (Dushyanth) Date: Tue, 24 Mar 2009 15:37:48 +0000 (UTC) Subject: Ext3 - Frequent read-only FS issues References: <49C8FB77.3050206@redhat.com> Message-ID: Hi, Thanks Steve and Eric. My response below. > > Steven Chervets wrote: > > Dushyanth, > > > > If you have the disk write-cache enabled it means that any power > > outage can cause file system corruption. I would suggest that you put > > a battery in your disk controller and enable the write-cache on it, > > and at the same time, disable the write-cache on the disk itself. > > > > If you can't get your hands on a battery, then disabled the write- > > cache on the hard disk and see if you get any more file system > > corruption errors. > > > > Good Luck, > > Steve > > If you think it could be write-cache related, just mount with -o > barrier=1, and see if things get better. (or disable write cache on the > disks with hdparm as Steve suggests) > > Do you lose power much? Do these errors correspond to power loss events? These errors come up suddenly on running systems. We do IPMI based reboots for power cycling the boxes when they hang. Is it possible that some corruption that occurred during one such reboots to cause a read only FS later on with the errors i mentioned ? Iam guessing yes, cos force fsck during boot is not forced always cos sometimes a FS check is suggested by ext3 during mounting the disk. I will disable the disk caches and observe. Thanks Dushyanth From tytso at mit.edu Tue Mar 24 16:36:18 2009 From: tytso at mit.edu (Theodore Tso) Date: Tue, 24 Mar 2009 12:36:18 -0400 Subject: Ext3 - Frequent read-only FS issues In-Reply-To: References: <49C8FB77.3050206@redhat.com> Message-ID: <20090324163618.GA32307@mit.edu> On Tue, Mar 24, 2009 at 03:37:48PM +0000, Dushyanth wrote: > > Do you lose power much? Do these errors correspond to power loss events? > > These errors come up suddenly on running systems. We do IPMI based reboots for > power cycling the boxes when they hang. I assume you do have a full fsck done on the filesystem after you see these errors? > Is it possible that some corruption that occurred during one such reboots to > cause a read only FS later on with the errors i mentioned ? Iam guessing yes, > cos force fsck during boot is not forced always cos sometimes a FS check is > suggested by ext3 during mounting the disk. If an fs check is getting suggested by ext3 during the mounting of the disk, then your boot scripts must not be set up correctly, or /etc/fstab has been set up to disable fsck getting run automatically on said filesystem. I would strongly recommend running forced fsck to make sure all of your filesystems are clean first; if the errors have been around for a while, and for some reason the automatic fsck has been disabled so they aren't getting checked, that could be really bad. - Ted From howachen at gmail.com Tue Mar 24 17:02:27 2009 From: howachen at gmail.com (howard chen) Date: Wed, 25 Mar 2009 01:02:27 +0800 Subject: Recommended max. limit of number of files per directory? Message-ID: Hello, In the past (i.e. Ext2), the performance of file system will get worse if too many files (e.g. > 10K) under same directory, is it still the same for Ext3? Since I have a web server which contains a lot of user uploaded photos, currently I have already create directory hierarchy to avoid the performance issue with the file system, but I want to know if any exact / tested figures? Thanks. From dushyanth at gmail.com Tue Mar 24 18:09:04 2009 From: dushyanth at gmail.com (Dushyanth) Date: Tue, 24 Mar 2009 18:09:04 +0000 (UTC) Subject: Ext3 - Frequent read-only FS issues References: <49C8FB77.3050206@redhat.com> <20090324163618.GA32307@mit.edu> Message-ID: Hi, Theodore Tso mit.edu> writes: > > On Tue, Mar 24, 2009 at 03:37:48PM +0000, Dushyanth wrote: > > > Do you lose power much? Do these errors correspond to power loss > > > events? > > > > These errors come up suddenly on running systems. We do IPMI based > > reboots for power cycling the boxes when they hang. > > I assume you do have a full fsck done on the filesystem after you see > these errors? Its quite possible that we might have not not run the suggested fsck. Ops/DC staff sometimes skip fsck to get the server up & running quick. > > Is it possible that some corruption that occurred during one such > > reboots to cause a read only FS later on with the errors i mentioned ? > > Iam guessing yes, cos force fsck during boot is not forced always cos > > sometimes a FS check is suggested by ext3 during mounting the disk. > > If an fs check is getting suggested by ext3 during the mounting of the > disk, then your boot scripts must not be set up correctly, or > /etc/fstab has been set up to disable fsck getting run automatically > on said filesystem. Boot scripts are redhat's defaults and fstab has fsck check is enabled for the disks in question. If the boot scripts run a fsck on ext3 filesystems that are marked unclean then i was wrong in guessing that ext3 fs's get mounted in such cases with a suggestion. > I would strongly recommend running forced fsck to make sure all of > your filesystems are clean first; if the errors have been around for a > while, and for some reason the automatic fsck has been disabled so > they aren't getting checked, that could be really bad. Ok. I might as well do the preemptive maintenance just to be sure. Thanks for the suggestions. TIA Dushyanth From bruno at wolff.to Tue Mar 24 19:45:31 2009 From: bruno at wolff.to (Bruno Wolff III) Date: Tue, 24 Mar 2009 14:45:31 -0500 Subject: Recommended max. limit of number of files per directory? In-Reply-To: References: Message-ID: <20090324194531.GA20046@wolff.to> On Wed, Mar 25, 2009 at 01:02:27 +0800, howard chen wrote: > Hello, > > In the past (i.e. Ext2), the performance of file system will get worse > if too many files (e.g. > 10K) under same directory, is it still the > same for Ext3? If you turn on directory hashing it scales better but eventually performance will still tank. You can use tune2fs to see how it is set and change it if necessay. From magawake at gmail.com Wed Mar 25 12:04:12 2009 From: magawake at gmail.com (Mag Gam) Date: Wed, 25 Mar 2009 08:04:12 -0400 Subject: large filesystem with many small files Message-ID: <1cbd6f830903250504k40dd52c8pa3d4cf36c1886e6f@mail.gmail.com> Hello All, I have several filesystems which are in 5 to 6TB range. I also have many small files on these fileysytem and I use a lot of hardlinking. I was wondering when will ext3 start hitting its limits? So far, it had no problems. Also, is it better to have small filesystems to avoid any pitfalls? TIA From howachen at gmail.com Wed Mar 25 14:52:04 2009 From: howachen at gmail.com (howard chen) Date: Wed, 25 Mar 2009 22:52:04 +0800 Subject: Recommended max. limit of number of files per directory? In-Reply-To: <20090324194531.GA20046@wolff.to> References: <20090324194531.GA20046@wolff.to> Message-ID: Hello, On Wed, Mar 25, 2009 at 3:45 AM, Bruno Wolff III wrote: > If you turn on directory hashing it scales better but eventually > performance will still tank. You can use tune2fs to see how it is set and > change it if necessay. > Yes, but I want to know if any testings was performed before? e.g. Ext3 start to suck when number of files under the same directory exceed X files. From lists at nerdbynature.de Thu Mar 26 04:58:53 2009 From: lists at nerdbynature.de (Christian Kujau) Date: Wed, 25 Mar 2009 21:58:53 -0700 (PDT) Subject: Recommended max. limit of number of files per directory? In-Reply-To: References: <20090324194531.GA20046@wolff.to> Message-ID: On Wed, 25 Mar 2009, howard chen wrote: >> If you turn on directory hashing it scales better but eventually >> performance will still tank. You can use tune2fs to see how it is set and >> change it if necessay. > > Yes, but I want to know if any testings was performed before? This made me curious as well, but apart from an rather old benchmark[0] I found nothing recent. I wrote a small benchmark script that will touch, cat, rm a large amount of files in/from a single directory. The script is currently still running, trying to create 10M files on a 4GB partition; the results so far: http://nerdbynature.de/bench/sid/2009-03-26/di-b.log.txt http://nerdbynature.de/bench/sid/2009-03-26/ (dmesg, .config, JFS oops, benchmark script) However, the correct answer is of course: do these tests with your applicaton and see if these results really match. And publish your results :) Christian. [0] http://lwn.net/Articles/14631/ -- Bruce Schneier found the inverse of the constant zero function. From howachen at gmail.com Thu Mar 26 13:58:37 2009 From: howachen at gmail.com (howard chen) Date: Thu, 26 Mar 2009 21:58:37 +0800 Subject: Recommended max. limit of number of files per directory? In-Reply-To: References: <20090324194531.GA20046@wolff.to> Message-ID: Hi, On Thu, Mar 26, 2009 at 12:58 PM, Christian Kujau wrote: > http://nerdbynature.de/bench/sid/2009-03-26/di-b.log.txt > http://nerdbynature.de/bench/sid/2009-03-26/ > (dmesg, .config, JFS oops, benchmark script) > Apparently ext3 start to suck when files > 1000000. Not bad in fact, I will try to run your script on my server for a comparison. Also I might try to measure the random read time when many directories containing many files. But I want to know: If I am writing a script to do such testing, what step is needed to prevent stuffs such as OS caching effect (not sure if it is the right name), so I can arrive a fair testing ? Thanks. From rwheeler at redhat.com Thu Mar 26 15:56:32 2009 From: rwheeler at redhat.com (Ric Wheeler) Date: Thu, 26 Mar 2009 11:56:32 -0400 Subject: Recommended max. limit of number of files per directory? In-Reply-To: References: <20090324194531.GA20046@wolff.to> Message-ID: <49CBA5B0.5080501@redhat.com> On 03/26/2009 09:58 AM, howard chen wrote: > Hi, > > On Thu, Mar 26, 2009 at 12:58 PM, Christian Kujau wrote: > >> http://nerdbynature.de/bench/sid/2009-03-26/di-b.log.txt >> http://nerdbynature.de/bench/sid/2009-03-26/ >> (dmesg, .config, JFS oops, benchmark script) >> >> > > Apparently ext3 start to suck when files> 1000000. Not bad in fact, > > I will try to run your script on my server for a comparison. > > Also I might try to measure the random read time when many directories > containing many files. But I want to know: > > If I am writing a script to do such testing, what step is needed to > prevent stuffs such as OS caching effect (not sure if it is the right > name), so I can arrive a fair testing ? > > Thanks. > > _______________________________________________ > I ran similar tests using fs_mark, basically, run it against 1 directory writing 10 or 20 thousand files per iteration and watch as performance (files/sec) degrades as the file system fills or the directory limitation kick in. If you want to be reproducible, you should probably start with a new file system but note that this does not reflect the reality of a naturally aged (say year or so old) file system well. You can also unmount/remount to clear out cached state for an older file system (or tweak the /proc/sys/vm/drop_caches knob to clear out the cache). Regards, Ric From lists at nerdbynature.de Thu Mar 26 18:59:48 2009 From: lists at nerdbynature.de (Christian Kujau) Date: Thu, 26 Mar 2009 11:59:48 -0700 (PDT) Subject: Recommended max. limit of number of files per directory? In-Reply-To: References: <20090324194531.GA20046@wolff.to> Message-ID: On Thu, 26 Mar 2009, howard chen wrote: > If I am writing a script to do such testing, what step is needed to > prevent stuffs such as OS caching effect (not sure if it is the right > name), so I can arrive a fair testing ? Hm, not really sure what you mean here. There are quite a few knobs in Documentation/sysctl/vm.txt to tune VM related stuff. You could disable disk- or controller-caches - but why would you disable caching at all from a performance point of view? I'm sure you'll have the disk cache enabled with your real-world applications as well? Christian. -- Bruce Schneier only uses condoms with 256-bit protection. From chitnis.ashay at gmail.com Tue Mar 31 14:55:32 2009 From: chitnis.ashay at gmail.com (Ashay Chitnis) Date: Tue, 31 Mar 2009 20:25:32 +0530 Subject: SAN partition with ext3 fs turns read only when shared bet two servers Message-ID: Dear All, Scenario: I am facing a unique issue as below. I have two physical machines installed with a windows beta virtualization env. on them. There is one centos 5.2 virtual server running on each physical machine. There is a SAN (IBM) partition provided to the virtualized servers by the host Windows OS on both physical machines. The partition is formated in ext3 Both the virtualized Centos servers are linked using heartbeat so that any of the server goes down the other server will take on the resources (Namely a floating IP and SAN storage) and start the services so that there is minimal interruption in services during hardware failure of one server. This is similar to heartbeat with drbd sans drbd. We have imap user data mounted on the system as since we had a SAN storage didnt opt for internal Storage or drbd structure. Graphical Representation is attached for elaboration. Problem: The problem is that during the fail over testings we found that the ext3 partition became read only during transition from one server to other server. What i want to know is 1. if attached scenario is feasible where in during failover occurs the other server mounts the SAN storage and mailbox server is available at all time. 2. What happens when one server has network issue but has not unmounted the SAN ? Is this the case why probably my partition becomes read only? regards, Ashay. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graphics.pdf Type: application/pdf Size: 17018 bytes Desc: not available URL: From sandeen at redhat.com Tue Mar 31 15:12:07 2009 From: sandeen at redhat.com (Eric Sandeen) Date: Tue, 31 Mar 2009 10:12:07 -0500 Subject: SAN partition with ext3 fs turns read only when shared bet two servers In-Reply-To: References: Message-ID: <49D232C7.8080903@redhat.com> Ashay Chitnis wrote: > Dear All, > ... > 2. What happens when one server has network issue but has not unmounted > the SAN ? Is this the case why probably my partition becomes read only? then you have 2 systems writing to the same block device with no coherency between them, which will corrupt the filesystem, and one system or the other or both will eventually notice this, throw an error, and go readonly (if you have errors=ro). You'll need to do something like STONITH (shoot the other node in the head) before the new node can safely take over. -Eric From fuller at droog.sdf-eu.org Tue Mar 31 14:47:55 2009 From: fuller at droog.sdf-eu.org (Felix Resch) Date: Tue, 31 Mar 2009 14:47:55 +0000 Subject: external journal lost Message-ID: <20090331144755.GA14155@droog.sdf-eu.org> Hello, i got a problem after i switched from an in-filesystem journal to an external journal on my soft-raid5 + lvm2 +ext3 volume. The Procedure was: - umount the volume - fsck -fy /dev/vg/space - tune2fs -O ^has_journal /dev/vg/space - create a 400M extended partition on my fast disk (sda) - mke2fs -O journal_dev -L space-journal /dev/sda6 - tune2fs -o journal_data -j -J device=/dev/sda6 /dev/vg/space - mount the volume, everything seems ok - while moving some files around i noticed "attempt to write beyond end of device sda6" as well as "read-only filesystem" errors - umount the volume (clean) - reboot the system and removed the external journal partition to move it elsewhere here began my Main Problem: 1. tune2fs -O ^has_journal /dev/vg/space "fs has needs_recovery set, please fsck" 2. fsck complains about a wrong uuid of the journal dev when used with a newly created and i cant see how to force mke2fs to give it a particular uuid (the one of the old journal partition) so i cant fsck the volume 3. man tune2fs says to me that there is the -f switch for this kind of accidents, but it gives me the same "fs has needs_recovery set, please fsck" I dont think the fs should be needs_recovery and i dont get why tune2fs -f -O ^has_journal /dev/vg/space doesnt forcefully remove the flag like said in the manpage. I am running debian5, linux 2.6.26-1-686, tune2fs 1.41.3 Please someone share your wisedom greetings Felix Resch