From clay at exavio.com.cn Thu Feb 19 03:00:21 2004 From: clay at exavio.com.cn (Isaac Claymore) Date: Thu, 19 Feb 2004 11:00:21 +0800 Subject: fs block level syncing In-Reply-To: References: Message-ID: <20040219030021.GA29856@exavio.com.cn> Hi, here's an article about asynchronous block level replication: http://www.linuxjournal.com/article.php?sid=7265 HTH On Thu, Feb 12, 2004 at 12:51:36PM -0500, Paul Raines wrote: > > Right now we do a lot of hard to hard disk backup by using rsync to weekly > "mirror" the source filesystem to a backup filesystem. This works fairly > well for most sources. However, one issue with rsync is that simple things > like changing the file name or directory name cause the whole file or > directory structure to get recopied over a previous sync. Also, like for > mail spools, large files that simply get appended to get the whole file > recopied. > > Does anyone know of something that syncs an ext2/3 fs to another > at the block level which result in less data transfer? > > -- > --------------------------------------------------------------- > Paul Raines email: raines at nmr.mgh.harvard.edu > MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging > 149 (2301) 13th Street Charlestown, MA 02129 USA > > > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -- Regards, Isaac () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From christian.braun at ch.abb.com Thu Feb 19 13:01:10 2004 From: christian.braun at ch.abb.com (christian.braun at ch.abb.com) Date: Thu, 19 Feb 2004 14:01:10 +0100 Subject: ext3 Overhead Message-ID: Hello! Message from "Stephen C. Tweedie" received on 13.02.2004 12:12 13.02.2004 12:12 "Stephen C. Tweedie" To: christian.braun at ch.abb.com cc: ext3 users list Subject: Re: ext3 Overhead Hi, On Wed, 2004-02-11 at 14:55, christian.braun at ch.abb.com wrote: > I'm using a CompactFlash as storage device. Since those CF cards only > have limited write cycles (CF does wear-levelling by itself, but you > don't want to write too many timet so the card) i was wondering by > what a factor the journaling of ext3 increases the write accesses to > the CompactFlash compared to ext2. Thanks a lot already for your help! It increases the number of writes a bit, but in many cases might actually reduce the number of overwrites for certain blocks like the superblock, which can be an advantage for those CF cards that don't do wear-levelling. But it's really not a filesystem designed for flash. --Stephen On Wed, 2004-02-11 at 14:55, christian.braun at ch.abb.com wrote: >> I'm using a CompactFlash as storage device. Since those CF cards only >> have limited write cycles (CF does wear-levelling by itself, but you >> don't want to write too many timet so the card) i was wondering by >> what a factor the journaling of ext3 increases the write accesses to >> the CompactFlash compared to ext2. Thanks a lot already for your help! > >It increases the number of writes a bit, but in many cases might >actually reduce the number of overwrites for certain blocks like the >superblock, which can be an advantage for those CF cards that don't do >wear-levelling. But it's really not a filesystem designed for flash. Well, as I said my CF card does wear-levelling, so that's not to worry about. Still, as you said, there is a difference in the number of write accesses between ext2 and ext3... I just need to know in what region that difference is... is it 3 times... or 30 times... or 300... or even more? Thank you! Christian Braun From leandro at dutra.fastmail.fm Wed Feb 18 14:48:15 2004 From: leandro at dutra.fastmail.fm (Leandro =?utf-8?b?R3VpbWFyw6Nlcw==?= Faria Corcete Dutra) Date: Wed, 18 Feb 2004 14:48:15 +0000 (UTC) Subject: ext3 badness in 2.6.0-test2 References: <20030804142245.GA1627@nevyn.them.org> <20030804132219.2e0c53b4.akpm@osdl.org> <16176.41431.279477.273718@gargle.gargle.HOWL> <20030805235735.4c180fa4.akpm@osdl.org> <16178.63046.43567.551323@gargle.gargle.HOWL> <20030807181631.2962dfca.akpm@osdl.org> Message-ID: Andrew Morton osdl.org> writes: > > Neil Brown cse.unsw.edu.au> wrote: > > > > On Tuesday August 5, akpm osdl.org wrote: > > > Neil Brown cse.unsw.edu.au> wrote: > > > > ... > > > > Aug 6 15:22:05 adams kernel: EXT3-fs error (device md1): ext3_add_entry: bad entry in directory #41 > > > > 009295: rec_len is smaller than minimal - offset=0, inode=3265411686, rec_len=0, name_len=0 > > > > > > It looks like we had a block full of zeroes come back from the device > > > driver. I find it distinctly fishy how this happens so much with > > > ext3-on-md, and so little with ext3-on-just-a-disk. > > > > Well, they're not *all* zero..... > > > > I can reproduce this easily with various configurations of ext3 over > > raid5, and get a similar problem with ext2 over raid5 (corrupt inodes > > rather than directory entries) but ext3 over raid0 is rock-solid. > > Good news that it is reproducible. Has anything ever come out of this? I am setting up a database server with RAID5, LVM2 and ext3fs, and has just stumbled upon this issue. Now I am nervous about proceeding. If there is a patch in some 2.6.3-rc I'd just upgrade, otherwise perhaps I have to go back to 2.4.24? I have seen people reporting this with 2.6.0 and 2.6.1 very recently, nothing on 2.6.2 yet. I took a look at the 2.6.2 ChangeLog, found nothing that seemed relevant. > Have you tried running fsx-linux? It is good at picking up data loss. I will try this as soon as I finish memtest86 and cpuburn. Meanwhile any information welcome. From sct at redhat.com Thu Feb 19 21:16:13 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 19 Feb 2004 21:16:13 +0000 Subject: ext3 Overhead In-Reply-To: References: Message-ID: <1077225373.2070.834.camel@sisko.scot.redhat.com> Hi, On Thu, 2004-02-19 at 13:01, christian.braun at ch.abb.com wrote: > Well, as I said my CF card does wear-levelling, so that's not to worry > about. Still, as you said, there is a difference in the number of write > accesses between ext2 and ext3... I just need to know in what region that > difference is... is it 3 times... or 30 times... or 300... or even more? For data, there's no difference --- unless you're in data=journal mode --- except for the fact that ext3 usually starts flushing stuff to disk earlier than ext2, which can mean that ext3 writes temporary data more often than ext2. For metadata, I'd expect ext3 is at most twice the writes of ext2 in most circumstances, but it's not something I've ever measured. --Stephen From bothie at gmx.de Thu Feb 19 21:25:35 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Thu, 19 Feb 2004 22:25:35 +0100 Subject: ext3 Overhead In-Reply-To: <1077225373.2070.834.camel@sisko.scot.redhat.com> References: <1077225373.2070.834.camel@sisko.scot.redhat.com> Message-ID: <200402192125.i1JLP1b07955@mx1.redhat.com> Hello. "Stephen C. Tweedie" wrote: >On Thu, 2004-02-19 at 13:01, christian.braun at ch.abb.com wrote: > >> Well, as I said my CF card does wear-levelling, so that's not to worry >> about. Still, as you said, there is a difference in the number of write >> accesses between ext2 and ext3... I just need to know in what region that >> difference is... is it 3 times... or 30 times... or 300... or even more? > > For data, there's no difference --- unless you're in data=journal mode > --- except for the fact that ext3 usually starts flushing stuff to disk > earlier than ext2, which can mean that ext3 writes temporary data more > often than ext2. For metadata, I'd expect ext3 is at most twice the > writes of ext2 in most circumstances, but it's not something I've ever > measured. Most probably even more. Imagine deleting a directory recursively. On ext2 most unlink operations will cause only one write operation to disk. On ext3 each unlink operation creates at least one extra journal entry (plus the write of the directory blocks). Same for modifications to the block bitmaps and so on. Regards, Bodo From lfabio_ext3users at smiling-web.com Sun Feb 22 00:53:07 2004 From: lfabio_ext3users at smiling-web.com (Luigi Fabio) Date: Sat, 21 Feb 2004 19:53:07 -0500 Subject: Crashed filesystem - directory recovery Message-ID: <4037B723.28990.D89EB49@localhost> Hello folks, I had an ext3 filesystem mounted as the root of a Linux MOO server. Unfortunately, the filesystem was on one of the infamous DTLA-3070xx drives - and the drive decided to fail at the worst moment it possibly could, trashing the filesystem fairly well. The situation is as follows: I used dd_rescue to create an image of what is left of the filesystem, but I ended up with some 65MB of 'holes' in the image. Among the 'holes' is the sector that hosts a directory, /home/weyrmount/MOO (indeed, on the original drive, trying to CD into that gives IO Error) That directory contained three files, plus an 'arch' directory. Now, while I understand that recovering the MOO dir itself is unrealistic, is there any way I could recover the arch dir - and the files therein? Any help will be greatly appreciated. Regards, Luigi Fabio - lfabio at smiling-web.com From adilger at clusterfs.com Sun Feb 22 01:56:50 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Sat, 21 Feb 2004 18:56:50 -0700 Subject: Crashed filesystem - directory recovery In-Reply-To: <4037B723.28990.D89EB49@localhost> References: <4037B723.28990.D89EB49@localhost> Message-ID: <20040222015650.GC17735@schnapps.adilger.int> On Feb 21, 2004 19:53 -0500, Luigi Fabio wrote: > I had an ext3 filesystem mounted as the root of a Linux MOO server. > Unfortunately, the filesystem was on one of the infamous DTLA-3070xx > drives - and the drive decided to fail at the worst moment it > possibly could, trashing the filesystem fairly well. > The situation is as follows: I used dd_rescue to create an image of > what is left of the filesystem, but I ended up with some 65MB of > 'holes' in the image. Among the 'holes' is the sector that hosts a > directory, /home/weyrmount/MOO (indeed, on the original drive, trying > to CD into that gives IO Error) > That directory contained three files, plus an 'arch' directory. Now, > while I understand that recovering the MOO dir itself is unrealistic, > is there any way I could recover the arch dir - and the files > therein? Any help will be greatly appreciated. If you run e2fsck on the copied device, you should find any unattached items put into lost+found. Of course, it is also possible that files in "arch" are also corrupted, depedning on location of 65MB of holes. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From lfabio_ext3users at smiling-web.com Sun Feb 22 02:59:08 2004 From: lfabio_ext3users at smiling-web.com (Luigi Fabio) Date: Sat, 21 Feb 2004 21:59:08 -0500 Subject: Crashed filesystem - directory recovery In-Reply-To: <20040222015650.GC17735@schnapps.adilger.int> References: <4037B723.28990.D89EB49@localhost> Message-ID: <4037D4AC.19956.DFD48ED@localhost> On 21 Feb 2004 at 18:56, Andreas Dilger wrote: > If you run e2fsck on the copied device, you should find any unattached > items put into lost+found. Of course, it is also possible that files > in "arch" are also corrupted, depedning on location of 65MB of holes. Interestingly enough, in lost+found there is no trace of arch - or of any of the 18xx files it contained. There is, however, a subdir of arch - I have to wonder, it seems very curious that only that single file out of 1800 managed to survive. I do get asked by fsck to clear a few inodes with too many errors at the beginning - would answering 'no' there help, perhaps? > Cheers, Andreas > -- > Andreas Dilger > http://sourceforge.net/projects/ext2resize/ > http://www-mddsp.enel.ucalgary.ca/People/adilger/ Regards, Luigi Fabio - lfabio at smiling-web.com From bothie at gmx.de Sun Feb 22 11:20:01 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Sun, 22 Feb 2004 12:20:01 +0100 Subject: Crashed filesystem - directory recovery In-Reply-To: <4037D4AC.19956.DFD48ED@localhost> References: <4037B723.28990.D89EB49@localhost> <4037D4AC.19956.DFD48ED@localhost> Message-ID: <200402221119.i1MBJQb12044@mx1.redhat.com> Hello. "Luigi Fabio" wrote: > Interestingly enough, in lost+found there is no trace of arch - or of > any of the 18xx files it contained. There is, however, a subdir of > arch - I have to wonder, it seems very curious that only that single > file out of 1800 managed to survive. I do get asked by fsck to clear > a few inodes with too many errors at the beginning - would answering > 'no' there help, perhaps? If you already answered 'yes', than you have to recopy the image from the broken hard disk again, because saying 'yes' simply means to delete the inode. On the other hand: If the inodes management areas are in broken blocks, than they cannot be rescued at all. How important are the files? (There are companies, which are specialised on data rescue, which are able to do things, which you cannot - but it's expensive as everything needs to be done by hand ;-) Regards, Bodo From lfabio_ext3users at smiling-web.com Sun Feb 22 13:51:20 2004 From: lfabio_ext3users at smiling-web.com (Luigi Fabio) Date: Sun, 22 Feb 2004 08:51:20 -0500 Subject: Crashed filesystem - directory recovery In-Reply-To: <200402221119.i1MBJQb12044@mx1.redhat.com> References: <4037D4AC.19956.DFD48ED@localhost> Message-ID: <40386D88.18281.105264E2@localhost> On 22 Feb 2004 at 12:20, Bodo Thiesen wrote: > If you already answered 'yes', than you have to recopy the image from the > broken hard disk again, because saying 'yes' simply means to delete the > inode. On the other hand: If the inodes management areas are in broken > blocks, than they cannot be rescued at all. How important are the files? > (There are companies, which are specialised on data rescue, which are > able to do things, which you cannot - but it's expensive as everything > needs to be done by hand ;-) Yes, I realise. I did copy the image again from a backup - several times - but I didn't get very far. I answered no and told fsck to fix the filetype instead when it asked me, but I only was able to recover one file from the arch dir - one much too old to be of interest (as a matter of fact, fsck did not stop when I answered no, I recalled correctly). At this point, any ideas are welcome - both for recovery of this filesystem and as a choice for future filesystems, because ext3 has shown itself to be far too brittle for critical usage. To answer your last question - I asked a data recovery place, in the hopes that they can mount the platters on a working head/spindle system to retrieve the data that I could not, but I am still waiting for an estimate. Per past experiences, unless they can gain data that way, data recovery places are much *less* effective than in house solutions, because they can't even remotely afford to dedicate the amount of time that we do in house to the problem while keeping their bills acceptable. > Regards, Bodo Regards, Luigi Fabio - lfabio at smiling-web.com From sct at redhat.com Thu Feb 26 16:50:21 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 26 Feb 2004 16:50:21 +0000 Subject: ext3 Overhead In-Reply-To: <200402192125.i1JLP1b07955@mx1.redhat.com> References: <1077225373.2070.834.camel@sisko.scot.redhat.com> <200402192125.i1JLP1b07955@mx1.redhat.com> Message-ID: <1077814220.2040.398.camel@sisko.scot.redhat.com> Hi, On Thu, 2004-02-19 at 21:25, Bodo Thiesen wrote: > Most probably even more. Imagine deleting a directory recursively. On ext2 > most unlink operations will cause only one write operation to disk. On ext3 > each unlink operation creates at least one extra journal entry (plus the > write of the directory blocks). That's true to some extent --- but remember that journal writes are batched into big sequential IOs. So for a whole 5 seconds' worth of truncates, all of the separate bitmap updates for a single bitmap block get coalesced; and then all of the separate pieces of metadata that were touched are themselves coalesced into one journal IO. And disk are *MUCH* faster at sequential IO than random metadata IO. So for embedded flash IOs, the overhead of the extra metadata IO is significant, because flash doesn't (much) care about seeks. But for normal disks, the actual impact is very much less. --Stephen From mfedyk at matchmail.com Thu Feb 26 19:05:30 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Thu, 26 Feb 2004 11:05:30 -0800 Subject: ext3 Overhead In-Reply-To: <1077225373.2070.834.camel@sisko.scot.redhat.com> References: <1077225373.2070.834.camel@sisko.scot.redhat.com> Message-ID: <403E437A.8070900@matchmail.com> Stephen C. Tweedie wrote: > Hi, > > On Thu, 2004-02-19 at 13:01, christian.braun at ch.abb.com wrote: > > >>Well, as I said my CF card does wear-levelling, so that's not to worry >>about. Still, as you said, there is a difference in the number of write >>accesses between ext2 and ext3... I just need to know in what region that >>difference is... is it 3 times... or 30 times... or 300... or even more? > > > For data, there's no difference --- unless you're in data=journal mode > --- except for the fact that ext3 usually starts flushing stuff to disk > earlier than ext2, which can mean that ext3 writes temporary data more > often than ext2. For metadata, I'd expect ext3 is at most twice the > writes of ext2 in most circumstances, but it's not something I've ever > measured. That can be configured to 30sec in the source too. Has the patch to make that a mount option made it into upstream? From sct at redhat.com Thu Feb 26 19:17:56 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 26 Feb 2004 19:17:56 +0000 Subject: ext3 Overhead In-Reply-To: <403E437A.8070900@matchmail.com> References: <1077225373.2070.834.camel@sisko.scot.redhat.com> <403E437A.8070900@matchmail.com> Message-ID: <1077823075.2040.573.camel@sisko.scot.redhat.com> Hi, On Thu, 2004-02-26 at 19:05, Mike Fedyk wrote: > That can be configured to 30sec in the source too. Has the patch to > make that a mount option made it into upstream? Yes, and "mount -o remount,commit=" should work too even without an unmount. --Stephen