From daytooner at gmail.com Fri Jan 17 16:32:48 2014 From: daytooner at gmail.com (Ken Bass) Date: Fri, 17 Jan 2014 08:32:48 -0800 Subject: Very long delay for first write to big filesystem Message-ID: I asked about this a while back. It seems that this problem is getting much worse. The problem/issue: there is a very long delay when my system does a write to the filesystem. The delay now is over 5 minutes (yes: minutes). This only happens on the first write after booting up the system, and only for large files - 1GB or more. This can be a serious problem since all access to any hard disk is blocked and will hang until the first write begins again. The prevailing thought at the time was this was associated with loading into memory the directory information looking for free space, which I would believe now. The filesystem in question is 7.5TB, with about 4TB used. There are over 250,000 files. I also have another system with 1TB total and 400GB used, with 65,000 files. This system, the smaller one, is beginning to show delays as well, although only a few seconds. This problem seems to involve several factors: the total size of the system; the current "fragmentation" of that system; and finally the amount of physical memory available. As to the last factor, the 7.5TB system has only 2GB of memory (I didn't think that it would need a lot since it is mostly being used as a file server). The "fragmentation" factor (I am only guessing here) occurs with having many files written and deleted over time. So my questions are: is there a solution or work around for this; and is this a bug, or perhaps an undesirable feature. If the latter, should this be reported (somewhere)? Any suggestions, tips, etc. greatly appreciated. TIA ken -------------- next part -------------- An HTML attachment was scrubbed... URL: From lakshmipathi.g at gmail.com Sat Jan 18 12:13:11 2014 From: lakshmipathi.g at gmail.com (Lakshmipathi.G) Date: Sat, 18 Jan 2014 17:43:11 +0530 Subject: File System corruption tool Message-ID: Hi - I'm searching for file system corruption tool, say it inject disk-errors like multiply owned blocks etc. Later an integrity scan process (like e2fsck) will verify on-disk layout and fix these errors. I'd like to read/understand such tools before writing one for an proprietary on-disk file system. Do we have such tools for ext{2,3,4}fs ? Thanks for any help or pointers! -- ---- Cheers, Lakshmipathi.G FOSS Programmer. www.giis.co.in -------------- next part -------------- An HTML attachment was scrubbed... URL: From ricwheeler at gmail.com Sat Jan 18 12:40:28 2014 From: ricwheeler at gmail.com (Ric Wheeler) Date: Sat, 18 Jan 2014 07:40:28 -0500 Subject: File System corruption tool In-Reply-To: References: Message-ID: <52DA763C.1090505@gmail.com> On 01/18/2014 07:13 AM, Lakshmipathi.G wrote: > Hi - > > I'm searching for file system corruption tool, say it inject disk-errors like > multiply owned blocks etc. Later an integrity scan process (like e2fsck) will > verify on-disk layout and fix these errors. > > I'd like to read/understand such tools before writing one for an proprietary > on-disk file system. > > Do we have such tools for ext{2,3,4}fs ? Thanks for any help or pointers! > > For s-ata drives, you can use hdparm to create a bad sector that will cause an IO error on read. (Write will fix it) Ric From adilger at dilger.ca Sat Jan 18 17:09:20 2014 From: adilger at dilger.ca (Andreas Dilger) Date: Sat, 18 Jan 2014 10:09:20 -0700 Subject: Very long delay for first write to big filesystem In-Reply-To: References: Message-ID: <9263807E-9BD9-41B0-AC1E-E7D4CBD4CA04@dilger.ca> On Jan 17, 2014, at 9:32, Ken Bass wrote: > > The problem/issue: there is a very long delay when my system does a write to the filesystem. The delay now is over 5 minutes (yes: minutes). This only happens on the first write after booting up the system, and only for large files - 1GB or more. This can be a serious problem since all access to any hard disk is blocked and will hang until the first write begins again. > > The prevailing thought at the time was this was associated with loading into memory the directory information looking for free space, which I would believe now. It isn't actually directory information that is being loaded, but rather the block bitmaps from each group, and each one needs a seek to read. This will take up to 7.5 TB / 128 MB/group / 100 seeks/sec = 600s if the filesystem is nearly full. After this point, the bitmaps are cached In memory and allocation is faster. > The filesystem in question is 7.5TB, with about 4TB used. There are over 250,000 files. I also have another system with 1TB total and 400GB used, with 65,000 files. This system, the smaller one, is beginning to show delays as well, although only a few seconds. > > This problem seems to involve several factors: the total size of the system; the current "fragmentation" of that system; and finally the amount of physical memory available. > > As to the last factor, the 7.5TB system has only 2GB of memory (I didn't think that it would need a lot since it is mostly being used as a file server). The "fragmentation" factor (I am only guessing here) occurs with having many files written and deleted over time. > > So my questions are: is there a solution or work around for this; and is this a bug, or perhaps an undesirable feature. If the latter, should this be reported (somewhere)? You might consider mounting the filesystem as ext4 instead of ext3. It will do a slightly better job of finding contiguous free space and avoid loading bitmaps that do not have enough space, but the physics of seeking to read bitmaps is still the same. If you format a new filesystem as ext4 (as opposed to just mounting the existing filesystem as ext4) you can use a new feature "flex_bg" that locates the block and inode bitmaps together so that they can be read without so much seeking. You'd need a spare disk to format and copy the data over to. Using ext4 is also more resistant to fragmentation over time. Cheers, Andreas > Any suggestions, tips, etc. greatly appreciated. > > TIA > > ken > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users From adilger at dilger.ca Sat Jan 18 22:29:09 2014 From: adilger at dilger.ca (Andreas Dilger) Date: Sat, 18 Jan 2014 15:29:09 -0700 Subject: File System corruption tool In-Reply-To: References: Message-ID: <3698A346-9586-4656-8010-F4CDE0D3319E@dilger.ca> We have a script that adds corruption to ext2/3/4 filesystems and runs e2fsck on it. It definitely could be improved, but it still catches some occasional errors: http://git.whamcloud.com/?p=tools/e2fsprogs.git;a=commit;h=aee44c669bebe29bfdb8a1c86da443234f8bc257 It tries to format the filesystem with different features and options, then adds corruption from both random data and copying parts of the filesystem internally to some other part of the filesystem. It might be useful to corrupt some random bits and bytes in the filesystem also, but it doesn't do that today. There is also fsfuzzer, which writes random data to the filesystem and tries to mount it, but I don't know if that has been tried with e2fsck. The other major question I have is why you are trying to create a new proprietary filesystem? That is really a ten year effort, and you would be much better off to use one of the many existing filesystems. If the current ones don't meet your exact needs, add the missing features you need instead of creating a whole new one from scratch. While I'm a big fan of ext4, there are many other good filesystems out there - XFS, Btrfs, ZFS, and several flash filesystems. Cheers, Andreas > On Jan 18, 2014, at 5:13, "Lakshmipathi.G" wrote: > > Hi - > > I'm searching for file system corruption tool, say it inject disk-errors like > multiply owned blocks etc. Later an integrity scan process (like e2fsck) will > verify on-disk layout and fix these errors. > > I'd like to read/understand such tools before writing one for an proprietary > on-disk file system. > > Do we have such tools for ext{2,3,4}fs ? Thanks for any help or pointers! > > -- > ---- > Cheers, > Lakshmipathi.G > FOSS Programmer. > www.giis.co.in > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From lakshmipathi.g at gmail.com Sun Jan 19 14:24:28 2014 From: lakshmipathi.g at gmail.com (Lakshmipathi.G) Date: Sun, 19 Jan 2014 19:54:28 +0530 Subject: File System corruption tool In-Reply-To: <3698A346-9586-4656-8010-F4CDE0D3319E@dilger.ca> References: <3698A346-9586-4656-8010-F4CDE0D3319E@dilger.ca> Message-ID: > For s-ata drives, you can use hdparm to create a bad sector that will cause an IO error on read. (Write will fix it) Thanks Ric Wheeler. Just looked into hdparm, it has nice option to corrupt sectors but I need to manipulate disk-entries also (like add invalid nlink value in an inode structure etc). > We have a script that adds corruption to ext2/3/4 filesystems and runs > e2fsck on it. It definitely could be improved, but it still catches some Thanks Andreas. I was going the script, liked idea to create file system image based on random files. It has minimal option as now of. Will look into fsfuzzer. > The other major question I have is why you are trying to create a new > proprietary filesystem? Sorry, I should have been clearer. I was assigned with a task of creating a framework to corrupt disk-layout on an existing FreeBSD based closed source file system. We have e2fsck like integrity checker, but corrupting disk script needs to be added. Thanks! -- ---- Cheers, Lakshmipathi.G FOSS Programmer. www.giis.co.in -------------- next part -------------- An HTML attachment was scrubbed... URL: From daytooner at gmail.com Mon Jan 20 02:07:54 2014 From: daytooner at gmail.com (Ken Bass) Date: Sun, 19 Jan 2014 18:07:54 -0800 Subject: Very long delay for first write to big filesystem In-Reply-To: <9263807E-9BD9-41B0-AC1E-E7D4CBD4CA04@dilger.ca> References: <9263807E-9BD9-41B0-AC1E-E7D4CBD4CA04@dilger.ca> Message-ID: Thx Andreas. re: block bitmaps - yes that is what I really meant. My experience with filesystems is mainly from CPM/BDOS, where "directory" and block mapping are essentially synonymous. And now I understand about the timing. Makes sense when you describe it that way. My system is ext4, although I doubt that I used "flex_bg" option, since this was first created awhile back. I did try to run e4defrag. It simply said that no defrag was needed. So, now I'm only left in need of a work-around. Perhaps a way to have the system load the bitmaps at boot time in the background? It would need to be done in such a way that it would not block any other access to that system. Or, is there a better filesystem format that would not have this problem? (Not a really great solution, since I would need to somehow/somewhere backup my 7.5TB system first.) It does seem strange that this hasn't become a more serious issue, as typical filesystems are getting bigger now. And I can't imagine a really large network server (10TB+) having to deal with this. Again, thx for the response. ken On Sat, Jan 18, 2014 at 9:09 AM, Andreas Dilger wrote: > On Jan 17, 2014, at 9:32, Ken Bass wrote: > > > > The problem/issue: there is a very long delay when my system does a > write to the filesystem. The delay now is over 5 minutes (yes: minutes). > This only happens on the first write after booting up the system, and only > for large files - 1GB or more. This can be a serious problem since all > access to any hard disk is blocked and will hang until the first write > begins again. > > > > The prevailing thought at the time was this was associated with loading > into memory the directory information looking for free space, which I would > believe now. > > It isn't actually directory information that is being loaded, but rather > the > block bitmaps from each group, and each one needs a seek to read. > This will take up to 7.5 TB / 128 MB/group / 100 seeks/sec = 600s > if the filesystem is nearly full. After this point, the bitmaps are cached > In memory and allocation is faster. > > > The filesystem in question is 7.5TB, with about 4TB used. There are over > 250,000 files. I also have another system with 1TB total and 400GB used, > with 65,000 files. This system, the smaller one, is beginning to show > delays as well, although only a few seconds. > > > > This problem seems to involve several factors: the total size of the > system; the current "fragmentation" of that system; and finally the amount > of physical memory available. > > > > As to the last factor, the 7.5TB system has only 2GB of memory (I didn't > think that it would need a lot since it is mostly being used as a file > server). The "fragmentation" factor (I am only guessing here) occurs with > having many files written and deleted over time. > > > > So my questions are: is there a solution or work around for this; and is > this a bug, or perhaps an undesirable feature. If the latter, should this > be reported (somewhere)? > > You might consider mounting the filesystem as ext4 instead of ext3. > It will do a slightly better job of finding contiguous free space > and avoid loading bitmaps that do not have enough space, but the > physics of seeking to read bitmaps is still the same. > > If you format a new filesystem as ext4 (as opposed to just mounting the > existing filesystem as ext4) you can use a new feature "flex_bg" that > locates the block and inode bitmaps together so that they can be read > without so much seeking. You'd need a spare disk to format and copy > the data over to. > > Using ext4 is also more resistant to fragmentation over time. > > Cheers, Andreas > > > Any suggestions, tips, etc. greatly appreciated. > > > > TIA > > > > ken > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adilger at dilger.ca Mon Jan 20 21:46:01 2014 From: adilger at dilger.ca (Andreas Dilger) Date: Mon, 20 Jan 2014 14:46:01 -0700 Subject: Very long delay for first write to big filesystem In-Reply-To: References: <9263807E-9BD9-41B0-AC1E-E7D4CBD4CA04@dilger.ca> Message-ID: <805DD079-E4FD-4C4B-B418-9C165A76328C@dilger.ca> On Jan 19, 2014, at 7:07 PM, Ken Bass wrote: > re: block bitmaps - yes that is what I really meant. My experience with filesystems is mainly from CPM/BDOS, where "directory" and block mapping are essentially synonymous. > > And now I understand about the timing. Makes sense when you describe it that way. > > My system is ext4, although I doubt that I used "flex_bg" option, since this was first created awhile back. I did try to run e4defrag. It simply said that no defrag was needed. Use "dumpe2fs -h /dev/XXX | grep feature" to see if it is listed. > So, now I'm only left in need of a work-around. Perhaps a way to have the system load the bitmaps at boot time in the background? It would need to be done in such a way that it would not block any other access to that system. We had a similar problem in the past. Run "dumpe2fs /dev/XXX > /dev/null" at startup time (can be before or after mount) to start it reading the block and inode allocation bitmaps. > Or, is there a better filesystem format that would not have this problem? (Not a really great solution, since I would need to somehow/somewhere backup my 7.5TB system first.) Yes, formatting with "mke2fs -t ext4" should enable flex_bg by default. > It does seem strange that this hasn't become a more serious issue, as typical filesystems are getting bigger now. And I can't imagine a really large network server (10TB+) having to deal with this. That's why the flex_bg feature was added to ext4 in the first place. Cheers, Andreas > Again, thx for the response. > > ken > > > On Sat, Jan 18, 2014 at 9:09 AM, Andreas Dilger wrote: > On Jan 17, 2014, at 9:32, Ken Bass wrote: > > > > The problem/issue: there is a very long delay when my system does a write to the filesystem. The delay now is over 5 minutes (yes: minutes). This only happens on the first write after booting up the system, and only for large files - 1GB or more. This can be a serious problem since all access to any hard disk is blocked and will hang until the first write begins again. > > > > The prevailing thought at the time was this was associated with loading into memory the directory information looking for free space, which I would believe now. > > It isn't actually directory information that is being loaded, but rather the > block bitmaps from each group, and each one needs a seek to read. > This will take up to 7.5 TB / 128 MB/group / 100 seeks/sec = 600s > if the filesystem is nearly full. After this point, the bitmaps are cached > In memory and allocation is faster. > > > The filesystem in question is 7.5TB, with about 4TB used. There are over 250,000 files. I also have another system with 1TB total and 400GB used, with 65,000 files. This system, the smaller one, is beginning to show delays as well, although only a few seconds. > > > > This problem seems to involve several factors: the total size of the system; the current "fragmentation" of that system; and finally the amount of physical memory available. > > > > As to the last factor, the 7.5TB system has only 2GB of memory (I didn't think that it would need a lot since it is mostly being used as a file server). The "fragmentation" factor (I am only guessing here) occurs with having many files written and deleted over time. > > > > So my questions are: is there a solution or work around for this; and is this a bug, or perhaps an undesirable feature. If the latter, should this be reported (somewhere)? > > You might consider mounting the filesystem as ext4 instead of ext3. > It will do a slightly better job of finding contiguous free space > and avoid loading bitmaps that do not have enough space, but the > physics of seeking to read bitmaps is still the same. > > If you format a new filesystem as ext4 (as opposed to just mounting the > existing filesystem as ext4) you can use a new feature "flex_bg" that > locates the block and inode bitmaps together so that they can be read > without so much seeking. You'd need a spare disk to format and copy > the data over to. > > Using ext4 is also more resistant to fragmentation over time. > > Cheers, Andreas > > > Any suggestions, tips, etc. greatly appreciated. > > > > TIA > > > > ken > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users Cheers, Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP using GPGMail URL: From vyshakh.krishnan at gmail.com Wed Jan 29 04:41:05 2014 From: vyshakh.krishnan at gmail.com (VYSHAKH KRISHNAN CH) Date: Wed, 29 Jan 2014 04:41:05 +0000 (UTC) Subject: Invitation to connect on LinkedIn Message-ID: <541068172.1450798.1390970465684.JavaMail.app@ela4-app0086.prod> LinkedIn ------------ I'd like to add you to my professional network on LinkedIn. - VYSHAKH VYSHAKH KRISHNAN CH Software Engineer at Ericsson Bengaluru Area, India Confirm that you know VYSHAKH KRISHNAN CH: https://www.linkedin.com/e/-v6wknk-hr03wgb1-6h/isd/19706316761/43cr4Cgy/?hs=false&tok=2T55fcOOLVLm41 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-v6wknk-hr03wgb1-6h/qxeZrUjp1nwyb3Pfqiy3okSt-vmyBELIgd/goo/ext3-users%40redhat%2Ecom/20061/I6368510451_1/?hs=false&tok=0HqWpZGkfVLm41 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: