From markjballard at googlemail.com Tue Sep 3 08:46:39 2013 From: markjballard at googlemail.com (Mark Ballard) Date: Tue, 3 Sep 2013 09:46:39 +0100 Subject: ext3 / ext4 on USB flash drive? In-Reply-To: <20130830133224.GB27699@thunk.org> References: <20130829154657.GC30918@thunk.org> <521F7747.2000003@redhat.com> <20130830133224.GB27699@thunk.org> Message-ID: >From the little I have heard about control systems for cars, which was some years ago, they were blockhead proprietary. The analogy would only work if computing was customarily blackbox technology, which it isn't. I'd be surprised if there were any branded flash drives that contained less than their advertised amount of storage. That leaves the question of what is going on under the hood in what is probably the vast majority of devices where the flash isn't fraudulent. And whether my system handles it correctly. My system leaves me with no idea of either (though my hope holds out for some tools I bookmarked recently). Reference to forums and specialist websites gives genuine cause for doubt. Yet I thought it was usual for system software to have a good angle on how its hardware was constructed and what it was doing. I thought they worked in symbiosis, and that this maintained by mutual necessity. I thought the symbiosis was kept unassailably whole by a common purpose: the user. What you say implies that this symbiosis has been broken by the commercial greed of flash manufacturers. Or that it is by neglect on their part, or lazyness, or some other cause of a fissure in industry relations. Whatever the reason, it raises another question, and that is what must be done so that I can simply format my USB without a concern and get back to my work. > even for a non-fradulent USB stick or SD card, there is no single way to measure "FTL quality". ... there are some things (such as the erase block size) which would be useful for tuning file system performance. And the technical people I've talked to at various Flash manufacturers all agree it's pointless to hide this information, but the product managers tend to be the roadblock. > This is perhaps telling. One would imagine the USB Industry Forum meeting the Association of (File) System Software Scribes or whatever at routine collegiate meetings in Las Vegas hotels, and so on. Which flash manufacturers have refused to collaborate? Why has the fabled industry forum failed? mb. From tytso at mit.edu Tue Sep 3 12:23:56 2013 From: tytso at mit.edu (Theodore Ts'o) Date: Tue, 3 Sep 2013 08:23:56 -0400 Subject: ext3 / ext4 on USB flash drive? In-Reply-To: References: <20130829154657.GC30918@thunk.org> <521F7747.2000003@redhat.com> <20130830133224.GB27699@thunk.org> Message-ID: <20130903122356.GC15457@thunk.org> On Tue, Sep 03, 2013 at 09:46:39AM +0100, Mark Ballard wrote: > isn't. I'd be surprised if there were any branded flash drives that > contained less than their advertised amount of storage. The vast majority of flash sold, especially the cheap-grade flash (i.e., SD Cards and USB sticks) sold through retail channels, is probably unbranded. Because it's cheaper, and for most users, (a) price is a feature, and (b) they are only using flash as a temporary transport medium (e.g., here let me give you my slide presentation; can I borrow a USB stick?), and (c) they are much more likely to lose said flash device before it is likely to go bad, or even gets 100% filled. It's for the same reason that the quality of experience in airplanes travel has degraded so badly. The market has spoken; and consumers have said, at least by their actions, that price is important than anything else. > Whatever the reason, it raises another question, and that is what must > be done so that I can simply format my USB without a concern and get > back to my work. Buy high quality flash which has been explicitly reviewed by a source you trust. There isn't much else you can really do.... > This is perhaps telling. One would imagine the USB Industry Forum > meeting the Association of (File) System Software Scribes or whatever > at routine collegiate meetings in Las Vegas hotels, and so on. > > Which flash manufacturers have refused to collaborate? Why has the > fabled industry forum failed? They are collaborating --- with the mass buyers of their flash. If you are a purchasing flash by the millions, then you can get all of this information (under NDA), and you can dictate the quality of the flash which is appropriate for your use case. This even afflicted Microsoft's Windows Phone, where they had some manufacturers provide an SD Card slot. This meant that end users could replace their carefully tested-and-selected-for-performance SD cards which was shipped with their phone with crap sold at the checkout counter, and since the phone's root file system was stored on the SD card, performance when into the crapper, and guess who the customers blamed? Not the flash manufacturer, and not the handset manufacturer for including a removable SD-card slot instead of using a fixed eMMC flash device, but Microsoft. As a result, many handset manufacutrers these days do *not* have an SD card slot, and if they do, they don't allow the root file system to be stored on the SD card, and the SD card can only be used for auxilary or media storage (for which even really crappy flash is generally good enough). So the market is working; it's just working for the most common use case, and the most common desire of the customers who are doing the buying. And that means there will be high quality stuff that costs $$$, and really cheap stuff where you get what you pay for, and hardware manufacters who buy flash devices by the million unit order will get better deals, and all of the low-level information under NDA. All hail the free market.... as my libertarian friends would say, "Huge success". - Ted From prichard at med.wayne.edu Sat Sep 7 02:46:35 2013 From: prichard at med.wayne.edu (Richards, Paul Franklin) Date: Sat, 7 Sep 2013 02:46:35 +0000 Subject: Strange fsck.ext3 behavior - infinite loop In-Reply-To: <20130830182302.GE30385@thunk.org> References: <90D96432E685D84E9A62AF83962A689737E7AA68@MED-CORE07A.med.wayne.edu> <4764C2B2-63C9-4FC5-A99B-3D8BEB004995@dilger.ca>, <20130830182302.GE30385@thunk.org> Message-ID: <90D96432E685D84E9A62AF83962A689737E7EE25@MED-CORE07A.med.wayne.edu> It appears that the RAID has hardware problems as three of the disks are being detected as "unhealthy". Thank you all for your help! ________________________________________ From: Theodore Ts'o [tytso at mit.edu] Sent: Friday, August 30, 2013 2:23 PM To: Andreas Dilger Cc: Richards, Paul Franklin; ext3-users at redhat.com Subject: Re: Strange fsck.ext3 behavior - infinite loop On Fri, Aug 30, 2013 at 12:07:22PM -0600, Andreas Dilger wrote: > > > [root at myhost /]# mkfs.ext3 /dev/sda1 > > mke2fs 1.35 (28-Feb-2004) > > First thing I would suggest is to update to a newer version of e2fsprogs, since this one is 9+ years old and that is a lot of > water under the bridge. That's definitely good advice, but even with e2fsprogs 1.35, if e2fsck -f is finding errors when run immediately after running mke2fs, it would make me suspect the storage device. Are you sure the RAID controller (is this a hw raid, or software raid) is working correctly? - Ted From be.nicolas.michel at gmail.com Mon Sep 16 10:16:17 2013 From: be.nicolas.michel at gmail.com (Nicolas Michel) Date: Mon, 16 Sep 2013 12:16:17 +0200 Subject: Numbers behind "df" and "tune2fs" Message-ID: Hello guys, I have some difficulties to understand what really are the numbers behing "df" and tune2fs. You'll find the output of tune2fs and df below, on which my maths are based. Here are my maths: A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It also tell me that the size of one block is 4096 (bytes if I'm not wrong?). So my maths tell me that the disk is 3284992 * 4096 = 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. A df --block-size=1 on the same FS tell me the disk is 13243846656 which is 211480576 bytes smaller than what tune2fs tell me. In gigabytes, it means: * for df, the disk is 12.33 GB * for tune2fs, the disk is 12.53 GB I thought that maybe df is only taking into account the real blocks available for users. So I tried to remove the reserved blocks and the GDT blocks: (3284992 - 164249 - 801) * 4096 = 12779282432 or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... My last thought was that "Reserved block" in tune2fs was not only the reserved blocks for root (which is 5% per default on my system) but take into account all other reserved blocks fo the fs internal usage. So: (3284992 - 164249) * 4096 = 12782563328 In GB : 11.90 Gb (the difference is not significative with a precision of 2. So I'm lost ... Is someone have an explanation? I would really really be grateful. Nicolas ------------------------------ --------- Here is the output of df and tune2fs : $ tune2fs -l /dev/mapper/datavg-datalogslv tune2fs 1.41.9 (22-Aug-2009) Filesystem volume name: Last mounted on: Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Filesystem flags: unsigned_directory_hash Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 822544 Block count: 3284992 Reserved block count: 164249 Free blocks: 3109325 Free inodes: 822348 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 801 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8144 Inode blocks per group: 509 Filesystem created: Wed Aug 28 08:30:10 2013 Last mount time: Wed Sep 11 17:16:56 2013 Last write time: Thu Sep 12 09:38:02 2013 Mount count: 18 Maximum mount count: 27 Last checked: Wed Aug 28 08:30:10 2013 Check interval: 15552000 (6 months) Next check after: Mon Feb 24 07:30:10 2014 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 Journal backup: inode blocks $ df --block-size=1 /dev/mapper/datavg-datalogslv Filesystem 1B-blocks Used Available Use% Mounted on /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs -- Nicolas MICHEL From sandeen at redhat.com Mon Sep 16 14:39:23 2013 From: sandeen at redhat.com (Eric Sandeen) Date: Mon, 16 Sep 2013 09:39:23 -0500 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: References: Message-ID: <5237181B.1070109@redhat.com> On 9/16/13 5:16 AM, Nicolas Michel wrote: > Hello guys, > > I have some difficulties to understand what really are the numbers > behing "df" and tune2fs. You'll find the output of tune2fs and df > below, on which my maths are based. > > Here are my maths: > > A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It > also tell me that the size of one block is 4096 (bytes if I'm not > wrong?). So my maths tell me that the disk is 3284992 * 4096 = > 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. > > A df --block-size=1 on the same FS tell me the disk is 13243846656 > which is 211480576 bytes smaller than what tune2fs tell me. By default, df on extN assumes that metadata used by the filesystem was never available for your use and is not part of the filesystem space. Documentation/filesystems/ext3.txt says: bsddf (*) Make 'df' act like BSD. minixdf Make 'df' act like Minix. which is pretty unhelpful I suppose. ;) The mount man page is a little more helpful: bsddf|minixdf Set the behaviour for the statfs system call. The minixdf behaviour is to return in the f_blocks field the total number of blocks of the filesystem, while the bsddf behaviour (which is the default) is to subtract the overhead blocks used by the ext2 filesystem and not available for file storage. You're seeing the latter behavior. if you mount with -o minixdf you should see what you expect. (Too bad there's no "linuxdf?") :) > In gigabytes, it means: > * for df, the disk is 12.33 GB > * for tune2fs, the disk is 12.53 GB > > I thought that maybe df is only taking into account the real blocks > available for users. So I tried to remove the reserved blocks and the > GDT blocks: > (3284992 - 164249 - 801) * 4096 = 12779282432 > or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... you're on the right track, but you forgot the journal space, all the preallocated inode table blocks, etc. -Eric > My last thought was that "Reserved block" in tune2fs was not only the > reserved blocks for root (which is 5% per default on my system) but > take into account all other reserved blocks fo the fs internal usage. > So: > (3284992 - 164249) * 4096 = 12782563328 > In GB : 11.90 Gb (the difference is not significative with a precision of 2. > > So I'm lost ... > > Is someone have an explanation? I would really really be grateful. > Nicolas > > ------------------------------ > --------- > > Here is the output of df and tune2fs : > > $ tune2fs -l /dev/mapper/datavg-datalogslv > tune2fs 1.41.9 (22-Aug-2009) > Filesystem volume name: > Last mounted on: > Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c > Filesystem magic number: 0xEF53 > Filesystem revision #: 1 (dynamic) > Filesystem features: has_journal ext_attr resize_inode dir_index > filetype needs_recovery sparse_super large_file > Filesystem flags: unsigned_directory_hash > Default mount options: (none) > Filesystem state: clean > Errors behavior: Continue > Filesystem OS type: Linux > Inode count: 822544 > Block count: 3284992 > Reserved block count: 164249 > Free blocks: 3109325 > Free inodes: 822348 > First block: 0 > Block size: 4096 > Fragment size: 4096 > Reserved GDT blocks: 801 > Blocks per group: 32768 > Fragments per group: 32768 > Inodes per group: 8144 > Inode blocks per group: 509 > Filesystem created: Wed Aug 28 08:30:10 2013 > Last mount time: Wed Sep 11 17:16:56 2013 > Last write time: Thu Sep 12 09:38:02 2013 > Mount count: 18 > Maximum mount count: 27 > Last checked: Wed Aug 28 08:30:10 2013 > Check interval: 15552000 (6 months) > Next check after: Mon Feb 24 07:30:10 2014 > Reserved blocks uid: 0 (user root) > Reserved blocks gid: 0 (group root) > First inode: 11 > Inode size: 256 > Required extra isize: 28 > Desired extra isize: 28 > Journal inode: 8 > Default directory hash: half_md4 > Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 > Journal backup: inode blocks > > $ df --block-size=1 /dev/mapper/datavg-datalogslv > Filesystem 1B-blocks Used Available Use% Mounted on > /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs > > From be.nicolas.michel at gmail.com Mon Sep 16 14:44:59 2013 From: be.nicolas.michel at gmail.com (Nicolas Michel) Date: Mon, 16 Sep 2013 16:44:59 +0200 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: <5237181B.1070109@redhat.com> References: <5237181B.1070109@redhat.com> Message-ID: Thanks for you help. I also tried adding some other informations as you suggest: I can also take into account: - "Reserved block count: XXXXXXX" from tune2fs that gives me the number of blocks reserved for root - Reserved GDT blocks: XXX But I didn't thought about the FS journal. How can I gather information about it? (it's size and any other information?) 2013/9/16 Eric Sandeen : > On 9/16/13 5:16 AM, Nicolas Michel wrote: >> Hello guys, >> >> I have some difficulties to understand what really are the numbers >> behing "df" and tune2fs. You'll find the output of tune2fs and df >> below, on which my maths are based. >> >> Here are my maths: >> >> A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It >> also tell me that the size of one block is 4096 (bytes if I'm not >> wrong?). So my maths tell me that the disk is 3284992 * 4096 = >> 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. >> >> A df --block-size=1 on the same FS tell me the disk is 13243846656 >> which is 211480576 bytes smaller than what tune2fs tell me. > > By default, df on extN assumes that metadata used by the filesystem > was never available for your use and is not part of the filesystem > space. > > Documentation/filesystems/ext3.txt says: > > bsddf (*) Make 'df' act like BSD. > minixdf Make 'df' act like Minix. > > which is pretty unhelpful I suppose. ;) > > The mount man page is a little more helpful: > > bsddf|minixdf > Set the behaviour for the statfs system call. The minixdf > behaviour is to return in the f_blocks field the total number > of blocks of the filesystem, while the bsddf behaviour (which > is the default) is to subtract the overhead blocks used by the > ext2 filesystem and not available for file storage. > > You're seeing the latter behavior. if you mount with -o minixdf you should > see what you expect. (Too bad there's no "linuxdf?") :) > >> In gigabytes, it means: >> * for df, the disk is 12.33 GB >> * for tune2fs, the disk is 12.53 GB >> >> I thought that maybe df is only taking into account the real blocks >> available for users. So I tried to remove the reserved blocks and the >> GDT blocks: >> (3284992 - 164249 - 801) * 4096 = 12779282432 >> or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... > > you're on the right track, but you forgot the journal space, all the > preallocated inode table blocks, etc. > > -Eric > >> My last thought was that "Reserved block" in tune2fs was not only the >> reserved blocks for root (which is 5% per default on my system) but >> take into account all other reserved blocks fo the fs internal usage. >> So: >> (3284992 - 164249) * 4096 = 12782563328 >> In GB : 11.90 Gb (the difference is not significative with a precision of 2. >> >> So I'm lost ... >> >> Is someone have an explanation? I would really really be grateful. >> Nicolas >> >> ------------------------------ >> --------- >> >> Here is the output of df and tune2fs : >> >> $ tune2fs -l /dev/mapper/datavg-datalogslv >> tune2fs 1.41.9 (22-Aug-2009) >> Filesystem volume name: >> Last mounted on: >> Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c >> Filesystem magic number: 0xEF53 >> Filesystem revision #: 1 (dynamic) >> Filesystem features: has_journal ext_attr resize_inode dir_index >> filetype needs_recovery sparse_super large_file >> Filesystem flags: unsigned_directory_hash >> Default mount options: (none) >> Filesystem state: clean >> Errors behavior: Continue >> Filesystem OS type: Linux >> Inode count: 822544 >> Block count: 3284992 >> Reserved block count: 164249 >> Free blocks: 3109325 >> Free inodes: 822348 >> First block: 0 >> Block size: 4096 >> Fragment size: 4096 >> Reserved GDT blocks: 801 >> Blocks per group: 32768 >> Fragments per group: 32768 >> Inodes per group: 8144 >> Inode blocks per group: 509 >> Filesystem created: Wed Aug 28 08:30:10 2013 >> Last mount time: Wed Sep 11 17:16:56 2013 >> Last write time: Thu Sep 12 09:38:02 2013 >> Mount count: 18 >> Maximum mount count: 27 >> Last checked: Wed Aug 28 08:30:10 2013 >> Check interval: 15552000 (6 months) >> Next check after: Mon Feb 24 07:30:10 2014 >> Reserved blocks uid: 0 (user root) >> Reserved blocks gid: 0 (group root) >> First inode: 11 >> Inode size: 256 >> Required extra isize: 28 >> Desired extra isize: 28 >> Journal inode: 8 >> Default directory hash: half_md4 >> Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 >> Journal backup: inode blocks >> >> $ df --block-size=1 /dev/mapper/datavg-datalogslv >> Filesystem 1B-blocks Used Available Use% Mounted on >> /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs >> >> > -- Nicolas MICHEL From sandeen at redhat.com Mon Sep 16 16:25:37 2013 From: sandeen at redhat.com (Eric Sandeen) Date: Mon, 16 Sep 2013 11:25:37 -0500 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: References: <5237181B.1070109@redhat.com> Message-ID: <52373101.3060802@redhat.com> On 9/16/13 9:44 AM, Nicolas Michel wrote: > Thanks for you help. I also tried adding some other informations as you suggest: > I can also take into account: > - "Reserved block count: XXXXXXX" from tune2fs that gives me the > number of blocks reserved for root > - Reserved GDT blocks: XXX > > But I didn't thought about the FS journal. How can I gather > information about it? (it's size and any other information?) # dumpe2fs /dev/$YOUR_DEVICE | grep Journal dumpe2fs 1.41.12 (17-May-2010) Journal inode: 8 Journal backup: inode blocks Journal features: journal_incompat_revoke Journal size: 128M Journal length: 32768 But you also need to take into account inode tables, inode allocation bitmaps, block allocation bitmaps ... -Eric > 2013/9/16 Eric Sandeen : >> On 9/16/13 5:16 AM, Nicolas Michel wrote: >>> Hello guys, >>> >>> I have some difficulties to understand what really are the numbers >>> behing "df" and tune2fs. You'll find the output of tune2fs and df >>> below, on which my maths are based. >>> >>> Here are my maths: >>> >>> A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It >>> also tell me that the size of one block is 4096 (bytes if I'm not >>> wrong?). So my maths tell me that the disk is 3284992 * 4096 = >>> 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. >>> >>> A df --block-size=1 on the same FS tell me the disk is 13243846656 >>> which is 211480576 bytes smaller than what tune2fs tell me. >> >> By default, df on extN assumes that metadata used by the filesystem >> was never available for your use and is not part of the filesystem >> space. >> >> Documentation/filesystems/ext3.txt says: >> >> bsddf (*) Make 'df' act like BSD. >> minixdf Make 'df' act like Minix. >> >> which is pretty unhelpful I suppose. ;) >> >> The mount man page is a little more helpful: >> >> bsddf|minixdf >> Set the behaviour for the statfs system call. The minixdf >> behaviour is to return in the f_blocks field the total number >> of blocks of the filesystem, while the bsddf behaviour (which >> is the default) is to subtract the overhead blocks used by the >> ext2 filesystem and not available for file storage. >> >> You're seeing the latter behavior. if you mount with -o minixdf you should >> see what you expect. (Too bad there's no "linuxdf?") :) >> >>> In gigabytes, it means: >>> * for df, the disk is 12.33 GB >>> * for tune2fs, the disk is 12.53 GB >>> >>> I thought that maybe df is only taking into account the real blocks >>> available for users. So I tried to remove the reserved blocks and the >>> GDT blocks: >>> (3284992 - 164249 - 801) * 4096 = 12779282432 >>> or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... >> >> you're on the right track, but you forgot the journal space, all the >> preallocated inode table blocks, etc. >> >> -Eric >> >>> My last thought was that "Reserved block" in tune2fs was not only the >>> reserved blocks for root (which is 5% per default on my system) but >>> take into account all other reserved blocks fo the fs internal usage. >>> So: >>> (3284992 - 164249) * 4096 = 12782563328 >>> In GB : 11.90 Gb (the difference is not significative with a precision of 2. >>> >>> So I'm lost ... >>> >>> Is someone have an explanation? I would really really be grateful. >>> Nicolas >>> >>> ------------------------------ >>> --------- >>> >>> Here is the output of df and tune2fs : >>> >>> $ tune2fs -l /dev/mapper/datavg-datalogslv >>> tune2fs 1.41.9 (22-Aug-2009) >>> Filesystem volume name: >>> Last mounted on: >>> Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c >>> Filesystem magic number: 0xEF53 >>> Filesystem revision #: 1 (dynamic) >>> Filesystem features: has_journal ext_attr resize_inode dir_index >>> filetype needs_recovery sparse_super large_file >>> Filesystem flags: unsigned_directory_hash >>> Default mount options: (none) >>> Filesystem state: clean >>> Errors behavior: Continue >>> Filesystem OS type: Linux >>> Inode count: 822544 >>> Block count: 3284992 >>> Reserved block count: 164249 >>> Free blocks: 3109325 >>> Free inodes: 822348 >>> First block: 0 >>> Block size: 4096 >>> Fragment size: 4096 >>> Reserved GDT blocks: 801 >>> Blocks per group: 32768 >>> Fragments per group: 32768 >>> Inodes per group: 8144 >>> Inode blocks per group: 509 >>> Filesystem created: Wed Aug 28 08:30:10 2013 >>> Last mount time: Wed Sep 11 17:16:56 2013 >>> Last write time: Thu Sep 12 09:38:02 2013 >>> Mount count: 18 >>> Maximum mount count: 27 >>> Last checked: Wed Aug 28 08:30:10 2013 >>> Check interval: 15552000 (6 months) >>> Next check after: Mon Feb 24 07:30:10 2014 >>> Reserved blocks uid: 0 (user root) >>> Reserved blocks gid: 0 (group root) >>> First inode: 11 >>> Inode size: 256 >>> Required extra isize: 28 >>> Desired extra isize: 28 >>> Journal inode: 8 >>> Default directory hash: half_md4 >>> Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 >>> Journal backup: inode blocks >>> >>> $ df --block-size=1 /dev/mapper/datavg-datalogslv >>> Filesystem 1B-blocks Used Available Use% Mounted on >>> /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs >>> >>> >> > > > From be.nicolas.michel at gmail.com Tue Sep 17 06:14:07 2013 From: be.nicolas.michel at gmail.com (Nicolas Michel) Date: Tue, 17 Sep 2013 08:14:07 +0200 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: <52373101.3060802@redhat.com> References: <5237181B.1070109@redhat.com> <52373101.3060802@redhat.com> Message-ID: OK. Thanks for the journal information. I thought tune2fs -l and dumpe2fs were the same. In reality it's almost the same but not entirely ^^ I hear you about all the internal mecanisms that make the FS working or give it some features, and I do understand that it takes some place on the disk. However what I don't understand is why the number given in the "available column" is called "available" if it's not really the case and we have to remove some other thousand/million of bytes for some internal mecanisms. In other words I don't understand why the "used" percentage given by df does not reflects the values given by itself in the other columns. I can live with it but I really would like to understand why things are what they are. Is there an historic reason? Or maybe a technical reason that makes thoses numbers some added values? The least would be to have the df algorithms documented somewhere? A document that explains intentions and how the values are obtained. The same for tune2fs and dumpe2fs (what really means the given numbers?) 2013/9/16 Eric Sandeen : > On 9/16/13 9:44 AM, Nicolas Michel wrote: >> Thanks for you help. I also tried adding some other informations as you suggest: >> I can also take into account: >> - "Reserved block count: XXXXXXX" from tune2fs that gives me the >> number of blocks reserved for root >> - Reserved GDT blocks: XXX >> >> But I didn't thought about the FS journal. How can I gather >> information about it? (it's size and any other information?) > > # dumpe2fs /dev/$YOUR_DEVICE | grep Journal > dumpe2fs 1.41.12 (17-May-2010) > Journal inode: 8 > Journal backup: inode blocks > Journal features: journal_incompat_revoke > Journal size: 128M > Journal length: 32768 > > But you also need to take into account inode tables, inode > allocation bitmaps, block allocation bitmaps ... > > -Eric > >> 2013/9/16 Eric Sandeen : >>> On 9/16/13 5:16 AM, Nicolas Michel wrote: >>>> Hello guys, >>>> >>>> I have some difficulties to understand what really are the numbers >>>> behing "df" and tune2fs. You'll find the output of tune2fs and df >>>> below, on which my maths are based. >>>> >>>> Here are my maths: >>>> >>>> A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It >>>> also tell me that the size of one block is 4096 (bytes if I'm not >>>> wrong?). So my maths tell me that the disk is 3284992 * 4096 = >>>> 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. >>>> >>>> A df --block-size=1 on the same FS tell me the disk is 13243846656 >>>> which is 211480576 bytes smaller than what tune2fs tell me. >>> >>> By default, df on extN assumes that metadata used by the filesystem >>> was never available for your use and is not part of the filesystem >>> space. >>> >>> Documentation/filesystems/ext3.txt says: >>> >>> bsddf (*) Make 'df' act like BSD. >>> minixdf Make 'df' act like Minix. >>> >>> which is pretty unhelpful I suppose. ;) >>> >>> The mount man page is a little more helpful: >>> >>> bsddf|minixdf >>> Set the behaviour for the statfs system call. The minixdf >>> behaviour is to return in the f_blocks field the total number >>> of blocks of the filesystem, while the bsddf behaviour (which >>> is the default) is to subtract the overhead blocks used by the >>> ext2 filesystem and not available for file storage. >>> >>> You're seeing the latter behavior. if you mount with -o minixdf you should >>> see what you expect. (Too bad there's no "linuxdf?") :) >>> >>>> In gigabytes, it means: >>>> * for df, the disk is 12.33 GB >>>> * for tune2fs, the disk is 12.53 GB >>>> >>>> I thought that maybe df is only taking into account the real blocks >>>> available for users. So I tried to remove the reserved blocks and the >>>> GDT blocks: >>>> (3284992 - 164249 - 801) * 4096 = 12779282432 >>>> or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... >>> >>> you're on the right track, but you forgot the journal space, all the >>> preallocated inode table blocks, etc. >>> >>> -Eric >>> >>>> My last thought was that "Reserved block" in tune2fs was not only the >>>> reserved blocks for root (which is 5% per default on my system) but >>>> take into account all other reserved blocks fo the fs internal usage. >>>> So: >>>> (3284992 - 164249) * 4096 = 12782563328 >>>> In GB : 11.90 Gb (the difference is not significative with a precision of 2. >>>> >>>> So I'm lost ... >>>> >>>> Is someone have an explanation? I would really really be grateful. >>>> Nicolas >>>> >>>> ------------------------------ >>>> --------- >>>> >>>> Here is the output of df and tune2fs : >>>> >>>> $ tune2fs -l /dev/mapper/datavg-datalogslv >>>> tune2fs 1.41.9 (22-Aug-2009) >>>> Filesystem volume name: >>>> Last mounted on: >>>> Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c >>>> Filesystem magic number: 0xEF53 >>>> Filesystem revision #: 1 (dynamic) >>>> Filesystem features: has_journal ext_attr resize_inode dir_index >>>> filetype needs_recovery sparse_super large_file >>>> Filesystem flags: unsigned_directory_hash >>>> Default mount options: (none) >>>> Filesystem state: clean >>>> Errors behavior: Continue >>>> Filesystem OS type: Linux >>>> Inode count: 822544 >>>> Block count: 3284992 >>>> Reserved block count: 164249 >>>> Free blocks: 3109325 >>>> Free inodes: 822348 >>>> First block: 0 >>>> Block size: 4096 >>>> Fragment size: 4096 >>>> Reserved GDT blocks: 801 >>>> Blocks per group: 32768 >>>> Fragments per group: 32768 >>>> Inodes per group: 8144 >>>> Inode blocks per group: 509 >>>> Filesystem created: Wed Aug 28 08:30:10 2013 >>>> Last mount time: Wed Sep 11 17:16:56 2013 >>>> Last write time: Thu Sep 12 09:38:02 2013 >>>> Mount count: 18 >>>> Maximum mount count: 27 >>>> Last checked: Wed Aug 28 08:30:10 2013 >>>> Check interval: 15552000 (6 months) >>>> Next check after: Mon Feb 24 07:30:10 2014 >>>> Reserved blocks uid: 0 (user root) >>>> Reserved blocks gid: 0 (group root) >>>> First inode: 11 >>>> Inode size: 256 >>>> Required extra isize: 28 >>>> Desired extra isize: 28 >>>> Journal inode: 8 >>>> Default directory hash: half_md4 >>>> Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 >>>> Journal backup: inode blocks >>>> >>>> $ df --block-size=1 /dev/mapper/datavg-datalogslv >>>> Filesystem 1B-blocks Used Available Use% Mounted on >>>> /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs >>>> >>>> >>> >> >> >> > -- Nicolas MICHEL From be.nicolas.michel at gmail.com Tue Sep 17 06:34:26 2013 From: be.nicolas.michel at gmail.com (Nicolas Michel) Date: Tue, 17 Sep 2013 08:34:26 +0200 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: References: <5237181B.1070109@redhat.com> <52373101.3060802@redhat.com> Message-ID: In fact the thing I really want to achieve is to be able to find the values and the algorithm that enable me to reproduce the percentage given by df (and to understand deeply what it means). Why do I need it? Because I'm trying to write some script to do capacity planning and space problem forecast. Currently I don't really know which values I should use to do it. (I could use the percentage given by df but it lacks some precisions to make usefull forecasts) 2013/9/17 Nicolas Michel : > OK. Thanks for the journal information. I thought tune2fs -l and > dumpe2fs were the same. In reality it's almost the same but not > entirely ^^ > > I hear you about all the internal mecanisms that make the FS working > or give it some features, and I do understand that it takes some place > on the disk. However what I don't understand is why the number given > in the "available column" is called "available" if it's not really the > case and we have to remove some other thousand/million of bytes for > some internal mecanisms. > > In other words I don't understand why the "used" percentage given by > df does not reflects the values given by itself in the other columns. > > I can live with it but I really would like to understand why things > are what they are. Is there an historic reason? Or maybe a technical > reason that makes thoses numbers some added values? > > The least would be to have the df algorithms documented somewhere? A > document that explains intentions and how the values are obtained. > The same for tune2fs and dumpe2fs (what really means the given numbers?) > > 2013/9/16 Eric Sandeen : >> On 9/16/13 9:44 AM, Nicolas Michel wrote: >>> Thanks for you help. I also tried adding some other informations as you suggest: >>> I can also take into account: >>> - "Reserved block count: XXXXXXX" from tune2fs that gives me the >>> number of blocks reserved for root >>> - Reserved GDT blocks: XXX >>> >>> But I didn't thought about the FS journal. How can I gather >>> information about it? (it's size and any other information?) >> >> # dumpe2fs /dev/$YOUR_DEVICE | grep Journal >> dumpe2fs 1.41.12 (17-May-2010) >> Journal inode: 8 >> Journal backup: inode blocks >> Journal features: journal_incompat_revoke >> Journal size: 128M >> Journal length: 32768 >> >> But you also need to take into account inode tables, inode >> allocation bitmaps, block allocation bitmaps ... >> >> -Eric >> >>> 2013/9/16 Eric Sandeen : >>>> On 9/16/13 5:16 AM, Nicolas Michel wrote: >>>>> Hello guys, >>>>> >>>>> I have some difficulties to understand what really are the numbers >>>>> behing "df" and tune2fs. You'll find the output of tune2fs and df >>>>> below, on which my maths are based. >>>>> >>>>> Here are my maths: >>>>> >>>>> A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It >>>>> also tell me that the size of one block is 4096 (bytes if I'm not >>>>> wrong?). So my maths tell me that the disk is 3284992 * 4096 = >>>>> 13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB. >>>>> >>>>> A df --block-size=1 on the same FS tell me the disk is 13243846656 >>>>> which is 211480576 bytes smaller than what tune2fs tell me. >>>> >>>> By default, df on extN assumes that metadata used by the filesystem >>>> was never available for your use and is not part of the filesystem >>>> space. >>>> >>>> Documentation/filesystems/ext3.txt says: >>>> >>>> bsddf (*) Make 'df' act like BSD. >>>> minixdf Make 'df' act like Minix. >>>> >>>> which is pretty unhelpful I suppose. ;) >>>> >>>> The mount man page is a little more helpful: >>>> >>>> bsddf|minixdf >>>> Set the behaviour for the statfs system call. The minixdf >>>> behaviour is to return in the f_blocks field the total number >>>> of blocks of the filesystem, while the bsddf behaviour (which >>>> is the default) is to subtract the overhead blocks used by the >>>> ext2 filesystem and not available for file storage. >>>> >>>> You're seeing the latter behavior. if you mount with -o minixdf you should >>>> see what you expect. (Too bad there's no "linuxdf?") :) >>>> >>>>> In gigabytes, it means: >>>>> * for df, the disk is 12.33 GB >>>>> * for tune2fs, the disk is 12.53 GB >>>>> >>>>> I thought that maybe df is only taking into account the real blocks >>>>> available for users. So I tried to remove the reserved blocks and the >>>>> GDT blocks: >>>>> (3284992 - 164249 - 801) * 4096 = 12779282432 >>>>> or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ... >>>> >>>> you're on the right track, but you forgot the journal space, all the >>>> preallocated inode table blocks, etc. >>>> >>>> -Eric >>>> >>>>> My last thought was that "Reserved block" in tune2fs was not only the >>>>> reserved blocks for root (which is 5% per default on my system) but >>>>> take into account all other reserved blocks fo the fs internal usage. >>>>> So: >>>>> (3284992 - 164249) * 4096 = 12782563328 >>>>> In GB : 11.90 Gb (the difference is not significative with a precision of 2. >>>>> >>>>> So I'm lost ... >>>>> >>>>> Is someone have an explanation? I would really really be grateful. >>>>> Nicolas >>>>> >>>>> ------------------------------ >>>>> --------- >>>>> >>>>> Here is the output of df and tune2fs : >>>>> >>>>> $ tune2fs -l /dev/mapper/datavg-datalogslv >>>>> tune2fs 1.41.9 (22-Aug-2009) >>>>> Filesystem volume name: >>>>> Last mounted on: >>>>> Filesystem UUID: 4e5bea3e-3e61-4fc8-9676-e5177522911c >>>>> Filesystem magic number: 0xEF53 >>>>> Filesystem revision #: 1 (dynamic) >>>>> Filesystem features: has_journal ext_attr resize_inode dir_index >>>>> filetype needs_recovery sparse_super large_file >>>>> Filesystem flags: unsigned_directory_hash >>>>> Default mount options: (none) >>>>> Filesystem state: clean >>>>> Errors behavior: Continue >>>>> Filesystem OS type: Linux >>>>> Inode count: 822544 >>>>> Block count: 3284992 >>>>> Reserved block count: 164249 >>>>> Free blocks: 3109325 >>>>> Free inodes: 822348 >>>>> First block: 0 >>>>> Block size: 4096 >>>>> Fragment size: 4096 >>>>> Reserved GDT blocks: 801 >>>>> Blocks per group: 32768 >>>>> Fragments per group: 32768 >>>>> Inodes per group: 8144 >>>>> Inode blocks per group: 509 >>>>> Filesystem created: Wed Aug 28 08:30:10 2013 >>>>> Last mount time: Wed Sep 11 17:16:56 2013 >>>>> Last write time: Thu Sep 12 09:38:02 2013 >>>>> Mount count: 18 >>>>> Maximum mount count: 27 >>>>> Last checked: Wed Aug 28 08:30:10 2013 >>>>> Check interval: 15552000 (6 months) >>>>> Next check after: Mon Feb 24 07:30:10 2014 >>>>> Reserved blocks uid: 0 (user root) >>>>> Reserved blocks gid: 0 (group root) >>>>> First inode: 11 >>>>> Inode size: 256 >>>>> Required extra isize: 28 >>>>> Desired extra isize: 28 >>>>> Journal inode: 8 >>>>> Default directory hash: half_md4 >>>>> Directory Hash Seed: ad2251a9-ac33-4e5e-b933-af49cb4f2bb3 >>>>> Journal backup: inode blocks >>>>> >>>>> $ df --block-size=1 /dev/mapper/datavg-datalogslv >>>>> Filesystem 1B-blocks Used Available Use% Mounted on >>>>> /dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680 5% /logs >>>>> >>>>> >>>> >>> >>> >>> >> > > > > -- > Nicolas MICHEL -- Nicolas MICHEL From sandeen at redhat.com Wed Sep 18 15:25:48 2013 From: sandeen at redhat.com (Eric Sandeen) Date: Wed, 18 Sep 2013 10:25:48 -0500 Subject: Numbers behind "df" and "tune2fs" In-Reply-To: References: <5237181B.1070109@redhat.com> <52373101.3060802@redhat.com> Message-ID: <5239C5FC.8010405@redhat.com> On 9/17/13 1:34 AM, Nicolas Michel wrote: > In fact the thing I really want to achieve is to be able to find the > values and the algorithm that enable me to reproduce the percentage > given by df (and to understand deeply what it means). > > Why do I need it? Because I'm trying to write some script to do > capacity planning and space problem forecast. Currently I don't really > know which values I should use to do it. (I could use the percentage > given by df but it lacks some precisions to make usefull forecasts) > If you want "the truth" just mount -o minixdf, tune2fs to 0 blocks reserved, and you'll get the actual number of blocks contained in the filesystem, the actual number of blocks used, and the actual blocks free. Why extN made it so complicated, I don't really know. If you want to see how the sausage is made, look at ext3_statfs() for all the hairy calculations. (ext4_statfs() is even more complex). Until recently, it was all complicated enough that even the kernel code got it wrong. ;) 0875a2b448fcaba67010850cf9649293a5ef653d ext4: include journal blocks in df overhead calcs b72f78cb63fb595af63fc781dced0a6fd354e572 ext4: fix overhead calculations in ext4_stats, again 952fc18ef9ec707ebdc16c0786ec360295e5ff15 ext4: fix overhead calculation used by ext4_statfs() ... -Eric