From jemf at gabcmt.eb.mil.br Sun Jan 1 21:54:40 2006 From: jemf at gabcmt.eb.mil.br (JEMF) Date: Sun, 01 Jan 2006 19:54:40 -0200 Subject: Questions about partitioning and ext3 Message-ID: Hello all! I have a 512 MB Kingston flash disk. When I try to create a partition with 460 MB (471.040 KB), the partition is created with 460.6 MB (471.665 KB). ------------------------------------------------ Device Boot Start End Blocks Id System /dev/sdb1 1 951 471665 83 Linux ------------------------------------------------ Why? Geometry? 2nd Question: When I format the partition with ext3, the df -k command returns: ------------------------------------------------ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 456730 8239 424908 2% /mnt ------------------------------------------------ I think 8239 KB (8.05 MB) was used by journal. But the amount of blocks decreased after formatted (471665 to 456730). Why? Thanks. From daniel at rimspace.net Mon Jan 2 00:11:37 2006 From: daniel at rimspace.net (Daniel Pittman) Date: Mon, 02 Jan 2006 11:11:37 +1100 Subject: Questions about partitioning and ext3 References: Message-ID: <87lkxzo5dy.fsf@rimspace.net> JEMF writes: > I have a 512 MB Kingston flash disk. When I try to create a partition > with 460 MB (471.040 KB), the partition is created with 460.6 MB > (471.665 KB). > > ------------------------------------------------ > Device Boot Start End Blocks Id System > /dev/sdb1 1 951 471665 83 Linux > ------------------------------------------------ > > Why? Geometry? Yup; when the flash device pretends to have geometry so that it doesn't confuse software that still lives in DOS land, it caused that. > 2nd Question: > > When I format the partition with ext3, the df -k command returns: > > ------------------------------------------------ > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sdb1 456730 8239 424908 2% /mnt > ------------------------------------------------ > > I think 8239 KB (8.05 MB) was used by journal. But the amount of blocks > decreased after formatted (471665 to 456730). Why? The difference, of around 23,000 blocks, is five percent of the available space on the filesystem. The default reserved block count for root is five percent... Daniel From jemf at gabcmt.eb.mil.br Mon Jan 2 01:35:51 2006 From: jemf at gabcmt.eb.mil.br (JEMF) Date: Sun, 01 Jan 2006 23:35:51 -0200 Subject: Questions about partitioning and ext3 In-Reply-To: <87lkxzo5dy.fsf@rimspace.net> References: <87lkxzo5dy.fsf@rimspace.net> Message-ID: Daniel Pittman escreveu: > JEMF writes: >>Why? Geometry? > > Yup; when the flash device pretends to have geometry so that it doesn't > confuse software that still lives in DOS land, it caused that. How the system calculate this geometry? >>I think 8239 KB (8.05 MB) was used by journal. But the amount of blocks >>decreased after formatted (471665 to 456730). Why? > > The difference, of around 23,000 blocks, is five percent of the > available space on the filesystem. No! The difference is 14935 blocks! I mentioned in the previous message the difference between sizes of the unformatted partition and formated partition. Can you help me again? Thanks! From daniel at rimspace.net Mon Jan 2 02:43:32 2006 From: daniel at rimspace.net (Daniel Pittman) Date: Mon, 02 Jan 2006 13:43:32 +1100 Subject: Questions about partitioning and ext3 References: <87lkxzo5dy.fsf@rimspace.net> Message-ID: <878xtznycr.fsf@rimspace.net> JEMF writes: > Daniel Pittman escreveu: >> JEMF writes: >>>Why? Geometry? >> Yup; when the flash device pretends to have geometry so that it doesn't >> confuse software that still lives in DOS land, it caused that. > > How the system calculate this geometry? Basically, magic. Seriously, there are a bunch of heuristics, or it can come from the DOS partition table, or from the BIOS, but it really is pretty much just invented in (hopefully) the same way that DOS-ish operating systems and the BIOS will do, so they also work. >>>I think 8239 KB (8.05 MB) was used by journal. But the amount of blocks >>>decreased after formatted (471665 to 456730). Why? >> The difference, of around 23,000 blocks, is five percent of the >> available space on the filesystem. > > No! The difference is 14935 blocks! I mentioned in the previous message > the difference between sizes of the unformatted partition and formated > partition. You are right -- my mental math is broken this morning. I approximated five percent in my head, then managed to miss-subtract the two numbers to make them match. How embarrassing. :/ My expectation would be that the difference is caused by filesystem meta-data, such as inode allocation tables, which consume some storage space on the raw device, but are not available for file storage. The ext3 file system uses fixed size and location tables, so that space is consumed irregardless of the space used for files. Daniel From bunk at stusta.de Mon Jan 2 16:09:39 2006 From: bunk at stusta.de (Adrian Bunk) Date: Mon, 2 Jan 2006 17:09:39 +0100 Subject: 2.6.15-rc6 OOPS In-Reply-To: <20051224200336.GF12561@kmv.ru> References: <20051224200336.GF12561@kmv.ru> Message-ID: <20060102160939.GG17398@stusta.de> On Sat, Dec 24, 2005 at 11:03:36PM +0300, Andrey J. Melnikoff (TEMHOTA) wrote: > Hello. Hi Andrey, > Please, CC me, i'm not subscribed. > > Kernel 2.6.15-rc6 OOPS: > > kernel: general protection fault: 0000 [#1] > kernel: SMP > kernel: Modules linked in: ipt_REDIRECT ipt_LOG ipt_TOS ipt_TCPMSS ipt_tos > ip_nat_ftp ipt_tcpmss iptable_nat ip_nat iptable_mangle iptable_filter > ipt_multiport ipt_mac ipt_state ipt_limit ipt_conntrack ip_conntrack_ftp > ip_conntrack ip_tables af_packet ipv6 pcspkr floppy i2c_piix4 i2c_core > ohci_hcd usbcore aic7xxx scsi_transport_spi psmouse ide_disk ide_cd > cdrom genrtc > kernel: CPU: 0 > kernel: EIP: 0060:[] Not tainted VLI > kernel: EFLAGS: 00010286 (2.6.15-rc6) > kernel: EIP is at ext3_find_entry+0x18f/0x3e0 > kernel: eax: ffffffff ebx: 00010001 ecx: 00000002 edx: 00000000 > kernel: esi: 00000000 edi: ffffffff ebp: 00000000 esp: f71b9d60 > kernel: ds: 007b es: 007b ss: 0068 > kernel: Process smbd (pid: 2999, threadinfo=f71b8000 task=f7aee530) > kernel: Stack: 00000000 f71b9db8 00000000 00000027 000005b4 ffffffff f71a62e8 00000000 > kernel: f71b9ea8 00001000 f71a636c 00000001 00000001 00010001 00000001 00000000 > kernel: 00000000 00000000 f7caf400 f71b9df0 f71503d4 ffffffff 00000000 f7159c68 > kernel: Call Trace: > kernel: [] memcpy_toiovec+0x29/0x50 > kernel: [] ext3_lookup+0x3a/0xc0 > kernel: [] real_lookup+0xae/0xd0 > kernel: [] do_lookup+0x85/0x90 > kernel: [] __link_path_walk+0x7ef/0xdd0 > kernel: [] link_path_walk+0x4e/0xd0 > kernel: [] path_lookup+0x9f/0x170 > kernel: [] __user_walk+0x2f/0x60 > kernel: [] vfs_stat+0x1d/0x60 > kernel: [] sys_stat64+0xf/0x30 > kernel: [] sys_gettimeofday+0x21/0x60 > kernel: [] syscall_call+0x7/0xb > kernel: Code: 07 7e 88 89 f6 8d bc 27 00 00 00 00 8b 5c 24 34 8b 44 9c 5c 43 89 > 5c 24 34 85 c0 89 44 24 14 89 44 24 54 0f 84 b7 00 00 00 89 c7 <8b> 00 a8 04 75 > 07 8b 47 0c 85 c0 75 11 8b 44 24 14 e8 fb e1 fb is this Oops in any way reproducible? If yes, does it occur in earlier kernels like 2.6.14.x? > After OOPS system work, but smbd process in 'D' state: > > kernel: smbd D 00000000 0 3000 2871 3001 2872 (NOTLB) > kernel: f71b9dbc 000005b4 000005b4 00000000 00000000 f71b9ea8 c1b70dc0 00000000 > kernel: 7fffffff c031d940 c1807400 00000000 998db100 003d099f c0300b20 f7aee530 > kernel: f7aee658 f71a63e0 f71a63e8 00000292 f7aee530 c02ba525 00000001 f7aee530 > kernel: Call Trace: > kernel: [] __down+0x75/0xe0 > kernel: [] default_wake_function+0x0/0x10 > kernel: [] __d_lookup+0xa4/0x110 > kernel: [] __down_failed+0x7/0xc > kernel: [] .text.lock.namei+0x8/0x1e6 > kernel: [] do_lookup+0x85/0x90 > kernel: [] __link_path_walk+0x7ef/0xdd0 > kernel: [] link_path_walk+0x4e/0xd0 > kernel: [] __mark_inode_dirty+0x104/0x1b0 > kernel: [] path_lookup+0x9f/0x170 > kernel: [] __user_walk+0x2f/0x60 > kernel: [] vfs_stat+0x1d/0x60 > kernel: [] __mark_inode_dirty+0x104/0x1b0 > kernel: [] current_fs_time+0x5f/0x70 > kernel: [] sys_stat64+0xf/0x30 > kernel: [] update_atime+0x52/0x90 > kernel: [] vfs_readdir+0x85/0x90 > kernel: [] dput+0x71/0x1b0 > kernel: [] mntput_no_expire+0x1b/0x70 > kernel: [] filp_close+0x3c/0x80 > kernel: [] syscall_call+0x7/0xb > > kernel: smbd D 00000000 0 3001 2871 3008 3000 (NOTLB) > kernel: f71fbdbc 000005b4 000005b4 00000000 00000000 f71fbea8 c1b70e60 00000000 > kernel: 7fffffff c031d940 c1807400 00000000 0dfc9800 003d09ad c0300b20 f7aeea30 > kernel: f7aeeb58 f71a63e0 f71a63e8 00000292 f7aeea30 c02ba525 00000001 f7aeea30 > kernel: Call Trace: > kernel: [] __down+0x75/0xe0 > kernel: [] default_wake_function+0x0/0x10 > kernel: [] __d_lookup+0xa4/0x110 > kernel: [] __down_failed+0x7/0xc > kernel: [] .text.lock.namei+0x8/0x1e6 > kernel: [] do_lookup+0x85/0x90 > kernel: [] __link_path_walk+0x7ef/0xdd0 > kernel: [] link_path_walk+0x4e/0xd0 > kernel: [] __mark_inode_dirty+0x62/0x1b0 > kernel: [] path_lookup+0x9f/0x170 > kernel: [] __user_walk+0x2f/0x60 > kernel: [] vfs_stat+0x1d/0x60 > kernel: [] __mark_inode_dirty+0x62/0x1b0 > kernel: [] current_fs_time+0x5f/0x70 > kernel: [] sys_stat64+0xf/0x30 > kernel: [] update_atime+0x52/0x90 > kernel: [] vfs_readdir+0x85/0x90 > kernel: [] dput+0x71/0x1b0 > kernel: [] mntput_no_expire+0x1b/0x70 > kernel: [] filp_close+0x3c/0x80 > kernel: [] syscall_call+0x7/0xb > > kernel: smbd D 00000000 0 3008 2871 3015 3001 (NOTLB) > kernel: f736bdbc 000005b4 000005b4 00000000 00000000 f736bea8 f79e4e00 00000000 > kernel: 7fffffff c031d940 c1807400 00000000 66f2b100 003d09bd c0300b20 f7b3b0b0 > kernel: f7b3b1d8 f71a63e0 f71a63e8 00000292 f7b3b0b0 c02ba525 00000001 f7b3b0b0 > kernel: Call Trace: > kernel: [] __down+0x75/0xe0 > kernel: [] default_wake_function+0x0/0x10 > kernel: [] __d_lookup+0xa4/0x110 > kernel: [] __down_failed+0x7/0xc > kernel: [] .text.lock.namei+0x8/0x1e6 > kernel: [] do_lookup+0x85/0x90 > kernel: [] __link_path_walk+0x7ef/0xdd0 > kernel: [] link_path_walk+0x4e/0xd0 > kernel: [] __mark_inode_dirty+0x62/0x1b0 > kernel: [] path_lookup+0x9f/0x170 > kernel: [] __user_walk+0x2f/0x60 > kernel: [] vfs_stat+0x1d/0x60 > kernel: [] __mark_inode_dirty+0x62/0x1b0 > kernel: [] current_fs_time+0x5f/0x70 > kernel: [] sys_stat64+0xf/0x30 > kernel: [] update_atime+0x52/0x90 > kernel: [] vfs_readdir+0x85/0x90 > kernel: [] dput+0x71/0x1b0 > kernel: [] mntput_no_expire+0x1b/0x70 > kernel: [] filp_close+0x3c/0x80 > kernel: [] syscall_call+0x7/0xb > > kernel: smbd D C01641BA 0 3015 2871 3036 3008 (NOTLB) > kernel: f7273f30 bfe3253c 00000000 c01641ba 00000804 00000000 00000000 0048815f > kernel: 000041c0 00000008 c1807400 00000000 66224500 003d09e6 c0300b20 f7b3bab0 > kernel: f7b3bbd8 f71a63e0 f71a63e8 00000286 f7b3bab0 c02ba525 00000001 f7b3bab0 > kernel: Call Trace: > kernel: [] cp_new_stat64+0xea/0x100 > kernel: [] __down+0x75/0xe0 > kernel: [] default_wake_function+0x0/0x10 > kernel: [] __down_failed+0x7/0xc > kernel: [] filldir64+0x0/0xf0 > kernel: [] .text.lock.readdir+0x8/0x29 > kernel: [] sys_getdents64+0x77/0xd7 > kernel: [] do_fcntl+0x16e/0x1e0 > kernel: [] syscall_call+0x7/0xb > > > Hardware: IBM eServer xSeries 330, 1Gb memory, ServeRaid 4Mx. > > Config, other data - on request. > > -- > Best regards, TEMHOTA-RIPN aka MJA13-RIPE > System Administrator. mailto:temnota at kmv.ru > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From jemf at gabcmt.eb.mil.br Thu Jan 5 12:47:25 2006 From: jemf at gabcmt.eb.mil.br (JEMF) Date: Thu, 05 Jan 2006 10:47:25 -0200 Subject: Questions about partitioning and ext3 In-Reply-To: References: Message-ID: Somebody has more ideas about this subject? JEMF escreveu: > Hello all! > > I have a 512 MB Kingston flash disk. When I try to create a partition > with 460 MB (471.040 KB), the partition is created with 460.6 MB > (471.665 KB). > > ------------------------------------------------ > Device Boot Start End Blocks Id System > /dev/sdb1 1 951 471665 83 Linux > ------------------------------------------------ > > Why? Geometry? > > 2nd Question: > > When I format the partition with ext3, the df -k command returns: > > ------------------------------------------------ > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sdb1 456730 8239 424908 2% /mnt > ------------------------------------------------ > > I think 8239 KB (8.05 MB) was used by journal. But the amount of blocks > decreased after formatted (471665 to 456730). Why? > > Thanks. From teeeelo at googlemail.com Fri Jan 6 10:54:07 2006 From: teeeelo at googlemail.com (Thilo) Date: Fri, 06 Jan 2006 11:54:07 +0100 Subject: Folder with a questionmark "?mnt1" Message-ID: Hey guys, i had a folder ~/mnt1 where a smbmount was mounted in. I browsed the subnet with smbc as normal user, than my ssh connection crashed because of disconnecting my remote wlan. :-/ Now i cannot access to mnt1, even root can't. The folder listing needs a lot of time and does not list mnt1 anymore. The midnight comander (mc) shows that folder in red color with the name ?mnt1 also after a long latency. The cpu usage is quite normal. I tried chattr -V -R -i ./mnt1 I/O failure by getting the status of ./mnt1 (Eingabe-/Ausgabefehler beim Auslesen des Status von ./mnt1) Does anyone know what happens? Thanks From teeeelo at googlemail.com Fri Jan 6 17:02:33 2006 From: teeeelo at googlemail.com (Thilo) Date: Fri, 06 Jan 2006 18:02:33 +0100 Subject: Folder with a questionmark "?mnt1" In-Reply-To: References: Message-ID: Hey, i fixed it. Sorry my samba caused that error. Anyway I have to check my filesystem because fsck.ext3 showes several failures... thats why I asked my question here in that newsgroup ... But what did that questionmark mean.... Have fun! :-) From temnota at kmv.ru Mon Jan 9 17:28:35 2006 From: temnota at kmv.ru (Andrey J. Melnikoff (TEMHOTA)) Date: Mon, 9 Jan 2006 20:28:35 +0300 Subject: 2.6.15-rc6 OOPS In-Reply-To: <20060102160939.GG17398@stusta.de> References: <20051224200336.GF12561@kmv.ru> <20060102160939.GG17398@stusta.de> Message-ID: <20060109172835.GB2724@kmv.ru> Hi Adrian Bunk! On Mon, Jan 02, 2006 at 05:09:39PM +0100, Adrian Bunk wrote next: > On Sat, Dec 24, 2005 at 11:03:36PM +0300, Andrey J. Melnikoff (TEMHOTA) wrote: > > > Please, CC me, i'm not subscribed. > > > > Kernel 2.6.15-rc6 OOPS: > > > > kernel: general protection fault: 0000 [#1] > > kernel: SMP > > kernel: Modules linked in: ipt_REDIRECT ipt_LOG ipt_TOS ipt_TCPMSS ipt_tos > > ip_nat_ftp ipt_tcpmss iptable_nat ip_nat iptable_mangle iptable_filter > > ipt_multiport ipt_mac ipt_state ipt_limit ipt_conntrack ip_conntrack_ftp > > ip_conntrack ip_tables af_packet ipv6 pcspkr floppy i2c_piix4 i2c_core > > ohci_hcd usbcore aic7xxx scsi_transport_spi psmouse ide_disk ide_cd > > cdrom genrtc > > kernel: CPU: 0 > > kernel: EIP: 0060:[] Not tainted VLI > > kernel: EFLAGS: 00010286 (2.6.15-rc6) > > kernel: EIP is at ext3_find_entry+0x18f/0x3e0 > > kernel: eax: ffffffff ebx: 00010001 ecx: 00000002 edx: 00000000 > > kernel: esi: 00000000 edi: ffffffff ebp: 00000000 esp: f71b9d60 > > kernel: ds: 007b es: 007b ss: 0068 > > kernel: Process smbd (pid: 2999, threadinfo=f71b8000 task=f7aee530) > > kernel: Stack: 00000000 f71b9db8 00000000 00000027 000005b4 ffffffff f71a62e8 00000000 > > kernel: f71b9ea8 00001000 f71a636c 00000001 00000001 00010001 00000001 00000000 > > kernel: 00000000 00000000 f7caf400 f71b9df0 f71503d4 ffffffff 00000000 f7159c68 > > kernel: Call Trace: > > kernel: [] memcpy_toiovec+0x29/0x50 > > kernel: [] ext3_lookup+0x3a/0xc0 > > kernel: [] real_lookup+0xae/0xd0 > > kernel: [] do_lookup+0x85/0x90 > > kernel: [] __link_path_walk+0x7ef/0xdd0 > > kernel: [] link_path_walk+0x4e/0xd0 > > kernel: [] path_lookup+0x9f/0x170 > > kernel: [] __user_walk+0x2f/0x60 > > kernel: [] vfs_stat+0x1d/0x60 > > kernel: [] sys_stat64+0xf/0x30 > > kernel: [] sys_gettimeofday+0x21/0x60 > > kernel: [] syscall_call+0x7/0xb > > kernel: Code: 07 7e 88 89 f6 8d bc 27 00 00 00 00 8b 5c 24 34 8b 44 9c 5c 43 89 > > 5c 24 34 85 c0 89 44 24 14 89 44 24 54 0f 84 b7 00 00 00 89 c7 <8b> 00 a8 04 75 > > 07 8b 47 0c 85 c0 75 11 8b 44 24 14 e8 fb e1 fb > > > is this Oops in any way reproducible? No. We replease server and this kernel work on new hardware. I think this is memory/hardware/overheat problem. > If yes, does it occur in earlier kernels like 2.6.14.x? Sorry for noise and long delay. -- Best regards, TEMHOTA-RIPN aka MJA13-RIPE System Administrator. mailto:temnota at kmv.ru From cpwright at cpwright.com Thu Jan 12 17:07:51 2006 From: cpwright at cpwright.com (Charles P. Wright) Date: Thu, 12 Jan 2006 12:07:51 -0500 Subject: Extended Attribute Write Performance Message-ID: <1137085671.30101.0.camel@localhost.localdomain> Hello, I'm writing an application that makes pretty extensive use of extended attributes to store file attributes on Ext2. I used a profiling tool developed by my colleague Nikolai Joukov at SUNY Stony Brook to dig a bit deeper into the performance of my application. In the course of my benchmark, there are 54247 setxattr operations during a 54 seconds. They use about 10.56 seconds of the time, which seemed to be a rather outsized performance toll to me (~40k writes took only 10% as long). After looking at the profile, 27 of those writes end up taking 7.74 seconds. That works out to roughly 286 ms per call; which seems a bit high. The workload is not memory constrained (the working set is 50MB + 5000 files). Each file has one extended attribute block that contains two attributes totaling 32 bytes. The attributes are unique (random actually), so there isn't any sharing. Can someone provide me with some intuition as to why there are so many writes that reach the disk, and why they take so long. I would expect that the operations shouldn't take much longer than a seek (on the order of 10ms, not 200+)? Charles From adilger at clusterfs.com Thu Jan 12 19:52:03 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 12 Jan 2006 12:52:03 -0700 Subject: Extended Attribute Write Performance In-Reply-To: <1137085671.30101.0.camel@localhost.localdomain> References: <1137085671.30101.0.camel@localhost.localdomain> Message-ID: <20060112195203.GD3682@schatzie.adilger.int> On Jan 12, 2006 12:07 -0500, Charles P. Wright wrote: > I'm writing an application that makes pretty extensive use of extended > attributes to store file attributes on Ext2. I used a profiling tool > developed by my colleague Nikolai Joukov at SUNY Stony Brook to dig a > bit deeper into the performance of my application. Presumably you are using ext3 and not ext2, given posting to this list? > In the course of my benchmark, there are 54247 setxattr operations > during a 54 seconds. They use about 10.56 seconds of the time, which > seemed to be a rather outsized performance toll to me (~40k writes took > only 10% as long). > > After looking at the profile, 27 of those writes end up taking 7.74 > seconds. That works out to roughly 286 ms per call; which seems a bit > high. > > The workload is not memory constrained (the working set is 50MB + 5000 > files). Each file has one extended attribute block that contains two > attributes totaling 32 bytes. The attributes are unique (random > actually), so there isn't any sharing. > > Can someone provide me with some intuition as to why there are so many > writes that reach the disk, and why they take so long. I would expect > that the operations shouldn't take much longer than a seek (on the order > of 10ms, not 200+)? I suspect the reason is that the journal is getting full and jbd is doing a full journal checkpoint because it has run out of space for new transactions. This is because using external EA blocks consume a lot of space (4kB) regardless of how small the EA is, and this can eat up the journal quickly. 54247 * 4kB = 211MB, much larger than the default 32MB (or maybe 128MB with newer e2fsprogs) journal size. Solutions to your specific problem are to use large inodes and the fast EA space ("mke2fs -j -I 256 ..." makes 256-byte inodes, 128 bytes left for EAs) and/or increasing the journal size ("mke2fs -J size=400", though even 400MB won't be enough for this test case). We implemented the large inodes + fast EAs (included in 2.6.12+ kernels) to avoid the need to do any seeking when reading/writing EAs, in addition to the benefit of not writing so much data (mostly unused) to disk. This showed a huge performance increase for Lustre metadata servers (which use EAs on every file) and also with Samba4 testing. We've run into similar problems recently with test loads that are generating a lot of dirty metadata. The real solution is to fix the jbd layer not to be so aggressive about flushing out the whole journal when it runs out of space, as this introduces gigantic latencies. It should instead only clear out a smaller amount of space in order to allow the new transaction to start and it can again do the checkpoint in the background. Not sure when we'll be able to work on that. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From adilger at clusterfs.com Fri Jan 13 06:54:17 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 12 Jan 2006 23:54:17 -0700 Subject: Extended Attribute Write Performance In-Reply-To: <1137117134.8569.7.camel@polarbear.fsl.cs.sunysb.edu> References: <1137085671.30101.0.camel@localhost.localdomain> <20060112195203.GD3682@schatzie.adilger.int> <1137117134.8569.7.camel@polarbear.fsl.cs.sunysb.edu> Message-ID: <20060113065417.GD6006@schatzie.adilger.int> On Jan 12, 2006 20:52 -0500, Charles P. Wright wrote: > On Thu, 2006-01-12 at 12:52 -0700, Andreas Dilger wrote: > > Presumably you are using ext3 and not ext2, given posting to this list? > > Actually this test case was on Ext2, not Ext3. I did a quick search for > an ext2-users list and didn't immediately see results, so I figured that > as Ext2 and Ext3 have similar EA implementations, this list would be > appropriate. There is ext2-devel at lists.sourceforge.net, which is listed in the MAINTAINERS file for ext2... You are right that the same people read both lists. > > Solutions to your specific problem are to use large inodes and the > > fast EA space ("mke2fs -j -I 256 ..." makes 256-byte inodes, 128 bytes > > left for EAs) > > Increasing the inode size to 256 bytes made a huge difference under > Ext3. The spikes that I mentioned for Ext2 also existed in Ext3, and > were eliminated by this change. My application's performance increased > by about 40%, and the standard deviations dropped from around 20% to 4%. > > However, for Ext2 it made very little difference. I still have a > handful of operations (.05%) that account for 73% of the time. I know > that Ext2 is optimized for shared attribute blocks (for the case of > ACLs). Is there something about having lots of unique attributes that > results in poor performance? There is no support for fast EAs in ext2 at this time, so it would only slow things down there because you are writing more (useless) data to disk. I honestly have no ideas about ext2 performance, as I only ever use ext3. I would suspect that some of these operations are slower because they are "stuck" with doing some extra amount of work, like reading a bitmap from disk, and the rest of the operations are going to cache. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From cwright at ic.sunysb.edu Fri Jan 13 01:52:14 2006 From: cwright at ic.sunysb.edu (Charles P. Wright) Date: Thu, 12 Jan 2006 20:52:14 -0500 Subject: Extended Attribute Write Performance In-Reply-To: <20060112195203.GD3682@schatzie.adilger.int> References: <1137085671.30101.0.camel@localhost.localdomain> <20060112195203.GD3682@schatzie.adilger.int> Message-ID: <1137117134.8569.7.camel@polarbear.fsl.cs.sunysb.edu> Andreas, Thanks for your helpful reply. On Thu, 2006-01-12 at 12:52 -0700, Andreas Dilger wrote: > On Jan 12, 2006 12:07 -0500, Charles P. Wright wrote: > > I'm writing an application that makes pretty extensive use of extended > > attributes to store file attributes on Ext2. I used a profiling tool > > developed by my colleague Nikolai Joukov at SUNY Stony Brook to dig a > > bit deeper into the performance of my application. > > Presumably you are using ext3 and not ext2, given posting to this list? Actually this test case was on Ext2, not Ext3. I did a quick search for an ext2-users list and didn't immediately see results, so I figured that as Ext2 and Ext3 have similar EA implementations, this list would be appropriate. > > In the course of my benchmark, there are 54247 setxattr operations > > during a 54 seconds. They use about 10.56 seconds of the time, which > > seemed to be a rather outsized performance toll to me (~40k writes took > > only 10% as long). > > > > After looking at the profile, 27 of those writes end up taking 7.74 > > seconds. That works out to roughly 286 ms per call; which seems a bit > > high. > > > > The workload is not memory constrained (the working set is 50MB + 5000 > > files). Each file has one extended attribute block that contains two > > attributes totaling 32 bytes. The attributes are unique (random > > actually), so there isn't any sharing. > > > > Can someone provide me with some intuition as to why there are so many > > writes that reach the disk, and why they take so long. I would expect > > that the operations shouldn't take much longer than a seek (on the order > > of 10ms, not 200+)? > > I suspect the reason is that the journal is getting full and jbd is > doing a full journal checkpoint because it has run out of space for > new transactions. This is because using external EA blocks consume > a lot of space (4kB) regardless of how small the EA is, and this can > eat up the journal quickly. 54247 * 4kB = 211MB, much larger than > the default 32MB (or maybe 128MB with newer e2fsprogs) journal size. > > Solutions to your specific problem are to use large inodes and the > fast EA space ("mke2fs -j -I 256 ..." makes 256-byte inodes, 128 bytes > left for EAs) and/or increasing the journal size ("mke2fs -J size=400", > though even 400MB won't be enough for this test case). Increasing the inode size to 256 bytes made a huge difference under Ext3. The spikes that I mentioned for Ext2 also existed in Ext3, and were eliminated by this change. My application's performance increased by about 40%, and the standard deviations dropped from around 20% to 4%. However, for Ext2 it made very little difference. I still have a handful of operations (.05%) that account for 73% of the time. I know that Ext2 is optimized for shared attribute blocks (for the case of ACLs). Is there something about having lots of unique attributes that results in poor performance? > We implemented the large inodes + fast EAs (included in 2.6.12+ kernels) > to avoid the need to do any seeking when reading/writing EAs, in addition > to the benefit of not writing so much data (mostly unused) to disk. > This showed a huge performance increase for Lustre metadata servers > (which use EAs on every file) and also with Samba4 testing. I can see why, especially on a journalled file system. Thanks, Charles From agruen at suse.de Sun Jan 15 03:12:46 2006 From: agruen at suse.de (Andreas Gruenbacher) Date: Sun, 15 Jan 2006 04:12:46 +0100 Subject: Extended Attribute Write Performance In-Reply-To: <1137117134.8569.7.camel@polarbear.fsl.cs.sunysb.edu> References: <1137085671.30101.0.camel@localhost.localdomain> <20060112195203.GD3682@schatzie.adilger.int> <1137117134.8569.7.camel@polarbear.fsl.cs.sunysb.edu> Message-ID: <200601150412.46782.agruen@suse.de> On Friday 13 January 2006 02:52, Charles P. Wright wrote: > Increasing the inode size to 256 bytes made a huge difference under > Ext3. The spikes that I mentioned for Ext2 also existed in Ext3, and > were eliminated by this change. My application's performance increased > by about 40%, and the standard deviations dropped from around 20% to 4%. > > However, for Ext2 it made very little difference. I still have a > handful of operations (.05%) that account for 73% of the time. I know > that Ext2 is optimized for shared attribute blocks (for the case of > ACLs). Is there something about having lots of unique attributes that > results in poor performance? Without fast xattrs (i.e., bigger inodes), unique attributes can consume lots of memory, you will end up writing entire blocks for each xattr change, and you may also waste a considerable amount of disk space. You already noticed that ext3 fast xattrs are much faster for small attributes, no matter if they are unique or not. Ext2 does not have fast xattr support; it most likely never will. There, the extra space is just wasted and you'll see about the same performance no matter which inode size you choose. Regards, Andreas From satimis at yahoo.com Tue Jan 17 06:31:54 2006 From: satimis at yahoo.com (Stephen Liu) Date: Tue, 17 Jan 2006 14:31:54 +0800 (CST) Subject: Mounting problem Message-ID: <20060117063154.17930.qmail@web34715.mail.mud.yahoo.com> Hi folks, For unknown cause I encounter following mounting problem; # /mnt/hda8 mount: wrong fs type, bad option, bad superblock on /dev/hda8, missing codepage or other error In some cases useful info is found in syslog - try dmesg | tail or so # dmesg | tail via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) via82cxxx: timeout while reading AC97 codec (0x9A0000) EXT3-fs: hda8: couldn't mount because of unsupported optional features (2000200). EXT3-fs: hda8: couldn't mount because of unsupported optional features (2000200) Previously this partition can be mounted without problem. I also tried adding -t ext3 without result still having the same warning. # e2fsck -f /dev/hda8 e2fsck 1.38 (30-Jun-2005) e2fsck: Filesystem revision too high while trying to open dev/hda8 The filesystem revision is apparently too high for this version of e2fsck. (Or the filesystem superblock is corrupt) The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 # e2fsck -f 8193 /dev/hda8 Usage: e2fsck [-panyrcdfvstDFSV] [-b superblock] [-B blocksize] [-I inode_buffer_blocks] [-P process_inode_size] [-l|-L bad_blocks_file] [-C fd] [-j external_journal] [-E extended-options] device Emergency help: -p Automatic repair (no questions) -n Make no changes to the filesystem -y Assume "yes" to all questions -c Check for bad blocks and add them to the badblock list -f Force checking even if filesystem is marked clean -v Be verbose -b superblock Use alternative superblock -B blocksize Force blocksize when looking for superblock -j external_journal Set location of the external journal -l bad_blocks_file Add to badblocks list -L bad_blocks_file Set badblocks list Please advise how to fix the problem. TIA B.R. SL From adilger at clusterfs.com Tue Jan 17 07:11:46 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 17 Jan 2006 00:11:46 -0700 Subject: Mounting problem In-Reply-To: <20060117063154.17930.qmail@web34715.mail.mud.yahoo.com> References: <20060117063154.17930.qmail@web34715.mail.mud.yahoo.com> Message-ID: <20060117071146.GB8009@schatzie.adilger.int> On Jan 17, 2006 14:31 +0800, Stephen Liu wrote: > # dmesg | tail > via82cxxx: timeout while reading AC97 codec (0x9A0000) > via82cxxx: timeout while reading AC97 codec (0x9A0000) > via82cxxx: timeout while reading AC97 codec (0x9A0000) > EXT3-fs: hda8: couldn't mount because of unsupported optional features > (2000200). > EXT3-fs: hda8: couldn't mount because of unsupported optional features > (2000200) It certainly looks like your disk is corrupted with "0x0200" data. I'm not sure where that would come from. Please attach output from: dd if=/dev/hda8 bs=4k count=1 | gzip -9 > /tmp/hda8-sb.gz > # e2fsck -f /dev/hda8 > e2fsck 1.38 (30-Jun-2005) > e2fsck: Filesystem revision too high while trying to open dev/hda8 > The filesystem revision is apparently too high for this version of > e2fsck. (Or the filesystem superblock is corrupt) > > The superblock could not be read or does not describe a correct ext2 > filesystem. If the device is valid and it really contains an ext2 > filesystem (and not swap or ufs or something else), then the superblock > is corrupt, and you might try running e2fsck with an alternate > superblock: > e2fsck -b 8193 I believe modern e2fsck's already try the backup superblocks automatically, but I coul dbe wrong. In any case, the number after "-b" usually depends on the size of the filesystem. For smaller filesystems (< 512MB) it is 8193 or 24576 or 8192 * {3,5,7}^n + 1. For larger filesystems it is 32768 or 98304 or 32768 * {3,5,7}^n by default. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From satimis at yahoo.com Tue Jan 17 10:59:11 2006 From: satimis at yahoo.com (Stephen Liu) Date: Tue, 17 Jan 2006 18:59:11 +0800 (CST) Subject: Mounting problem In-Reply-To: <20060117071146.GB8009@schatzie.adilger.int> Message-ID: <20060117105911.48558.qmail@web34705.mail.mud.yahoo.com> Hi Andreas, Tks for your advice. - snip - > I'm > not sure where that would come from. Please attach output from: > > dd if=/dev/hda8 bs=4k count=1 | gzip -9 > /tmp/hda8-sb.gz # dd if=/dev/hda8 bs=4k count=1 | gzip -9 > /tmp/hda8-sb.gz 1+0 records in 1+0 records out 4096 bytes transferred in 0.016240 seconds (252216 bytes/sec) - snip - > For smaller filesystems (< 512MB) it is > 8193 or 24576 or 8192 * {3,5,7}^n + 1. For larger filesystems > it is > 32768 or 98304 or 32768 * {3,5,7}^n by default. This partition is about 5~6G, if I recall correctly. Which number shall I use to test. Besides do I need to backup this partition before test? Partition /dev/hda7 has about 5.6G space available. # mount /mnt/hda7 # df -hT /mnt/hda7 Filesystem Type Size Used Avail Use% Mounted on /UNIONFS/dev/hda7 ext3 5.6G 33M 5.3G 1% /mnt/hda7 Would running # dd if=/dev/hda8 of=/dev/hda7 backup its content to /dev/hda7. Please advise. TIA Remark: there are several non-important working files on /dev/hda7. To overwrite them has no problem. The data on /dev/hda8 is about 900MB B.R. SL From satimis at yahoo.com Wed Jan 18 02:11:15 2006 From: satimis at yahoo.com (Stephen Liu) Date: Wed, 18 Jan 2006 10:11:15 +0800 (CST) Subject: Mounting problem In-Reply-To: <20060117071146.GB8009@schatzie.adilger.int> Message-ID: <20060118021115.93572.qmail@web34702.mail.mud.yahoo.com> Hi Damian and Andreas, Tks for your advice. The file is attached to this posting. B.R. SL --- Damian Menscher wrote: > On Tue, 17 Jan 2006, Stephen Liu wrote: > > >> I'm > >> not sure where that would come from. Please attach output from: > >> > >> dd if=/dev/hda8 bs=4k count=1 | gzip -9 > /tmp/hda8-sb.gz > > > > # dd if=/dev/hda8 bs=4k count=1 | gzip -9 > /tmp/hda8-sb.gz > > 1+0 records in > > 1+0 records out > > 4096 bytes transferred in 0.016240 seconds (252216 bytes/sec) > > I think Andreas meant for you to attach the /tmp/hda8-sb.gz file that > > command created. He can then analyze the file to see what went > wrong. > > Damian Menscher -------------- next part -------------- A non-text attachment was scrubbed... Name: hda8-sb.gz Type: application/x-gunzip Size: 202 bytes Desc: 3092802863-hda8-sb.gz URL: From evoltech at 2inches.com Thu Jan 19 00:16:13 2006 From: evoltech at 2inches.com (Dennis Williams) Date: Wed, 18 Jan 2006 16:16:13 -0800 (PST) Subject: ext3 fs errors 3T fs Message-ID: <20060118160515.M49352@periphery.2inches.com> Hello, I looked through the archives a bit and could not find anything relevant, if you know otherwise please point me in the right direction. I have a ~3T ext3 filesystem on linux software raid that had been behaving corectly for sometime. Not to long ago it gave the following error after trying to mount it: mount: wrong fs type, bad option, bad superblock on /dev/md0, or too many mounted file systems after a long fsck which I had to do manually I noticed the following in /var/log/messages after trying to mount again: Jan 19 09:13:11 terrorbytes kernel: EXT3-fs error (device md0): ext3_check_descriptors: Block bitmap for group 3584 not in group (block 0)! Jan 19 09:13:11 terrorbytes kernel: EXT3-fs: group descriptors corrupted ! when trying to correct again with e2fsck I get this error: e2fsck 1.34 (25-Jul-2003) Group descriptors look bad... trying backup blocks... e2fsck: Invalid argument while checking ext3 journal for /dev/md0 some more information on the system: os flavor: Suse 9.1 kernel version: 2.6.5-7.202.7-default (various suse patches applied to 2.6.5 kernel) I am not sure where to go from here, any help, experience, or references to documentation that would help me better understand the problem would be apreciated. Sincerely, Dennison Williams "And for all the good or evil, creation or destruction your living might have of accomplished, you might have just never have lived at all" -The Sleeping Beauty From adilger at clusterfs.com Thu Jan 19 10:32:42 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 19 Jan 2006 03:32:42 -0700 Subject: Mounting problem In-Reply-To: <20060118021115.93572.qmail@web34702.mail.mud.yahoo.com> References: <20060117071146.GB8009@schatzie.adilger.int> <20060118021115.93572.qmail@web34702.mail.mud.yahoo.com> Message-ID: <20060119103242.GT4124@schatzie.adilger.int> On Jan 18, 2006 10:11 +0800, Stephen Liu wrote: > Tks for your advice. The file is attached to this posting. It is pretty clear that there is some type of single-bit corruption with your disk: 000000 00000000 00000000 00000000 00000000 * 000400 0009f400 0013e0e1 0000fc8b 000ef82c 000410 0009a4c4 00000000 00000002 00000002 000420 00008000 00008000 00003dc0 43cc70d6 000430 43cc76cc 02250015 0203ef53 02000201 000440 43ad2a1a 02ed4e00 02000200 02000201 000450 02000200 0200020b 02000280 02000204 000460 02000206 02000201 2be35aee d34ae63d 000470 6eeb1bb3 c368ff68 02000200 02000200 000480 02000200 02000200 02000200 02000200 * 0004e0 02000208 02000200 02000200 5a46b3f1 0004f0 87471f23 6f6ed694 87f63eb6 02000302 000500 02000200 02000200 43ad2a1a 02000207 000510 02000208 02000209 0200020a 0200020b 000520 0200020c 0200020d 0200020e 0200020f 000530 02000210 02000211 02000212 02000213 000540 02000614 02000200 02000200 02000200 000550 02000200 02000200 02000200 02000200 * 001000 Note that all of the "02000200" bits are set for most of the superblock. I'd suspect either something bad in the controller or maybe a cable. If this is present throughout the disk then there is nothing that can be done about it, except restore from backup. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From adilger at clusterfs.com Thu Jan 19 12:26:39 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 19 Jan 2006 05:26:39 -0700 Subject: ext3 fs errors 3T fs In-Reply-To: <20060118160515.M49352@periphery.2inches.com> References: <20060118160515.M49352@periphery.2inches.com> Message-ID: <20060119122639.GW4124@schatzie.adilger.int> On Jan 18, 2006 16:16 -0800, Dennis Williams wrote: > I looked through the archives a bit and could not find anything relevant, > if you know otherwise please point me in the right direction. > > I have a ~3T ext3 filesystem on linux software raid that had been behaving > corectly for sometime. Not to long ago it gave the following error after > trying to mount it: > > mount: wrong fs type, bad option, bad superblock on /dev/md0, > or too many mounted file systems This sounds like the superblock has been overwritten. There are occasional reports from > 2TB filesystem users of similar corruption. It isn't clear if the problem exists in ext3 or if it is in the block or SCSI layer. > some more information on the system: > os flavor: Suse 9.1 > kernel version: 2.6.5-7.202.7-default (various suse patches applied to > 2.6.5 kernel) RHEL4 (2.6.9) claims support for up to 8TB filesystems. I don't know what patches they made, if any, in order to have this working. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From menscher at uiuc.edu Thu Jan 19 16:35:56 2006 From: menscher at uiuc.edu (Damian Menscher) Date: Thu, 19 Jan 2006 10:35:56 -0600 (CST) Subject: ext3 fs errors 3T fs In-Reply-To: <20060119122639.GW4124@schatzie.adilger.int> References: <20060118160515.M49352@periphery.2inches.com> <20060119122639.GW4124@schatzie.adilger.int> Message-ID: On Thu, 19 Jan 2006, Andreas Dilger wrote: > On Jan 18, 2006 16:16 -0800, Dennis Williams wrote: >> >> I have a ~3T ext3 filesystem on linux software raid that had been behaving >> corectly for sometime. Not to long ago it gave the following error after >> trying to mount it: >> >> mount: wrong fs type, bad option, bad superblock on /dev/md0, >> or too many mounted file systems > > This sounds like the superblock has been overwritten. There are occasional > reports from > 2TB filesystem users of similar corruption. It isn't clear > if the problem exists in ext3 or if it is in the block or SCSI layer. > >> some more information on the system: >> os flavor: Suse 9.1 >> kernel version: 2.6.5-7.202.7-default (various suse patches applied to >> 2.6.5 kernel) 32bit or 64bit? > RHEL4 (2.6.9) claims support for up to 8TB filesystems. I don't know what > patches they made, if any, in order to have this working. FWIW, when we first tried using a >2TB filesystem on linux (I think it was FC3 at the time), we discovered filesystem corruption once data had been written past the 2TB mark on a 32-bit machine. I'm guessing this is what you're seeing also. We have been using (and filling) >2TB filesystems on 64-bit machines (FC4 and RHEL4) for some time now without problems. Note that we didn't bother doing a detailed analysis of configurations, but rather tried a couple of variations until we found one that worked, so this could be a red herring (for those not familiar with the term, that means a clue that leads you in the wrong direction). Damian Menscher -- -=#| www.uiuc.edu/~menscher/ Ofc:(650)253-2757 |#=- -=#| The above opinions are not necessarily those of my employers. |#=- From evoltech at 2inches.com Thu Jan 19 21:25:29 2006 From: evoltech at 2inches.com (Dennis Williams) Date: Thu, 19 Jan 2006 13:25:29 -0800 (PST) Subject: ext3 fs errors 3T fs In-Reply-To: <20060118160515.M49352@periphery.2inches.com> References: <20060118160515.M49352@periphery.2inches.com> Message-ID: <20060119125736.Q66109@periphery.2inches.com> This is a 64 bit system running a 64 bit kernel. After reading through the manpage for e2fsck a bit I noticed that mke2fs can be used to determine additional superblock backups with the -n flag. Not knowing how the fs was created I assumed that the default blocksize was used. terrorbytes:~ # mke2fs -n /dev/md0 mke2fs 1.34 (25-Jul-2003) warning: 160 blocks unused. Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 403685856 inodes, 805797888 blocks 40289902 blocks (5.00%) reserved for the super user First data block=0 24591 block groups 32768 blocks per group, 32768 fragments per group 16416 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544 terrorbytes:~ # e2fsck -yb 229376 /dev/md0 The system has now been corecting errors for the past 12 hours. I hope when it finishes, it will mount without complaints. Sincerely, Dennis Williams "And for all the good or evil, creation or destruction your living might have of accomplished, you might have just never have lived at all" -From: The Sleeping Beauty On Wed, 18 Jan 2006, Dennis Williams wrote: > Hello, > I looked through the archives a bit and could not find anything relevant, > if you know otherwise please point me in the right direction. > > I have a ~3T ext3 filesystem on linux software raid that had been behaving > corectly for sometime. Not to long ago it gave the following error after > trying to mount it: > > mount: wrong fs type, bad option, bad superblock on /dev/md0, > or too many mounted file systems > > after a long fsck which I had to do manually I noticed the following in > /var/log/messages after trying to mount again: > > Jan 19 09:13:11 terrorbytes kernel: EXT3-fs error (device md0): > ext3_check_descriptors: Block bitmap for group 3584 not in group (block > 0)! > Jan 19 09:13:11 terrorbytes kernel: EXT3-fs: group descriptors corrupted ! > > when trying to correct again with e2fsck I get this error: > > e2fsck 1.34 (25-Jul-2003) > Group descriptors look bad... trying backup blocks... > e2fsck: Invalid argument while checking ext3 journal for /dev/md0 > > some more information on the system: > os flavor: Suse 9.1 > kernel version: 2.6.5-7.202.7-default (various suse patches applied to > 2.6.5 kernel) > > I am not sure where to go from here, any help, experience, or references > to documentation that would help me better understand the problem would be > apreciated. > > Sincerely, > Dennison Williams > > "And for all the good or evil, creation or destruction > your living might have of accomplished, you might have > just never have lived at all" > -The Sleeping Beauty > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > From satimis at yahoo.com Fri Jan 20 15:45:54 2006 From: satimis at yahoo.com (Stephen Liu) Date: Fri, 20 Jan 2006 23:45:54 +0800 (CST) Subject: Mounting problem In-Reply-To: <20060119103242.GT4124@schatzie.adilger.int> Message-ID: <20060120154554.79769.qmail@web34712.mail.mud.yahoo.com> Hi Andreas, Tks for your advice. > It is pretty clear that there is some type of > single-bit corruption with your disk: - snip - > I'd suspect either something bad in the controller > or maybe a cable. I recall once the screen hanged compelling me to press a hard-reboot. Later I found out the cause was due to the bad contact of the power cable of HD. I got the problem fix after running; # fcsk.ext3 -b 32768 /dev/hda8 and answering several questions. Now partition /dev/hda8 is now working. Tks again for your assistance. B.R. SL From evoltech at 2inches.com Fri Jan 20 17:22:03 2006 From: evoltech at 2inches.com (Dennis Williams) Date: Fri, 20 Jan 2006 09:22:03 -0800 (PST) Subject: ext3 fs errors 3T fs In-Reply-To: References: Message-ID: <20060120091159.H84552@periphery.2inches.com> > > The system has now been corecting errors for the past 12 hours. I hope > > when it finishes, it will mount without complaints. > > Never belive fsck here. It may check heavy corrupted filesystems serval DAYS. > For me (corrupted 120 Gb ext3 partition) "fsck.ext3 -y" work 3 days before i > interrupt it. In manual mode, avoid 'duplicate inode clone' and answer yes to > 'delete file' - only 30 minutes. > Just out of morbid curiosity what does 'duplicate inode clone' mean? And how does the fs get in that state? The fsck finished this morning with the following final statements: /dev/md0: ***** FILE SYSTEM WAS MODIFIED ***** /dev/md0: ********** WARNING: Filesystem still has errors ********** /dev/md0: 1472505/403685856 files (10.3% non-contiguous), 673983041/805797888 blocks 1) Why would the fs still have errors? Is it correct to assume that running fsck again is the answer? (I hope so) 2) What does the last line of this message mean? I did notice that the fs mounted correctly after this with the following errors in /var/log/messages: Jan 21 02:09:48 terrorbytes kernel: kjournald starting. Commit interval 5 seconds Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): ext3_clear_journal_err: Marking fs in need of filesystem check. Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning: mounting unchecked fs, running e2fsck is recommended Jan 21 02:09:48 terrorbytes kernel: EXT3 FS on md0, internal journal Jan 21 02:09:48 terrorbytes kernel: EXT3-fs: mounted filesystem with ordered data mode. after unmounting the filesystem, I ran a standard fsck again: terrorbytes:~ # e2fsck /dev/md0 e2fsck 1.34 (25-Jul-2003) /dev/md0 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Thank you to everyone who has responded to my posts with thier suggestions. Sincerely, Dennison Williams From evoltech at 2inches.com Sat Jan 21 07:07:07 2006 From: evoltech at 2inches.com (Dennis Williams) Date: Fri, 20 Jan 2006 23:07:07 -0800 (PST) Subject: ext3 fs errors 3T fs In-Reply-To: <20060120091159.H84552@periphery.2inches.com> References: <20060120091159.H84552@periphery.2inches.com> Message-ID: <20060120225709.A95489@periphery.2inches.com> Hello, After the fsck finished this evening there were no final statements refering to problems. I remounted the filesystem without any errors. After noticing that there were a number of files missing, I started to attempt to recover from the lost+found directory. I was repeatedly able to get the the filesystem to error and remount read only when find traversed a specific directory in lost+found. This is the error message I recieved from /var/log/messages: Jan 21 16:00:26 terrorbytes kernel: EXT3-fs error (device md0): ext3_readdir: bad entry in directory #73117155: directory entry across blocks - offset=0, inode=0, rec_len=8196, name_len=84 Jan 21 16:00:26 terrorbytes kernel: Aborting journal on device md0. Jan 21 16:00:26 terrorbytes kernel: ext3_abort called. Jan 21 16:00:26 terrorbytes kernel: EXT3-fs abort (device md0): ext3_journal_start: Detected aborted journal Jan 21 16:00:26 terrorbytes kernel: Remounting filesystem read-only 1) Can someone explain what this means, and or why it might happen? 2) Why this condition might exist even after a succesfull fsck? I am planning on running a fsck yet again. Sincerely, Dennis Williams On Fri, 20 Jan 2006, Dennis Williams wrote: > > > > The system has now been corecting errors for the past 12 hours. I hope > > > when it finishes, it will mount without complaints. > > > > Never belive fsck here. It may check heavy corrupted filesystems serval DAYS. > > For me (corrupted 120 Gb ext3 partition) "fsck.ext3 -y" work 3 days before i > > interrupt it. In manual mode, avoid 'duplicate inode clone' and answer yes to > > 'delete file' - only 30 minutes. > > > > Just out of morbid curiosity what does 'duplicate inode clone' mean? And > how does the fs get in that state? > > The fsck finished this morning with the following final statements: > > /dev/md0: ***** FILE SYSTEM WAS MODIFIED ***** > > /dev/md0: ********** WARNING: Filesystem still has errors ********** > > /dev/md0: 1472505/403685856 files (10.3% non-contiguous), > 673983041/805797888 blocks > > 1) Why would the fs still have errors? Is it correct to assume that > running fsck again is the answer? (I hope so) > > 2) What does the last line of this message mean? > > I did notice that the fs mounted correctly after this with the following > errors in /var/log/messages: > > Jan 21 02:09:48 terrorbytes kernel: kjournald starting. Commit interval 5 > seconds > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > ext3_clear_journal_err: Filesystem error recorded from previous mount: IO > failure > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > ext3_clear_journal_err: Marking fs in need of filesystem check. > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning: mounting unchecked > fs, running e2fsck is recommended > Jan 21 02:09:48 terrorbytes kernel: EXT3 FS on md0, internal journal > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs: mounted filesystem with > ordered data mode. > > after unmounting the filesystem, I ran a standard fsck again: > terrorbytes:~ # e2fsck /dev/md0 > e2fsck 1.34 (25-Jul-2003) > /dev/md0 contains a file system with errors, check forced. > Pass 1: Checking inodes, blocks, and sizes > > Thank you to everyone who has responded to my posts with thier > suggestions. > > Sincerely, > Dennison Williams > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users > From adilger at clusterfs.com Sun Jan 22 19:25:25 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Sun, 22 Jan 2006 12:25:25 -0700 Subject: ext3 fs errors 3T fs In-Reply-To: <20060120225709.A95489@periphery.2inches.com> References: <20060120091159.H84552@periphery.2inches.com> <20060120225709.A95489@periphery.2inches.com> Message-ID: <20060122192525.GL4124@schatzie.adilger.int> On Jan 20, 2006 23:07 -0800, Dennis Williams wrote: > After the fsck finished this evening there were no final statements > refering to problems. I remounted the filesystem without any errors. > After noticing that there were a number of files missing, I started to > attempt to recover from the lost+found directory. I was repeatedly able > to get the the filesystem to error and remount read only when find > traversed a specific directory in lost+found. This is the error message I > recieved from /var/log/messages: > > Jan 21 16:00:26 terrorbytes kernel: EXT3-fs error (device md0): > ext3_readdir: bad entry in directory #73117155: directory entry across > blocks - offset=0, inode=0, rec_len=8196, name_len=84 > Jan 21 16:00:26 terrorbytes kernel: Aborting journal on device md0. > Jan 21 16:00:26 terrorbytes kernel: ext3_abort called. > Jan 21 16:00:26 terrorbytes kernel: EXT3-fs abort (device md0): > ext3_journal_start: Detected aborted journal > Jan 21 16:00:26 terrorbytes kernel: Remounting filesystem read-only > > 1) Can someone explain what this means, and or why it might happen? > 2) Why this condition might exist even after a succesfull fsck? In case it wasn't clear before (I thought it was) you are having problems because this fs is > 2TB. Why, I'm not sure - it may relate to LVM/MD, it may be the block layer, or it may be an ext3 bug. The fact that it is at 2TB makes it seem like a block layer bug or lower. I would start by making a backup if you haven't already. I think debugging it would be easiest if you had a backup and were willing to overwrite the device with a test pattern. If you can isolate the corruptionto a single file or dir, you may get some insight into the problem by running filefrag on it (or "stat {path}" in debugfs. > I am planning on running a fsck yet again. Won't prevent problems from recurring. > > Sincerely, > Dennis Williams > > On Fri, 20 Jan 2006, Dennis Williams wrote: > > > > > > > The system has now been corecting errors for the past 12 hours. I hope > > > > when it finishes, it will mount without complaints. > > > > > > Never belive fsck here. It may check heavy corrupted filesystems serval DAYS. > > > For me (corrupted 120 Gb ext3 partition) "fsck.ext3 -y" work 3 days before i > > > interrupt it. In manual mode, avoid 'duplicate inode clone' and answer yes to > > > 'delete file' - only 30 minutes. > > > > > > > Just out of morbid curiosity what does 'duplicate inode clone' mean? And > > how does the fs get in that state? > > > > The fsck finished this morning with the following final statements: > > > > /dev/md0: ***** FILE SYSTEM WAS MODIFIED ***** > > > > /dev/md0: ********** WARNING: Filesystem still has errors ********** > > > > /dev/md0: 1472505/403685856 files (10.3% non-contiguous), > > 673983041/805797888 blocks > > > > 1) Why would the fs still have errors? Is it correct to assume that > > running fsck again is the answer? (I hope so) > > > > 2) What does the last line of this message mean? > > > > I did notice that the fs mounted correctly after this with the following > > errors in /var/log/messages: > > > > Jan 21 02:09:48 terrorbytes kernel: kjournald starting. Commit interval 5 > > seconds > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > > ext3_clear_journal_err: Filesystem error recorded from previous mount: IO > > failure > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > > ext3_clear_journal_err: Marking fs in need of filesystem check. > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning: mounting unchecked > > fs, running e2fsck is recommended > > Jan 21 02:09:48 terrorbytes kernel: EXT3 FS on md0, internal journal > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs: mounted filesystem with > > ordered data mode. > > > > after unmounting the filesystem, I ran a standard fsck again: > > terrorbytes:~ # e2fsck /dev/md0 > > e2fsck 1.34 (25-Jul-2003) > > /dev/md0 contains a file system with errors, check forced. > > Pass 1: Checking inodes, blocks, and sizes > > > > Thank you to everyone who has responded to my posts with thier > > suggestions. > > > > Sincerely, > > Dennison Williams > > > > _______________________________________________ > > Ext3-users mailing list > > Ext3-users at redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From rkimber at ntlworld.com Mon Jan 23 13:16:40 2006 From: rkimber at ntlworld.com (R Kimber) Date: Mon, 23 Jan 2006 13:16:40 +0000 Subject: Oops Message-ID: <20060123131640.7a7c355b.rkimber@ntlworld.com> I don't know enough about it to know whether this is a known problem (I couldn't make much sense of what I found on Google), but it seems to be a journal-related issue. Is it likely that data has been corrupted? Do I need to take any action? 2.6.12-9-amd64-k8-smp, Ubuntu 5.10, dual opteron 2GB Jan 23 03:08:22 infinity kernel: [24686.841032] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: Jan 23 03:08:22 infinity kernel: [24686.841039] {:jbd:journal_commit_transaction+2594} Jan 23 03:08:22 infinity kernel: [24686.841064] PGD 6ea20067 PUD 6ea97067 PMD 0 Jan 23 03:08:22 infinity kernel: [24686.841069] Oops: 0000 [1] SMP Jan 23 03:08:22 infinity kernel: [24686.841073] CPU 1 Jan 23 03:08:22 infinity kernel: [24686.841075] Modules linked in: ext2 binfmt_misc ipt_limit iptable_mangle ipt_LOG ipt_MASQUERADE iptable_nat ipt_TOS ipt_REJECT ip_conntrack_irc ip_conntrack_ftp ipt_state ip_conntrack iptable_filter ip_tables ipv6 pcspkr snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_via82xx gameport snd_ac97_codec snd_mpu401_uart snd_rawmidi snd_seq_device bt878 snd_bt87x snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc tuner tvaudio bttv video_buf firmware_class i2c_algo_bit v4l2_common btcx_risc tveeprom videodev nls_iso8859_1 nls_cp437 vfat fat dm_mod tsdev evdev nvidia w83627hf eeprom i2c_sensor i2c_isa i2c_viapro i2c_core rtc psmouse mousedev parport_pc lp parport sd_mod md ext3 jbd mbcache thermal processor fan ide_cd cdrom usb_storage scsi_mod ehci_hcd uhci_hcd tg3 pdc202xx_new ide_disk ide_generic via82cxxx ide_core unix vesafb capability commoncap vga16fb vgastate softcursor cfbimgblt cfbfillrect cfbcopyarea fbcon tileblit font bitblit Jan 23 03:08:22 infinity kernel: [24686.841128] Pid: 3318, comm: kjournald Tainted: P 2.6.12-9-amd64-k8-smp Jan 23 03:08:22 infinity kernel: [24686.841132] RIP: 0010:[_end+130728930/2132406272] {:jbd:journal_commit_transaction+2594} Jan 23 03:08:22 infinity kernel: [24686.841146] RSP: 0018:ffff81007cf47d88 EFLAGS: 00010286 Jan 23 03:08:22 infinity kernel: [24686.841150] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000035 Jan 23 03:08:22 infinity kernel: [24686.841155] RDX: ffff81004d06ada0 RSI: ffff810071d0bb88 RDI: ffff810071d0bb88 Jan 23 03:08:22 infinity kernel: [24686.841159] RBP: ffff81004d06ae00 R08: 0000000000000015 R09: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841162] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841166] R13: ffff810064213d40 R14: ffff81007d4c4400 R15: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841171] FS: 00002aaaadd8ebe0(0000) GS:ffffffff804286c0(0000) knlGS:0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841175] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 23 03:08:22 infinity kernel: [24686.841179] CR2: 0000000000000000 CR3: 000000006e619000 CR4: 00000000000006e0 Jan 23 03:08:22 infinity kernel: [24686.841184] Process kjournald (pid: 3318, threadinfo ffff81007cf46000, task ffff81007d6090b0) Jan 23 03:08:22 infinity kernel: [24686.841187] Stack: ffff81007d4c4424 ffff81007d4c455c 00000fd400000000 ffff810058bb002c Jan 23 03:08:22 infinity kernel: [24686.841196] 0000000002c0e4a0 ffff81007d628000 ffff81004d06ace0 0000000000001163 Jan 23 03:08:22 infinity kernel: [24686.841203] ffffffff8032c340 0000007300000003 Jan 23 03:08:22 infinity kernel: [24686.841208] Call Trace:{__wake_up+67} {:jbd:kjournald +276} Jan 23 03:08:22 infinity kernel: [24686.841260] {autoremove_wake_function+0} {autoremove_wake_function+0} Jan 23 03:08:22 infinity kernel: [24686.841283] {:jbd:commit_timeout+0} {child_rip+8} Jan 23 03:08:22 infinity kernel: [24686.841318] {:jbd:kjournald+0} {child_rip+0} Jan 23 03:08:22 infinity kernel: [24686.841350] Jan 23 03:08:22 infinity kernel: [24686.841357] Jan 23 03:08:22 infinity kernel: [24686.841358] Code: 8b 03 a8 04 74 18 8b 03 a8 04 75 07 8b 43 18 85 c0 75 e0 48 Jan 23 03:08:22 infinity kernel: [24686.841369] RIP {:jbd:journal_commit_transaction +2594} RSP Jan 23 03:08:22 infinity kernel: [24686.841382] CR2: 0000000000000000 Thanks -- Richard Kimber http://www.psr.keele.ac.uk/ From rkimber at ntlworld.com Mon Jan 23 13:16:40 2006 From: rkimber at ntlworld.com (R Kimber) Date: Mon, 23 Jan 2006 13:16:40 +0000 Subject: Oops Message-ID: <20060123131640.7a7c355b.rkimber@ntlworld.com> I don't know enough about it to know whether this is a known problem (I couldn't make much sense of what I found on Google), but it seems to be a journal-related issue. Is it likely that data has been corrupted? Do I need to take any action? 2.6.12-9-amd64-k8-smp, Ubuntu 5.10, dual opteron 2GB Jan 23 03:08:22 infinity kernel: [24686.841032] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: Jan 23 03:08:22 infinity kernel: [24686.841039] {:jbd:journal_commit_transaction+2594} Jan 23 03:08:22 infinity kernel: [24686.841064] PGD 6ea20067 PUD 6ea97067 PMD 0 Jan 23 03:08:22 infinity kernel: [24686.841069] Oops: 0000 [1] SMP Jan 23 03:08:22 infinity kernel: [24686.841073] CPU 1 Jan 23 03:08:22 infinity kernel: [24686.841075] Modules linked in: ext2 binfmt_misc ipt_limit iptable_mangle ipt_LOG ipt_MASQUERADE iptable_nat ipt_TOS ipt_REJECT ip_conntrack_irc ip_conntrack_ftp ipt_state ip_conntrack iptable_filter ip_tables ipv6 pcspkr snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_via82xx gameport snd_ac97_codec snd_mpu401_uart snd_rawmidi snd_seq_device bt878 snd_bt87x snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc tuner tvaudio bttv video_buf firmware_class i2c_algo_bit v4l2_common btcx_risc tveeprom videodev nls_iso8859_1 nls_cp437 vfat fat dm_mod tsdev evdev nvidia w83627hf eeprom i2c_sensor i2c_isa i2c_viapro i2c_core rtc psmouse mousedev parport_pc lp parport sd_mod md ext3 jbd mbcache thermal processor fan ide_cd cdrom usb_storage scsi_mod ehci_hcd uhci_hcd tg3 pdc202xx_new ide_disk ide_generic via82cxxx ide_core unix vesafb capability commoncap vga16fb vgastate softcursor cfbimgblt cfbfillrect cfbcopyarea fbcon tileblit font bitblit Jan 23 03:08:22 infinity kernel: [24686.841128] Pid: 3318, comm: kjournald Tainted: P 2.6.12-9-amd64-k8-smp Jan 23 03:08:22 infinity kernel: [24686.841132] RIP: 0010:[_end+130728930/2132406272] {:jbd:journal_commit_transaction+2594} Jan 23 03:08:22 infinity kernel: [24686.841146] RSP: 0018:ffff81007cf47d88 EFLAGS: 00010286 Jan 23 03:08:22 infinity kernel: [24686.841150] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000035 Jan 23 03:08:22 infinity kernel: [24686.841155] RDX: ffff81004d06ada0 RSI: ffff810071d0bb88 RDI: ffff810071d0bb88 Jan 23 03:08:22 infinity kernel: [24686.841159] RBP: ffff81004d06ae00 R08: 0000000000000015 R09: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841162] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841166] R13: ffff810064213d40 R14: ffff81007d4c4400 R15: 0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841171] FS: 00002aaaadd8ebe0(0000) GS:ffffffff804286c0(0000) knlGS:0000000000000000 Jan 23 03:08:22 infinity kernel: [24686.841175] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 23 03:08:22 infinity kernel: [24686.841179] CR2: 0000000000000000 CR3: 000000006e619000 CR4: 00000000000006e0 Jan 23 03:08:22 infinity kernel: [24686.841184] Process kjournald (pid: 3318, threadinfo ffff81007cf46000, task ffff81007d6090b0) Jan 23 03:08:22 infinity kernel: [24686.841187] Stack: ffff81007d4c4424 ffff81007d4c455c 00000fd400000000 ffff810058bb002c Jan 23 03:08:22 infinity kernel: [24686.841196] 0000000002c0e4a0 ffff81007d628000 ffff81004d06ace0 0000000000001163 Jan 23 03:08:22 infinity kernel: [24686.841203] ffffffff8032c340 0000007300000003 Jan 23 03:08:22 infinity kernel: [24686.841208] Call Trace:{__wake_up+67} {:jbd:kjournald +276} Jan 23 03:08:22 infinity kernel: [24686.841260] {autoremove_wake_function+0} {autoremove_wake_function+0} Jan 23 03:08:22 infinity kernel: [24686.841283] {:jbd:commit_timeout+0} {child_rip+8} Jan 23 03:08:22 infinity kernel: [24686.841318] {:jbd:kjournald+0} {child_rip+0} Jan 23 03:08:22 infinity kernel: [24686.841350] Jan 23 03:08:22 infinity kernel: [24686.841357] Jan 23 03:08:22 infinity kernel: [24686.841358] Code: 8b 03 a8 04 74 18 8b 03 a8 04 75 07 8b 43 18 85 c0 75 e0 48 Jan 23 03:08:22 infinity kernel: [24686.841369] RIP {:jbd:journal_commit_transaction +2594} RSP Jan 23 03:08:22 infinity kernel: [24686.841382] CR2: 0000000000000000 Thanks -- Richard Kimber http://www.psr.keele.ac.uk/ From evoltech at 2inches.com Mon Jan 23 17:09:23 2006 From: evoltech at 2inches.com (Dennis Williams) Date: Mon, 23 Jan 2006 09:09:23 -0800 (PST) Subject: ext3 fs errors 3T fs In-Reply-To: <20060122192525.GL4124@schatzie.adilger.int> References: <20060120091159.H84552@periphery.2inches.com> <20060120225709.A95489@periphery.2inches.com> <20060122192525.GL4124@schatzie.adilger.int> Message-ID: <20060123085621.X57575@periphery.2inches.com> > In case it wasn't clear before (I thought it was) you are having problems > because this fs is > 2TB. Why, I'm not sure - it may relate to LVM/MD, > it may be the block layer, or it may be an ext3 bug. The fact that it is > at 2TB makes it seem like a block layer bug or lower. _you_ were clear, though others lead me to believe that on a 64 bit system I should have no problem with a ext3 fs > 2T, further more there are a number of claims on the Internet that ext3 should have no problem being > 2T. http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits. That being said though I do plan on rebuilding the raid + filesystem in chunks < 2T as soon as I get additional storage setup. > If you can isolate the corruption to a single file or dir, you may get some > insight into the problem by running filefrag on it (or "stat {path}" in > debugfs. I was able to isolate the problem to 2 different directories repeatedly. Both of them were in the lost+found directory. I ran "stat {path}" in debugfs. on them but did not see any info that stood out as abnormal. When I get access to the system again, I will repost the output. > I think debugging it would be easiest if you had a backup and were > willing to overwrite the device with a test pattern. I would like to debug this situation when I get backup storage. What steps would you recommend to do this? Thanks again, to everyone who has offered suggestions. Sincerely, Dennison Williams From bladilo at rice.edu Wed Jan 25 01:12:46 2006 From: bladilo at rice.edu (Franco M. Bladilo) Date: Tue, 24 Jan 2006 19:12:46 -0600 Subject: EXT3: failed to claim external journal device. Message-ID: <43D6D08E.2060107@rice.edu> We are having problems remounting an ext3 filesystem using an external journal device. The filesystem in question was working fine until the server was rebooted. This is what we see on dmesg when trying to mount: EXT3: failed to claim external journal device. The external journal lives on a LVM2 logical volume and it seems to be accessible ( we can dumpe2fs and see filesystem information). Here's the system information and command line used to create the filesystem : SuSE SLES9 2 , kernel 2.6.5 ada718-5:/ # rpm -qa | grep e2fs e2fsprogs-1.36-6.2 ----------------------------------------- mke2fs -O journal_dev /dev/mapper/home_jou_vol_grp-home_jou 400000 mke2fs -E stride=16 -O sparse_super,dir_index -j -J device=/dev/mapper/home_jou_vol_grp-home_jou /dev/mapper/home_vol_grp-home Any ideas? Thanks in advance, -- Franco Bladilo Linux/HPCC Administrator Research Computing Support Group Rice University bladilo at rice.edu From adilger at clusterfs.com Wed Jan 25 10:05:09 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 25 Jan 2006 03:05:09 -0700 Subject: EXT3: failed to claim external journal device. In-Reply-To: <43D6D08E.2060107@rice.edu> References: <43D6D08E.2060107@rice.edu> Message-ID: <20060125100509.GJ11642@schatzie.adilger.int> On Jan 24, 2006 19:12 -0600, Franco M. Bladilo wrote: > We are having problems remounting an ext3 filesystem using an external > journal device. The filesystem in question was working fine until the > server was rebooted. > This is what we see on dmesg when trying to mount: > EXT3: failed to claim external journal device. > The external journal lives on a LVM2 logical volume and it seems to be > accessible ( we can dumpe2fs and see filesystem information). > > Here's the system information and command line used to create the > filesystem : > SuSE SLES9 2 , kernel 2.6.5 > ada718-5:/ # rpm -qa | grep e2fs > e2fsprogs-1.36-6.2 > ----------------------------------------- > mke2fs -O journal_dev /dev/mapper/home_jou_vol_grp-home_jou 400000 > mke2fs -E stride=16 -O sparse_super,dir_index -j -J > device=/dev/mapper/home_jou_vol_grp-home_jou /dev/mapper/home_vol_grp-home > > Any ideas? I believe the kernel does the journal device lookup by the device major/minor, and those are not fixed for LVM devices. Bull recently posted a patch here for mount to automatically find the correct block device for this journal UUID. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From johann.lombardi at bull.net Thu Jan 26 09:52:34 2006 From: johann.lombardi at bull.net (Johann Lombardi) Date: Thu, 26 Jan 2006 10:52:34 +0100 Subject: EXT3: failed to claim external journal device. In-Reply-To: <20060125100509.GJ11642@schatzie.adilger.int> References: <43D6D08E.2060107@rice.edu> <20060125100509.GJ11642@schatzie.adilger.int> Message-ID: <200601261052.35134.johann.lombardi@bull.net> > > Here's the system information and command line used to create the > > filesystem : > > SuSE SLES9 2 , kernel 2.6.5 > > ada718-5:/ # rpm -qa | grep e2fs > > e2fsprogs-1.36-6.2 > > ----------------------------------------- > > mke2fs -O journal_dev /dev/mapper/home_jou_vol_grp-home_jou 400000 > > mke2fs -E stride=16 -O sparse_super,dir_index -j -J > > device=/dev/mapper/home_jou_vol_grp-home_jou > > /dev/mapper/home_vol_grp-home > > > > Any ideas? > > I believe the kernel does the journal device lookup by the device > major/minor, and those are not fixed for LVM devices. If the filesystem was _cleanly_ unmounted, you can try to remove/reattach the external journal. It will update the superblock with the new major/minor numbers. You can proceed as follows: # tune2fs -f -O^has_journal /dev/mapper/home_vol_grp-home # tune2fs -J device=/dev/mapper/home_jou_vol_grp-home_jou /dev/mapper/home_vol_grp-home It will work until the journal device's major/minor numbers change again (the next reboot?). > Bull recently posted a patch here for mount to automatically find the > correct block device for this journal UUID. Actually, it was on ext2-devel: http://thread.gmane.org/gmane.comp.file-systems.ext2.devel/2950 Johann -------------- next part -------------- An embedded message was scrubbed... From: Pekka Enberg Subject: [Ext2-devel] [PATCH] ext2: return FSID for statvfs Date: Tue, 06 Dec 2005 22:22:48 +0200 Size: 5581 URL: From johann.lombardi at bull.net Thu Jan 26 10:20:22 2006 From: johann.lombardi at bull.net (Johann Lombardi) Date: Thu, 26 Jan 2006 11:20:22 +0100 Subject: EXT3: failed to claim external journal device. In-Reply-To: <200601261052.35134.johann.lombardi@bull.net> References: <43D6D08E.2060107@rice.edu> <20060125100509.GJ11642@schatzie.adilger.int> <200601261052.35134.johann.lombardi@bull.net> Message-ID: <20060126102022.GA19355@lombardij> oops, my mistake. Please do not pay attention to the attachment of my previous post. From bladilo at rice.edu Thu Jan 26 16:19:17 2006 From: bladilo at rice.edu (Franco M. Bladilo) Date: Thu, 26 Jan 2006 10:19:17 -0600 Subject: EXT3: failed to claim external journal device. In-Reply-To: <200601261052.35134.johann.lombardi@bull.net> References: <43D6D08E.2060107@rice.edu> <20060125100509.GJ11642@schatzie.adilger.int> <200601261052.35134.johann.lombardi@bull.net> Message-ID: <43D8F685.2060701@rice.edu> Johann, Andreas, Thanks for the pointers, they certainly explain the issue we were seeing. Has the mount/util-linux external journal patch been accepted ? Franco. Johann Lombardi wrote: >>>Here's the system information and command line used to create the >>>filesystem : >>>SuSE SLES9 2 , kernel 2.6.5 >>>ada718-5:/ # rpm -qa | grep e2fs >>>e2fsprogs-1.36-6.2 >>>----------------------------------------- >>>mke2fs -O journal_dev /dev/mapper/home_jou_vol_grp-home_jou 400000 >>>mke2fs -E stride=16 -O sparse_super,dir_index -j -J >>>device=/dev/mapper/home_jou_vol_grp-home_jou >>>/dev/mapper/home_vol_grp-home >>> >>>Any ideas? >>> >>> >>I believe the kernel does the journal device lookup by the device >>major/minor, and those are not fixed for LVM devices. >> >> > >If the filesystem was _cleanly_ unmounted, you can try to remove/reattach the >external journal. It will update the superblock with the new major/minor >numbers. >You can proceed as follows: ># tune2fs -f -O^has_journal /dev/mapper/home_vol_grp-home ># tune2fs -J device=/dev/mapper/home_jou_vol_grp-home_jou /dev/mapper/home_vol_grp-home > >It will work until the journal device's major/minor numbers change again >(the next reboot?). > > > >>Bull recently posted a patch here for mount to automatically find the >>correct block device for this journal UUID. >> >> > >Actually, it was on ext2-devel: >http://thread.gmane.org/gmane.comp.file-systems.ext2.devel/2950 > >Johann > > > > ------------------------------------------------------------------------ > > Subject: > [Ext2-devel] [PATCH] ext2: return FSID for statvfs > From: > Pekka Enberg > Date: > Tue, 06 Dec 2005 22:22:48 +0200 > To: > akpm at osdl.org > > To: > akpm at osdl.org > CC: > linux-kernel at vger.kernel.org, ext2-devel at lists.sourceforge.net > > >This patch changes ext2_statfs() to return a FSID based on least significant >64-bits of the 128-bit filesystem UUID. This patch is a partial fix for >Bugzilla Bug . > >Signed-off-by: Pekka Enberg >--- > > super.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > >Index: 2.6/fs/ext2/super.c >=================================================================== >--- 2.6.orig/fs/ext2/super.c >+++ 2.6/fs/ext2/super.c >@@ -1038,6 +1038,7 @@ restore_opts: > static int ext2_statfs (struct super_block * sb, struct kstatfs * buf) > { > struct ext2_sb_info *sbi = EXT2_SB(sb); >+ struct ext2_super_block *es = sbi->s_es; > unsigned long overhead; > int i; > >@@ -1052,7 +1053,7 @@ static int ext2_statfs (struct super_blo > * All of the blocks before first_data_block are > * overhead > */ >- overhead = le32_to_cpu(sbi->s_es->s_first_data_block); >+ overhead = le32_to_cpu(es->s_first_data_block); > > /* > * Add the overhead attributed to the superblock and >@@ -1073,14 +1074,16 @@ static int ext2_statfs (struct super_blo > > buf->f_type = EXT2_SUPER_MAGIC; > buf->f_bsize = sb->s_blocksize; >- buf->f_blocks = le32_to_cpu(sbi->s_es->s_blocks_count) - overhead; >+ buf->f_blocks = le32_to_cpu(es->s_blocks_count) - overhead; > buf->f_bfree = ext2_count_free_blocks(sb); >- buf->f_bavail = buf->f_bfree - le32_to_cpu(sbi->s_es->s_r_blocks_count); >- if (buf->f_bfree < le32_to_cpu(sbi->s_es->s_r_blocks_count)) >+ buf->f_bavail = buf->f_bfree - le32_to_cpu(es->s_r_blocks_count); >+ if (buf->f_bfree < le32_to_cpu(es->s_r_blocks_count)) > buf->f_bavail = 0; >- buf->f_files = le32_to_cpu(sbi->s_es->s_inodes_count); >+ buf->f_files = le32_to_cpu(es->s_inodes_count); > buf->f_ffree = ext2_count_free_inodes (sb); > buf->f_namelen = EXT2_NAME_LEN; >+ buf->f_fsid.val[0] = le32_to_cpup((void *)es->s_uuid); >+ buf->f_fsid.val[1] = le32_to_cpup((void *)es->s_uuid + sizeof(u32)); > return 0; > } > > > > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click >_______________________________________________ >Ext2-devel mailing list >Ext2-devel at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/ext2-devel > > -- Franco Bladilo Linux/HPCC Administrator Research Computing Support Group Rice University bladilo at rice.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.lombardi at bull.net Thu Jan 26 17:11:48 2006 From: johann.lombardi at bull.net (Johann Lombardi) Date: Thu, 26 Jan 2006 18:11:48 +0100 Subject: EXT3: failed to claim external journal device. In-Reply-To: <43D8F685.2060701@rice.edu> References: <43D6D08E.2060107@rice.edu> <200601261052.35134.johann.lombardi@bull.net> <43D8F685.2060701@rice.edu> Message-ID: <200601261811.48527.johann.lombardi@bull.net> > Has the mount/util-linux external journal patch been accepted ? Not at the moment. From adilger at clusterfs.com Fri Jan 27 01:03:04 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 26 Jan 2006 18:03:04 -0700 Subject: ext3 fs errors 3T fs In-Reply-To: <20060123085621.X57575@periphery.2inches.com> References: <20060120091159.H84552@periphery.2inches.com> <20060120225709.A95489@periphery.2inches.com> <20060122192525.GL4124@schatzie.adilger.int> <20060123085621.X57575@periphery.2inches.com> Message-ID: <20060127010304.GY11642@schatzie.adilger.int> On Jan 23, 2006 09:09 -0800, Dennis Williams wrote: > I was able to isolate the problem to 2 different directories repeatedly. > Both of them were in the lost+found directory. I ran "stat {path}" in > debugfs. on them but did not see any info that stood out as abnormal. > When I get access to the system again, I will repost the output. What would be of interest is the block numbers of the lost+found dir, and all of the files therein. Anything with a block number > 250M (at the 2TB = 4B sector boundary) would be of interest. > > I think debugging it would be easiest if you had a backup and were > > willing to overwrite the device with a test pattern. > > I would like to debug this situation when I get backup storage. What > steps would you recommend to do this? If possible, it would be desirable to isolate the exact operation that is causing the corruption. Since we are fairly sure it is corrupting the beginning of the filesystem (which likely aliases to just beyond the 2TB device boundary) we could do a test like the following: - do a backup of the first, say, 128kB of the device with dd - read 50MB of data at 2TB offset - compare this data - it should probably not be the same - rewrite out the 50MB of data beyond 2TB - verify that the first 128kB of data in the device did not change - do some operation on _one_ file in the lost+found - verify that the first 128kB of data does not change - run e2fsck I don't have anything else specific, just in the nature of "play around" and see what breaks. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From fk at linuxburg.de Mon Jan 30 18:38:54 2006 From: fk at linuxburg.de (Felix E. Klee) Date: Mon, 30 Jan 2006 19:38:54 +0100 Subject: df reports false size Message-ID: <200601301938.54145.fk@linuxburg.de> On a customer's machine running SuSE 9.2, the size of the occupied space on the harddisk is reported incorrectly by "df -h". After we noticed the problem, I rebooted the machine and had it checked by "e2fsck" (check forced with "tune2fs -C 40", we are not on location). Right after the reboot I proceeded as follows, but I could not find any information about the cause, and the problem is still there - see below. That the value reported by "du -shx" is close to the correct one was verified by copying the data to an identical partition on a second harddisk: On this disk "du" and "df" both reported a size of about 4 GB, and not 7.6G, which is completely off the mark. # df -h / Filesystem Size Used Avail Use% Mounted on /dev/sda1 7.6G 7.0G 216M 98% / # du -shx / 4.2G / # find / -xdev | wc -l 161021 # tune2fs -l /dev/sda1 tune2fs 1.35 (28-Feb-2004) Filesystem volume name: Last mounted on: Filesystem UUID: a3f40d6f-51be-448b-bf71-76292772fea0 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal filetype needs_recovery sparse_super Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 1005888 Block count: 2010125 Reserved block count: 100506 Free blocks: 155746 Free inodes: 744793 First block: 0 Block size: 4096 Fragment size: 4096 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16224 Inode blocks per group: 507 Filesystem created: Sat Nov 5 19:00:05 2005 Last mount time: Mon Jan 30 13:28:19 2006 Last write time: Mon Jan 30 13:28:19 2006 Mount count: 1 Maximum mount count: 39 Last checked: Mon Jan 30 13:28:19 2006 Check interval: 15552000 (6 months) Next check after: Sat Jul 29 14:28:19 2006 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 First orphan inode: 357173 Default directory hash: tea Directory Hash Seed: 59ce6d12-990c-40ad-8268-212ae9bb8291 Journal backup: inode blocks Later we also tried out the following commands - apparently sparse files or unlinked files are not to blame: # lsof -s | grep deleted isam 6354 david 0r REG 8,1 55 357173 /tmp/sh-thd-1138650835 (deleted) vmware-vm 15452 arzt 48u REG 8,1 11948032 357177 /tmp/ram0 (deleted) # df --sync -h / Filesystem Size Used Avail Use% Mounted on /dev/sda1 7.6G 7.0G 212M 98% / # du -shx --apparent-size / 3.9G . Any idea what may be the cause of the problem? -- Dipl.-Phys. Felix E. Klee Email: fk at linuxburg.de (work), felix.klee at inka.de (home) Tel: +49 721 8307937, Fax: +49 721 8307936 Linuxburg, Goethestr. 15A, 76135 Karlsruhe, Germany From adilger at clusterfs.com Mon Jan 30 23:10:36 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Mon, 30 Jan 2006 16:10:36 -0700 Subject: df reports false size In-Reply-To: <200601301938.54145.fk@linuxburg.de> References: <200601301938.54145.fk@linuxburg.de> Message-ID: <20060130231036.GR11642@schatzie.adilger.int> On Jan 30, 2006 19:38 +0100, Felix E. Klee wrote: > On a customer's machine running SuSE 9.2, the size of the occupied space on > the harddisk is reported incorrectly by "df -h". > > # df -h / > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 7.6G 7.0G 216M 98% / > # du -shx / > 4.2G / > # find / -xdev | wc -l > 161021 > # tune2fs -l /dev/sda1 > tune2fs 1.35 (28-Feb-2004) > Inode count: 1005888 > Block count: 2010125 > Reserved block count: 100506 > Free blocks: 155746 > Free inodes: 744793 > First orphan inode: 357173 The "find | wc -l" definitely does not agree with the superblock info. That reports (1005888 - 744793 = 261095) in use inodes, not 161021. If those numbers agreed, I'd suspect some space leakage (though not after an e2fsck run). With 2.6 kernels the ext3 superblock info does not get updated on disk, except at shutdown (though it would be nice to have this done at, say, statfs time). There are no EAs consuming blocks (this would be 4kB per file, so 1GB in total for 250k files). > Later we also tried out the following commands - apparently sparse files or > unlinked files are not to blame: > > # lsof -s | grep deleted > vmware-vm 15452 arzt 48u REG 8,1 11948032 > 357177 /tmp/ram0 (deleted) > isam 6354 david 0r REG 8,1 55 > 357173 /tmp/sh-thd-1138650835 (deleted) This is also the file shown in the orphan inode list, so at least it is consistent. I also wouldn't expect files to be orphaned after e2fsck. The other thing you can do is run "dumpe2fs /dev/sda1" to see what the block group descriptors report for free blocks/inodes. You'd need some scripting to add this up, but fairly easy. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. From fk at linuxburg.de Tue Jan 31 09:36:41 2006 From: fk at linuxburg.de (Felix E. Klee) Date: Tue, 31 Jan 2006 10:36:41 +0100 Subject: df reports false size In-Reply-To: <20060130231036.GR11642@schatzie.adilger.int> References: <200601301938.54145.fk@linuxburg.de> <20060130231036.GR11642@schatzie.adilger.int> Message-ID: <200601311036.41707.fk@linuxburg.de> Am Dienstag, 31. Januar 2006 00:10 schrieb Andreas Dilger: > The "find | wc -l" definitely does not agree with the superblock info. > That reports (1005888 - 744793 = 261095) in use inodes, not 161021. > If those numbers agreed, I'd suspect some space leakage (though not > after an e2fsck run). With 2.6 kernels the ext3 superblock info does > not get updated on disk, except at shutdown (though it would be nice to > have this done at, say, statfs time). > > There are no EAs consuming blocks (this would be 4kB per file, so 1GB > in total for 250k files). I understand what you're after, but what's an EA? > This is also the file shown in the orphan inode list, so at least it > is consistent. I also wouldn't expect files to be orphaned after e2fsck. > > The other thing you can do is run "dumpe2fs /dev/sda1" to see what the > block group descriptors report for free blocks/inodes. You'd need some > scripting to add this up, but fairly easy. Thanks for the hint. However, before following your suggestions, I'd like to try something else: In another ML someone mentioned that the problem could be caused by another partition mounted on a non-empty subdirectory. This sounds quite plausible, especially since we're dealing with partition containing the root directory. How do I get the complete list of files on an ext3 FS? -- Dipl.-Phys. Felix E. Klee Email: fk at linuxburg.de (work), felix.klee at inka.de (home) Tel: +49 721 8307937, Fax: +49 721 8307936 Linuxburg, Goethestr. 15A, 76135 Karlsruhe, Germany From bryan at kadzban.is-a-geek.net Tue Jan 31 11:54:40 2006 From: bryan at kadzban.is-a-geek.net (Bryan Kadzban) Date: Tue, 31 Jan 2006 06:54:40 -0500 Subject: df reports false size In-Reply-To: <200601311036.41707.fk@linuxburg.de> References: <200601301938.54145.fk@linuxburg.de> <20060130231036.GR11642@schatzie.adilger.int> <200601311036.41707.fk@linuxburg.de> Message-ID: <43DF5000.8030408@kadzban.is-a-geek.net> Felix E. Klee wrote: > In another ML someone mentioned that the problem could be caused by > another partition mounted on a non-empty subdirectory. This sounds > quite plausible, especially since we're dealing with partition > containing the root directory. How do I get the complete list of > files on an ext3 FS? find / -xdev -type f ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 256 bytes Desc: OpenPGP digital signature URL: From fk at linuxburg.de Tue Jan 31 12:06:51 2006 From: fk at linuxburg.de (Felix E. Klee) Date: Tue, 31 Jan 2006 13:06:51 +0100 Subject: df reports false size In-Reply-To: <43DF5000.8030408@kadzban.is-a-geek.net> References: <200601301938.54145.fk@linuxburg.de> <200601311036.41707.fk@linuxburg.de> <43DF5000.8030408@kadzban.is-a-geek.net> Message-ID: <200601311306.51975.fk@linuxburg.de> Am Dienstag, 31. Januar 2006 12:54 schrieb Bryan Kadzban: > find / -xdev -type f > > ? It doesn't work if files are hidden by a mounted partition: $ mkdir /tmp/foo $ touch /tmp/foo/bar $ find / -xdev | grep '^/tmp/foo/bar$' /tmp/foo/bar $ mount /dev/hdb1 /tmp/foo $ find / -xdev | grep '^/tmp/foo/bar$' [nothing found] -- Dipl.-Phys. Felix E. Klee Email: fk at linuxburg.de (work), felix.klee at inka.de (home) Tel: +49 721 8307937, Fax: +49 721 8307936 Linuxburg, Goethestr. 15A, 76135 Karlsruhe, Germany From fk at linuxburg.de Tue Jan 31 12:31:20 2006 From: fk at linuxburg.de (Felix E. Klee) Date: Tue, 31 Jan 2006 13:31:20 +0100 Subject: df reports false size In-Reply-To: <200601311306.51975.fk@linuxburg.de> References: <200601301938.54145.fk@linuxburg.de> <43DF5000.8030408@kadzban.is-a-geek.net> <200601311306.51975.fk@linuxburg.de> Message-ID: <200601311331.20929.fk@linuxburg.de> Am Dienstag, 31. Januar 2006 13:06 schrieb Felix E. Klee: > It doesn't work if files are hidden by a mounted partition: Hey I just found something cool: "debugfs". Here one can see all files on the file system, even ones that are hidden by mounted partitions. And, as it looks, this is indeed our problem: $ mount | grep ' /nfsroot' /dev/sda7 on /nfsroot type ext3 (rw,acl,user_xattr) # debugfs /dev/sda1 debugfs 1.35 (28-Feb-2004) debugfs: ls /nfsroot 519169 (12) . 2 (12) .. 519171 (4072) 9.2 Problem most likely found! Now, we need to solve it - fortunately someone is on location today. -- Dipl.-Phys. Felix E. Klee Email: fk at linuxburg.de (work), felix.klee at inka.de (home) Tel: +49 721 8307937, Fax: +49 721 8307936 Linuxburg, Goethestr. 15A, 76135 Karlsruhe, Germany From adilger at clusterfs.com Tue Jan 31 17:59:24 2006 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 31 Jan 2006 10:59:24 -0700 Subject: df reports false size In-Reply-To: <200601311331.20929.fk@linuxburg.de> References: <200601301938.54145.fk@linuxburg.de> <43DF5000.8030408@kadzban.is-a-geek.net> <200601311306.51975.fk@linuxburg.de> <200601311331.20929.fk@linuxburg.de> Message-ID: <20060131175924.GA11642@schatzie.adilger.int> On Jan 31, 2006 13:31 +0100, Felix E. Klee wrote: > Hey I just found something cool: "debugfs". Here one can see all files on the > file system, even ones that are hidden by mounted partitions. And, as it > looks, this is indeed our problem: > > $ mount | grep ' /nfsroot' > /dev/sda7 on /nfsroot type ext3 (rw,acl,user_xattr) > > # debugfs /dev/sda1 > debugfs 1.35 (28-Feb-2004) > debugfs: ls /nfsroot > 519169 (12) . 2 (12) .. 519171 (4072) 9.2 > > Problem most likely found! Now, we need to solve it - fortunately someone is > on location today. You can use "mount -t bind / /mnt" and then "/mnt/nfsroot" will be the underlying directory. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.