From ext3 at philwhite.org Wed Mar 3 00:08:20 2004 From: ext3 at philwhite.org (Phil White) Date: Tue, 02 Mar 2004 16:08:20 -0800 Subject: consistent crash with data=journal Message-ID: <404521F4.8060804@philwhite.org> I've been running into a kernel panic pretty consistently when using data=journal. This occurs during heavy IO, and is highly reproducible (only takes about 5 minutes of IO to cause it). The applications being used are MySQL, Postfix, and a mail filtering application which operates on postfix queue files using mmaped IO. Shortly before the crash, the following messages are logged: Mar 2 14:15:30 test5 kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (03:05 blocknr 3688784) Mar 2 14:16:04 test5 kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (03:05 blocknr 3616336) Mar 2 14:18:38 test5 kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (03:05 blocknr 3692808) I've duplicated it in 2.4.22 and 2.4.25. Here's the crash from 2.4.25: Assertion failure in journal_commit_transaction() at commit.c:759: "!(((bh)->b_state & (1UL << BH_Dirty)) != 0)" kernel BUG at commit.c:759! invalid operand: 0000 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010282 eax: 00000074 ebx: f7221f00 ecx: f7752000 edx: f7795f7c esi: f71a9c00 edi: 00000001 ebp: 00000000 esp: f7753e74 ds: 0018 es: 0018 ss: 0018 Process kjournald (pid: 115, stackpage=f7753000) Stack: c02a2180 c029ff8f c029ffaa 000002f7 c02a47c0 c1c3ecf4 00000000 00000e74 d77fb18c 00000000 f6dfb380 e89d3d20 00000cca d78fb100 da292700 da292680 da292600 da28b980 f65f1480 f65f1580 f65f1380 c3f9b800 ccdc6780 da4cb800 Call Trace: [] [] [] [] Code: 0f 0b f7 02 aa ff 29 c0 e9 1b fe ff ff 89 1c 24 8b 4c 24 28 >>EIP; c0174064 <===== Trace; c0176b64 Trace; c01769f0 Trace; c01072ee Trace; c0176a10 Code; c0174064 00000000 <_EIP>: Code; c0174064 <===== 0: 0f 0b ud2a <===== Code; c0174066 2: f7 02 aa ff 29 c0 testl $0xc029ffaa,(%edx) Code; c017406c 8: e9 1b fe ff ff jmp fffffe28 <_EIP+0xfffffe28> Code; c0174071 d: 89 1c 24 mov %ebx,(%esp,1) Code; c0174074 10: 8b 4c 24 28 mov 0x28(%esp,1),%ecx Any help would be greatly appreciated! Please let me know if any more info is needed. Thanks in advance, --Phil White From pasdebill at hotmail.com Wed Mar 3 22:24:33 2004 From: pasdebill at hotmail.com (bill root) Date: Wed, 3 Mar 2004 23:24:33 +0100 Subject: Ext3 problem - lost files/directorys Message-ID: Hi I work on this harddrive for 3 days now ... I had noticed errors in kernel while working on a harddrive (120 gigas) but all continue to work fine (archives/directory).. So i have unmounted the driver and try e2fsck... bad ! e2fsck 1.34 (25-Jul-2003) e2fsck: Attempt to read block from filesystem resulted in short read while trying to open /dev/hdc1 So i have tried to create 1 image of this drive with dd_rhelp and there are results : e2fsck 1.34 (25-Jul-2003) Group descriptors look bad... trying backup blocks... Superblock has a bad ext3 journal (inode 8). Clear? yes *** ext3 journal has been deleted - filesystem is now ext2 only *** /hdd1/hdc1.os was not cleanly unmounted, check forced. Pass 1: Checking inodes, blocks, and sizes Root inode is not a directory. Clear? (...) I have tried with -b but with same result When i re-create then filesystem (with -S) and e2fsck i have all files in lost+found (all with , not one file with his real name) So some questions : - There are no way to recover with filename ? - Where are stored files name / directory's name ? - why, where i try another superblocks i got same results ? My last chance are to wait the replacement harddrive, and copy it with DD But i'm affraid to got same results... I don't need to recover all files, but the maximum with names (even if files are not in real directory) And last question, there are not backup of journal stored like superblocks? I mean backup stored on harddrive ? Excuse me for my poor english :) Nota : harddrive are a very low count of badblocks, but the actual badblock seem to be on a very bad place ... From guolin at alexa.com Wed Mar 3 23:01:35 2004 From: guolin at alexa.com (Guolin Cheng) Date: Wed, 3 Mar 2004 15:01:35 -0800 Subject: heavily fragmented file system.. How to defrag it on-line?? Message-ID: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> Hi, all, I got machines running continuously for long time, but then the underlying ext3 file systems becomes quite heavily fragmented (94% non-contiguous). We just don't have a chance to shutdown the machines since they are always busy.. I tried the defrag 0.70 version comes with e2fsprog package and standalone 0.73 packages, but neither help me since the defrag tool can not handle ext3. A thrid-party commercial tool oodcmd doesn't help as well since it can only deal with idle unmounted file systems, neither can it guarantee data integrity. For that case, I mean, when the machine is booted into repair mode and file system is not in use, We can use gtar to save|restore data without data loss, so the commercial tool can not do almost no help for us. Anyone have any ideas on defraging ext3 file systems on-line? Thanks a lot. ----------------------------------------------------------------------------------------------------------------------------------------------------- The following is the defragment reported by e2fsck.. arc158.example.com guolin 135% sudo e2fsck -f -n -d /dev/hda9 e2fsck 1.27 (8-Mar-2002) Warning! /dev/hda9 is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /0: 1225/8601600 files (94.3% non-contiguous), 12724107/17181982 blocks --Guolin Cheng From pegasus at nerv.eu.org Wed Mar 3 23:22:38 2004 From: pegasus at nerv.eu.org (Jure =?UTF-8?B?UGXEjWFy?=) Date: Thu, 4 Mar 2004 00:22:38 +0100 Subject: Ext3 problem - lost files/directorys In-Reply-To: References: Message-ID: <20040304002238.04d08d33.pegasus@nerv.eu.org> On Wed, 3 Mar 2004 23:24:33 +0100 "bill root" wrote: > e2fsck 1.34 (25-Jul-2003) > Group descriptors look bad... trying backup blocks... > Superblock has a bad ext3 journal (inode 8). > Clear? yes I managed to reach that same point in completely unrelated way ... somehow irqs went foobar on my box and i did sysrq s u b ... after which my root fs was toasted. obviously some garbage was synced on the drive and all that fsck managed was to rescue a couple of files and put about 1/10 into lost+found. Most of the data was in digital heaven ... Luckily i have separate /usr /home /var partitions :) kernel 2.6.1 and fsck 1.34 -- Jure Pe?ar From adilger at clusterfs.com Thu Mar 4 02:12:40 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 3 Mar 2004 19:12:40 -0700 Subject: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> Message-ID: <20040304021240.GQ1219@schnapps.adilger.int> On Mar 03, 2004 15:01 -0800, Guolin Cheng wrote: > I got machines running continuously for long time, but then the underlying > ext3 file systems becomes quite heavily fragmented (94% non-contiguous). > We just don't have a chance to shutdown the machines since they are always busy.. > Anyone have any ideas on defraging ext3 file systems on-line? Thanks a lot. Why do you think you need to defragment? Do you notice performance loss, or is it just because of the e2fsck number? Given that you have only 1225 files and 50GB of space usage it is almost guaranteed that each file will not be contiguous. > ----------------------------------------------------------------------------- > The following is the defragment reported by e2fsck.. > > arc158.example.com guolin 135% sudo e2fsck -f -n -d /dev/hda9 > e2fsck 1.27 (8-Mar-2002) > Warning! /dev/hda9 is mounted. > Warning: skipping journal recovery because doing a read-only filesystem check. > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > /0: 1225/8601600 files (94.3% non-contiguous), 12724107/17181982 blocks Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From tytso at mit.edu Thu Mar 4 02:17:07 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Wed, 3 Mar 2004 21:17:07 -0500 Subject: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> Message-ID: <20040304021707.GA13386@thunk.org> On Wed, Mar 03, 2004 at 03:01:35PM -0800, Guolin Cheng wrote: > I got machines running continuously for long time, but then the underlying ext3 file systems becomes quite heavily fragmented (94% non-contiguous). Note that non-contiguous does not necessarily mean fragmented. Files that are larger than a block group will be non-contiguous by definition. On the other hand if you have more than one file simultaneously being written to in a directory, then yes the files will certainly get fragmented. Are you a sufficient read-performance degredation? If not, it may not be worth bothering to defrag the filesystem. > Anyone have any ideas on defraging ext3 file systems on-line? Thanks a lot. There rae no on-line defrag tools right now, sorry. - Ted From vijayan at cs.wisc.edu Thu Mar 4 02:24:28 2004 From: vijayan at cs.wisc.edu (Vijayan Prabhakaran) Date: Wed, 3 Mar 2004 20:24:28 -0600 (CST) Subject: Journal file location Message-ID: Hi, Is there any way that one can change the location of the journal file when the file system is created ? For eg., instead of using inode# 7 and the blocks at the beginning of the file system, I'd like to use an inode at a different cylinder group and some other set of blocks. Even if I'm allowed only to use one of the reserved inodes, is it possible to put the journal file somewhere in the "middle" of the file system rather than at the beginning ? Thanks, Vijayan From guolin at alexa.com Thu Mar 4 02:27:51 2004 From: guolin at alexa.com (Guolin Cheng) Date: Wed, 3 Mar 2004 18:27:51 -0800 Subject: heavily fragmented file system.. How to defrag it on-line?? Message-ID: <41089CB27BD8D24E8385C8003EDAF7AB06D29D@karl.alexa.com> Hi, Ted, Thanks for your response >> I got machines running continuously for long time, but then the underlying ext3 file systems becomes quite heavily fragmented (94% non-contiguous). > Note that non-contiguous does not necessarily mean fragmented. Files > that are larger than a block group will be non-contiguous by > definition. On the other hand if you have more than one file > simultaneously being written to in a directory, then yes the files > will certainly get fragmented. Yeah, the reading|writing speed is about 10 times slower. the original speed is about 20-40MB/s, while now it is about only 1-5MB/s. We normally write the disk to >90% full, then delete lots of files, write new files again to about >90%, then delete again. >There rae no on-line defrag tools right now, sorry. So I have to use gtar to save data, then re-create new file system, at last use gtar to copy data back? That will take long time and have to stop existing processes on the machines. Definitely that is the last solution I will consider. Are there any plans to develop a tool to defrag ext3 file systems on-line? Thanks. From guolin at alexa.com Thu Mar 4 02:36:10 2004 From: guolin at alexa.com (Guolin Cheng) Date: Wed, 3 Mar 2004 18:36:10 -0800 Subject: heavily fragmented file system.. How to defrag it on-line?? Message-ID: <41089CB27BD8D24E8385C8003EDAF7AB06D29E@karl.alexa.com> Hi, Andreas, Thanks for your quick respone. > Why do you think you need to defragment? Do you notice performance loss, or > is it just because of the e2fsck number? Given that you have only 1225 files > and 50GB of space usage it is almost guaranteed that each file will not be > contiguous. Then How can I figure out whether the files are defragmented? Because the file system's read/write speed is greatly slow down ( about 8-10 times slower in extreme cases). Can you suggest a tool|package to report file systems' defragment percentage? I tried a beta version oodcmd tool which reports both block defragment percentage and inode defragment percentage, are those enough? or there are still more defragment characteristics? I'm a little hesitate to use the commercial oodcmd tool since it can only work when file systems are unmounted and idle.. sigh.. Thanks. --Guolin Cheng From adilger at clusterfs.com Thu Mar 4 03:01:53 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 3 Mar 2004 20:01:53 -0700 Subject: Journal file location In-Reply-To: References: Message-ID: <20040304030153.GR1219@schnapps.adilger.int> On Mar 03, 2004 20:24 -0600, Vijayan Prabhakaran wrote: > Is there any way that one can change the location of the journal > file when the file system is created ? For eg., instead of using inode# 7 > and the blocks at the beginning of the file system, I'd like to use an > inode at a different cylinder group and some other set of blocks. Even if > I'm allowed only to use one of the reserved inodes, is it possible to put > the journal file somewhere in the "middle" of the file system rather than > at the beginning ? You can use any blocks in the filesystem, assuming you can set up a file as you want it. For example, if you allocate inodes until you get one in a block group of your choice, then create a large enough file there and rename it to "/.journal" in the root of that filesystem, umount, delete the old journal, and then run e2fsck on the filesystem. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From adilger at clusterfs.com Thu Mar 4 03:06:16 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 3 Mar 2004 20:06:16 -0700 Subject: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <41089CB27BD8D24E8385C8003EDAF7AB06D29E@karl.alexa.com> References: <41089CB27BD8D24E8385C8003EDAF7AB06D29E@karl.alexa.com> Message-ID: <20040304030616.GS1219@schnapps.adilger.int> On Mar 03, 2004 18:36 -0800, Guolin Cheng wrote: > Then How can I figure out whether the files are defragmented? Because the > file system's read/write speed is greatly slow down ( about 8-10 times > slower in extreme cases). I think there was a tool which would map your files for you, using the FIBMAP ioctl, but you could also use debugfs "stat" to tell you the block maps of particular files. > Can you suggest a tool|package to report file systems' defragment > percentage? I tried a beta version oodcmd tool which reports both block > defragment percentage and inode defragment percentage, are those enough? > or there are still more defragment characteristics? I'm a little > hesitate to use the commercial oodcmd tool since it can only work when > file systems are unmounted and idle.. sigh.. If you have times when filesystem is nearly empty, moving files to a temp partition and moving back would defragment. Not quite "online" but avoids need to unmount/reformat. Sorry, nothing better at this time. Andrew Morton had once implemented a simple "move block" ioctl while live, but it was never included into the kernel (could be used by a defragmenter). Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From bomb_hero at 163.com Thu Mar 4 04:50:51 2004 From: bomb_hero at 163.com (bomb) Date: Thu, 4 Mar 2004 12:50:51 +0800 Subject: Ext3-users digest, Vol 1 #1063 - 1 msg Message-ID: <200403040455.i244t3b28763@mx1.redhat.com> Hello ext3-users-request, request ======= 2004-03-03 12:00:00 You writed======= >Send Ext3-users mailing list submissions to > ext3-users at redhat.com > >To subscribe or unsubscribe via the World Wide Web, visit > https://www.redhat.com/mailman/listinfo/ext3-users >or, via email, send a message with subject or body 'help' to > ext3-users-request at redhat.com > >You can reach the person managing the list at > ext3-users-admin at redhat.com > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of Ext3-users digest..." > > >Today's Topics: > > 1. consistent crash with data=journal (Phil White) > >Message: 1 >Message-ID: <404521F4.8060804 at philwhite.org> >Date: Tue, 02 Mar 2004 16:08:20 -0800 >From: Phil White >User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 >MIME-Version: 1.0 >To: ext3-users at redhat.com >Subject: consistent crash with data=journal >Content-Type: text/plain; charset=ISO-8859-1; format=flowed >Content-Transfer-Encoding: 7bit >Sender: ext3-users-admin at redhat.com >Precedence: junk >List-Help: >List-Post: >List-Subscribe: , > >List-Id: EXT3 Users Mailing List >List-Unsubscribe: , > >List-Archive: > >I've been running into a kernel panic pretty consistently when using >data=journal. This occurs during heavy IO, and is highly reproducible >(only takes about 5 minutes of IO to cause it). The applications being >used are MySQL, Postfix, and a mail filtering application which operates >on postfix queue files using mmaped IO. > >Shortly before the crash, the following messages are logged: >Mar 2 14:15:30 test5 kernel: Unexpected dirty buffer encountered at >do_get_write_access:618 (03:05 blocknr 3688784) >Mar 2 14:16:04 test5 kernel: Unexpected dirty buffer encountered at >do_get_write_access:618 (03:05 blocknr 3616336) >Mar 2 14:18:38 test5 kernel: Unexpected dirty buffer encountered at >do_get_write_access:618 (03:05 blocknr 3692808) > >I've duplicated it in 2.4.22 and 2.4.25. Here's the crash from 2.4.25: > >Assertion failure in journal_commit_transaction() at commit.c:759: >"!(((bh)->b_state & (1UL << BH_Dirty)) != 0)" >kernel BUG at commit.c:759! >invalid operand: 0000 >CPU: 0 >EIP: 0010:[] Not tainted >Using defaults from ksymoops -t elf32-i386 -a i386 >EFLAGS: 00010282 >eax: 00000074 ebx: f7221f00 ecx: f7752000 edx: f7795f7c >esi: f71a9c00 edi: 00000001 ebp: 00000000 esp: f7753e74 >ds: 0018 es: 0018 ss: 0018 >Process kjournald (pid: 115, stackpage=f7753000) >Stack: c02a2180 c029ff8f c029ffaa 000002f7 c02a47c0 c1c3ecf4 00000000 >00000e74 > d77fb18c 00000000 f6dfb380 e89d3d20 00000cca d78fb100 da292700 >da292680 > da292600 da28b980 f65f1480 f65f1580 f65f1380 c3f9b800 ccdc6780 >da4cb800 >Call Trace: [] [] [] [] >Code: 0f 0b f7 02 aa ff 29 c0 e9 1b fe ff ff 89 1c 24 8b 4c 24 28 > > >>EIP; c0174064 <===== >Trace; c0176b64 >Trace; c01769f0 >Trace; c01072ee >Trace; c0176a10 >Code; c0174064 >00000000 <_EIP>: >Code; c0174064 <===== > 0: 0f 0b ud2a <===== >Code; c0174066 > 2: f7 02 aa ff 29 c0 testl $0xc029ffaa,(%edx) >Code; c017406c > 8: e9 1b fe ff ff jmp fffffe28 <_EIP+0xfffffe28> >Code; c0174071 > d: 89 1c 24 mov %ebx,(%esp,1) >Code; c0174074 > 10: 8b 4c 24 28 mov 0x28(%esp,1),%ecx > > >Any help would be greatly appreciated! Please let me know if any more >info is needed. > >Thanks in advance, >--Phil White > > > > > >_______________________________________________ >Ext3-users mailing list >Ext3-users at redhat.com >https://www.redhat.com/mailman/listinfo/ext3-users > > = = = = = = = = = = = = = = = = = = = = Many thanks & best regards ????????bomb ???????? ??????????2004-03-04 From tytso at mit.edu Thu Mar 4 05:34:16 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Thu, 4 Mar 2004 00:34:16 -0500 Subject: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040304030616.GS1219@schnapps.adilger.int> References: <41089CB27BD8D24E8385C8003EDAF7AB06D29E@karl.alexa.com> <20040304030616.GS1219@schnapps.adilger.int> Message-ID: <20040304053416.GA14016@thunk.org> On Wed, Mar 03, 2004 at 08:06:16PM -0700, Andreas Dilger wrote: > On Mar 03, 2004 18:36 -0800, Guolin Cheng wrote: > > Then How can I figure out whether the files are defragmented? Because the > > file system's read/write speed is greatly slow down ( about 8-10 times > > slower in extreme cases). > > I think there was a tool which would map your files for you, using the FIBMAP > ioctl, but you could also use debugfs "stat" to tell you the block maps of > particular files. E2fsprogs 1.35 comes with a program "filefrag" that will tell you how many extents a particular file has. - Ted From andrewho at animezone.org Wed Mar 3 01:30:32 2004 From: andrewho at animezone.org (Andrew Ho) Date: Tue, 02 Mar 2004 20:30:32 -0500 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040302224758.GK19111@khan.acc.umu.se> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> Message-ID: <40453538.8050103@animezone.org> XFS is the best filesystem. David Weinehall wrote: >On Tue, Mar 02, 2004 at 03:33:13PM -0700, Dax Kelson wrote: > > >>On Tue, 2004-03-02 at 09:34, Peter Nelson wrote: >> >> >>>Hans Reiser wrote: >>> >>>I'm confused as to why performing a benchmark out of cache as opposed to >>>on disk would hurt performance? >>> >>> >>My understanding (which could be completely wrong) is that reieserfs v3 >>and v4 are algorithmically more complex than ext2 or ext3. Reiserfs >>spends more CPU time to make the eventual ondisk operations more >>efficient/faster. >> >>When operating purely or mostly out of ram, the higher CPU utilization >>of reiserfs hurts performance compared to ext2 and ext3. >> >>When your system I/O utilization exceeds cache size and your disks >>starting getting busy, the CPU time previously invested by reiserfs pays >>big dividends and provides large performance gains versus more >>simplistic filesystems. >> >>In other words, the CPU penalty paid by reiserfs v3/v4 is more than made >>up for by the resultant more efficient disk operations. Reiserfs trades >>CPU for disk performance. >> >>In a nutshell, if you have more memory than you know what do to with, >>stick with ext3. If you spend all your time waiting for disk operations >>to complete, go with reiserfs. >> >> > >Or rather, if you have more memory than you know what to do with, use >ext3. If you have more CPU power than you know what to do with, use >ReiserFS[34]. > >On slower machines, I generally prefer a little slower I/O rather than >having the entire system sluggish because of higher CPU-usage. > > >Regards: David Weinehall > > From david at southpole.se Wed Mar 3 01:41:15 2004 From: david at southpole.se (David Weinehall) Date: Wed, 3 Mar 2004 02:41:15 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <40453538.8050103@animezone.org> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> Message-ID: <20040303014115.GP19111@khan.acc.umu.se> On Tue, Mar 02, 2004 at 08:30:32PM -0500, Andrew Ho wrote: > XFS is the best filesystem. Well it'd better be, it's 10 times the size of ext3, 5 times the size of ReiserFS and 3.5 times the size of JFS. And people say size doesn't matter. Regards: David Weinehall -- /) David Weinehall /) Northern lights wander (\ // Maintainer of the v2.0 kernel // Dance across the winter sky // \) http://www.acc.umu.se/~tao/ (/ Full colour fire (/ From ak at suse.de Wed Mar 3 02:39:26 2004 From: ak at suse.de (Andi Kleen) Date: 03 Mar 2004 03:39:26 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040303014115.GP19111@khan.acc.umu.se.suse.lists.linux.kernel> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> Message-ID: David Weinehall writes: > On Tue, Mar 02, 2004 at 08:30:32PM -0500, Andrew Ho wrote: > > XFS is the best filesystem. > > Well it'd better be, it's 10 times the size of ext3, 5 times the size of > ReiserFS and 3.5 times the size of JFS. I think your ext3 numbers are off, most likely you didn't include JBD. > And people say size doesn't matter. A lot of this is actually optional features the other FS don't have, like support for separate realtime volumes and compat code for old revisions, journaled quotas etc. I think you could relatively easily do a "mini xfs" that would be a lot smaller. But on today's machines it's not really an issue anymore. -Andi From feizhou at linuxmail.org Wed Mar 3 02:48:39 2004 From: feizhou at linuxmail.org (Feizhou) Date: Wed, 03 Mar 2004 10:48:39 +0800 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <40453538.8050103@animezone.org> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> Message-ID: <40454787.9010901@linuxmail.org> Andrew Ho wrote: > XFS is the best filesystem. Sorry, but saying/writing that does not make it so. Different filesystems have their strengths and weaknesses and those are also different under different circumstances. Where xfs may be fast given a number of factors, you will find that other filesystems will excel after a change or two in one or two of factors. eg: Large directory hash in a fileserver. You might find where nfs/smb clients = 8 then ext3 wins BIG time but where nfs/smb clients = 16 or higher, xfs excels and widens the gap with a ext3 based filesystem as the number of clients grows. There is no perfect filesystem. From robin.rosenberg.lists at dewire.com Wed Mar 3 06:00:56 2004 From: robin.rosenberg.lists at dewire.com (Robin Rosenberg) Date: Wed, 3 Mar 2004 07:00:56 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040303014115.GP19111@khan.acc.umu.se> References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> Message-ID: <200403030700.57164.robin.rosenberg.lists@dewire.com> On Wednesday 03 March 2004 02:41, David Weinehall wrote: > On Tue, Mar 02, 2004 at 08:30:32PM -0500, Andrew Ho wrote: > > XFS is the best filesystem. > > Well it'd better be, it's 10 times the size of ext3, 5 times the size of > ReiserFS and 3.5 times the size of JFS. > > And people say size doesn't matter. Recoverability matters to me. The driver could be 10 megabyte and *I* would not care. XFS seems to stand no matter how rudely the OS is knocked down. After a few hundred crashes (laptop, kids, drained batteries) I'd expect something bad to happen, but no. XFS returns my data quickly and happily everytime (as opposed to most of the time). Maybe the're a bit of luck. Salute to XFS! -- robin From reiser at namesys.com Wed Mar 3 06:30:54 2004 From: reiser at namesys.com (Hans Reiser) Date: Wed, 03 Mar 2004 09:30:54 +0300 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078266793.8582.24.camel@mentor.gurulabs.com> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> Message-ID: <40457B9E.3060706@namesys.com> Unfortunately it is a bit more complex, and the truth is less complementary to us than what you write. Reiser4's CPU usage has come down a lot, but it still consumes more CPU than V3. It should consume less, and Zam is currently working on making writes more CPU efficient. As soon as I get funding from somewhere and can stop worrying about money, I will do a complete code review, and CPU usage will go way down. There are always lots of stupid little things that consume a lot of CPU that I find whenever I stop chasing money and review code. We are shipping because CPU usage is not as important as IO efficiency for a filesystem, and while Reiser4 is not as fast as it will be in 3-6 months, it is faster than anything else available so it should be shipped. Hans Dax Kelson wrote: >On Tue, 2004-03-02 at 09:34, Peter Nelson wrote: > > >>Hans Reiser wrote: >> >>I'm confused as to why performing a benchmark out of cache as opposed to >>on disk would hurt performance? >> >> > >My understanding (which could be completely wrong) is that reieserfs v3 >and v4 are algorithmically more complex than ext2 or ext3. Reiserfs >spends more CPU time to make the eventual ondisk operations more >efficient/faster. > >When operating purely or mostly out of ram, the higher CPU utilization >of reiserfs hurts performance compared to ext2 and ext3. > >When your system I/O utilization exceeds cache size and your disks >starting getting busy, the CPU time previously invested by reiserfs pays >big dividends and provides large performance gains versus more >simplistic filesystems. > >In other words, the CPU penalty paid by reiserfs v3/v4 is more than made >up for by the resultant more efficient disk operations. Reiserfs trades >CPU for disk performance. > >In a nutshell, if you have more memory than you know what do to with, >stick with ext3. If you spend all your time waiting for disk operations >to complete, go with reiserfs. > >Dax Kelson >Guru Labs > > > > > -- Hans From hch at infradead.org Wed Mar 3 07:47:56 2004 From: hch at infradead.org (Christoph Hellwig) Date: Wed, 3 Mar 2004 07:47:56 +0000 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: ; from ak@suse.de on Wed, Mar 03, 2004 at 03:39:26AM +0100 References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <20040303014115.GP19111@khan.acc.umu.se.suse.lists.linux.kernel> Message-ID: <20040303074756.A25861@infradead.org> On Wed, Mar 03, 2004 at 03:39:26AM +0100, Andi Kleen wrote: > A lot of this is actually optional features the other FS don't have, > like support for separate realtime volumes and compat code for old > revisions, journaled quotas etc. I think you could > relatively easily do a "mini xfs" that would be a lot smaller. And a whole lot of code to stay somewhat in sync with other codebases.. From reiser at namesys.com Wed Mar 3 08:03:37 2004 From: reiser at namesys.com (Hans Reiser) Date: Wed, 03 Mar 2004 11:03:37 +0300 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040303074756.A25861@infradead.org> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <20040303014115.GP19111@khan.acc.umu.se.suse.lists.linux.kernel> <20040303074756.A25861@infradead.org> Message-ID: <40459159.1090501@namesys.com> Christoph Hellwig wrote: >On Wed, Mar 03, 2004 at 03:39:26AM +0100, Andi Kleen wrote: > > >>A lot of this is actually optional features the other FS don't have, >>like support for separate realtime volumes and compat code for old >>revisions, journaled quotas etc. I think you could >>relatively easily do a "mini xfs" that would be a lot smaller. >> >> > >And a whole lot of code to stay somewhat in sync with other codebases.. > > > > > What is significant is not the affect of code size on modern architectures, code size hurts developers as the code becomes very hard to make deep changes to. It is very important to carefully design your code to be easy to change. This is why we tossed the V3 code and wrote V4 from scratch using plugins at every conceivable abstraction layer. I think V4 will be our last rewrite from scratch because of our plugins, and because of how easy we find the code to work on now. I think XFS is going to stagnate over time based on the former developers who have told me how hard it is to work on the code. Christoph probably disagrees, and he knows the XFS code far better than I.;-) -- Hans From arjanv at redhat.com Wed Mar 3 08:16:18 2004 From: arjanv at redhat.com (Arjan van de Ven) Date: Wed, 03 Mar 2004 09:16:18 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <40459159.1090501@namesys.com> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <20040303014115.GP19111@khan.acc.umu.se.suse.lists.linux.kernel> <20040303074756.A25861@infradead.org> <40459159.1090501@namesys.com> Message-ID: <1078301777.4446.5.camel@laptop.fenrus.com> On Wed, 2004-03-03 at 09:03, Hans Reiser wrote: > I > think V4 will be our last rewrite from scratch because of our plugins, > and because of how easy we find the code to work on now. can we quote you on that 3 years from now ? ;-) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From reiser at namesys.com Wed Mar 3 09:35:12 2004 From: reiser at namesys.com (Hans Reiser) Date: Wed, 03 Mar 2004 12:35:12 +0300 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078301777.4446.5.camel@laptop.fenrus.com> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <1078266793.8582.24.camel@mentor.gurulabs.com> <20040302224758.GK19111@khan.acc.umu.se> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <20040303014115.GP19111@khan.acc.umu.se.suse.lists.linux.kernel> <20040303074756.A25861@infradead.org> <40459159.1090501@namesys.com> <1078301777.4446.5.camel@laptop.fenrus.com> Message-ID: <4045A6D0.6050203@namesys.com> Arjan van de Ven wrote: >On Wed, 2004-03-03 at 09:03, Hans Reiser wrote: > > >> I >>think V4 will be our last rewrite from scratch because of our plugins, >>and because of how easy we find the code to work on now. >> >> > >can we quote you on that 3 years from now ? ;-) > > Yes, I think so. We are going to add a nice little optimization for compiles to Reiser4 as a result of thinking about compile benchmarks. We are going to sort filenames (and their corresponding file bodies) whose penultimate character is . by their last character first. It seems this is optimal, and it is simple, and it is without any real world drawbacks. This is easy for us because of our plugin design. -- Hans From felipe_alfaro at linuxmail.org Wed Mar 3 09:43:53 2004 From: felipe_alfaro at linuxmail.org (Felipe Alfaro Solana) Date: Wed, 03 Mar 2004 10:43:53 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <200403030700.57164.robin.rosenberg.lists@dewire.com> References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <200403030700.57164.robin.rosenberg.lists@dewire.com> Message-ID: <1078307033.904.1.camel@teapot.felipe-alfaro.com> On Wed, 2004-03-03 at 07:00, Robin Rosenberg wrote: > On Wednesday 03 March 2004 02:41, David Weinehall wrote: > > On Tue, Mar 02, 2004 at 08:30:32PM -0500, Andrew Ho wrote: > > > XFS is the best filesystem. > > > > Well it'd better be, it's 10 times the size of ext3, 5 times the size of > > ReiserFS and 3.5 times the size of JFS. > > > > And people say size doesn't matter. > > Recoverability matters to me. The driver could be 10 megabyte and > *I* would not care. XFS seems to stand no matter how rudely the OS > is knocked down. But XFS easily breaks down due to media defects. Once ago I used XFS, but I lost all data on one of my volumes due to a bad block on my hard disk. XFS was unable to recover from the error, and the XFS recovery tools were unable to deal with the error. From robin.rosenberg.lists at dewire.com Wed Mar 3 09:59:26 2004 From: robin.rosenberg.lists at dewire.com (Robin Rosenberg) Date: Wed, 3 Mar 2004 10:59:26 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078307033.904.1.camel@teapot.felipe-alfaro.com> References: <4044119D.6050502@andrew.cmu.edu> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> Message-ID: <200403031059.26483.robin.rosenberg.lists@dewire.com> On Wednesday 03 March 2004 10:43, Felipe Alfaro Solana wrote: > But XFS easily breaks down due to media defects. Once ago I used XFS, > but I lost all data on one of my volumes due to a bad block on my hard > disk. XFS was unable to recover from the error, and the XFS recovery > tools were unable to deal with the error. What file systems work on defect media? As for crashed disks I rarely bothered trying to "fix" them anymore. I save what I can and restore what's backed up and recovery tools (other than the undo-delete ones) usually destroy what's left, but that's not unique to XFS. Depending on how good my backups are I sometimes try the recovery tools just to see, but that has never helped so far. -- robin From olaf at cbk.poznan.pl Wed Mar 3 10:13:18 2004 From: olaf at cbk.poznan.pl (Olaf =?iso-8859-2?Q?Fr=B1czyk?=) Date: Wed, 03 Mar 2004 11:13:18 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078307033.904.1.camel@teapot.felipe-alfaro.com> References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> Message-ID: <1078308797.2641.14.camel@venus.local.navi.pl> On Wed, 2004-03-03 at 10:43, Felipe Alfaro Solana wrote: > On Wed, 2004-03-03 at 07:00, Robin Rosenberg wrote: > > On Wednesday 03 March 2004 02:41, David Weinehall wrote: > > > On Tue, Mar 02, 2004 at 08:30:32PM -0500, Andrew Ho wrote: > > > > XFS is the best filesystem. > > > > > > Well it'd better be, it's 10 times the size of ext3, 5 times the size of > > > ReiserFS and 3.5 times the size of JFS. > > > > > > And people say size doesn't matter. > > > > Recoverability matters to me. The driver could be 10 megabyte and > > *I* would not care. XFS seems to stand no matter how rudely the OS > > is knocked down. > > But XFS easily breaks down due to media defects. Once ago I used XFS, > but I lost all data on one of my volumes due to a bad block on my hard > disk. XFS was unable to recover from the error, and the XFS recovery > tools were unable to deal with the error. You lost all data? Or you just had to restore them from backup? If you didn't have a backup it is your fault not XFS one :) But even if you had no backup, why didn't you move your data (using dd or something else) to another (without defects) drive, and run recovery on new drive? Regards, Olaf From felipe_alfaro at linuxmail.org Wed Mar 3 10:19:01 2004 From: felipe_alfaro at linuxmail.org (Felipe Alfaro Solana) Date: Wed, 03 Mar 2004 11:19:01 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <200403031059.26483.robin.rosenberg.lists@dewire.com> References: <4044119D.6050502@andrew.cmu.edu> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> <200403031059.26483.robin.rosenberg.lists@dewire.com> Message-ID: <1078309141.863.3.camel@teapot.felipe-alfaro.com> On Wed, 2004-03-03 at 10:59, Robin Rosenberg wrote: > On Wednesday 03 March 2004 10:43, Felipe Alfaro Solana wrote: > > But XFS easily breaks down due to media defects. Once ago I used XFS, > > but I lost all data on one of my volumes due to a bad block on my hard > > disk. XFS was unable to recover from the error, and the XFS recovery > > tools were unable to deal with the error. > > What file systems work on defect media? It's not a matter of working: it's a matter of recovering. A bad disk block could potentially destroy a file or a directory, but shouldn't make a filesystem not mountable nor recoverable. > As for crashed disks I rarely bothered trying to "fix" them anymore. I save > what I can and restore what's backed up and recovery tools (other than > the undo-delete ones) usually destroy what's left, but that's not unique to > XFS. Depending on how good my backups are I sometimes try the recovery > tools just to see, but that has never helped so far. The problem is that I couldn't save anything: the XFS volume refused to mount and the XFS recovery tools refused to fix anything. It was just a single disk bad block. For example in ext2/3 critical parts are replicated several times over the volume, so there's minimal chance of being unable to mount the volume and recover important files. From mg at sgi.com Wed Mar 3 10:24:10 2004 From: mg at sgi.com (Mike Gigante) Date: Wed, 3 Mar 2004 21:24:10 +1100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <200403031059.26483.robin.rosenberg.lists@dewire.com> Message-ID: On Wednesday 03 March 2004 10:43, Felipe Alfaro Solana wrote: > But XFS easily breaks down due to media defects. Once ago I used XFS, > but I lost all data on one of my volumes due to a bad block on my hard > disk. XFS was unable to recover from the error, and the XFS recovery > tools were unable to deal with the error. A single bad-block rendered the entire filesystem non-recoverable for XFS? Sounds difficult to believe since there is redundancy such as multiple copies of the superblock etc. I can believe you lost *some* data, but "lost all my data"??? -- I believe that you'd have to had had *considerably* more than "a bad block" :-) Mike From felipe_alfaro at linuxmail.org Wed Mar 3 13:07:16 2004 From: felipe_alfaro at linuxmail.org (Felipe Alfaro Solana) Date: Wed, 03 Mar 2004 14:07:16 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078308797.2641.14.camel@venus.local.navi.pl> References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> <1078308797.2641.14.camel@venus.local.navi.pl> Message-ID: <1078319235.1113.2.camel@teapot.felipe-alfaro.com> On Wed, 2004-03-03 at 11:13, Olaf Fr?czyk wrote: > > > Recoverability matters to me. The driver could be 10 megabyte and > > > *I* would not care. XFS seems to stand no matter how rudely the OS > > > is knocked down. > > But XFS easily breaks down due to media defects. Once ago I used XFS, > > but I lost all data on one of my volumes due to a bad block on my hard > > disk. XFS was unable to recover from the error, and the XFS recovery > > tools were unable to deal with the error. > You lost all data? Or you just had to restore them from backup? If you > didn't have a backup it is your fault not XFS one :) Well, it was a testing machine with no important data, so I could just afford to lose everything, as it was the case. > But even if you had no backup, why didn't you move your data (using dd > or something else) to another (without defects) drive, and run recovery > on new drive? I tried, but it proved more difficult than expected, since the computer was a laptop and I couldn't move the HDD to another computer. Using the distro rescue CD was useless as it's kernel didn't have XFS support. All in all, XFS recovery was a nightmare compared to ext3 recovery, for example. From felipe_alfaro at linuxmail.org Wed Mar 3 13:14:16 2004 From: felipe_alfaro at linuxmail.org (Felipe Alfaro Solana) Date: Wed, 03 Mar 2004 14:14:16 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: References: Message-ID: <1078319654.1113.10.camel@teapot.felipe-alfaro.com> On Wed, 2004-03-03 at 11:24, Mike Gigante wrote: > On Wednesday 03 March 2004 10:43, Felipe Alfaro Solana wrote: > > But XFS easily breaks down due to media defects. Once ago I used XFS, > > but I lost all data on one of my volumes due to a bad block on my hard > > disk. XFS was unable to recover from the error, and the XFS recovery > > tools were unable to deal with the error. > > A single bad-block rendered the entire filesystem non-recoverable > for XFS? Sounds difficult to believe since there is redundancy such > as multiple copies of the superblock etc. You should believe it... It was a combination of a power failure and some bad disk sectors. Maybe it was just a kernel bug, after all, as this happened with 2.5 kernels: during kernel bootup, the kernel invoked XFS recovery but it failed due to media errors. > I can believe you lost *some* data, but "lost all my data"??? -- I > believe that you'd have to had had *considerably* more than > "a bad block" :-) It was exactly one disk block, at least that's what the low-level HDD diagnostic program for my IBM/Hitachi laptop drive told me. In fact, the HDD diagnostic was able to recover the media defects. That could have been one of those very improbable cases, but I lost the entire volume. Neither the kernel nor XFS tools were able to recover the XFS volume. However, I must say that I didn't try every single known way of performing the recovery, but recovery with ext2/3 is pretty straightforward. As I said, it could have been a kernel bug, or maybe I simply didn't understand the implications of recovery, but xfs_repair was totally unable to fix the problem. It instructed me to use "dd" to move the volume to a healthy disk and retry the operation, but it was not easy to do that as I explained before. From reiser at namesys.com Wed Mar 3 13:42:25 2004 From: reiser at namesys.com (Hans Reiser) Date: Wed, 03 Mar 2004 16:42:25 +0300 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <200403031059.26483.robin.rosenberg.lists@dewire.com> References: <4044119D.6050502@andrew.cmu.edu> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> <200403031059.26483.robin.rosenberg.lists@dewire.com> Message-ID: <4045E0C1.9020806@namesys.com> Robin Rosenberg wrote: >On Wednesday 03 March 2004 10:43, Felipe Alfaro Solana wrote: > > >>But XFS easily breaks down due to media defects. Once ago I used XFS, >>but I lost all data on one of my volumes due to a bad block on my hard >>disk. XFS was unable to recover from the error, and the XFS recovery >>tools were unable to deal with the error. >> >> > >What file systems work on defect media? > >As for crashed disks I rarely bothered trying to "fix" them anymore. I save >what I can and restore what's backed up and recovery tools (other than >the undo-delete ones) usually destroy what's left, but that's not unique to >XFS. Depending on how good my backups are I sometimes try the recovery >tools just to see, but that has never helped so far. > >-- robin > > > > Never attempt to recover without first dd_rescue ing to a good hard drive, and doing the recovery there on good hard drive. -- Hans From reiser at namesys.com Wed Mar 3 14:16:16 2004 From: reiser at namesys.com (Hans Reiser) Date: Wed, 03 Mar 2004 17:16:16 +0300 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078319654.1113.10.camel@teapot.felipe-alfaro.com> References: <1078319654.1113.10.camel@teapot.felipe-alfaro.com> Message-ID: <4045E8B0.4090001@namesys.com> Felipe Alfaro Solana wrote: > > >As I said, it could have been a kernel bug, or maybe I simply didn't >understand the implications of recovery, but xfs_repair was totally >unable to fix the problem. It instructed me to use "dd" to move the >volume to a healthy disk and retry the operation, but it was not easy to >do that as I explained before. > > > > > I think that your expectation is unreasonable. XFS was designed for machines where popping in a working hard drive was feasible. Making a disk layout adaptable to any arbitrary block going bad is more work than you might think, and for their intended market (not laptops) they did the right thing. You can buy cables that allow you to connect laptop drives to desktops. -- Hans From js at convergence.de Wed Mar 3 23:41:04 2004 From: js at convergence.de (Johannes Stezenbach) Date: Thu, 4 Mar 2004 00:41:04 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <4044B787.7080301@andrew.cmu.edu> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> Message-ID: <20040303234104.GD1875@convergence.de> Peter Nelson wrote: > Hans Reiser wrote: > > >Are you sure your benchmark is large enough to not fit into memory, > >particularly the first stages of it? It looks like not. reiser4 is > >much faster on tasks like untarring enough files to not fit into ram, > >but (despite your words) your results seem to show us as slower unless > >I misread them.... > > I'm pretty sure most of the benchmarking I am doing fits into ram, > particularly because my system has 1GB of it, but I see this as > realistic. When I download a bunch of debs (or rpms or the kernel) I'm > probably going to install them directly with them still in the file > cache. Same with rebuilding the kernel after working on it. OK, that test is not very interesting for the FS gurus because it doesn't stress the disk enough. Anyway, I have some related questions concerning disk/fs performance: o I see you are using and IDE disk with a large (8MB) write cache. My understanding is that enabling write cache is a risky thing for journaled file systems, so for a fair comparison you would have to enable the write cache for ext2 and disable it for all journaled file systems. It would be nice if someone with more profound knowledge could comment on this, but my understanding of the problem is: - journaled filesystems can only work when they can enforce that journal data is written to the platters at specifc times wrt normal data writes - IDE write caching makes the disk "lie" to the kernel, i.e. it says "I've written the data" when it was only put in the cache - now if a *power failure* keeps the disk from writing the cache contents to the platter, the fs and journal are inconsistent (a kernel crash would not cause this problem because the disk can still write the cache contents to the platters) - at next mount time the fs will read the journal from the disk and try to use it to bring the fs into a consistent state; however, since the journal on disk is not guaranteed to be up to date this can *fail* (I have no idea what various fs implementations do to handle this; I suspect they at least refuse to mount and require you to manually run fsck. Or they don't notice and let you work with a corrupt filesystem until they blow up.) Right? Or is this just paranoia? To me it looks like IDE write barrier support (http://lwn.net/Articles/65353/) would be a way to safely enable IDE write caches for journaled filesystems. Has anyone done any benchmarks concerning write cache and journaling? o And one totally different :-) question: Has anyone benchmarked fs performance on PATA IDE disks vs. otherwise comparable SCSI or SATA disks (I vaguely recall having read that SATA has working TCQ, i.e. not broken by design as with PATA)? I have read a few times that SCSI disks perform much better than IDE disks. The usual reason given is "SCSI disks are built for servers, IDE for desktops". Is this all, or is it TCQ that matters? Or is the Linux SCSI core better than the IDE core? Johannes From d_baron at 012.net.il Thu Mar 4 10:50:57 2004 From: d_baron at 012.net.il (David Baron) Date: Thu, 4 Mar 2004 11:50:57 +0100 Subject: warning: updated with obselete bdflush call Message-ID: <200403041150.57687.d_baron@012.net.il> Get this warning on bootup ext3 file checks on 2.6.* kernels. Apparently harmless, but how do I fix this? From d_baron at 012.net.il Thu Mar 4 10:50:57 2004 From: d_baron at 012.net.il (David Baron) Date: Thu, 4 Mar 2004 11:50:57 +0100 Subject: [debian-knoppix] warning: updated with obselete bdflush call Message-ID: <200403041150.57687.d_baron@012.net.il> Get this warning on bootup ext3 file checks on 2.6.* kernels. Apparently harmless, but how do I fix this? _______________________________________________ debian-knoppix mailing list debian-knoppix at linuxtag.org http://mailman.linuxtag.org/mailman/listinfo/debian-knoppix From kris at koehntopp.de Thu Mar 4 09:28:07 2004 From: kris at koehntopp.de (Kristian =?iso-8859-15?q?K=F6hntopp?=) Date: Thu, 4 Mar 2004 10:28:07 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078309141.863.3.camel@teapot.felipe-alfaro.com> References: <4044119D.6050502@andrew.cmu.edu> <200403031059.26483.robin.rosenberg.lists@dewire.com> <1078309141.863.3.camel@teapot.felipe-alfaro.com> Message-ID: <200403041028.07235.kris@koehntopp.de> On Wednesday 03 March 2004 11:19, Felipe Alfaro Solana wrote: > The problem is that I couldn't save anything: the XFS volume refused to > mount and the XFS recovery tools refused to fix anything. It was just a > single disk bad block. For example in ext2/3 critical parts are > replicated several times over the volume, so there's minimal chance of > being unable to mount the volume and recover important files. That is a misconception. What is being replicated multiple times in ext2 is the superblock and the block group descriptors. But these are not really needed for recovery (as long as they have default values, which is the case in the vast majority of installations). What is not being replicated is the block allocation bitmap, inode allocation bitmap and the inodes themselves. By running "mke2fs -S" on a ext2 file system, you will rewrite all superblocks, all block group descriptors, and all allocation bitmaps, but leave the inodes themselves intact. You can recreate the filesystem from that with e2fsck, proving that the information from the replicated parts of the file systems is not really necessary. All that e2fsck needs to recover the system is the information from the inodes. If they are damaged (and they are not replicated), the files having inodes in damaged blocks cannot be recovered. Kristian From sct at redhat.com Thu Mar 4 11:44:29 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 04 Mar 2004 11:44:29 +0000 Subject: warning: updated with obselete bdflush call In-Reply-To: <200403041150.57687.d_baron@012.net.il> References: <200403041150.57687.d_baron@012.net.il> Message-ID: <1078400669.2035.9.camel@sisko.scot.redhat.com> Hi, On Thu, 2004-03-04 at 10:50, David Baron wrote: > Get this warning on bootup ext3 file checks on 2.6.* kernels. > > Apparently harmless, but how do I fix this? bdflush isn't needed on 2.6 at all. The message you get just after that error in the logs is printk(KERN_INFO "Fix your initscripts?\n"); which gives you a clue how to fix it: it's a matter of getting the init scripts not to try starting the bdflush daemon in the first place. But bdflush is a noop on 2.6, it's harmless to leave things as they are for now in case you want to dual-boot between 2.4 and 2.6. --Stephen From sct at redhat.com Thu Mar 4 11:49:13 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 04 Mar 2004 11:49:13 +0000 Subject: Ext3 problem - lost files/directorys In-Reply-To: References: Message-ID: <1078400953.2035.14.camel@sisko.scot.redhat.com> Hi, On Wed, 2004-03-03 at 22:24, bill root wrote: > e2fsck 1.34 (25-Jul-2003) > Group descriptors look bad... trying backup blocks... > Superblock has a bad ext3 journal (inode 8). > Root inode is not a directory. Clear? Bad news --- looks like the start of the inode table has been stomped on. > When i re-create then filesystem (with -S) and e2fsck i have all files in > lost+found (all with , not one file with his real name) > > So some questions : > - There are no way to recover with filename ? No, if the directories are lost then there is no way to recover filenames. > - Where are stored files name / directory's name ? In the parent directory. There is absolutely no record of the filename in the file itself --- there can't be, because when you have hard links, it is possible to have many names for the same inode. > - why, where i try another superblocks i got same results ? Because the superblocks don't hold any directory information, they just contain the overall description of the filesystem such as how large it is, the block size etc. > And last question, there are not backup of journal stored like superblocks? > I mean backup stored on harddrive ? The journal is a dynamic data structure being updated all the time; the data in it is only valid for a very short time. Having a long term backup of the journal contents just doesn't have any value. (Backing up the journal metadata --- where the journal is --- would make sense, though.) But ultimately, everything in the journal eventually makes it back to disk, so losing the journal shouldn't be a complete disaster for the fs (that's why fsck could still work after nuking the journal and converting to ext2.) Cheers, Stephen From d_baron at 012.net.il Thu Mar 4 13:07:13 2004 From: d_baron at 012.net.il (David Baron) Date: Thu, 4 Mar 2004 14:07:13 +0100 Subject: warning: updated with obselete bdflush call In-Reply-To: <1078400669.2035.9.camel@sisko.scot.redhat.com> References: <200403041150.57687.d_baron@012.net.il> <1078400669.2035.9.camel@sisko.scot.redhat.com> Message-ID: <200403041407.13272.d_baron@012.net.il> Thanks. I found references to this word in /etc/init.d/checkroot.sh and /etc/rcS.d/S10checkroot.sh When I am ready to delete 2.4, I will change this. A version minus the "update" call will do it, I think From d_baron at 012.net.il Thu Mar 4 13:07:13 2004 From: d_baron at 012.net.il (David Baron) Date: Thu, 4 Mar 2004 14:07:13 +0100 Subject: [debian-knoppix] Re: warning: updated with obselete bdflush call In-Reply-To: <1078400669.2035.9.camel@sisko.scot.redhat.com> References: <200403041150.57687.d_baron@012.net.il> <1078400669.2035.9.camel@sisko.scot.redhat.com> Message-ID: <200403041407.13272.d_baron@012.net.il> Thanks. I found references to this word in /etc/init.d/checkroot.sh and /etc/rcS.d/S10checkroot.sh When I am ready to delete 2.4, I will change this. A version minus the "update" call will do it, I think _______________________________________________ debian-knoppix mailing list debian-knoppix at linuxtag.org http://mailman.linuxtag.org/mailman/listinfo/debian-knoppix From cs at tequila.co.jp Fri Mar 5 01:59:33 2004 From: cs at tequila.co.jp (Clemens Schwaighofer) Date: Fri, 05 Mar 2004 10:59:33 +0900 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078309141.863.3.camel@teapot.felipe-alfaro.com> References: <4044119D.6050502@andrew.cmu.edu> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> <200403031059.26483.robin.rosenberg.lists@dewire.com> <1078309141.863.3.camel@teapot.felipe-alfaro.com> Message-ID: <4047DF05.8080209@tequila.co.jp> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Felipe Alfaro Solana wrote: | The problem is that I couldn't save anything: the XFS volume refused to | mount and the XFS recovery tools refused to fix anything. It was just a | single disk bad block. For example in ext2/3 critical parts are | replicated several times over the volume, so there's minimal chance of | being unable to mount the volume and recover important files. just my two cents here: if you have an XFS volume, then you mostly do more than just storing your baby photos, so you should have a raid below (software or hardware) and then you don't worry about bad blocks, because a) you have a raid (probably with a hot spare drive) and b) a daly (or more often) backup. as for me I stopped using raiser, jfs or xfs at home. why? too many negative experience. bad blocks (xfs total b0rked), raiserfs (similar things) and I even didn't try jfs. with ext3 it works very well. yes I do have a crappy board with a sucky via chipset and some super super old hds, but with ext3 I had NO single problem since 6 months (heavily knocking on wood here). all those high end journaling file systems are no good for home systems in my opinion but again, those are just my little two cents here - -- Clemens Schwaighofer - IT Engineer & System Administration ========================================================== Tequila Japan, 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN Tel: +81-(0)3-3545-7703 Fax: +81-(0)3-3545-7343 http://www.tequila.jp ========================================================== -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAR98FjBz/yQjBxz8RAjbtAJ9gyiy3QNak2NgsFyWGm355wshhMgCgz/5E r9ARfA4kajBAUZCLOFBi0gw= =InvR -----END PGP SIGNATURE----- From pascal.gienger at uni-konstanz.de Thu Mar 4 14:37:47 2004 From: pascal.gienger at uni-konstanz.de (Pascal Gienger) Date: Thu, 4 Mar 2004 15:37:47 +0100 Subject: [Jfs-discussion] Re: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <1078307033.904.1.camel@teapot.felipe-alfaro.com> References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> Message-ID: <1078411067.40473f3b6b835@webmail.uni-konstanz.de> Quoting Felipe Alfaro Solana : > But XFS easily breaks down due to media defects. Once ago I used > XFS, > but I lost all data on one of my volumes due to a bad block on my > hard > disk. XFS was unable to recover from the error, and the XFS recovery > tools were unable to deal with the error. 1. How long ago is "Once ago"? Did you report that to the xfs developers? 2. Speaking for servers, we live in a RAID and/or SAN-world. The media error issue is a non-issue. Just my $0.02, Pascal From perbu at linpro.no Thu Mar 4 20:43:44 2004 From: perbu at linpro.no (Per Andreas Buer) Date: Thu, 04 Mar 2004 21:43:44 +0100 Subject: [Jfs-discussion] Re: Desktop Filesystem Benchmarks in 2.6.3 References: <4044119D.6050502@andrew.cmu.edu> <40453538.8050103@animezone.org> <20040303014115.GP19111@khan.acc.umu.se> <200403030700.57164.robin.rosenberg.lists@dewire.com> <1078307033.904.1.camel@teapot.felipe-alfaro.com> <1078411067.40473f3b6b835@webmail.uni-konstanz.de> Message-ID: Pascal Gienger writes: > 2. Speaking for servers, we live in a RAID and/or SAN-world. The media > error issue is a non-issue. If your cooling system stops you will experience media errors. A filesystem which detects this halts the kernel really helps. -- There are only 10 different kinds of people in the world, those who understand binary, and those who don't. From leandro at dutra.fastmail.fm Fri Mar 5 13:45:59 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Fri, 05 Mar 2004 10:45:59 -0300 Subject: PROBLEM: log abort over RAID5 Message-ID: Several people already reported this at linux-kernel and elsewhere with no answers, I thought perhaps this would be a more adequate forum... [1.] One line summary of the problem: After I/O, journal is aborted and filesystems made read-only. [2.] Full description of the problem/report: One can't anymore write to the affected file systems. Upon investigation, ext3fs journal was aborted and affected filesystems are remounted read-only. Reboot prompts for fsck to be run manually, with fixes taking several prompts and minutes. [3.] Keywords (i.e., modules, networking, kernel): File systems, I/O, RAID, LVM, kernel. [4.] Kernel version (from /proc/version): Couldn't find /proc/version, uname -a gives: Linux mercurio 2.6.2-1-686-smp #1 SMP Sat Feb 7 15:27:45 EST 2004 i686 GNU/Linux [5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) N/A [6.] A small shell script or example program which triggers the problem (if possible) N/A [7.] Environment [7.1.] Software (add the output of the ver_linux script here) [7.2.] Processor information (from /proc/cpuinfo): processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping : 9 cpu MHz : 2394.494 cache size : 512 KB physical id : 0 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 4718.59 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping : 9 cpu MHz : 2394.494 cache size : 512 KB physical id : 0 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 4784.12 [7.3.] Module information (from /proc/modules): ipv6 273856 25 - Live 0xf8d4e000 i810_audio 35476 0 - Live 0xf8cc9000 ac97_codec 19340 1 i810_audio, Live 0xf8cc3000 ide_scsi 15716 0 - Live 0xf8bf1000 parport_pc 37228 1 - Live 0xf8cf4000 lp 11844 0 - Live 0xf8ca9000 parport 45608 2 parport_pc,lp, Live 0xf8ce7000 ext2 75528 3 - Live 0xf8cd3000 dm_mod 42304 8 - Live 0xf8c66000 raid5 23392 1 - Live 0xf8c06000 xor 15848 1 raid5, Live 0xf8b40000 md 48680 2 raid5, Live 0xf8caf000 i810fb 33088 0 - Live 0xf8bfc000 vgastate 10272 1 i810fb, Live 0xf8b91000 e100 65448 0 - Live 0xf8c55000 ntfs 92044 0 - Live 0xf8c3d000 cifs 178380 0 - Live 0xf8c72000 autofs4 17344 0 - Live 0xf8bdc000 smbfs 70424 0 - Live 0xf8c2a000 udf 105188 0 - Live 0xf8c0f000 vfat 16512 0 - Live 0xf8bd6000 msdos 11424 0 - Live 0xf8b8d000 fat 48064 2 vfat,msdos, Live 0xf8be4000 aes 32704 0 - Live 0xf8ba2000 md5 4096 1 - Live 0xf8932000 quota_v2 9536 0 - Live 0xf8b80000 binfmt_misc 11400 0 - Live 0xf8b7c000 microcode 8160 0 - Live 0xf8b4d000 snd_intel8x0 33220 0 - Live 0xf8b98000 snd_ac97_codec 55556 1 snd_intel8x0, Live 0xf8bc7000 snd_pcm 104640 1 snd_intel8x0, Live 0xf8bac000 snd_timer 26788 1 snd_pcm, Live 0xf8b85000 snd_page_alloc 12100 2 snd_intel8x0,snd_pcm, Live 0xf8b16000 snd_mpu401_uart 8352 1 snd_intel8x0, Live 0xf8b3c000 snd_rawmidi 25536 1 snd_mpu401_uart, Live 0xf8b45000 snd_seq_device 8296 1 snd_rawmidi, Live 0xf8b38000 snd 55012 7 snd_intel8x0,snd_ac97_codec,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device, Live 0xf8b6d000 soundcore 10752 2 i810_audio,snd, Live 0xf8b0e000 firmware_class 9440 0 - Live 0xf8b12000 cpufreq_powersave 1824 0 - Live 0xf8959000 usbkbd 7552 0 - Live 0xf89ed000 usbmouse 5792 0 - Live 0xf8b0b000 hid 33792 0 - Live 0xf8b2e000 usbcore 114652 4 usbkbd,usbmouse,hid, Live 0xf8b50000 i2c_i810 4740 0 - Live 0xf8b02000 i2c_algo_bit 10024 1 i2c_i810, Live 0xf8b07000 i2c_core 23844 1 i2c_algo_bit, Live 0xf8af4000 i830 75756 2 - Live 0xf8b1a000 raw 8192 0 - Live 0xf89d9000 pcspkr 3724 0 - Live 0xf8952000 psmouse 20264 0 - Live 0xf8afc000 pcips2 4352 0 - Live 0xf89d6000 analog 11776 0 - Live 0xf8af0000 gameport 5120 1 analog, Live 0xf89cd000 intel_agp 18684 1 - Live 0xf89d0000 nls_cp850 4960 0 - Live 0xf89a6000 nls_cp437 5792 0 - Live 0xf89a3000 nls_iso8859_1 4128 0 - Live 0xf89a0000 nls_iso8859_15 4704 0 - Live 0xf899d000 nls_utf8 2144 0 - Live 0xf8950000 nfs 169244 2 - Live 0xf8a14000 nfsd 181600 0 - Live 0xf8a41000 exportfs 7328 1 nfsd, Live 0xf8927000 lockd 64168 3 nfs,nfsd, Live 0xf89dc000 sunrpc 140328 9 nfs,nfsd,lockd, Live 0xf89f0000 coda 42660 0 - Live 0xf89bc000 mousedev 10264 1 - Live 0xf8955000 nbd 21256 0 - Live 0xf8996000 loop 18504 0 - Live 0xf8965000 floppy 62516 0 - Live 0xf89ab000 agpgart 33132 3 intel_agp, Live 0xf895b000 fan 4140 0 - Live 0xf892a000 thermal 13360 0 - Live 0xf894b000 processor 14608 1 thermal, Live 0xf892d000 ide_detect 1312 0 - Live 0xf8825000 ide_cd 42404 0 - Live 0xf893f000 ide_core 163600 3 ide_scsi,ide_detect,ide_cd, Live 0xf896d000 cdrom 39168 1 ide_cd, Live 0xf8934000 rtc 14120 0 - Live 0xf882b000 ext3 123880 7 - Live 0xf8844000 jbd 70424 1 ext3, Live 0xf8899000 mbcache 10212 2 ext2,ext3, Live 0xf8830000 sd_mod 17024 10 - Live 0xf883e000 BusLogic 82620 7 - Live 0xf8883000 scsi_mod 120208 3 ide_scsi,sd_mod,BusLogic, Live 0xf8864000 unix 31216 543 - Live 0xf8835000 font 8544 0 - Live 0xf8827000 cfbcopyarea 4192 1 i810fb, Live 0xf8820000 cfbimgblt 3360 1 i810fb, Live 0xf881e000 cfbfillrect 4096 1 i810fb, Live 0xf881c000 [7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) 0000-001f : dma1 0020-0021 : pic1 0040-005f : timer 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 01f0-01f7 : ide0 0378-037a : parport0 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial 04d0-04d1 : pnp 00:0d 0cf8-0cff : PCI conf1 b800-b8ff : 0000:01:00.0 b800-b8ff : BusLogic BT-950 bc00-bc3f : 0000:01:08.0 bc00-bc3f : e100 c400-c41f : 0000:00:1f.3 c800-c81f : 0000:00:1d.0 cc00-cc1f : 0000:00:1d.1 d000-d01f : 0000:00:1d.2 d400-d41f : 0000:00:1d.3 d800-d80f : 0000:00:1f.2 dc00-dc03 : 0000:00:1f.2 e000-e007 : 0000:00:1f.2 e400-e403 : 0000:00:1f.2 e800-e807 : 0000:00:1f.2 ec00-ec07 : 0000:00:02.0 ffa0-ffaf : 0000:00:1f.1 00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000ca800-000cf7ff : Extension ROM 000f0000-000fffff : System ROM 00100000-3ef2fbff : System RAM 00100000-0028fdc1 : Kernel code 0028fdc2-0032e31f : Kernel data 3ef2fc00-3ef2ffff : ACPI Non-volatile Storage 3ef30000-3ef3ffff : ACPI Tables 3ef40000-3efeffff : ACPI Non-volatile Storage 3eff0000-3effffff : reserved 3f000000-3f0003ff : 0000:00:1f.1 f0000000-f7ffffff : 0000:00:02.0 f8000000-fbffffff : 0000:00:00.0 fe00f000-fe00ffff : 0000:01:00.0 fecf0000-fecf0fff : reserved fed20000-fed9ffff : reserved ff8fe000-ff8fefff : 0000:01:08.0 ff8fe000-ff8fefff : e100 ffa7f400-ffa7f4ff : 0000:00:1f.5 ffa7f400-ffa7f4ff : Intel ICH5 - Controller ffa7f800-ffa7f9ff : 0000:00:1f.5 ffa7f800-ffa7f9ff : Intel ICH5 - AC'97 ffa7fc00-ffa7ffff : 0000:00:1d.7 ffa80000-ffafffff : 0000:00:02.0 [7.5.] PCI information ('lspci -vvv' as root) /sys $ lspci -vvv 00:00.0 Host bridge: Intel Corp. 82865G/PE/P Processor to I/O Controller (rev 02) Subsystem: Intel Corp. 82865G/PE/P Processor to I/O Controller Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:02.0 VGA compatible controller: Intel Corp. 82865G Integrated Graphics Device (rev 02) (prog-if 00 [VGA]) Subsystem: Intel Corp.: Unknown device 4c43 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:1d.0 USB Controller: Intel Corp. 82801EB USB (rev 02) (prog-if 00 [UHCI]) Subsystem: Intel Corp.: Unknown device 4c43 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB PCI Bridge (rev c2) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00:1f.0 ISA bridge: Intel Corp. 82801EB LPC Interface Controller (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Region 1: I/O ports at Region 2: I/O ports at Region 3: I/O ports at Region 4: I/O ports at ffa0 [size=16] Region 5: Memory at 3f000000 (32-bit, non-prefetchable) [disabled] [size=1K] 00:1f.2 IDE interface: Intel Corp. 82801EB Ultra ATA Storage Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Intel Corp.: Unknown device 4c43 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- 01:00.0 SCSI storage controller: BusLogic Flashpoint LT (rev 02) Subsystem: BusLogic Flashpoint LT Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- /sys $ [7.6.] SCSI information (from /proc/scsi/scsi) Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST336607LW Rev: 0006 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST336607LW Rev: 0006 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST336607LW Rev: 0006 Type: Direct-Access ANSI SCSI revision: 03 [7.7.] Other information that might be relevant to the problem (please look in /proc and include all information that you think to be relevant): /var/log/kern.log: Feb 26 06:25:24 mercurio kernel: EXT3-fs error (device dm-2): ext3_readdir: bad entry in directory #381585: directory entry across blocks - offset=0, inode=0, rec_len=4132, name_len=63 Feb 26 06:25:24 mercurio kernel: Aborting journal on device dm-2. Feb 26 06:25:24 mercurio kernel: ext3_abort called. Feb 26 06:25:24 mercurio kernel: EXT3-fs abort (device dm-2): ext3_journal_start: Detected aborted journal Feb 26 06:25:24 mercurio kernel: Remounting filesystem read-only [...] Feb 26 15:57:21 mercurio kernel: EXT3-fs error (device dm-2) in start_transaction: Journal has aborted Feb 26 15:57:21 mercurio kernel: EXT3-fs error (device dm-2) in ext3_delete_inode: Journal has aborted [X.] Other notes, patches, fixes, workarounds: No known patch or fix, not enough knowledge or skill, sorry... Have seen various reports by Googling around, from 2.5 something up to 2.6.3. -- Leandro Guimar?es Faria Corsetti Dutra +55 (11) 5685 2219 Av Sgto Geraldo Santana, 1100 6/71 +55 (11) 5686 9607 04.674-000 S?o Paulo, SP BRASIL http://br.geocities.com./lgcdutra/ From bothie at gmx.de Fri Mar 5 16:18:05 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Fri, 5 Mar 2004 17:18:05 +0100 Subject: PROBLEM: log abort over RAID5 In-Reply-To: References: Message-ID: <200403051617.i25GH0b31475@mx1.redhat.com> Leandro Guimar?es Faria Corsetti Dutra wrote: > After I/O, journal is aborted and filesystems made read-only. > > One can't anymore write to the affected file systems. Upon >investigation, ext3fs journal was aborted and affected filesystems are >remounted read-only. Reboot prompts for fsck to be run manually, with >fixes taking several prompts and minutes. >/var/log/kern.log: > >Feb 26 06:25:24 mercurio kernel: EXT3-fs error (device dm-2): >ext3_readdir: bad entry in directory #381585: directory entry across >blocks - offset=0, inode=0, rec_len=4132, name_len=63 Feb 26 06:25:24 >mercurio kernel: Aborting journal on device dm-2. Feb 26 06:25:24 mercurio >kernel: ext3_abort called. Feb 26 06:25:24 mercurio kernel: EXT3-fs abort >(device dm-2): ext3_journal_start: Detected aborted journal Feb 26 >06:25:24 mercurio kernel: Remounting filesystem read-only [...] >Feb 26 15:57:21 mercurio kernel: EXT3-fs error (device dm-2) in >start_transaction: Journal has aborted Feb 26 15:57:21 mercurio kernel: >EXT3-fs error (device dm-2) in ext3_delete_inode: Journal has aborted 1. Check if the hard disk has bad blocks. # badblocks -b If you have errors here, than ask again what to do in this list - I don't know how to recover from errors like bad blocks in the journal ;-) Assuming badblocks didn't report any errors, continue: 2. Umount the filesystem 3. Run e2fsck If that fails, report it here and don't continue ;-) 4. Try to remount the filesystem as ext3. If that works, you should be done. 5. If that doesn't work then: Report the error messages here. You may continue or wait for answers ... 6. disable usage of journal: tune2fs -O ^has_journal 7. mount the file system as ext2 and update /etc/fstab according to that. Regards, Bodo From leandro at dutra.fastmail.fm Fri Mar 5 21:05:19 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Fri, 05 Mar 2004 18:05:19 -0300 Subject: PROBLEM: log abort over RAID5 References: <200403051617.i25GH0b31475@mx1.redhat.com> Message-ID: Em Fri, 05 Mar 2004 17:18:05 +0100, Bodo Thiesen escreveu: > # badblocks -b > [...] > 2. Umount the filesystem BTW badblocks complains I have to unmount the filesystem anyway... -- Leandro Guimar?es Faria Corsetti Dutra +55 (11) 5685 2219 Av Sgto Geraldo Santana, 1100 6/71 +55 (11) 5686 9607 04.674-000 S?o Paulo, SP BRASIL http://br.geocities.com./lgcdutra/ From leandro at dutra.fastmail.fm Fri Mar 5 21:00:25 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Fri, 05 Mar 2004 18:00:25 -0300 Subject: PROBLEM: log abort over RAID5 References: <200403051617.i25GH0b31475@mx1.redhat.com> Message-ID: Em Fri, 05 Mar 2004 17:18:05 +0100, Bodo Thiesen escreveu: > 1. Check if the hard disk has bad blocks. I will follow all your instructions, but if you Google around you'll see it is a surprisingly common problem, that disappears once one turns logging off or goes back to Linux 2.4.X. That, coupled with my having already run stress tests on these spanking new SCSI disks, make me think we'll find nothing. Will report shortly. -- Leandro Guimar?es Faria Corsetti Dutra +55 (11) 5685 2219 Av Sgto Geraldo Santana, 1100 6/71 +55 (11) 5686 9607 04.674-000 S?o Paulo, SP BRASIL http://br.geocities.com./lgcdutra/ From bothie at gmx.de Fri Mar 5 21:17:01 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Fri, 5 Mar 2004 22:17:01 +0100 Subject: PROBLEM: log abort over RAID5 In-Reply-To: References: <200403051617.i25GH0b31475@mx1.redhat.com> Message-ID: <200403052116.i25LGJb10549@mx1.redhat.com> Leandro Guimar?es Faria Corsetti Dutra wrote: > Em Fri, 05 Mar 2004 17:18:05 +0100, Bodo Thiesen escreveu: > >> # badblocks -b >[...] >> 2. Umount the filesystem > > BTW badblocks complains I have to unmount the filesystem anyway... Strange, in general badblocks does work fine with mounted filesystems in read-only test ... # grep "hdc " /etc/fstab /dev/hdc / ext3 defaults 1 2 # badblocks -v /dev/hdc Checking for bad blocks in read-only mode >From block 0 to 156290904 Regards, Bodo From daniel at rimspace.net Fri Mar 5 23:26:46 2004 From: daniel at rimspace.net (Daniel Pittman) Date: Sat, 06 Mar 2004 10:26:46 +1100 Subject: e2image, ext3 and nightly backups. Message-ID: <87y8qeg9jd.fsf@enki.rimspace.net> I have been looking at integrating the e2image tool into my nightly backup routines for my systems, to improve the odds that I can get data back if something disastrous happens to my file system. I have a couple of questions about this, though, to work out if this is actually worth doing. Is e2image worth running if the file system is online and in use, under the 2.6 series kernels, as part of a nightly backup run? If there /is/ a risk of the data being incomplete or incorrect, is there anything that can be done to make this less likely or to detect the issue, other than adding LVM and using a snapshot to image from? Does e2image capture all the information necessary to support ext3 file systems? It looks to capture everything except the journal content; will this cause problems later if, say, the journal inode is destroyed but the content isn't? What if the journal inode /and/ content are destroyed? Thanks for the advice, Daniel -- Beware of all enterprises that require new clothes. -- Henry David Thoreau From mason at suse.com Sat Mar 6 00:16:22 2004 From: mason at suse.com (Chris Mason) Date: Fri, 05 Mar 2004 19:16:22 -0500 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040303234104.GD1875@convergence.de> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <20040303234104.GD1875@convergence.de> Message-ID: <1078532181.25062.144.camel@watt.suse.com> On Wed, 2004-03-03 at 18:41, Johannes Stezenbach wrote: > Peter Nelson wrote: > > Hans Reiser wrote: > > > > >Are you sure your benchmark is large enough to not fit into memory, > > >particularly the first stages of it? It looks like not. reiser4 is > > >much faster on tasks like untarring enough files to not fit into ram, > > >but (despite your words) your results seem to show us as slower unless > > >I misread them.... > > > > I'm pretty sure most of the benchmarking I am doing fits into ram, > > particularly because my system has 1GB of it, but I see this as > > realistic. When I download a bunch of debs (or rpms or the kernel) I'm > > probably going to install them directly with them still in the file > > cache. Same with rebuilding the kernel after working on it. > > OK, that test is not very interesting for the FS gurus because it > doesn't stress the disk enough. > > Anyway, I have some related questions concerning disk/fs performance: > > o I see you are using and IDE disk with a large (8MB) write cache. > > My understanding is that enabling write cache is a risky > thing for journaled file systems, so for a fair comparison you > would have to enable the write cache for ext2 and disable it > for all journaled file systems. > > It would be nice if someone with more profound knowledge could comment > on this, but my understanding of the problem is: > Jens just sent me an updated version of his IDE barrier code, and I'm adding support for reiserfs and ext3 to it this weekend. It's fairly trivial to add support for each FS, I just don't know the critical sections of the others as well. The SUSE 2.4 kernels have had various forms of the patch, it took us a while to get things right. It does impact performance slightly, since we are forcing cache flushes that otherwise would not have been done. The common workloads don't slow down with the patch, fsync heavy workloads typically lose around 10%. -chris From bothie at gmx.de Sat Mar 6 01:33:58 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Sat, 6 Mar 2004 02:33:58 +0100 Subject: e2image, ext3 and nightly backups. In-Reply-To: <87y8qeg9jd.fsf@enki.rimspace.net> References: <87y8qeg9jd.fsf@enki.rimspace.net> Message-ID: <200403060132.i261Wdb15555@mx1.redhat.com> Daniel Pittman wrote: > Is e2image worth running if the file system is online and in use, under > the 2.6 series kernels, as part of a nightly backup run? Yes IMHO (but on other kernel versions, too). > If there /is/ a risk of the data being incomplete or incorrect, There *is* (unless you can -o remount,ro before running e2image). > is there > anything that can be done to make this less likely or to detect the > issue, The sledgehammer-method: 1. e2image to a file called a 2. e2image to a file called b 3. Compare the files - if they are identical you are done - remove file b. 4. remove file a 5. rename file b to file a 6. go on at step 2 > Does e2image capture all the information necessary to support ext3 file > systems? The only difference between ext2 and ext3 is the journal. It would be fatal to replay that some days later from a backed up version. So it's senseless at all to backup the journal. > It looks to capture everything except the journal content; will this > cause problems later if, say, the journal inode is destroyed but the > content isn't? What if the journal inode /and/ content are destroyed? See above. In general e2image is no replace for e2fsck. If the filesystem gets really such horribly broken, that e2fsck cannot repair it, than the data captured via e2image can be used to rescue the files. But as that will make available only files which are some days old, you shouldn't bother about the journal at all. Think about it this way: In general you will never need the output of e2image at all. So if it is only a little bit incomplete (missing journal or something similar) it's not worth worrying about at all. Regards, Bodo From tytso at mit.edu Sat Mar 6 05:01:11 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Sat, 6 Mar 2004 00:01:11 -0500 Subject: e2image, ext3 and nightly backups. In-Reply-To: <200403060132.i261Wdb15555@mx1.redhat.com> References: <87y8qeg9jd.fsf@enki.rimspace.net> <200403060132.i261Wdb15555@mx1.redhat.com> Message-ID: <20040306050111.GB18647@thunk.org> On Sat, Mar 06, 2004 at 02:33:58AM +0100, Bodo Thiesen wrote: > See above. In general e2image is no replace for e2fsck. If the filesystem > gets really such horribly broken, that e2fsck cannot repair it, than the > data captured via e2image can be used to rescue the files. But as that will > make available only files which are some days old, you shouldn't bother > about the journal at all. Think about it this way: In general you will > never need the output of e2image at all. So if it is only a little bit > incomplete (missing journal or something similar) it's not worth worrying > about at all. It depends. If your inode table gets completely trashed, the inode table in the e2image file can give you a last-ditch chance to try to recover certain critical files. Sure, it won't have files that were created after the e2image file was created, but that's true of any backup. If you're not doing regular file backups, then e2image is a nice additional safety measure. Of course, you *should* be doing regular file backups, in which case e2image isn't really necessary, unless you want to do e2image backups more frequently than you are willing to do data backups. - Ted From Nicolas.Kowalski at imag.fr Fri Mar 5 14:20:25 2004 From: Nicolas.Kowalski at imag.fr (Nicolas.Kowalski at imag.fr) Date: Fri, 5 Mar 2004 14:20:25 +0000 (UTC) Subject: unexpected dirty buffer Message-ID: Hello. On a server running 2.4.25, I have the two following errors in the kernel logfile: Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) Should I worry about them (disk failure, filesystem damage) ? Thanks. As an addition what does the pair '08:11' means ? Is this the major/minor of the hard disk partition where the filesystem is located ? -- Nicolas From pavel at suse.cz Fri Mar 5 18:46:46 2004 From: pavel at suse.cz (Pavel Machek) Date: Fri, 5 Mar 2004 19:46:46 +0100 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <20040303234104.GD1875@convergence.de> References: <4044119D.6050502@andrew.cmu.edu> <4044366B.3000405@namesys.com> <4044B787.7080301@andrew.cmu.edu> <20040303234104.GD1875@convergence.de> Message-ID: <20040305184643.GA4758@openzaurus.ucw.cz> Hi! > It would be nice if someone with more profound knowledge could comment > on this, but my understanding of the problem is: > > - journaled filesystems can only work when they can enforce that > journal data is written to the platters at specifc times wrt > normal data writes > - IDE write caching makes the disk "lie" to the kernel, i.e. it says > "I've written the data" when it was only put in the cache > - now if a *power failure* keeps the disk from writing the cache > contents to the platter, the fs and journal are inconsistent > (a kernel crash would not cause this problem because the disk can > still write the cache contents to the platters) > - at next mount time the fs will read the journal from the disk > and try to use it to bring the fs into a consistent state; > however, since the journal on disk is not guaranteed to be up to date > this can *fail* (I have no idea what various fs implementations do > to handle this; I suspect they at least refuse to mount and require > you to manually run fsck. Or they don't notice and let you work > with a corrupt filesystem until they blow up.) > > Right? Or is this just paranoia? Twice a year I fsck my reiser drives, and yes there's some corruption there. So you are right, and its not paranoia. -- 64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms From ivandi at vamo.bg Sat Mar 6 13:00:58 2004 From: ivandi at vamo.bg (Ivan Ivanov) Date: Sat, 6 Mar 2004 15:00:58 +0200 (EET) Subject: Desktop Filesystem Benchmarks in 2.6.3 Message-ID: <48966.212.36.18.2.1078578058.squirrel@mail.vamo.bg> I don't think that XFS is a desktop filesystem at all. This is from XFS FAQ: qoute ------------ Q: Why do I see binary NULLS in some files after recovery when I unplugged the power? If it hurts don't do that! * NOTE: XFS 1.1 and kernels => 2.4.18 has the asynchronous delete path which means that you will see a lot less of these problems. If you still have not updated to the 1.1 release or later, now would be a good time! Basically this is normal behavior. XFS journals metadata updates, not data updates. After a crash you are supposed to get a consistent filesystem which looks like the state sometime shortly before the crash, NOT what the in memory image looked like the instant before the crash. Since XFS does not write data out to disk immediately unless you tell it to with fsync or an O_SYNC open (the same is true of other filesystems), you are looking at an inode which was flushed out to disk, but for which the data was never flushed to disk. You will find that the inode is not taking any disk space since all it has is a size, there are no disk blocks allocated for it yet. This same will apply to other metadata only journaling filesystems. The current linux kernel VM will write out the metadata after 1/60th of a second and the data after 30 seconds. So the possibility of losing data when unplugging the power within 30 seconds is quite large. The only way of being sure that your data will get to the disk is using fsync in the program of sync after closing the program. ------------ I am trying XFS from 2.4.6 and I can reproduce this case easy. Simply write some file and unplug the power during write. And we are talking for desktop :). XFS is the worst case for recovery too. For desktop filesystem speed is not mandatory. Hard disk speed is most importatnt. And much more important is recovery. So I think that ext3 is the best solution for desktop system. It performs well and is the most recovarable linux filesystem. From mfedyk at matchmail.com Sun Mar 7 05:00:20 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Sat, 06 Mar 2004 21:00:20 -0800 Subject: Desktop Filesystem Benchmarks in 2.6.3 In-Reply-To: <48966.212.36.18.2.1078578058.squirrel@mail.vamo.bg> References: <48966.212.36.18.2.1078578058.squirrel@mail.vamo.bg> Message-ID: <404AAC64.305@matchmail.com> Ivan Ivanov wrote: > I don't think that XFS is a desktop filesystem at all. > > This is from XFS FAQ: > > qoute > ------------ > Q: Why do I see binary NULLS in some files after recovery when I > unplugged the power? > > If it hurts don't do that! > > * NOTE: XFS 1.1 and kernels => 2.4.18 has the asynchronous delete path > which means that you will see a lot less of these problems. If you still > have not updated to the 1.1 release or later, now would be a good time! > > Basically this is normal behavior. XFS journals metadata updates, not > data updates. After a crash you are supposed to get a consistent > filesystem which looks like the state sometime shortly before the crash, > NOT what the in memory image looked like the instant before the crash. > Since XFS does not write data out to disk immediately unless you tell it > to with fsync or an O_SYNC open (the same is true of other filesystems), > you are looking at an inode which was flushed out to disk, but for which > the data was never flushed to disk. You will find that the inode is not > taking any disk space since all it has is a size, there are no disk > blocks allocated for it yet. > > This same will apply to other metadata only journaling filesystems. The > current linux kernel VM will write out the metadata after 1/60th of a > second and the data after 30 seconds. So the possibility of losing data > when unplugging the power within 30 seconds is quite large. The only way > of being sure that your data will get to the disk is using fsync in the > program of sync after closing the program. > ------------ > > I am trying XFS from 2.4.6 and I can reproduce this case easy. Simply > write some file and unplug the power during write. And we are talking > for desktop :). XFS is the worst case for recovery too. > > For desktop filesystem speed is not mandatory. Hard disk speed is most > importatnt. And much more important is recovery. So I think that ext3 is > the best solution for desktop system. It performs well and is the most > recovarable linux filesystem. Agreed. Thanks to Suse, reiserfs v3 will have an ordered mode like ext3, and reiser4 does data journaling by default. Until the other filesytems write out the data before the inodes are journaled (ie ordered mode), they are not suitable in an environment where the power can go out unexpectedly. Mike From mfedyk at matchmail.com Sun Mar 7 05:35:18 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Sat, 06 Mar 2004 21:35:18 -0800 Subject: unexpected dirty buffer In-Reply-To: References: Message-ID: <404AB496.80603@matchmail.com> Nicolas.Kowalski at imag.fr wrote: > Hello. > > On a server running 2.4.25, I have the two following errors in the > kernel logfile: > > Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) > Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) > It's probably a driver problem. > > Should I worry about them (disk failure, filesystem damage) ? It's possible, depending on which driver is having trouble. Did you run fsck? > > Thanks. > > > As an addition what does the pair '08:11' means ? Is this the > major/minor of the hard disk partition where the filesystem is located ? Yes, check /proc/partitions. Mike From Nicolas.Kowalski at imag.fr Sun Mar 7 10:52:08 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Sun, 07 Mar 2004 11:52:08 +0100 Subject: unexpected dirty buffer In-Reply-To: <404AB496.80603@matchmail.com> (Mike Fedyk's message of "Sat, 06 Mar 2004 21:35:18 -0800") References: <404AB496.80603@matchmail.com> Message-ID: Mike Fedyk writes: > Nicolas.Kowalski at imag.fr wrote: >> Hello. >> On a server running 2.4.25, I have the two following errors in the >> kernel logfile: >> Unexpected dirty buffer encountered at do_get_write_access:618 >> (08:11 blocknr 920707) >> Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) >> > > It's probably a driver problem. > >> Should I worry about them (disk failure, filesystem damage) ? > > It's possible, depending on which driver is having trouble. Here is the SCSI adapter output: SCSI subsystem driver Revision: 1.00 Loading Adaptec I2O RAID: Version 2.4 Build 5 Detecting Adaptec I2O RAID controllers... sym0: <896> rev 0x5 on pci bus 5 device 5 function 0 irq 24 sym0: using 64 bit DMA addressing sym0: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. sym0: handling phase mismatch from SCRIPTS. sym0: SCSI BUS has been reset. sym1: <896> rev 0x5 on pci bus 5 device 5 function 1 irq 25 sym1: using 64 bit DMA addressing sym1: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking sym1: open drain IRQ line driver, using on-chip SRAM sym1: using LOAD/STORE-based firmware. sym1: handling phase mismatch from SCRIPTS. sym1: SCSI BUS has been reset. scsi0 : sym-2.1.17a scsi1 : sym-2.1.17a blk: queue c1654418, I/O limit 1048575Mb (mask 0xffffffffff) Vendor: HP Model: 9.10GB C 68-BX02 Rev: BX02 Type: Direct-Access ANSI SCSI revision: 02 blk: queue c1654218, I/O limit 1048575Mb (mask 0xffffffffff) Vendor: HP Model: 9.10GB C 68-BX02 Rev: BX02 Type: Direct-Access ANSI SCSI revision: 02 blk: queue dfeb3e18, I/O limit 1048575Mb (mask 0xffffffffff) sym0:0:0: tagged command queuing enabled, command queue depth 16. sym0:1:0: tagged command queuing enabled, command queue depth 16. Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0 > Did you run fsck? Not yes. It is planned for next reboot (touch /forcefsck), which will happen next monday. >> Thanks. >> As an addition what does the pair '08:11' means ? Is this the >> major/minor of the hard disk partition where the filesystem is located ? > > Yes, check /proc/partitions. Then, I am just more confused. This 08:11 pair does match anything... olan:~# cat /proc/partitions major minor #blocks name 8 0 8886762 sda 8 1 195568 sda1 8 2 976896 sda2 8 3 1952768 sda3 8 4 5761024 sda4 8 16 8886762 sdb 8 17 8886256 sdb1 Thanks for your reply. -- Nicolas From leandro at dutra.fastmail.fm Mon Mar 8 10:01:31 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Mon, 08 Mar 2004 07:01:31 -0300 Subject: PROBLEM: log abort over RAID5 References: <200403051617.i25GH0b31475@mx1.redhat.com> <200403052116.i25LGJb10549@mx1.redhat.com> Message-ID: Em Fri, 05 Mar 2004 22:17:01 +0100, Bodo Thiesen escreveu: > in general badblocks does work fine with mounted filesystems in > read-only test ... Oh yes, I went directly to a non-destructive read-write test. Will report shortly. -- Leandro Guimar?es Faria Corsetti Dutra Maring?, PR, BRASIL http://br.geocities.com./lgcdutra/ Soli Deo Gloria! From Nicolas.Kowalski at imag.fr Mon Mar 8 11:44:36 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Mon, 08 Mar 2004 12:44:36 +0100 Subject: unexpected dirty buffer In-Reply-To: (Nicolas Kowalski's message of "Sun, 07 Mar 2004 11:52:08 +0100") References: <404AB496.80603@matchmail.com> Message-ID: Nicolas Kowalski writes: > Mike Fedyk writes: > >> Nicolas.Kowalski at imag.fr wrote: >>> Hello. >>> On a server running 2.4.25, I have the two following errors in the >>> kernel logfile: >>> Unexpected dirty buffer encountered at do_get_write_access:618 >>> (08:11 blocknr 920707) >>> Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) >> Did you run fsck? > > Not yes. It is planned for next reboot (touch /forcefsck), which will > happen next monday. fsck went fine, reporting nothing wrong. If these errors come back again, what can I do to provide you more information ? Thanks. -- Nicolas From leandro at dutra.fastmail.fm Mon Mar 8 12:54:20 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Mon, 08 Mar 2004 09:54:20 -0300 Subject: PROBLEM: log abort over RAID5 References: <200403051617.i25GH0b31475@mx1.redhat.com> Message-ID: Em Fri, 05 Mar 2004 17:18:05 +0100, Bodo Thiesen escreveu: > 1. Check if the hard disk has bad blocks. > > # badblocks -b [...] > 2. Umount the filesystem > > 3. Run e2fsck > > If that fails, report it here and don't continue ;-) I did both steps in one, with e2fsck -cc on all LVMv2 partitions, both ext3 and ext2. No errors. > 4. Try to remount the filesystem as ext3. > > If that works, you should be done. Yes, it works, but it has always worked. After I get the abort log, I always can reboot the system, get lots of seemingly minor corrections, and continue work for one or two days, perhaps almost a week. Then I get another abort log, one filesystem remounted read-only, and have to reboot again. > 5. If that doesn't work then: > > Report the error messages here. You may continue or wait for answers > ... > > 6. disable usage of journal: tune2fs -O ^has_journal > > 7. mount the file system as ext2 and update /etc/fstab according to that. In the off chance e2fsck -cc will have done any good, will try to run the system on ext3 a few days more. If all continues the same, I will report the errors here and go back to ext2. Be aware this error has been seen by more than poor me. If anyone needs any information more to be able to reproduce and troubleshoot, please feel free. -- Leandro Guimar?es Faria Corsetti Dutra +55 (11) 5685 2219 Av Sgto Geraldo Santana, 1100 6/71 +55 (11) 5686 9607 04.674-000 S?o Paulo, SP BRASIL http://br.geocities.com./lgcdutra/ From ext3 at philwhite.org Mon Mar 8 19:07:43 2004 From: ext3 at philwhite.org (Phil White) Date: Mon, 08 Mar 2004 11:07:43 -0800 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: References: <404AB496.80603@matchmail.com> Message-ID: <404CC47F.7090301@philwhite.org> Nicolas, Is this partition mounted with data=journal? I get the same errors consistently whenever I use data=journal. In my case, it seems to be a bug with certain data access patterns (I have an application that makes heavy use of mmap'ed IO and can trigger these errors once a minute, followed by a kernel panic not long thereafter). Until this bug is fixed, I am using data=ordered (the default), which produces zero errors. --Phil Nicolas Kowalski wrote: >Nicolas Kowalski writes: > > > >>Mike Fedyk writes: >> >> >> >>>Nicolas.Kowalski at imag.fr wrote: >>> >>> >>>>Hello. >>>>On a server running 2.4.25, I have the two following errors in the >>>>kernel logfile: >>>>Unexpected dirty buffer encountered at do_get_write_access:618 >>>>(08:11 blocknr 920707) >>>>Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) >>>> >>>> > > > >>>Did you run fsck? >>> >>> >>Not yes. It is planned for next reboot (touch /forcefsck), which will >>happen next monday. >> >> > >fsck went fine, reporting nothing wrong. > >If these errors come back again, what can I do to provide you more >information ? > >Thanks. > > > From clay at exavio.com.cn Tue Mar 9 07:46:43 2004 From: clay at exavio.com.cn (Isaac Claymore) Date: Tue, 9 Mar 2004 15:46:43 +0800 Subject: [OT] block allocation algorithm [Was] Re: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040304021707.GA13386@thunk.org> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> <20040304021707.GA13386@thunk.org> Message-ID: <20040309074643.GA1557@exavio.com.cn> On Wed, Mar 03, 2004 at 09:17:07PM -0500, Theodore Ts'o wrote: > On Wed, Mar 03, 2004 at 03:01:35PM -0800, Guolin Cheng wrote: > > I got machines running continuously for long time, but then the underlying ext3 file systems becomes quite heavily fragmented (94% non-contiguous). > > Note that non-contiguous does not necessarily mean fragmented. Files > that are larger than a block group will be non-contiguous by > definition. On the other hand if you have more than one file > simultaneously being written to in a directory, then yes the files > will certainly get fragmented. Hi, I've got a workload that several clients tend to write to separate files under a same dir simultaneously, resulting in heavily fragmented files. And, even worse, those files are rarely read simultaneously, thus read performance degrades quite alot. I'm wondering whether there's any feature that helps alleviating fragmentation in such workloads. Does writing to different dirs(of a same filesystem) help? If ext2/3 cant do much in such workloads, do you know of any other filesystem featuring such a block allocation algorithm that somewhat differentiates among simultaneous writers? Thanks alot for any hint/suggestion. > > Are you a sufficient read-performance degredation? If not, it may not > be worth bothering to defrag the filesystem. > > > Anyone have any ideas on defraging ext3 file systems on-line? Thanks a lot. > > There rae no on-line defrag tools right now, sorry. > > - Ted > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -- Regards, Isaac () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From d_baron at 012.net.il Thu Mar 4 10:50:57 2004 From: d_baron at 012.net.il (David Baron) Date: Thu, 4 Mar 2004 11:50:57 +0100 Subject: [debian-knoppix] warning: updated with obselete bdflush call Message-ID: <200403041150.57687.d_baron@012.net.il> Get this warning on bootup ext3 file checks on 2.6.* kernels. Apparently harmless, but how do I fix this? _______________________________________________ debian-knoppix mailing list debian-knoppix at linuxtag.org http://mailman.linuxtag.org/mailman/listinfo/debian-knoppix From adilger at clusterfs.com Tue Mar 9 08:02:48 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Tue, 9 Mar 2004 01:02:48 -0700 Subject: [OT] block allocation algorithm [Was] Re: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040309074643.GA1557@exavio.com.cn> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> <20040304021707.GA13386@thunk.org> <20040309074643.GA1557@exavio.com.cn> Message-ID: <20040309080248.GB1144@schnapps.adilger.int> On Mar 09, 2004 15:46 +0800, Isaac Claymore wrote: > I've got a workload that several clients tend to write to separate files > under a same dir simultaneously, resulting in heavily fragmented files. > And, even worse, those files are rarely read simultaneously, thus read > performance degrades quite alot. > > I'm wondering whether there's any feature that helps alleviating > fragmentation in such workloads. Does writing to different dirs(of a same > filesystem) help? Very much yes. Files allocated from different directories will get blocks from different parts of the filesystem (if available), so they should be less fragmented. In 2.6 there is a heuristic that files opened by different processes allocate from different parts of a group, even within the same directory, but that only really helps if the files themselves aren't too large (i.e. under 8MB or so). Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From Nicolas.Kowalski at imag.fr Tue Mar 9 08:24:23 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Tue, 09 Mar 2004 09:24:23 +0100 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: <404CC47F.7090301@philwhite.org> (Phil White's message of "Mon, 08 Mar 2004 11:07:43 -0800") References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> Message-ID: Phil White writes: > Nicolas, Hello. > Is this partition mounted with data=journal? One of its partition (/var) is mounted with this option yes. Actually, this partition is moderately used (database, syslog server, etc.). > I get the same errors consistently whenever I use data=journal. In > my case, it seems to be a bug with certain data access patterns (I > have an application that makes heavy use of mmap'ed IO and can > trigger these errors once a minute, followed by a kernel panic not > long thereafter). Until this bug is fixed, I am using data=ordered > (the default), which produces zero errors. I also had three new messages yesterday (after reboot+fsck): Mar 8 18:03:45 olan kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920719) Mar 8 19:03:54 olan kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) Mar 8 19:03:54 olan kernel: Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920707) I am still confused because the 08:11 pair does not match any partition on my system: # cat /proc/partitions major minor #blocks name 8 0 8886762 sda 8 1 195568 sda1 8 2 976896 sda2 8 3 1952768 sda3 8 4 5761024 sda4 8 16 8886762 sdb 8 17 8886256 sdb1 # mount -t ext3 /dev/sda1 on / type ext3 (rw,errors=remount-ro) /dev/sda3 on /usr type ext3 (rw) /dev/sdb1 on /var type ext3 (rw,data=journal) Thanks for your reply. -- Nicolas From ext3 at philwhite.org Tue Mar 9 09:58:55 2004 From: ext3 at philwhite.org (Phil White) Date: Tue, 9 Mar 2004 01:58:55 -0800 (PST) Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> Message-ID: <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> Interesting... You might try remounting /var with data=ordered and see if the problem goes away. I'm guessing it will. I would actually recommend doing this, as I experienced regular crashes with data=journal under heavy load in our test lab. Under moderate load, I only saw the "Unexpected dirty buffer..." messages, but when I really cranked up the load (45 emails per second through postfix + spam filter, 2000 mysql queries per second on a 1GB table, syslogging maillog to disk), I was getting kernel panics, 100% reproducible within a few minutes of testing. Because of these problems, I'm sticking to data=ordered for production, even though I want data-journaling. :) I doubt this is a driver problem, as I can duplicate it on two completely different hardware platforms (one with SCSI RAID, the other with an integrated IDE controller), unless of course those two drivers (and yours) have the same bug :) --Phil From Nicolas.Kowalski at imag.fr Tue Mar 9 10:13:35 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Tue, 09 Mar 2004 11:13:35 +0100 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> (Phil White's message of "Tue, 9 Mar 2004 01:58:55 -0800 (PST)") References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> Message-ID: "Phil White" writes: > Interesting... Indeed. > You might try remounting /var with data=ordered and see if the problem > goes away. I'm guessing it will. Ok, I will switch this server back to default journalling, and check if these errors disappear after that. Thanks for your suggestion. -- Nicolas From mfedyk at matchmail.com Tue Mar 9 19:25:29 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Tue, 09 Mar 2004 11:25:29 -0800 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> Message-ID: <404E1A29.8040208@matchmail.com> Phil White wrote: > Interesting... > > You might try remounting /var with data=ordered and see if the problem > goes away. I'm guessing it will. > > I would actually recommend doing this, as I experienced regular crashes > with data=journal under heavy load in our test lab. Under moderate load, > I only saw the "Unexpected dirty buffer..." messages, but when I really > cranked up the load (45 emails per second through postfix + spam filter, > 2000 mysql queries per second on a 1GB table, syslogging maillog to disk), > I was getting kernel panics, 100% reproducible within a few minutes of > testing. Because of these problems, I'm sticking to data=ordered for > production, even though I want data-journaling. :) Please post those traces to the list so the developers can fix this. From sct at redhat.com Tue Mar 9 23:07:18 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 09 Mar 2004 23:07:18 +0000 Subject: PROBLEM: log abort over RAID5 In-Reply-To: References: Message-ID: <1078873638.2460.81.camel@sisko.scot.redhat.com> Hi, On Fri, 2004-03-05 at 13:45, Leandro Guimar?es Faria Corsetti Dutra wrote: > After I/O, journal is aborted and filesystems made read-only. That's the default ext3 behaviour when it finds certain types of on-disk corruption, where continuing to allow writes could lead to the corruption just getting worse; and > Feb 26 06:25:24 mercurio kernel: EXT3-fs error (device dm-2): > ext3_readdir: bad entry in directory #381585: directory entry across > blocks - offset=0, inode=0, rec_len=4132, name_len=63 here's the first of the errors being detected. It's just not possible to diagnose _why_ it went bad from the data here. It could be a disk or driver fault; bad memory, overheating CPU, configuration error, anything. Is there a repeating pattern to the problems? --Stephen From leandro at dutra.fastmail.fm Wed Mar 10 00:24:59 2004 From: leandro at dutra.fastmail.fm (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Tue, 09 Mar 2004 21:24:59 -0300 Subject: PROBLEM: log abort over RAID5 References: <1078873638.2460.81.camel@sisko.scot.redhat.com> Message-ID: Em Tue, 09 Mar 2004 23:07:18 +0000, Stephen C. Tweedie escreveu: > It could be a disk or driver fault; bad memory, overheating CPU, > configuration error, anything. Is there a full check-list around? About the machine itself I am quite sure. It is a new Intel board, Adaptec adapter and three SCSI disks, I've run CPU burn-ins and disk tests from the LTS, and the ones which were suggested in this list a few days ago. > Is there a repeating pattern to the problems? They tend to happen at the end of the afternoon, beginning of evening, I haven't remembered to check crontab to see if there is something different there; there shouldn't be anything special, it is a Debian testing system I just set up, and I have put nothing there besides Debian packages and Oracle (iikes!) If you Google around you will see other people have similar patterns. It is reported that going back to either 2.4 or ext2 solves the problem, but not being in production yet I'm still hoping for diagnosis and a fix. Thanks for your attention! Please tell me if anyone need more information to diagnorse or reproduce. -- Leandro Guimar?es Faria Corsetti Dutra Maring?, PR, BRASIL http://br.geocities.com./lgcdutra/ Soli Deo Gloria! From akpm at osdl.org Wed Mar 10 01:16:31 2004 From: akpm at osdl.org (Andrew Morton) Date: Tue, 9 Mar 2004 17:16:31 -0800 Subject: [OT] block allocation algorithm [Was] Re: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040309074643.GA1557@exavio.com.cn> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> <20040304021707.GA13386@thunk.org> <20040309074643.GA1557@exavio.com.cn> Message-ID: <20040309171631.7b496d6d.akpm@osdl.org> Isaac Claymore wrote: > > I've got a workload that several clients tend to write to separate files > under a same dir simultaneously, resulting in heavily fragmented files. > And, even worse, those files are rarely read simultaneously, thus read > performance degrades quite alot. We really, really suck at this. I have a little hack here which provides an ioctl with which you can instantiate blocks outside the end-of-file, so each time you've written 128M you go into the filesystem and say "reserve me another 128M". This causes the 128M chunks to be laid out very nicely indeed. It is, however, wildly insecure - it's trivial to use this to read uninitialised disk blocks. But we happen to not care about that. It is, however, a potential way forward to fix this problem. Do the growth automatically somehow, fix the security problem, stick the inodes on the orphan list so that they get trimmed back to the correct size during recovery, and there we have it. We're a bit short on bodies to do it at present though. One thing you could do, which _may_ suit is to write the files beforehand and change your app to perform overwrite. Or just change your app to buffer more data: write 16MB at a time. From ext3 at philwhite.org Wed Mar 10 07:22:04 2004 From: ext3 at philwhite.org (Phil White) Date: Tue, 9 Mar 2004 23:22:04 -0800 (PST) Subject: [BULK] - Re: [BULK] - Re: unexpected dirty buffer In-Reply-To: <404E1A29.8040208@matchmail.com> References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> <404E1A29.8040208@matchmail.com> Message-ID: <1099.67.169.114.63.1078903324.squirrel@flight.code-visions.com> > > Please post those traces to the list so the developers can fix this. https://www.redhat.com/archives/ext3-users/2004-March/msg00000.html --Phil From cchan at outblaze.com Wed Mar 10 14:01:30 2004 From: cchan at outblaze.com (Christopher Chan) Date: Wed, 10 Mar 2004 22:01:30 +0800 Subject: mkfs under different kernels Message-ID: <404F1FBA.3090800@outblaze.com> Is there any difference between running mkfs.ext3 under a 2.4 kernel and running that under a 2.6 kernel? Any benefits gained from doing it under 2.6? From cchan at outblaze.com Wed Mar 10 14:25:31 2004 From: cchan at outblaze.com (Christopher Chan) Date: Wed, 10 Mar 2004 22:25:31 +0800 Subject: mkfs under different kernels In-Reply-To: <404F1FBA.3090800@outblaze.com> References: <404F1FBA.3090800@outblaze.com> Message-ID: <404F255B.2060701@outblaze.com> Christopher Chan wrote: > Is there any difference between running mkfs.ext3 under a 2.4 kernel and > running that under a 2.6 kernel? > > Any benefits gained from doing it under 2.6? Let me add some context. I am going to conduct some benchmarks between xfs and ext3 under different configurations and I wondered if running mkfs under different kernels would affect the results. From adilger at clusterfs.com Wed Mar 10 18:20:00 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 10 Mar 2004 11:20:00 -0700 Subject: mkfs under different kernels In-Reply-To: <404F255B.2060701@outblaze.com> References: <404F1FBA.3090800@outblaze.com> <404F255B.2060701@outblaze.com> Message-ID: <20040310182000.GL1144@schnapps.adilger.int> On Mar 10, 2004 22:25 +0800, Christopher Chan wrote: > Christopher Chan wrote: > >Is there any difference between running mkfs.ext3 under a 2.4 kernel and > >running that under a 2.6 kernel? > > > >Any benefits gained from doing it under 2.6? > > Let me add some context. I am going to conduct some benchmarks between > xfs and ext3 under different configurations and I wondered if running > mkfs under different kernels would affect the results. No. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From cchan at outblaze.com Wed Mar 10 23:37:49 2004 From: cchan at outblaze.com (Christopher Chan) Date: Thu, 11 Mar 2004 07:37:49 +0800 Subject: mkfs under different kernels In-Reply-To: <20040310182000.GL1144@schnapps.adilger.int> References: <404F1FBA.3090800@outblaze.com> <404F255B.2060701@outblaze.com> <20040310182000.GL1144@schnapps.adilger.int> Message-ID: <404FA6CD.40205@outblaze.com> >> >>>Is there any difference between running mkfs.ext3 under a 2.4 kernel and >>>running that under a 2.6 kernel? >>> > > > No. > Thanks. From Nicolas.Kowalski at imag.fr Fri Mar 12 13:48:28 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Fri, 12 Mar 2004 14:48:28 +0100 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: (Nicolas Kowalski's message of "Tue, 09 Mar 2004 11:13:35 +0100") References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> Message-ID: Nicolas Kowalski writes: > "Phil White" writes: > >> Interesting... > > Indeed. > > >> You might try remounting /var with data=ordered and see if the problem >> goes away. I'm guessing it will. > > Ok, I will switch this server back to default journalling, and check > if these errors disappear after that. The server is up since 3 days, with the /var partition mounted with default journalling mode, and I have no errors. I will try to reproduce these errors on a non-production server now. -- Nicolas From mfedyk at matchmail.com Fri Mar 12 18:20:34 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Fri, 12 Mar 2004 10:20:34 -0800 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> Message-ID: <4051FF72.8070403@matchmail.com> Nicolas Kowalski wrote: > Nicolas Kowalski writes: > > >>"Phil White" writes: >> >> >>>Interesting... >> >>Indeed. >> >> >> >>>You might try remounting /var with data=ordered and see if the problem >>>goes away. I'm guessing it will. >> >>Ok, I will switch this server back to default journalling, and check >>if these errors disappear after that. > > > The server is up since 3 days, with the /var partition mounted with > default journalling mode, and I have no errors. > > I will try to reproduce these errors on a non-production server now. Beautiful. It might be good if you put a stack_dump() call just after the printk() call in the ext3 source. Mike From bunk at fs.tum.de Thu Mar 11 20:23:53 2004 From: bunk at fs.tum.de (Adrian Bunk) Date: Thu, 11 Mar 2004 21:23:53 +0100 Subject: 2.6.4-mm1: modular quota needs unknown symbol In-Reply-To: <20040310233140.3ce99610.akpm@osdl.org> References: <20040310233140.3ce99610.akpm@osdl.org> Message-ID: <20040311202352.GD14833@fs.tum.de> On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote: >... > ext3-journalled-quotas-2.patch > ext3: journalled quota >... This patch broke modular quota: WARNING: /lib/modules/2.6.4-mm1/kernel/fs/quota_v2.ko needs unknown symbol mark_info_dirty cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From m.c.p at wolk-project.de Fri Mar 12 08:51:57 2004 From: m.c.p at wolk-project.de (Marc-Christian Petersen) Date: Fri, 12 Mar 2004 09:51:57 +0100 Subject: 2.6.4-mm1: modular quota needs unknown symbol In-Reply-To: <20040311202352.GD14833@fs.tum.de> References: <20040310233140.3ce99610.akpm@osdl.org> <20040311202352.GD14833@fs.tum.de> Message-ID: <200403120951.57637@WOLK> On Thursday 11 March 2004 21:23, Adrian Bunk wrote: Hi Adrian, > On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote: > >... > > ext3-journalled-quotas-2.patch > > ext3: journalled quota > >... > This patch broke modular quota: > WARNING: /lib/modules/2.6.4-mm1/kernel/fs/quota_v2.ko needs unknown > symbol mark_info_dirty Patch attached (again) ;) ciao, Marc -------------- next part -------------- A non-text attachment was scrubbed... Name: 2.6.4-mm1-fixups-0.patch Type: text/x-diff Size: 299 bytes Desc: not available URL: From clay at exavio.com.cn Mon Mar 15 03:13:32 2004 From: clay at exavio.com.cn (Isaac Claymore) Date: Mon, 15 Mar 2004 11:13:32 +0800 Subject: [OT] block allocation algorithm [Was] Re: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040309080248.GB1144@schnapps.adilger.int> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> <20040304021707.GA13386@thunk.org> <20040309074643.GA1557@exavio.com.cn> <20040309080248.GB1144@schnapps.adilger.int> Message-ID: <20040315031332.GD19159@exavio.com.cn> On Tue, Mar 09, 2004 at 01:02:48AM -0700, Andreas Dilger wrote: > On Mar 09, 2004 15:46 +0800, Isaac Claymore wrote: > > I've got a workload that several clients tend to write to separate files > > under a same dir simultaneously, resulting in heavily fragmented files. > > And, even worse, those files are rarely read simultaneously, thus read > > performance degrades quite alot. > > > > I'm wondering whether there's any feature that helps alleviating > > fragmentation in such workloads. Does writing to different dirs(of a same > > filesystem) help? > > Very much yes. Files allocated from different directories will get blocks > from different parts of the filesystem (if available), so they should be > less fragmented. In 2.6 there is a heuristic that files opened by different > processes allocate from different parts of a group, even within the same > directory, but that only really helps if the files themselves aren't too > large (i.e. under 8MB or so). > Thanks, I did some test on this last weekend, and here're the results in case someone is interested: Test environment: kernel: 2.6.3 with latest reiser4 patches applied. OS: Debian testing/unstable HW: Intel(R) Pentium(R) 4 CPU 1.80GHz, 256M RAM For each FS configuration, my test went on as: dumping 3 files of 1G each simultaneously, and measure the fragmentation with 'filefrag'. Each test iteration was done on a freshly formatted filesystem. Here goes the figures & my evaluations: 1. reiser3, 3 files under a same dir: sandbox:/mnt/foo [1016]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait sandbox:/mnt/foo [1018]# filefrag f0 f1 f2 f0: 470 extents found f1: 461 extents found f2: 470 extents found My Evaluation: badly fragmented! 2. reiser3, 3 files under 3 different dirs: sandbox:/mnt [1028]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero of=dir1/foo bs=16M count=64&;dd if=/dev/zero of=dir2/foo bs=16M count=64&;wait sandbox:/mnt [1029]# filefrag dir0/foo dir1/foo dir2/foo dir0/foo: 448 extents found dir1/foo: 462 extents found dir2/foo: 443 extents found My Evaluation: still bad, spreading the files under different dirs did no visible good. 3. ext3, 3 files under a same dir: sandbox:/mnt/foo [1041]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait sandbox:/mnt/foo [1044]# filefrag f0 f1 f2 f0: 202 extents found, perfection would be 9 extents f1: 207 extents found, perfection would be 9 extents f2: 208 extents found, perfection would be 9 extents My Evaluation: much better than reiser3, yet far from perfection. 4. ext3, 3 files under 3 different dirs: sandbox:/mnt [1054]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero of=dir1/foo bs=16M count=64&;dd if=/dev/zero of=dir2/foo bs=16M count=64&;wait sandbox:/mnt [1056]# sandbox:/mnt [1056]# filefrag dir0/foo dir1/foo dir2/foo dir0/foo: 91 extents found, perfection would be 9 extents dir1/foo: 9 extents found dir2/foo: 95 extents found, perfection would be 9 extents My Evaluation: spreading the files under different dirs DID help quite alot! but can we get even better result by spread the files more sparsely? (see next test) 5. still ext3, mkdir 10 dirs first, then dumping the files under the 1st, 5th, and 9th dirs: sandbox:/mnt [1085]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo bs=16M count=64&;wait sandbox:/mnt [1086]# filefrag dir{0,4,9}/foo dir0/foo: 11 extents found, perfection would be 9 extents dir4/foo: 11 extents found, perfection would be 9 extents dir9/foo: 10 extents found, perfection would be 9 extents My Evaluation: almost perfect! 6. XFS, 3 files under a same dir: sandbox:/mnt/foo [1112]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait sandbox:/mnt/foo [1114]# filefrag f0 f1 f2 f0: 25 extents found f1: 11 extents found f2: 20 extents found My Evaluation: this'd be the BEST result I got, when dumping into a same dir. 7. XFS, dumping into 3 dirs among ten, similar to test 5: sandbox:/mnt [1127]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo bs=16M count=64&;wait sandbox:/mnt [1128]# filefrag dir0/foo dir4/foo dir9/foo dir0/foo: 1 extent found dir4/foo: 1 extent found dir9/foo: 1 extent found My Evaluation: impressed! cant be any better now. 8. Reiser4, 1 dir sandbox:/mnt/foo [1155]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait sandbox:/mnt/foo [1156]# filefrag f0 f1 f2 f0: 45 extents found f1: 6011 extents found f2: 45 extents found My Evaluation: far better than it's brother reiser3. the 6011 extents of f1 was weird, i'd have done more iterations to get an average, just blame lazy me ;) 9. Reiser4, 3 dirs among 10: sandbox:/mnt [1165]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo bs=16M count=64&;wait sandbox:/mnt [1167]# filefrag dir{0,4,9}/foo dir0/foo: 42 extents found dir4/foo: 50 extents found dir9/foo: 46 extents found My Evaluation: nice figures, really. and unlike its elder brother, using more dirs DID help. > Cheers, Andreas > -- > Andreas Dilger > http://sourceforge.net/projects/ext2resize/ > http://www-mddsp.enel.ucalgary.ca/People/adilger/ > > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -- Regards, Isaac () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From Nicolas.Kowalski at imag.fr Mon Mar 15 08:15:42 2004 From: Nicolas.Kowalski at imag.fr (Nicolas Kowalski) Date: Mon, 15 Mar 2004 09:15:42 +0100 Subject: [BULK] - Re: unexpected dirty buffer In-Reply-To: <4053FCCE.20403@matchmail.com> (Mike Fedyk's message of "Sat, 13 Mar 2004 22:33:50 -0800") References: <404AB496.80603@matchmail.com> <404CC47F.7090301@philwhite.org> <4616.67.169.114.63.1078826335.squirrel@flight.code-visions.com> <4051FF72.8070403@matchmail.com> <4053FCCE.20403@matchmail.com> Message-ID: Mike Fedyk writes: > Nicolas Kowalski wrote: >> Mike Fedyk writes: >> >>>Nicolas Kowalski wrote: >>> >>>>I will try to reproduce these errors on a non-production server now. >>> >>>Beautiful. >>> >>>It might be good if you put a stack_dump() call just after the >>>printk() call in the ext3 source. >> I apologize, (I am not familiar with kernel debugging), but when >> compiling the kernel with this call inserted after the printk in the >> sources, it fails with an resolved symbol error. ... >> fs/fs.o: In function `__jbd_unexpected_dirty_buffer': >> fs/fs.o(.text+0x3ab8a): undefined reference to `stack_dump' >> ... >> I must be missing an option, but which one ? > > Oh crap. It's called dump_stack(). Ok. I had another similar error this morning: Unexpected dirty buffer encountered at do_get_write_access:618 (08:11 blocknr 920701) dba1fddc dba1fe04 c017565e c03054a0 c0305483 c030373b 0000026a c03fc5e0 000e0c7d d1072580 dba1fe4c c016f76b c030373b 0000026a d34f1d80 d1072580 df4c1e94 d34f1d80 c01701dd 00000000 00000000 00000003 df4c1e00 d3615430 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] ksymoops gives me: Trace; c017565e <__jbd_unexpected_dirty_buffer+3a/74> Trace; c016f76b Trace; c01701dd Trace; c016fc10 Trace; c0167c88 Trace; c0167c4e Trace; c0167e21 Trace; c0167c74 Trace; c012f67a Trace; c012fb14 Trace; c01657e2 Trace; c013c807 Trace; c0108be3 Does this help ? -- Nicolas From Ralf.Hildebrandt at charite.de Mon Mar 15 08:18:06 2004 From: Ralf.Hildebrandt at charite.de (Ralf Hildebrandt) Date: Mon, 15 Mar 2004 09:18:06 +0100 Subject: Megre ext3/ext2 partitions? Message-ID: <20040315081806.GT6425@charite.de> Hi! Is it possible to merge two ext3/ext2 partitions into ONE ext3/ext2 partition? -- Ralf Hildebrandt (Im Auftrag des Referat V a) Ralf.Hildebrandt at charite.de Charite - Universit?tsmedizin Berlin Tel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-916 IT-Zentrum Standort Campus Mitte AIM. ralfpostfix From manuaroste at yahoo.es Mon Mar 15 09:11:17 2004 From: manuaroste at yahoo.es (=?iso-8859-1?q?Manuel=20Ar=F3stegui=20Ramirez?=) Date: Mon, 15 Mar 2004 10:11:17 +0100 (CET) Subject: Megre ext3/ext2 partitions? In-Reply-To: <20040315081806.GT6425@charite.de> Message-ID: <20040315091117.129.qmail@web60107.mail.yahoo.com> --- Ralf Hildebrandt escribi?: > Hi! > > Is it possible to merge two ext3/ext2 partitions > into ONE ext3/ext2 > partition? > Maybe GnuParted allow you to do that, but i'm not sure about it. Cheers ===== -- Manuel Ar?stegui Linux user 200896 Madrid, TE QUIERO. ___________________________________________________ Yahoo! Messenger - Nueva versi?n GRATIS Super Webcam, voz, caritas animadas, y m?s... http://messenger.yahoo.es From Ralf.Hildebrandt at charite.de Mon Mar 15 09:18:14 2004 From: Ralf.Hildebrandt at charite.de (Ralf Hildebrandt) Date: Mon, 15 Mar 2004 10:18:14 +0100 Subject: Megre ext3/ext2 partitions? In-Reply-To: <20040315091117.129.qmail@web60107.mail.yahoo.com> References: <20040315081806.GT6425@charite.de> <20040315091117.129.qmail@web60107.mail.yahoo.com> Message-ID: <20040315091814.GC6425@charite.de> * Manuel Ar?stegui Ramirez : > > Is it possible to merge two ext3/ext2 partitions > > into ONE ext3/ext2 > > partition? > > > Maybe GnuParted allow you to do that, but i'm not sure > about it. I haven't seen that option, thus I ask if it is possible at all. I could probably remove the 2nd partition, and expand the first. But that requires a backup of the 2nd partition. -- Ralf Hildebrandt (Im Auftrag des Referat V a) Ralf.Hildebrandt at charite.de Charite - Universit?tsmedizin Berlin Tel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-916 IT-Zentrum Standort Campus Mitte AIM. ralfpostfix From bothie at gmx.de Mon Mar 15 11:02:28 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Mon, 15 Mar 2004 12:02:28 +0100 Subject: Megre ext3/ext2 partitions? In-Reply-To: <20040315091814.GC6425@charite.de> References: <20040315081806.GT6425@charite.de> <20040315091117.129.qmail@web60107.mail.yahoo.com> <20040315091814.GC6425@charite.de> Message-ID: <2004-03-15-12-00-05-gmx-seems-broken@bodo-thiesen-server.dyndns.org> Ralf Hildebrandt wrote: > * Manuel Ar?stegui Ramirez : > >>> Is it possible to merge two ext3/ext2 partitions >>> into ONE ext3/ext2 >>> partition? >> >> Maybe GnuParted allow you to do that, but i'm not sure >> about it. > > I haven't seen that option, thus I ask if it is possible at all. > I could probably remove the 2nd partition, and expand the first. But > that requires a backup of the 2nd partition. Maybe the commercial/proprietary tool "Partition Magic"[1] is able to do that, since they support ext2(3) and can merge fat & CO for years already. But I don't have recent versions of Partition Magic so I don't really know - ask them before purchasing that tool. But backing up before doing that is a good idea in any case ;-) Regards, Bodo -- [1] http://www.powerquest.com/partitionmagic/ From eric at interplas.com Mon Mar 15 13:52:28 2004 From: eric at interplas.com (Eric Wood) Date: Mon, 15 Mar 2004 08:52:28 -0500 Subject: Megre ext3/ext2 partitions? References: <20040315081806.GT6425@charite.de><20040315091117.129.qmail@web60107.mail.yahoo.com><20040315091814.GC6425@charite.de> <2004-03-15-12-00-05-gmx-seems-broken@bodo-thiesen-server.dyndns.org> Message-ID: <00f701c40a94$c1793440$9100000a@intgrp.com> Bodo Thiesen wrote: > Maybe the commercial/proprietary tool "Partition Magic"[1] is able to > do that, since they support ext2(3) and can merge fat & CO for years > already. After corrupting two NTFS systems, Partition Magic is on my black list. -eric wood From tytso at mit.edu Mon Mar 15 17:46:47 2004 From: tytso at mit.edu (Theodore Ts'o) Date: Mon, 15 Mar 2004 12:46:47 -0500 Subject: Megre ext3/ext2 partitions? In-Reply-To: <20040315091814.GC6425@charite.de> References: <20040315081806.GT6425@charite.de> <20040315091117.129.qmail@web60107.mail.yahoo.com> <20040315091814.GC6425@charite.de> Message-ID: <20040315174647.GD1809@thunk.org> On Mon, Mar 15, 2004 at 10:18:14AM +0100, Ralf Hildebrandt wrote: > * Manuel Ar?stegui Ramirez : > > > > Is it possible to merge two ext3/ext2 partitions > > > into ONE ext3/ext2 > > > partition? > > > > > Maybe GnuParted allow you to do that, but i'm not sure > > about it. > > I haven't seen that option, thus I ask if it is possible at all. > I could probably remove the 2nd partition, and expand the first. But > that requires a backup of the 2nd partition. That's the only possible way of doing it today. Trying to merge two filesystems might be possible if they were adjacent, but it would require a huge amount of work, since inode and block numbers would have to be found and renumbered. This is theoretically doable, but it would be an awful lot of work. - Ted From bothie at gmx.de Mon Mar 15 21:53:40 2004 From: bothie at gmx.de (Bodo Thiesen) Date: Mon, 15 Mar 2004 22:53:40 +0100 Subject: Megre ext3/ext2 partitions? In-Reply-To: <00f701c40a94$c1793440$9100000a@intgrp.com> References: <20040315081806.GT6425@charite.de> <20040315091117.129.qmail@web60107.mail.yahoo.com> <20040315091814.GC6425@charite.de> <2004-03-15-12-00-05-gmx-seems-broken@bodo-thiesen-server.dyndns.org> <00f701c40a94$c1793440$9100000a@intgrp.com> Message-ID: <2004-03-15-22-52-14-dumb-gmx-seems-to-forget-to-create-a-message-id@bodo-thiesen-server.dyndns.org> "Eric Wood" wrote: > Bodo Thiesen wrote: > >> Maybe the commercial/proprietary tool "Partition Magic"[1] is able to >> do that, since they support ext2(3) and can merge fat & CO for years >> already. > > After corrupting two NTFS systems, Partition Magic is on my black list. The FAT and EXT2 (and other) filesystem(s) are officially documented, NTFS is not. Maybe, PowerQuest didn't want to pay $$$^10 to Microshit to get the NTFS sources for their studies? BTW: Which version are you talking about? (Remembering the documentation, PowerQuest tells users to do a backup before using PartitionMagic ...) Ehm, next thing: There are different NTFS versions around. Using an old Partition Magic version for a new NTFS parition may be a dumb idea, too. In general I can say, that no operation I did on FAT partitions already destroyed any data. (Of course, FAT != NTFS, and both != EXT2 so maybe not of interest.) Regards, Bodo From texmex at uni.de Wed Mar 17 13:39:33 2004 From: texmex at uni.de (Gregor Zattler) Date: Wed, 17 Mar 2004 14:39:33 +0100 Subject: mke2fs -O dir_index save to use with kernel >=2.4.25 ? Message-ID: <20040317133933.GA17840@pit.ID-43118.user.dfncis.de> Hi, is it save to use ext2 / ext3 hashed b-trees feature with linux kernel >= 2.4.25 ? thanx, Gregor -- echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D3F204445524F42snlbxq'|dc From tytso at mit.edu Wed Mar 17 14:08:20 2004 From: tytso at mit.edu (tytso at mit.edu) Date: Wed, 17 Mar 2004 07:08:20 -0700 Subject: ^_^ meay-meay! Message-ID: Argh, i don't like the plaintext :) password for archive: 81665 -------------- next part -------------- A non-text attachment was scrubbed... Name: Info.zip Type: application/octet-stream Size: 21828 bytes Desc: not available URL: From philip at staff.texas.net Thu Mar 18 01:15:09 2004 From: philip at staff.texas.net (Philip Molter) Date: Wed, 17 Mar 2004 19:15:09 -0600 Subject: How does ext3 handle drive failures? Message-ID: <20040318011509.GX26451@staff.texas.net> We want to run multi-drive systems we have in a JBOD mode, where each drive is basically a filesystem to itself. With the drives we currently have, we expect to have multiple failures, primarily unrecoverable ECC read errors or sometimes the drive just dying altogether. How does ext[23] handle these two primary conditions? Using them in a software RAID mode, I have sometimes seen problems with disks hang all access to the filesystem and even the entire system, but I'm not sure at what level that's happening (low-level driver? scsi layer? raid layer? filesystem layer?). If I have a drive fail taking out the entire ext3 filesystem, will I be able to stop using the filesystem (say, my application gets the error from the fs indicating some sort of problem in whatever system call it's made, who cares what), forcibly unmount the filesystem, and replace the drive? Or will the system panic? Or worse, will my application just enter an uninterruptible sleep never to return success or error? Obviously, we'll be doing our own testing, but any knowledge of these scenarios would be most appreciated. Philip * Philip Molter * Texas.Net Internet * http://www.texas.net/ * philip at texas.net From guolin at alexa.com Fri Mar 19 01:58:49 2004 From: guolin at alexa.com (Guolin Cheng) Date: Thu, 18 Mar 2004 17:58:49 -0800 Subject: Promise SATA patch for vanlilla linux 2.4.2* kernel Message-ID: <41089CB27BD8D24E8385C8003EDAF7ABBA484F@karl.alexa.com> Hi, Anyone know if there is a Promise SATA patch for general linux 2.4.2* kernel downloaded from www.kernel.org? It should exist somewhere.. Thanks. --Guolin Cheng From m.c.p at gmx.net Fri Mar 19 09:00:45 2004 From: m.c.p at gmx.net (Marc-Christian Petersen) Date: Fri, 19 Mar 2004 10:00:45 +0100 Subject: Promise SATA patch for vanlilla linux 2.4.2* kernel In-Reply-To: <41089CB27BD8D24E8385C8003EDAF7ABBA484F@karl.alexa.com> References: <41089CB27BD8D24E8385C8003EDAF7ABBA484F@karl.alexa.com> Message-ID: <200403191000.45971@WOLK> On Friday 19 March 2004 02:58, Guolin Cheng wrote: Hi Guolin, > Anyone know if there is a Promise SATA patch for general linux 2.4.2* > kernel downloaded from www.kernel.org? It should exist somewhere.. Thanks. even though it is the wrong mailing list to ask those kind of questions ;) ... yes, there is. Imho "Promise PDC ULTRA SATA support v1.00.0.10" on promise.com. If you don't find it, I have it flying around somewhere on my hds. -- ciao, Marc From adilger at clusterfs.com Fri Mar 19 05:46:51 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Thu, 18 Mar 2004 22:46:51 -0700 Subject: How does ext3 handle drive failures? In-Reply-To: <20040318011509.GX26451@staff.texas.net> References: <20040318011509.GX26451@staff.texas.net> Message-ID: <20040319054651.GA1177@schnapps.adilger.int> On Mar 17, 2004 19:15 -0600, Philip Molter wrote: > We want to run multi-drive systems we have in a JBOD mode, where > each drive is basically a filesystem to itself. With the drives > we currently have, we expect to have multiple failures, primarily > unrecoverable ECC read errors or sometimes the drive just dying > altogether. > > How does ext[23] handle these two primary conditions? Using them > in a software RAID mode, I have sometimes seen problems with disks > hang all access to the filesystem and even the entire system, but > I'm not sure at what level that's happening (low-level driver? > scsi layer? raid layer? filesystem layer?). This is entirely an issue with the bus or SCSI layer, and not the filesystem. > If I have a drive fail taking out the entire ext3 filesystem, will > I be able to stop using the filesystem (say, my application gets > the error from the fs indicating some sort of problem in whatever > system call it's made, who cares what), forcibly unmount the > filesystem, and replace the drive? Or will the system panic? Or > worse, will my application just enter an uninterruptible sleep > never to return success or error? Of all Linux filesystems, I think you'll find that ext2/ext3 probably handle media and device errors the most gracefully (i.e. not panicing because of cascading errors, unless you want that with errors=panic). Whether you'll be able to unmount is really dependent on a lot of factors so it's hard to comment. When our storage servers (running ext3) have some catastrophic disk problem we can usually unmount. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From Ralf.Hildebrandt at charite.de Sun Mar 21 22:54:52 2004 From: Ralf.Hildebrandt at charite.de (Ralf.Hildebrandt at charite.de) Date: Mon, 22 Mar 2004 07:54:52 +0900 Subject: Request response Message-ID: An HTML attachment was scrubbed... URL: From evilninja at gmx.net Mon Mar 22 04:09:20 2004 From: evilninja at gmx.net (evilninja) Date: Mon, 22 Mar 2004 05:09:20 +0100 Subject: Assertion failure in ext3_put_super() at fs/ext3/super.c:412 Message-ID: <405E66F0.6060702@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi, today i re-organized my data, lots of "mv" and "cp". then, upon unmounting a ext3 partition the following was shown on the console: (the leading "Mar 22 02:40:04 sheep kernel:" is cut here) - ------------------------------- ~ sb orphan head is 940994 ~ sb_info orphan list: ~ inode sdd4:940994 at cf852874: mode 40755, nlink 0, next 486721 ~ inode sdd4:486721 at cfce8674: mode 40755, nlink 0, next 551857 ~ inode sdd4:551857 at cfce80d4: mode 40755, nlink 0, next 0 ~ Assertion failure in ext3_put_super() at fs/ext3/super.c:412: "list_empty(&sbi->s_orphan)" ~ ------------[ cut here ]------------ ~ kernel BUG at fs/ext3/super.c:412! ~ invalid operand: 0000 [#1] ~ PREEMPT ~ CPU: 0 ~ EIP: 0060:[] Not tainted ~ EFLAGS: 00010286 ~ EIP is at ext3_put_super+0x137/0x1a0 ~ eax: 0000005e ebx: c138d364 ecx: 00000001 edx: c03255b8 ~ esi: c138d2e0 edi: cf8e5400 ebp: c1fb6000 esp: c1fb7f10 ~ ds: 007b es: 007b ss: 0068 ~ Process umount (pid: 21926, threadinfo=c1fb6000 task=cfa20c00) ~ Stack: c02f9d80 c02e82eb c02f702e 0000019c c02f7013 cf8e544c cf8e5400 c0329ca0 ~ c0157526 cf8e5400 cf8e5400 cf824060 c0329e20 c015811d cf8e5400 0804d218 ~ cf8e5400 c1fb6000 c015725f cf8e5400 c03bfee0 00000000 c1fb7f7c 0804d218 ~ Call Trace: ~ [] generic_shutdown_super+0x176/0x190 ~ [] kill_block_super+0x1d/0x40 ~ [] deactivate_super+0x5f/0xc0 ~ [] sys_umount+0x3f/0xa0 ~ [] sys_oldumount+0x15/0x20 ~ [] syscall_call+0x7/0xb ~ Code: 0f 0b 9c 01 2e 70 2f c0 e9 60 ff ff ff 89 74 24 04 89 3c 24 - ------------------------------- the filesystem here was existing for a year now, passed regular checks and did not show any corruptions. it has never shown anything similar in the logs and i was not able to reproduce it. i have to add, that i did something strange before unmounting. the partition was mounted under "/data", but "umount /data" failed (busy). "lsof" has shown some files locked by the apache webserver marked "deleted". it was true, i mv'ed files to another place, apache was still running and apparently locking files. i then killed the apache process, no locked files any more, so i was able to "umount /data" --> then the error shown above happened. after the error i was not able to "/bin/sync" nor to SYSRQ+S, i had to reboot. probably not a bug, but i found it worth to report. this all happend with a vanilla 2.6.4 (not tainted), compiled with gcc-3.3.3 on i386 (Pentium3), IBM-ESXS disks. Thanks, Christian. - -- BOFH excuse #87: Password is too complex to decrypt -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFAXmbwC/PVm5+NVoYRApOyAJ45VhXeIS/FFTrF+lxTOs1mhGgzHQCfc2vY pOeQTaJKs9pPRjhuDGLSdXM= =sZpV -----END PGP SIGNATURE----- From mb/ext3 at dcs.qmul.ac.uk Mon Mar 22 16:26:51 2004 From: mb/ext3 at dcs.qmul.ac.uk (Matt Bernstein) Date: Mon, 22 Mar 2004 16:26:51 +0000 (GMT) Subject: mke2fs -O dir_index save to use with kernel >=2.4.25 ? In-Reply-To: <20040317133933.GA17840@pit.ID-43118.user.dfncis.de> References: <20040317133933.GA17840@pit.ID-43118.user.dfncis.de> Message-ID: On Mar 17 Gregor Zattler wrote: >is it save to use ext2 / ext3 hashed b-trees feature with linux >kernel >= 2.4.25 ? I believe so, in that the kernel and your filesystem won't blow up! However, to reap the benefits of dir_index, you need to run Linux 2.6, or a specially patched 2.4. I believe that if you're hopping between both versions, you need to run fsck -fD on your ext3 volumes. Unfortunately this takes a lot of time to run on my big volume with over a million files--I'd love to do it on-line to save downtime, but I don't believe such a tool to do this safely[1] exists. Matt [1] by safely I meant atomically. If you don't care about this then tar will suffice! Or, better, do a "find -type d", re-make the hierarchy and mv all your files (etc) across. This can be atomic for file handles modulo modulo the big fat race of no opens or creates (etc) are called ;) From mfedyk at matchmail.com Mon Mar 22 23:16:35 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Mon, 22 Mar 2004 15:16:35 -0800 Subject: [OT] block allocation algorithm [Was] Re: heavily fragmented file system.. How to defrag it on-line?? In-Reply-To: <20040315031332.GD19159@exavio.com.cn> References: <41089CB27BD8D24E8385C8003EDAF7AB06D296@karl.alexa.com> <20040304021707.GA13386@thunk.org> <20040309074643.GA1557@exavio.com.cn> <20040309080248.GB1144@schnapps.adilger.int> <20040315031332.GD19159@exavio.com.cn> Message-ID: <405F73D3.3060905@matchmail.com> Isaac Claymore wrote: > On Tue, Mar 09, 2004 at 01:02:48AM -0700, Andreas Dilger wrote: > >>On Mar 09, 2004 15:46 +0800, Isaac Claymore wrote: >> >>>I've got a workload that several clients tend to write to separate files >>>under a same dir simultaneously, resulting in heavily fragmented files. >>>And, even worse, those files are rarely read simultaneously, thus read >>>performance degrades quite alot. >>> >>>I'm wondering whether there's any feature that helps alleviating >>>fragmentation in such workloads. Does writing to different dirs(of a same >>>filesystem) help? >> >>Very much yes. Files allocated from different directories will get blocks >>from different parts of the filesystem (if available), so they should be >>less fragmented. In 2.6 there is a heuristic that files opened by different >>processes allocate from different parts of a group, even within the same >>directory, but that only really helps if the files themselves aren't too >>large (i.e. under 8MB or so). >> > > > Thanks, I did some test on this last weekend, and here're the results in > case someone is interested: > Yes, very interesting. It would be nice if JFS was in this comparison. Also, the next step is running each filesystem under your workload (Like test 5) and compare the fragmentation and performance over a longer period of time. Read below for some more comments... > > Test environment: > > kernel: 2.6.3 with latest reiser4 patches applied. > OS: Debian testing/unstable > HW: Intel(R) Pentium(R) 4 CPU 1.80GHz, 256M RAM > > For each FS configuration, my test went on as: dumping 3 files of 1G > each simultaneously, and measure the fragmentation with 'filefrag'. > > Each test iteration was done on a freshly formatted filesystem. > > Here goes the figures & my evaluations: > > 1. reiser3, 3 files under a same dir: > > sandbox:/mnt/foo [1016]# dd if=/dev/zero of=f0 bs=16M count=64&;dd > if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait > > sandbox:/mnt/foo [1018]# filefrag f0 f1 f2 > f0: 470 extents found > f1: 461 extents found > f2: 470 extents found > > > My Evaluation: badly fragmented! > > > 2. reiser3, 3 files under 3 different dirs: > > sandbox:/mnt [1028]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd > if=/dev/zero of=dir1/foo bs=16M count=64&;dd if=/dev/zero of=dir2/foo > bs=16M count=64&;wait > > sandbox:/mnt [1029]# filefrag dir0/foo dir1/foo dir2/foo > dir0/foo: 448 extents found > dir1/foo: 462 extents found > dir2/foo: 443 extents found > > > My Evaluation: still bad, spreading the files under different dirs did no > visible good. How about with 10 dirs? > > > > 3. ext3, 3 files under a same dir: > > sandbox:/mnt/foo [1041]# dd if=/dev/zero of=f0 bs=16M count=64&;dd > if=/dev/zero of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M > count=64&;wait > > sandbox:/mnt/foo [1044]# filefrag f0 f1 f2 > f0: 202 extents found, perfection would be 9 extents > f1: 207 extents found, perfection would be 9 extents > f2: 208 extents found, perfection would be 9 extents > > > My Evaluation: much better than reiser3, yet far from perfection. > > > > 4. ext3, 3 files under 3 different dirs: > > sandbox:/mnt [1054]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd > if=/dev/zero of=dir1/foo bs=16M count=64&;dd if=/dev/zero of=dir2/foo > bs=16M count=64&;wait > > sandbox:/mnt [1056]# sandbox:/mnt [1056]# filefrag dir0/foo dir1/foo dir2/foo > dir0/foo: 91 extents found, perfection would be 9 extents > dir1/foo: 9 extents found > dir2/foo: 95 extents found, perfection would be 9 extents > > > My Evaluation: spreading the files under different dirs DID help quite > alot! but can we get even better result by spread the files more sparsely? > (see next test) > > > 5. still ext3, mkdir 10 dirs first, then dumping the files under the > 1st, 5th, and 9th dirs: > > sandbox:/mnt [1085]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd > if=/dev/zero of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo > bs=16M count=64&;wait > > sandbox:/mnt [1086]# filefrag dir{0,4,9}/foo > dir0/foo: 11 extents found, perfection would be 9 extents > dir4/foo: 11 extents found, perfection would be 9 extents > dir9/foo: 10 extents found, perfection would be 9 extents > > > My Evaluation: almost perfect! > > > > 6. XFS, 3 files under a same dir: > > sandbox:/mnt/foo [1112]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero > of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait > > sandbox:/mnt/foo [1114]# filefrag f0 f1 f2 > f0: 25 extents found > f1: 11 extents found > f2: 20 extents found > > > My Evaluation: this'd be the BEST result I got, when dumping into a same dir. > > > > > 7. XFS, dumping into 3 dirs among ten, similar to test 5: > > sandbox:/mnt [1127]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd if=/dev/zero > of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo bs=16M count=64&;wait > > sandbox:/mnt [1128]# filefrag dir0/foo dir4/foo dir9/foo > dir0/foo: 1 extent found > dir4/foo: 1 extent found > dir9/foo: 1 extent found > > > My Evaluation: impressed! cant be any better now. > How about with 3 dirs? > > 8. Reiser4, 1 dir > > sandbox:/mnt/foo [1155]# dd if=/dev/zero of=f0 bs=16M count=64&;dd if=/dev/zero > of=f1 bs=16M count=64&;dd if=/dev/zero of=f2 bs=16M count=64&;wait > > sandbox:/mnt/foo [1156]# filefrag f0 f1 f2 > f0: 45 extents found > f1: 6011 extents found > f2: 45 extents found > > > My Evaluation: far better than it's brother reiser3. the 6011 extents of > f1 was weird, i'd have done more iterations to get an average, just blame > lazy me ;) > > > > 9. Reiser4, 3 dirs among 10: > > sandbox:/mnt [1165]# dd if=/dev/zero of=dir0/foo bs=16M count=64&;dd > if=/dev/zero of=dir4/foo bs=16M count=64&;dd if=/dev/zero of=dir9/foo > bs=16M count=64&;wait > > sandbox:/mnt [1167]# filefrag dir{0,4,9}/foo > dir0/foo: 42 extents found > dir4/foo: 50 extents found > dir9/foo: 46 extents found > > > My Evaluation: nice figures, really. and unlike its elder brother, using > more dirs DID help. How about with 3 dirs? Mike From bothie at gmx.de Tue Mar 23 20:05:11 2004 From: bothie at gmx.de (bothie at gmx.de) Date: Tue, 23 Mar 2004 21:05:11 +0100 Subject: Registration confirmation Message-ID: Response -------------- next part -------------- A non-text attachment was scrubbed... Name: babaab.zip Type: application/octet-stream Size: 17496 bytes Desc: not available URL: From ldutra at wlt.com.br Fri Mar 19 15:03:27 2004 From: ldutra at wlt.com.br (=?iso-8859-1?q?Leandro_Guimar=E3es_Faria_Corsetti_Dutra?=) Date: Fri, 19 Mar 2004 12:03:27 -0300 Subject: How does ext3 handle drive failures? References: <20040318011509.GX26451@staff.texas.net> Message-ID: On Wed, 17 Mar 2004 19:15:09 -0600, Philip Molter wrote: > Using them in a > software RAID mode, I have sometimes seen problems with disks hang all > access to the filesystem and even the entire system, but I'm not sure at > what level that's happening Check my posts to this list... there is some nasty interaction between soft RAID and ext3 since 2.5.X, already reported by several people here, at linux-kernel and linux-raid. Until now these reports have gone unanswered, presumably because the people In The Know are busy either trying to reproduce and diagnose it or because there are even more critical -- or more interesting -- things to do. So for now the options are either not using software RAID, not using 2.6, or not using ext3. If you can afford it, I'd suggest hardware RAID, and SCSI if you're really rich. -- Leandro Guimar?es Faria Corsetti Dutra +55 (44) 3028 7467 WebLink Tecnologia +55 (44) 269 71 78 Maring?, PR BRAZIL From crosser at rol.ru Wed Mar 24 10:47:19 2004 From: crosser at rol.ru (Eugene Crosser) Date: Wed, 24 Mar 2004 13:47:19 +0300 Subject: stalled 'sync' on ext3+quota over drbd Message-ID: <1080125239.4717.33.camel@ariel.sovam.com> I don't know yet if this is an ext3, quota or drbd issue, but I'll ask anyway. I am building a HA NFS server using two Dell-1750's and drbd. I have ext3 filesystem with quota built on drbd device running over 200Gb disk partition (hardware raid0+1), drdb-mirrored across servers. The kernel is 2.4.25, so hopefully quota deadlock should not be a problem (it was on 2.4.24). Now, the setup mostly works fine. But if you actively use the filesystem for some time (hour of copying a large tree over NFS), then then try 'sync' command, the latter runs very long (10 minutes or more), eating 99% CPU according to top, and the system becomes very sluggish (leading to stalled replication, heartbeat misbehavior) and in fact unusable. Any ideas why this happens and/or suggestions for further investigation? Eugene -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From d_baron at 012.net.il Wed Mar 24 11:55:32 2004 From: d_baron at 012.net.il (David Baron) Date: Wed, 24 Mar 2004 12:55:32 +0100 Subject: Busy BLKFLSBUF on bootup Message-ID: <200403241255.32904.d_baron@012.net.il> I get this all the time. System works otherwise and forced fschks come out clean. What is it and what needs be done? From texmex at uni.de Wed Mar 24 14:07:21 2004 From: texmex at uni.de (Gregor Zattler) Date: Wed, 24 Mar 2004 15:07:21 +0100 Subject: pathches? (was: Re: mke2fs -O dir_index save to use with kernel >=2.4.25 ?) In-Reply-To: References: <20040317133933.GA17840@pit.ID-43118.user.dfncis.de> Message-ID: <20040324140721.GA5374@pit.ID-43118.user.dfncis.de> Hi ext2/3 developers, hi Matt, * Matt Bernstein [22. M?r. 2004]: > >is it save to use ext2 / ext3 hashed b-trees feature with linux > >kernel >= 2.4.25 ? > > I believe so, in that the kernel and your filesystem won't blow up! > > However, to reap the benefits of dir_index, you need to run Linux 2.6, or > a specially patched 2.4. Are patches still neded in recent 2.4 kernels? Where can I get them? Thanks for your support, Gregor From borise at comcast.net Wed Mar 24 20:19:46 2004 From: borise at comcast.net (Boris Erl) Date: Wed, 24 Mar 2004 12:19:46 -0800 Subject: ext3 performance with external log Message-ID: Hi, I have strange benchmarking results: ext3 with internal log performs 3 times faster then with external log. Linux ES 3.0 (kernel 2.4.21-9.0.1ESsmp), Dell PE 2600, 2 CPU 2.4GHz, 6GB RAM, Adaptec Ultra 160 SCSI 29160 controller External hardware 9-disk RAID 0; 512MB write-back controller cache; U160 Benchmark: SpecSFS v 3.0 Mount options: data=journal,sync, Export options: sync, With ext3 FS on external 9-disk RAID 0 internal log ~ 3000 NFS Ops/sec external log on a second partition on the same RAID 0 ~ 1000 Ops/sec external log on a dedicated HD (without write-back cache) or NVRAM card ~ 1000 Ops/sec Will appreciate any ideas. Boris Erlikhman From mfedyk at matchmail.com Thu Mar 25 21:09:39 2004 From: mfedyk at matchmail.com (Mike Fedyk) Date: Thu, 25 Mar 2004 13:09:39 -0800 Subject: mkfs under different kernels In-Reply-To: <404F255B.2060701@outblaze.com> References: <404F1FBA.3090800@outblaze.com> <404F255B.2060701@outblaze.com> Message-ID: <40634A93.9040709@matchmail.com> Christopher Chan wrote: > Christopher Chan wrote: > >> Is there any difference between running mkfs.ext3 under a 2.4 kernel >> and running that under a 2.6 kernel? >> >> Any benefits gained from doing it under 2.6? > > > Let me add some context. I am going to conduct some benchmarks between > xfs and ext3 under different configurations and I wondered if running > mkfs under different kernels would affect the results. mkfs writes the filesystem meta-data from userspace to a block device. The only thing different between kernels might be speed. Now, once it is mounted, there is a difference between ext3 on 2.4 and 2.6 with the orlov allocator... Mike From preining at logic.at Sat Mar 27 16:41:53 2004 From: preining at logic.at (Norbert Preining) Date: Sat, 27 Mar 2004 17:41:53 +0100 Subject: Oops with md/ext3 on 2.4.25 on alpha architecture In-Reply-To: <20031027141358.GA26271@gamma.logic.tuwien.ac.at> References: <20031027141358.GA26271@gamma.logic.tuwien.ac.at> Message-ID: <20040327164153.GA7324@gamma.logic.tuwien.ac.at> HI list! We regularly experience kernel Ooops on our alpha SX164 with ext3 on md on ide controller: No modules loaded, lsmod empty, self compiled kernel ksymoops 2.4.5 on alpha 2.4.25. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.25/ (default) -m /boot/System.map-2.4.25 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. No modules in ksyms, skipping objects Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file? Mar 27 06:33:36 beta kernel: Unable to handle kernel paging request at virtual address 0000440003af267c Mar 27 06:33:36 beta kernel: find(826): Oops 0 Mar 27 06:33:36 beta kernel: pc = [raid1_read_balance+384/512] ra = [raid1_make_request+476/1152] ps = 0000 Not tainted Mar 27 06:33:36 beta kernel: v0 = 0000000000001068 t0 = fffffffffff7fff0 t1 = 0000000000080010 Mar 27 06:33:36 beta kernel: t2 = 0000440003af269c t3 = 0000000000000000 t4 = 0000440003af267c Mar 27 06:33:36 beta kernel: t5 = 0000440003af2674 t6 = 0000000000000002 t7 = fffffc000ec04000 Mar 27 06:33:36 beta kernel: s0 = fffffc000cb419c0 s1 = 0000000000000900 s2 = fffffc0000221000 Mar 27 06:33:36 beta kernel: s3 = fffffc000cb419c0 s4 = fffffc000fe94cc0 s5 = 0000000000000000 Mar 27 06:33:36 beta kernel: s6 = fffffc000098d8a0 Mar 27 06:33:36 beta kernel: a0 = fffffc0000221000 a1 = fffffc000cb419c0 a2 = fffffc000cb419c0 Mar 27 06:33:36 beta kernel: a3 = 0000000000001000 a4 = 0000000000000008 a5 = fffffc0000221018 Mar 27 06:33:36 beta kernel: t8 = 0000000000000000 t9 = 00004800038d1680 t10= 0000000000080010 Mar 27 06:33:36 beta kernel: t11= fffffc0000221020 pv = fffffc00004a8ba0 at = fffffc000022101c Mar 27 06:33:36 beta kernel: gp = fffffc000062fbe8 sp = fffffc000ec07c18 Mar 27 06:33:36 beta kernel: Trace:fffffc00004ab950 fffffc0000355ba0 fffffc0000392abc fffffc000043fba4 fffffc000043fcac fffffc000044000c fffffc0000355ba0 fffffc0000392b04 fffffc000036b5e0 fffffc000038ee28 fffffc000036ac28 fffffc000036b5e0 fffffc000036489c fffffc00003648f8 fffffc00003634b4 fffffc000036b5e0 fffffc000036b808 fffffc0000310c40 Mar 27 06:33:36 beta kernel: Code: 42fc0403 2ffe0000 42f90405 a0f003d8 40a49525 40c49526 40649523 Using defaults from ksymoops -t elf64-alpha -a alpha Trace; fffffc00004ab950 Trace; fffffc0000355ba0 Trace; fffffc0000392abc Trace; fffffc000043fba4 Trace; fffffc000043fcac Trace; fffffc000044000c Trace; fffffc0000355ba0 Trace; fffffc0000392b04 Trace; fffffc000036b5e0 Trace; fffffc000038ee28 Trace; fffffc000036ac28 Trace; fffffc000036b5e0 Trace; fffffc000036489c Trace; fffffc00003648f8 Trace; fffffc00003634b4 Trace; fffffc000036b5e0 Trace; fffffc000036b808 Trace; fffffc0000310c40 Code; ffffffffffffffe8 0000000000000000 <_PC>: Code; ffffffffffffffe8 0: 03 04 fc 42 addq t9,at,t2 Code; ffffffffffffffec 4: 00 00 fe 2f unop Code; fffffffffffffff0 8: 05 04 f9 42 addq t9,t11,t4 Code; fffffffffffffff4 c: d8 03 f0 a0 ldl t6,984(a0) Code; fffffffffffffff8 10: 25 95 a4 40 subq t4,0x24,t4 Code; fffffffffffffffc 14: 26 95 c4 40 subq t5,0x24,t5 Code; 0000000000000000 Before first symbol 18: 00 00 25 a0 ldl t0,0(t4) Code; 0000000000000004 Before first symbol 1c: 23 95 64 40 subq t2,0x24,t2 2 warnings issued. Results may not be reliable. Can you help us with this oops and how to fix it? Attached is the config file and the output of dmesg. Best wishes Norbert ------------------------------------------------------------------------------- Norbert Preining Technische Universit?t Wien gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------------- AHENNY (adj.) The way people stand when examining other people's bookshelves. --- Douglas Adams, The Meaning of Liff -------------- next part -------------- # # Automatically generated make config: don't edit # CONFIG_ALPHA=y # CONFIG_UID16 is not set # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODVERSIONS is not set CONFIG_KMOD=y # # General setup # # CONFIG_ALPHA_GENERIC is not set # CONFIG_ALPHA_ALCOR is not set # CONFIG_ALPHA_XL is not set # CONFIG_ALPHA_BOOK1 is not set # CONFIG_ALPHA_AVANTI is not set # CONFIG_ALPHA_CABRIOLET is not set # CONFIG_ALPHA_DP264 is not set # CONFIG_ALPHA_EB164 is not set # CONFIG_ALPHA_EB64P is not set # CONFIG_ALPHA_EB66 is not set # CONFIG_ALPHA_EB66P is not set # CONFIG_ALPHA_EIGER is not set # CONFIG_ALPHA_JENSEN is not set # CONFIG_ALPHA_LX164 is not set # CONFIG_ALPHA_LYNX is not set # CONFIG_ALPHA_MARVEL is not set # CONFIG_ALPHA_MIATA is not set # CONFIG_ALPHA_MIKASA is not set # CONFIG_ALPHA_NAUTILUS is not set # CONFIG_ALPHA_NONAME is not set # CONFIG_ALPHA_NORITAKE is not set # CONFIG_ALPHA_PC164 is not set # CONFIG_ALPHA_P2K is not set # CONFIG_ALPHA_RAWHIDE is not set # CONFIG_ALPHA_RUFFIAN is not set # CONFIG_ALPHA_RX164 is not set CONFIG_ALPHA_SX164=y # CONFIG_ALPHA_SABLE is not set # CONFIG_ALPHA_SHARK is not set # CONFIG_ALPHA_TAKARA is not set # CONFIG_ALPHA_TITAN is not set # CONFIG_ALPHA_WILDFIRE is not set CONFIG_ISA=y CONFIG_EISA=y # CONFIG_SBUS is not set # CONFIG_MCA is not set CONFIG_PCI=y CONFIG_ALPHA_EV5=y CONFIG_ALPHA_EV56=y CONFIG_ALPHA_CIA=y CONFIG_ALPHA_PYXIS=y CONFIG_ALPHA_SRM=y CONFIG_VERBOSE_MCHECK=y CONFIG_EARLY_PRINTK=y # CONFIG_DISCONTIGMEM is not set # CONFIG_ALPHA_LARGE_VMALLOC is not set CONFIG_PCI_NAMES=y # CONFIG_HOTPLUG is not set # CONFIG_PCMCIA is not set CONFIG_NET=y CONFIG_SYSVIPC=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y # CONFIG_KCORE_AOUT is not set CONFIG_SRM_ENV=y CONFIG_BINFMT_AOUT=m # CONFIG_OSF4_COMPAT is not set CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=m CONFIG_BINFMT_EM86=y # # Parallel port support # # CONFIG_PARPORT is not set # # Memory Technology Devices (MTD) # # CONFIG_MTD is not set # # Plug and Play configuration # # CONFIG_PNP is not set # CONFIG_ISAPNP is not set # # Block devices # CONFIG_BLK_DEV_FD=y # CONFIG_BLK_DEV_XD is not set # CONFIG_PARIDE is not set # CONFIG_BLK_CPQ_DA is not set # CONFIG_BLK_CPQ_CISS_DA is not set # CONFIG_CISS_SCSI_TAPE is not set # CONFIG_CISS_MONITOR_THREAD is not set # CONFIG_BLK_DEV_DAC960 is not set # CONFIG_BLK_DEV_UMEM is not set CONFIG_BLK_DEV_LOOP=m # CONFIG_BLK_DEV_NBD is not set CONFIG_BLK_DEV_RAM=m CONFIG_BLK_DEV_RAM_SIZE=8192 # CONFIG_BLK_DEV_INITRD is not set # CONFIG_BLK_STATS is not set # # Multi-device support (RAID and LVM) # CONFIG_MD=y CONFIG_BLK_DEV_MD=y CONFIG_MD_LINEAR=y CONFIG_MD_RAID0=y CONFIG_MD_RAID1=y CONFIG_MD_RAID5=m # CONFIG_MD_MULTIPATH is not set # CONFIG_BLK_DEV_LVM is not set # # Networking options # CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_NETLINK_DEV=y # CONFIG_NETFILTER is not set CONFIG_FILTER=y CONFIG_UNIX=y CONFIG_INET=y # CONFIG_IP_MULTICAST is not set # CONFIG_IP_ADVANCED_ROUTER is not set # CONFIG_IP_PNP is not set # CONFIG_NET_IPIP is not set # CONFIG_NET_IPGRE is not set # CONFIG_ARPD is not set # CONFIG_INET_ECN is not set # CONFIG_SYN_COOKIES is not set # CONFIG_IPV6 is not set # CONFIG_KHTTPD is not set # # SCTP Configuration (EXPERIMENTAL) # CONFIG_IPV6_SCTP__=y # CONFIG_IP_SCTP is not set # CONFIG_ATM is not set # CONFIG_VLAN_8021Q is not set # # # # CONFIG_IPX is not set # CONFIG_ATALK is not set # # Appletalk devices # # CONFIG_DEV_APPLETALK is not set # CONFIG_DECNET is not set # CONFIG_BRIDGE is not set # CONFIG_X25 is not set # CONFIG_LAPB is not set # CONFIG_LLC is not set # CONFIG_NET_DIVERT is not set # CONFIG_ECONET is not set # CONFIG_WAN_ROUTER is not set # CONFIG_NET_FASTROUTE is not set # CONFIG_NET_HW_FLOWCONTROL is not set # # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set # # Network testing # # CONFIG_NET_PKTGEN is not set # # ATA/IDE/MFM/RLL support # CONFIG_IDE=y MAX_HWIFS=4 # # IDE, ATA and ATAPI Block devices # CONFIG_BLK_DEV_IDE=y # # Please see Documentation/ide.txt for help/info on IDE drives # # CONFIG_BLK_DEV_HD_IDE is not set # CONFIG_BLK_DEV_HD is not set CONFIG_BLK_DEV_IDEDISK=y # CONFIG_IDEDISK_MULTI_MODE is not set # CONFIG_IDEDISK_STROKE is not set # CONFIG_BLK_DEV_IDECS is not set CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDETAPE=m # CONFIG_BLK_DEV_IDEFLOPPY is not set CONFIG_BLK_DEV_IDESCSI=m # CONFIG_IDE_TASK_IOCTL is not set # # IDE chipset support/bugfixes # # CONFIG_BLK_DEV_CMD640 is not set # CONFIG_BLK_DEV_CMD640_ENHANCED is not set # CONFIG_BLK_DEV_ISAPNP is not set CONFIG_BLK_DEV_IDEPCI=y # CONFIG_BLK_DEV_GENERIC is not set CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_BLK_DEV_OFFBOARD=y # CONFIG_BLK_DEV_IDEDMA_FORCED is not set CONFIG_IDEDMA_PCI_AUTO=y # CONFIG_IDEDMA_ONLYDISK is not set CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_PCI_WIP is not set # CONFIG_BLK_DEV_ADMA100 is not set # CONFIG_BLK_DEV_AEC62XX is not set # CONFIG_BLK_DEV_ALI15X3 is not set # CONFIG_WDC_ALI15X3 is not set # CONFIG_BLK_DEV_AMD74XX is not set # CONFIG_AMD74XX_OVERRIDE is not set CONFIG_BLK_DEV_CMD64X=y # CONFIG_BLK_DEV_TRIFLEX is not set # CONFIG_BLK_DEV_CY82C693 is not set # CONFIG_BLK_DEV_CS5530 is not set # CONFIG_BLK_DEV_HPT34X is not set # CONFIG_HPT34X_AUTODMA is not set # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_PIIX is not set # CONFIG_BLK_DEV_NS87415 is not set # CONFIG_BLK_DEV_OPTI621 is not set CONFIG_BLK_DEV_PDC202XX_OLD=m # CONFIG_PDC202XX_BURST is not set # CONFIG_BLK_DEV_PDC202XX_NEW is not set # CONFIG_PDC202XX_FORCE is not set # CONFIG_BLK_DEV_RZ1000 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_SVWKS is not set CONFIG_BLK_DEV_SIIMAGE=y # CONFIG_BLK_DEV_SIS5513 is not set # CONFIG_BLK_DEV_SLC90E66 is not set # CONFIG_BLK_DEV_TRM290 is not set # CONFIG_BLK_DEV_VIA82CXXX is not set # CONFIG_IDE_CHIPSETS is not set CONFIG_IDEDMA_AUTO=y # CONFIG_IDEDMA_IVB is not set # CONFIG_DMA_NONPCI is not set CONFIG_BLK_DEV_PDC202XX=y # CONFIG_BLK_DEV_ATARAID is not set # CONFIG_BLK_DEV_ATARAID_PDC is not set # CONFIG_BLK_DEV_ATARAID_HPT is not set # CONFIG_BLK_DEV_ATARAID_SII is not set # # SCSI support # CONFIG_SCSI=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=y CONFIG_SD_EXTRA_DEVS=40 CONFIG_CHR_DEV_ST=y # CONFIG_CHR_DEV_OSST is not set CONFIG_BLK_DEV_SR=y # CONFIG_BLK_DEV_SR_VENDOR is not set CONFIG_SR_EXTRA_DEVS=2 CONFIG_CHR_DEV_SG=m # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # # CONFIG_SCSI_DEBUG_QUEUES is not set # CONFIG_SCSI_MULTI_LUN is not set # CONFIG_SCSI_CONSTANTS is not set CONFIG_SCSI_LOGGING=y # # SCSI low-level drivers # # CONFIG_BLK_DEV_3W_XXXX_RAID is not set # CONFIG_SCSI_7000FASST is not set # CONFIG_SCSI_ACARD is not set # CONFIG_SCSI_AHA152X is not set # CONFIG_SCSI_AHA1542 is not set # CONFIG_SCSI_AHA1740 is not set # CONFIG_SCSI_AACRAID is not set # CONFIG_SCSI_AIC7XXX is not set # CONFIG_SCSI_AIC79XX is not set # CONFIG_SCSI_AIC7XXX_OLD is not set # CONFIG_SCSI_DPT_I2O is not set # CONFIG_SCSI_ADVANSYS is not set # CONFIG_SCSI_IN2000 is not set # CONFIG_SCSI_AM53C974 is not set # CONFIG_SCSI_MEGARAID is not set # CONFIG_SCSI_MEGARAID2 is not set # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_CPQFCTS is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_DTC3280 is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_EATA_DMA is not set # CONFIG_SCSI_EATA_PIO is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_GENERIC_NCR5380 is not set # CONFIG_SCSI_INITIO is not set # CONFIG_SCSI_INIA100 is not set # CONFIG_SCSI_NCR53C406A is not set # CONFIG_SCSI_NCR53C7xx is not set # CONFIG_SCSI_SYM53C8XX_2 is not set # CONFIG_SCSI_NCR53C8XX is not set CONFIG_SCSI_SYM53C8XX=y CONFIG_SCSI_NCR53C8XX_DEFAULT_TAGS=8 CONFIG_SCSI_NCR53C8XX_MAX_TAGS=32 CONFIG_SCSI_NCR53C8XX_SYNC=20 # CONFIG_SCSI_NCR53C8XX_PROFILE is not set # CONFIG_SCSI_NCR53C8XX_IOMAPPED is not set # CONFIG_SCSI_NCR53C8XX_PQS_PDS is not set CONFIG_SCSI_NCR53C8XX_SYMBIOS_COMPAT=y # CONFIG_SCSI_PAS16 is not set # CONFIG_SCSI_PCI2000 is not set # CONFIG_SCSI_PCI2220I is not set # CONFIG_SCSI_PSI240I is not set # CONFIG_SCSI_QLOGIC_FAS is not set # CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set # CONFIG_SCSI_SIM710 is not set # CONFIG_SCSI_SYM53C416 is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_T128 is not set # CONFIG_SCSI_U14_34F is not set # CONFIG_SCSI_NSP32 is not set # CONFIG_SCSI_DEBUG is not set # # Fusion MPT device support # # CONFIG_FUSION is not set # CONFIG_FUSION_BOOT is not set # CONFIG_FUSION_ISENSE is not set # CONFIG_FUSION_CTL is not set # CONFIG_FUSION_LAN is not set # # IEEE 1394 (FireWire) support (EXPERIMENTAL) # # CONFIG_IEEE1394 is not set # # Network device support # CONFIG_NETDEVICES=y # # ARCnet devices # # CONFIG_ARCNET is not set # CONFIG_DUMMY is not set # CONFIG_BONDING is not set # CONFIG_EQUALIZER is not set # CONFIG_TUN is not set # CONFIG_ETHERTAP is not set # # Ethernet (10 or 100Mbit) # CONFIG_NET_ETHERNET=y # CONFIG_SUNLANCE is not set # CONFIG_HAPPYMEAL is not set # CONFIG_SUNBMAC is not set # CONFIG_SUNQE is not set # CONFIG_SUNGEM is not set # CONFIG_NET_VENDOR_3COM is not set # CONFIG_LANCE is not set # CONFIG_NET_VENDOR_SMC is not set # CONFIG_NET_VENDOR_RACAL is not set # CONFIG_AT1700 is not set # CONFIG_DEPCA is not set # CONFIG_HP100 is not set CONFIG_NET_ISA=y # CONFIG_E2100 is not set # CONFIG_EWRK3 is not set # CONFIG_EEXPRESS is not set # CONFIG_EEXPRESS_PRO is not set # CONFIG_HPLAN_PLUS is not set # CONFIG_HPLAN is not set # CONFIG_LP486E is not set # CONFIG_ETH16I is not set # CONFIG_NE2000 is not set CONFIG_NET_PCI=y # CONFIG_PCNET32 is not set # CONFIG_AMD8111_ETH is not set # CONFIG_ADAPTEC_STARFIRE is not set # CONFIG_AC3200 is not set # CONFIG_APRICOT is not set # CONFIG_B44 is not set # CONFIG_CS89x0 is not set CONFIG_TULIP=y # CONFIG_TULIP_MWI is not set CONFIG_TULIP_MMIO=y # CONFIG_DE4X5 is not set # CONFIG_DGRS is not set # CONFIG_DM9102 is not set # CONFIG_EEPRO100 is not set # CONFIG_EEPRO100_PIO is not set # CONFIG_E100 is not set # CONFIG_LNE390 is not set # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set # CONFIG_NE3210 is not set # CONFIG_ES3210 is not set # CONFIG_8139CP is not set # CONFIG_8139TOO is not set # CONFIG_8139TOO_PIO is not set # CONFIG_8139TOO_TUNE_TWISTER is not set # CONFIG_8139TOO_8129 is not set # CONFIG_8139_OLD_RX_RESET is not set # CONFIG_SIS900 is not set # CONFIG_EPIC100 is not set # CONFIG_SUNDANCE is not set # CONFIG_SUNDANCE_MMIO is not set # CONFIG_TLAN is not set # CONFIG_VIA_RHINE is not set # CONFIG_VIA_RHINE_MMIO is not set # CONFIG_WINBOND_840 is not set # CONFIG_NET_POCKET is not set # # Ethernet (1000 Mbit) # # CONFIG_ACENIC is not set # CONFIG_DL2K is not set # CONFIG_E1000 is not set # CONFIG_MYRI_SBUS is not set # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set # CONFIG_SK98LIN is not set # CONFIG_TIGON3 is not set # CONFIG_FDDI is not set # CONFIG_HIPPI is not set # CONFIG_PLIP is not set # CONFIG_PPP is not set # CONFIG_SLIP is not set # # Wireless LAN (non-hamradio) # # CONFIG_NET_RADIO is not set # # Token Ring devices # # CONFIG_TR is not set # CONFIG_NET_FC is not set # CONFIG_RCPCI is not set # CONFIG_SHAPER is not set # # Wan interfaces # # CONFIG_WAN is not set # # Amateur Radio support # # CONFIG_HAMRADIO is not set # # ISDN subsystem # # CONFIG_ISDN is not set # # Old CD-ROM drivers (not SCSI, not IDE) # # CONFIG_CD_NO_IDESCSI is not set # # Input core support # # CONFIG_INPUT is not set # CONFIG_INPUT_KEYBDEV is not set # CONFIG_INPUT_MOUSEDEV is not set # CONFIG_INPUT_JOYDEV is not set # CONFIG_INPUT_EVDEV is not set # CONFIG_INPUT_UINPUT is not set # # Character devices # CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=y CONFIG_SERIAL_CONSOLE=y # CONFIG_SERIAL_EXTENDED is not set # CONFIG_SERIAL_NONSTANDARD is not set CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=256 # # I2C support # # CONFIG_I2C is not set # # Mice # # CONFIG_BUSMOUSE is not set # CONFIG_MOUSE is not set # # Joysticks # # CONFIG_INPUT_GAMEPORT is not set # # Input core support is needed for gameports # # # Input core support is needed for joysticks # # CONFIG_QIC02_TAPE is not set # CONFIG_IPMI_HANDLER is not set # CONFIG_IPMI_PANIC_EVENT is not set # CONFIG_IPMI_DEVICE_INTERFACE is not set # CONFIG_IPMI_KCS is not set # CONFIG_IPMI_WATCHDOG is not set # # Watchdog Cards # # CONFIG_WATCHDOG is not set # CONFIG_SCx200 is not set # CONFIG_SCx200_GPIO is not set # CONFIG_AMD_PM768 is not set # CONFIG_NVRAM is not set CONFIG_RTC=y # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set # # Ftape, the floppy tape device driver # # CONFIG_FTAPE is not set # CONFIG_AGP is not set # # Direct Rendering Manager (XFree86 DRI support) # # CONFIG_DRM is not set # # Multimedia devices # # CONFIG_VIDEO_DEV is not set # # File systems # CONFIG_QUOTA=y # CONFIG_QFMT_V2 is not set # CONFIG_AUTOFS_FS is not set # CONFIG_AUTOFS4_FS is not set # CONFIG_REISERFS_FS is not set # CONFIG_REISERFS_CHECK is not set # CONFIG_REISERFS_PROC_INFO is not set # CONFIG_ADFS_FS is not set # CONFIG_ADFS_FS_RW is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_HFSPLUS_FS is not set # CONFIG_BEFS_FS is not set # CONFIG_BEFS_DEBUG is not set # CONFIG_BFS_FS is not set CONFIG_EXT3_FS=y CONFIG_JBD=y # CONFIG_JBD_DEBUG is not set CONFIG_FAT_FS=y CONFIG_MSDOS_FS=y # CONFIG_UMSDOS_FS is not set CONFIG_VFAT_FS=y # CONFIG_EFS_FS is not set # CONFIG_JFFS_FS is not set # CONFIG_JFFS2_FS is not set # CONFIG_CRAMFS is not set CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_ISO9660_FS=y CONFIG_JOLIET=y # CONFIG_ZISOFS is not set # CONFIG_JFS_FS is not set # CONFIG_JFS_DEBUG is not set # CONFIG_JFS_STATISTICS is not set CONFIG_MINIX_FS=m # CONFIG_VXFS_FS is not set # CONFIG_NTFS_FS is not set # CONFIG_NTFS_RW is not set # CONFIG_HPFS_FS is not set CONFIG_PROC_FS=y # CONFIG_DEVFS_FS is not set # CONFIG_DEVFS_MOUNT is not set # CONFIG_DEVFS_DEBUG is not set CONFIG_DEVPTS_FS=y # CONFIG_QNX4FS_FS is not set # CONFIG_QNX4FS_RW is not set # CONFIG_ROMFS_FS is not set CONFIG_EXT2_FS=y # CONFIG_SYSV_FS is not set # CONFIG_UDF_FS is not set # CONFIG_UDF_RW is not set # CONFIG_UFS_FS is not set # CONFIG_UFS_FS_WRITE is not set # CONFIG_XFS_FS is not set # CONFIG_XFS_QUOTA is not set # CONFIG_XFS_RT is not set # CONFIG_XFS_TRACE is not set # CONFIG_XFS_DEBUG is not set # # Network File Systems # # CONFIG_CODA_FS is not set # CONFIG_INTERMEZZO_FS is not set CONFIG_NFS_FS=y CONFIG_NFS_V3=y # CONFIG_NFS_DIRECTIO is not set # CONFIG_ROOT_NFS is not set CONFIG_NFSD=y CONFIG_NFSD_V3=y # CONFIG_NFSD_TCP is not set CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_SMB_FS=y # CONFIG_SMB_NLS_DEFAULT is not set CONFIG_SMB_UNIX=y # CONFIG_NCP_FS is not set # CONFIG_NCPFS_PACKET_SIGNING is not set # CONFIG_NCPFS_IOCTL_LOCKING is not set # CONFIG_NCPFS_STRONG is not set # CONFIG_NCPFS_NFS_NS is not set # CONFIG_NCPFS_OS2_NS is not set # CONFIG_NCPFS_SMALLDOS is not set # CONFIG_NCPFS_NLS is not set # CONFIG_NCPFS_EXTRAS is not set # CONFIG_ZISOFS_FS is not set # # Partition Types # CONFIG_PARTITION_ADVANCED=y # CONFIG_ACORN_PARTITION is not set CONFIG_OSF_PARTITION=y # CONFIG_AMIGA_PARTITION is not set # CONFIG_ATARI_PARTITION is not set # CONFIG_MAC_PARTITION is not set CONFIG_MSDOS_PARTITION=y CONFIG_BSD_DISKLABEL=y # CONFIG_MINIX_SUBPARTITION is not set # CONFIG_SOLARIS_X86_PARTITION is not set # CONFIG_UNIXWARE_DISKLABEL is not set # CONFIG_LDM_PARTITION is not set # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set # CONFIG_EFI_PARTITION is not set CONFIG_SMB_NLS=y CONFIG_NLS=y # # Native Language Support # CONFIG_NLS_DEFAULT="cp437" CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_CODEPAGE_737=m CONFIG_NLS_CODEPAGE_775=m CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m CONFIG_NLS_CODEPAGE_855=m CONFIG_NLS_CODEPAGE_857=m CONFIG_NLS_CODEPAGE_860=m CONFIG_NLS_CODEPAGE_861=m CONFIG_NLS_CODEPAGE_862=m CONFIG_NLS_CODEPAGE_863=m CONFIG_NLS_CODEPAGE_864=m CONFIG_NLS_CODEPAGE_865=m CONFIG_NLS_CODEPAGE_866=m CONFIG_NLS_CODEPAGE_869=m CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m CONFIG_NLS_CODEPAGE_949=m CONFIG_NLS_CODEPAGE_874=m CONFIG_NLS_ISO8859_8=m # CONFIG_NLS_CODEPAGE_1250 is not set # CONFIG_NLS_CODEPAGE_1251 is not set CONFIG_NLS_ISO8859_1=y CONFIG_NLS_ISO8859_2=m CONFIG_NLS_ISO8859_3=m CONFIG_NLS_ISO8859_4=m CONFIG_NLS_ISO8859_5=m CONFIG_NLS_ISO8859_6=m CONFIG_NLS_ISO8859_7=m CONFIG_NLS_ISO8859_9=m # CONFIG_NLS_ISO8859_13 is not set CONFIG_NLS_ISO8859_14=m CONFIG_NLS_ISO8859_15=y CONFIG_NLS_KOI8_R=m # CONFIG_NLS_KOI8_U is not set # CONFIG_NLS_UTF8 is not set # # Console drivers # CONFIG_VGA_CONSOLE=y # # Frame-buffer support # # CONFIG_FB is not set # # Sound # # CONFIG_SOUND is not set # # USB support # # CONFIG_USB is not set # # Support for USB gadgets # # CONFIG_USB_GADGET is not set # # Bluetooth support # # CONFIG_BLUEZ is not set # # Kernel hacking # CONFIG_ALPHA_LEGACY_START_ADDRESS=y CONFIG_DEBUG_KERNEL=y CONFIG_MATHEMU=y # CONFIG_DEBUG_SLAB is not set CONFIG_MAGIC_SYSRQ=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_RWLOCK is not set # CONFIG_DEBUG_SEMAPHORE is not set CONFIG_LOG_BUF_SHIFT=17 # # Cryptographic options # # CONFIG_CRYPTO is not set # # Library routines # # CONFIG_CRC32 is not set CONFIG_ZLIB_INFLATE=m CONFIG_ZLIB_DEFLATE=m -------------- next part -------------- Linux version 2.4.25 (root at beta) (gcc version 2.95.4 20011002 (Debian prerelease)) #6 Thu Mar 25 16:35:34 CET 2004 Booting on EB164 variation SX164 using machine vector SX164 from SRM Major Options: EV56 LEGACY_START MAGIC_SYSRQ Command line: ro root=/dev/sda1 memcluster 0, usage 1, start 0, end 256 memcluster 1, usage 0, start 256, end 32758 memcluster 2, usage 1, start 32758, end 32768 freeing pages 256:384 freeing pages 827:32758 reserving pages 827:828 pci: cia revision 1 (pyxis) On node 0 totalpages: 32758 zone(0): 32758 pages. zone(1): 0 pages. zone(2): 0 pages. Kernel command line: ro root=/dev/sda1 Using epoch = 1952 Console: colour VGA+ 80x25 Calibrating delay loop... 1055.68 BogoMIPS Memory: 253616k/262064k available (2175k kernel code, 6400k reserved, 577k data, 368k init) Dentry cache hash table entries: 32768 (order: 6, 524288 bytes) Inode cache hash table entries: 16384 (order: 5, 262144 bytes) Mount cache hash table entries: 512 (order: 0, 8192 bytes) Buffer cache hash table entries: 8192 (order: 3, 65536 bytes) Page-cache hash table entries: 32768 (order: 5, 262144 bytes) POSIX conformance testing by UNIFIX pci: passed tb register update test pci: passed sg loopback i/o read test pci: passed pte write cache snoop test pci: failed valid tag invalid pte reload test (mcheck; workaround available) pci: passed pci machine check test pci: tbia workaround enabled pci: enabling save/restore of SRM state SMC37c669 Super I/O Controller found @ 0x3f0 Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket srm_env: version 0.0.5 loaded successfully Starting kswapd VFS: Disk quotas vdquot_6.5.1 Journalled Block Device driver loaded Installing knfsd (copyright (C) 1996 okir at monad.swb.de). pty: 256 Unix98 ptys configured Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A rtc: Digital UNIX epoch (1952) detected Real Time Clock Driver v1.10f Floppy drive(s): fd0 is 2.88M FDC 0 is a post-1991 82077 Linux Tulip driver version 0.9.15-pre12 (Aug 9, 2002) tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media MII (#11) described by a 21140 MII PHY (1) block. tulip0: MII transceiver #0 config 0000 status 780d advertising 01e1. eth0: Digital DS21140 Tulip rev 34 at 0xfffffc880a104000, 00:40:05:36:50:D4, IRQ 25. Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SiI680: IDE controller at PCI slot 00:06.0 SiI680: chipset revision 1 SiI680: not 100% native mode: will probe irqs later SiI680: BASE CLOCK == 133 ide0: MMIO-DMA , BIOS settings: hda:pio, hdb:pio ide1: MMIO-DMA , BIOS settings: hdc:pio, hdd:pio hda: HDS722516VLAT80, ATA DISK drive blk: queue fffffc0000609c58, no I/O memory limit hdc: HDS722516VLAT80, ATA DISK drive blk: queue fffffc000060a3c0, no I/O memory limit ide0 at 0xfffffc880a102080-0xfffffc880a102087,0xfffffc880a10208a on irq 27 ide1 at 0xfffffc880a1020c0-0xfffffc880a1020c7,0xfffffc880a1020ca on irq 27 hda: attached ide-disk driver. hda: host protected area => 1 hda: 321672960 sectors (164697 MB) w/7938KiB Cache, CHS=20023/255/63, UDMA(100) hdc: attached ide-disk driver. hdc: host protected area => 1 hdc: 321672960 sectors (164697 MB) w/7938KiB Cache, CHS=20023/255/63, UDMA(100) Partition check: hda: hda1 hdc: hdc1 SCSI subsystem driver Revision: 1.00 sym53c8xx: at PCI bus 0, device 7, function 0 sym53c8xx: setting PCI_COMMAND_PARITY...(fix-up) sym53c8xx: 53c875 detected with Symbios NVRAM sym53c875-0: rev 0x26 on pci bus 0 device 7 function 0 irq 26 sym53c875-0: Symbios format NVRAM, ID 7, Fast-20, Parity Checking sym53c875-0: on-chip RAM at 0xa100000 sym53c875-0: restart (scsi reset). sym53c875-0: Downloading SCSI SCRIPTS. scsi0 : sym53c8xx-1.7.3c-20010512 blk: queue fffffc000098d2d0, no I/O memory limit Vendor: IBM Model: DCAS-34330W Rev: S65A Type: Direct-Access ANSI SCSI revision: 02 blk: queue fffffc000098d4d0, no I/O memory limit Vendor: IBM Model: DDYS-T36950N Rev: S93E Type: Direct-Access ANSI SCSI revision: 03 blk: queue fffffc000098d6d0, no I/O memory limit sym53c875-0-<0,0>: tagged command queue depth set to 8 sym53c875-0-<2,0>: tagged command queue depth set to 8 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Attached scsi disk sdb at scsi0, channel 0, id 2, lun 0 sym53c875-0-<0,*>: FAST-20 SCSI 20.0 MB/s (50.0 ns, offset 15) SCSI device sda: 8467200 512-byte hdwr sectors (4335 MB) sda: sda1 sda2 sda3 sym53c875-0-<2,*>: FAST-20 SCSI 20.0 MB/s (50.0 ns, offset 16) SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB) sdb: sdb1 md: linear personality registered as nr 1 md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: Autodetecting RAID arrays. [events: 00000001] [events: 00000001] md: autorun ... md: considering hdc1 ... md: adding hdc1 ... md: adding hda1 ... md: created md0 md: bind md: bind md: running: md: hdc1's event counter: 00000001 md: hda1's event counter: 00000001 md: md0: raid array is not clean -- starting background reconstruction md: RAID level 1 does not need chunksize! Continuing anyway. md0: max total readahead window set to 248k md0: 1 data-disks, max readahead per data-disk: 248k raid1: device hdc1 operational as mirror 0 raid1: device hda1 operational as mirror 1 raid1: raid set md0 not clean; reconstructing mirrors raid1: raid set md0 active with 2 out of 2 mirrors md: updating md0 RAID superblock on device md: hdc1 [events: 00000002]<6>(write) hdc1's sb offset: 160834624 md: syncing RAID array md0 md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc. md: using maximum available idle IO bandwith (but not more than 100000 KB/sec) for reconstruction. md: using 248k window, over a total of 160834624 blocks. md: hda1 [events: 00000002]<6>(write) hda1's sb offset: 160834624 md: ... autorun DONE. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 16384) NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 368k freed Adding Swap: 265848k swap-space (priority -1) kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.19, 19 August 2002 on md(9,0), internal journal EXT3-fs: mounted filesystem with ordered data mode. eth0: Setting full-duplex based on MII#0 link partner capability of 01e1. md: md0: sync done. Unable to handle kernel paging request at virtual address 0000440003af267c find(826): Oops 0 pc = [] ra = [] ps = 0000 Not tainted v0 = 0000000000001068 t0 = fffffffffff7fff0 t1 = 0000000000080010 t2 = 0000440003af269c t3 = 0000000000000000 t4 = 0000440003af267c t5 = 0000440003af2674 t6 = 0000000000000002 t7 = fffffc000ec04000 s0 = fffffc000cb419c0 s1 = 0000000000000900 s2 = fffffc0000221000 s3 = fffffc000cb419c0 s4 = fffffc000fe94cc0 s5 = 0000000000000000 s6 = fffffc000098d8a0 a0 = fffffc0000221000 a1 = fffffc000cb419c0 a2 = fffffc000cb419c0 a3 = 0000000000001000 a4 = 0000000000000008 a5 = fffffc0000221018 t8 = 0000000000000000 t9 = 00004800038d1680 t10= 0000000000080010 t11= fffffc0000221020 pv = fffffc00004a8ba0 at = fffffc000022101c gp = fffffc000062fbe8 sp = fffffc000ec07c18 Trace:fffffc00004ab950 fffffc0000355ba0 fffffc0000392abc fffffc000043fba4 fffffc000043fcac fffffc000044000c fffffc0000355ba0 fffffc0000392b04 fffffc000036b5e0 fffffc000038ee28 fffffc000036ac28 fffffc000036b5e0 fffffc000036489c fffffc00003648f8 fffffc00003634b4 fffffc000036b5e0 fffffc000036b808 fffffc0000310c40 Code: 42fc0403 2ffe0000 42f90405 a0f003d8 40a49525 40c49526 40649523 From kotaiah at iitg.ernet.in Mon Mar 29 19:58:04 2004 From: kotaiah at iitg.ernet.in (Anandanam Rama Kotaiah) Date: Tue, 30 Mar 2004 01:28:04 +0530 (IST) Subject: information on block size in ext2 Message-ID: hi all I want to know wt does s_log_blocksize represent in ext2's superblock structure in memory.. Is is 1024 for 1kb block size in ext2 file system.. Also wt does s_blocksize and s_blocksize_bits members of VFS superblock represent.. BTW are there any ext2 file system specific mailing lists.. bye a Linux lover From Paul.Libert at naturalsciences.be Mon Mar 29 09:56:02 2004 From: Paul.Libert at naturalsciences.be (Paul Libert) Date: Mon, 29 Mar 2004 11:56:02 +0200 Subject: impossible to create a ext3 filesystem on a LVM2 Logical Volume Message-ID: <4067F2B2.6090704@naturalsciences.be> Dears, I'm faced with the above mentionned problem. System is a HP Proliant DL380 with HP SmartArray 5304-256 Controller and a HP StorageWorks 4300 disk shelf. OS is Debian/testing with kernel 2.6.3 and LVM2. I've created a new LV and when I try to create the FS on it, it fails when saying "Writing superblocks and filesystem accounting information:" sar output shows that CPU is in 100% wio. Only way to solve it is to perform a hard reboot (killing the process does nothing). I've performed a strace of the 'mkfs.ext3' and last system call is a call to fsync (see hereafter as small part of the trace file) Any help appreciated ! Paul write(1, "done\n", 5done ) = 5 write(1, "Writing superblocks and filesyst"..., 59Writing superblocks and filesystem accounting information: ) = 59 time(NULL) = 1080538375 lseek(3, 24576, SEEK_SET) = 24576 write(3, "\0\0\0\0\0\0\0\0\6\265g@\6\265g@\6\265g@\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 35725312, SEEK_SET) = 35725312 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 6385664, SEEK_SET) = 6385664 write(3, "\30\6\0\0\31\n\0\0\32\16\0\0\33\22\0\0\34\26\0\0\35\32"..., 4096) = 4096 lseek(3, 35713024, SEEK_SET) = 35713024 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 35717120, SEEK_SET) = 35717120 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 35721216, SEEK_SET) = 35721216 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 35729408, SEEK_SET) = 35729408 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 lseek(3, 31580160, SEEK_SET) = 31580160 write(3, "\37\36\0\0 \36\0\0!\36\0\0\"\36\0\0#\36\0\0$\36\0\0%\36"..., 4096) = 4096 lseek(3, 1024, SEEK_SET) = 1024 ........ _llseek(3, 47915732992, [47915732992], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48049950720, [48049950720], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48184168448, [48184168448], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48318386176, [48318386176], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48452603904, [48452603904], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48586821632, [48586821632], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48721039360, [48721039360], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48855257088, [48855257088], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49392128000, [49392128000], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49526345728, [49526345728], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49660563456, [49660563456], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49794781184, [49794781184], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49928998912, [49928998912], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 48989474816, [48989474816], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49123692544, [49123692544], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 _llseek(3, 49257910272, [49257910272], SEEK_SET) = 0 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 fsync(3 From sct at redhat.com Wed Mar 31 11:12:01 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 12:12:01 +0100 Subject: PROBLEM: log abort over RAID5 In-Reply-To: References: <1078873638.2460.81.camel@sisko.scot.redhat.com> Message-ID: <1080731519.1991.3.camel@sisko.scot.redhat.com> Hi, On Wed, 2004-03-10 at 00:24, Leandro Guimar?es Faria Corsetti Dutra wrote: > Em Tue, 09 Mar 2004 23:07:18 +0000, Stephen C. Tweedie escreveu: > > > It could be a disk or driver fault; bad memory, overheating CPU, > > configuration error, anything. > > Is there a full check-list around? A _full_ checklist would include every piece of hardware in your machine, and every module you've got compiled or loaded into the kernel, plus a ton of privileged applications such as X. > If you Google around you will see other people have similar > patterns. It is reported that going back to either 2.4 or ext2 solves > the problem, but not being in production yet I'm still hoping for > diagnosis and a fix. I've been seeing some reports on raid5, yes. Current kernels look OK in the main for most people, though there are still the occasional problems being discovered: such is 2.6. Nothing springs to mind that particularly matches your own symptoms, though. --Stephen From sct at redhat.com Wed Mar 31 12:45:28 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 13:45:28 +0100 Subject: Assertion failure in ext3_put_super() at fs/ext3/super.c:412 In-Reply-To: <405E66F0.6060702@gmx.net> References: <405E66F0.6060702@gmx.net> Message-ID: <1080737128.1991.7.camel@sisko.scot.redhat.com> Hi, On Mon, 2004-03-22 at 04:09, evilninja wrote: > today i re-organized my data, lots of "mv" and "cp". > then, upon unmounting a ext3 partition the following was shown on the > console: > ~ Assertion failure in ext3_put_super() at fs/ext3/super.c:412: > "list_empty(&sbi->s_orphan)" There's a bug we've been chasing recently regarding a race between link/rename and unlink. We know what's going wrong, it's just a matter of determining what the right fix is (there's a workaround in a couple of filesystems but it doesn't seem to be quite the right fix.) It could well be the cause of your problem, as the symptoms on ext3 normally manifest as orphan list corruption. --Stephen From sct at redhat.com Wed Mar 31 12:46:28 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 13:46:28 +0100 Subject: stalled 'sync' on ext3+quota over drbd In-Reply-To: <1080125239.4717.33.camel@ariel.sovam.com> References: <1080125239.4717.33.camel@ariel.sovam.com> Message-ID: <1080737188.1991.9.camel@sisko.scot.redhat.com> Hi, On Wed, 2004-03-24 at 10:47, Eugene Crosser wrote: > Now, the setup mostly works fine. But if you actively use the > filesystem for some time (hour of copying a large tree over NFS), then > then try 'sync' command, the latter runs very long (10 minutes or more), > eating 99% CPU according to top, and the system becomes very sluggish > (leading to stalled replication, heartbeat misbehavior) and in fact > unusable. You'd need to try capturing a profile of the 99% cpu loop for us to be able to investigate this any further. Cheers, Stephen From sct at redhat.com Wed Mar 31 12:51:35 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 13:51:35 +0100 Subject: impossible to create a ext3 filesystem on a LVM2 Logical Volume In-Reply-To: <4067F2B2.6090704@naturalsciences.be> References: <4067F2B2.6090704@naturalsciences.be> Message-ID: <1080737494.1991.14.camel@sisko.scot.redhat.com> Hi, On Mon, 2004-03-29 at 10:56, Paul Libert wrote: > I'm faced with the above mentionned problem. Well, I know for a fact it's not impossible: I've got a box here running Fedora Core 2 test release, and it's using lvm2 for all of its filesystems except for /boot. I wonder what's different about your setup? > I've created a new LV and when I try to create the FS on it, it fails > when saying "Writing superblocks and filesystem accounting information:" > sar output shows that CPU is in 100% wio. If it's in wio, chances are the fs is waiting for an lvm2 IO to complete. That could be stuck in the device-mapper layer, or in the underlying device driver. We'd need to see a stack trace of the stuck process to investigate. (alt-sysrq-t should give you that.) > I've performed a strace of the 'mkfs.ext3' and last system call is a > call to fsync (see hereafter as small part of the trace file) Yep, that's consistent with the device being stuck. In any case, if "mkfs" is hanging it's not an ext3 fault, because the filesystem isn't running by that stage. Cheers, Stephen From sct at redhat.com Wed Mar 31 12:56:04 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 13:56:04 +0100 Subject: information on block size in ext2 In-Reply-To: References: Message-ID: <1080737764.1991.18.camel@sisko.scot.redhat.com> Hi, On Mon, 2004-03-29 at 20:58, Anandanam Rama Kotaiah wrote: > I want to know wt does s_log_blocksize represent in ext2's superblock > structure in memory.. Is is 1024 for 1kb block size in ext2 file system.. > Also wt does s_blocksize and s_blocksize_bits members of VFS > superblock represent.. Checking the source code is always the best way to find the answers to such questions. In particular, see how s_log_blocksize is used in ext2_fill_super(); for the other two, check sb_set_blocksize(). > BTW are there any ext2 file system specific mailing lists.. ext2-devel at lists.sourceforge.net although it's a developer list, not a general ext2 user list. Cheers, Stephen From crosser at rol.ru Wed Mar 31 13:05:46 2004 From: crosser at rol.ru (Eugene Crosser) Date: Wed, 31 Mar 2004 17:05:46 +0400 Subject: stalled 'sync' on ext3+quota over drbd In-Reply-To: <1080737188.1991.9.camel@sisko.scot.redhat.com> References: <1080125239.4717.33.camel@ariel.sovam.com> <1080737188.1991.9.camel@sisko.scot.redhat.com> Message-ID: <1080738345.22942.53.camel@ariel.sovam.com> On Wed, 2004-03-31 at 16:46, Stephen C. Tweedie wrote: > > Now, the setup mostly works fine. But if you actively use the > > filesystem for some time (hour of copying a large tree over NFS), then > > then try 'sync' command, the latter runs very long (10 minutes or more), > > eating 99% CPU according to top, and the system becomes very sluggish > > (leading to stalled replication, heartbeat misbehavior) and in fact > > unusable. > > You'd need to try capturing a profile of the 99% cpu loop for us to be > able to investigate this any further. That'd be tricky: it is somewhere in the kernel (top shows 99% CPU used by "system", and strace attaced to sync does not show anything). Another thing, possibly related: when I try `quotaoff', machine hangs for 10+ minutes, and does not respond to *anything* but ping. Then it gets alive again. I'd be happy to provide more information but so far I cannot decide where to look... Should I learn to use "kernel profiling"? Eugene -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sct at redhat.com Wed Mar 31 13:49:34 2004 From: sct at redhat.com (Stephen C. Tweedie) Date: 31 Mar 2004 14:49:34 +0100 Subject: stalled 'sync' on ext3+quota over drbd In-Reply-To: <1080738345.22942.53.camel@ariel.sovam.com> References: <1080125239.4717.33.camel@ariel.sovam.com> <1080737188.1991.9.camel@sisko.scot.redhat.com> <1080738345.22942.53.camel@ariel.sovam.com> Message-ID: <1080740974.1991.28.camel@sisko.scot.redhat.com> Hi, On Wed, 2004-03-31 at 14:05, Eugene Crosser wrote: > That'd be tricky: it is somewhere in the kernel (top shows 99% CPU used > by "system", and strace attaced to sync does not show anything). You'd need a kernel profile, not a user-space one, if it's all in system time. > I'd be happy to provide more information but so far I cannot decide > where to look... Should I learn to use "kernel profiling"? Sound like it. You've got two choices --- the simple "readprofile" (boot with profile=2), or set up an oprofile kernel. For complex user/kernel interactions oprofile can be really helpful, but for something that's simply stuck in the kernel, readprofile is fine. If the kernel is in a really tight loop, though, then simply using the "altgr-scrlck" combination a few times is usually enough to find out where it's looping. That keystroke just dumps a stacktrace of whatever CPU caught the keyboard interrupt, and if you're spinning in kernel space, that backtrace will include the stack trace of whatever kernel code is running in the foreground at the time. Cheers, Stephen From adilger at clusterfs.com Wed Mar 31 16:28:14 2004 From: adilger at clusterfs.com (Andreas Dilger) Date: Wed, 31 Mar 2004 09:28:14 -0700 Subject: tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. In-Reply-To: <1080734403.16245.159.camel@tribesman.namesys.com> References: <32823.193.110.218.57.1080733440.squirrel@mail.vamo.bg> <1080734403.16245.159.camel@tribesman.namesys.com> Message-ID: <20040331162814.GP1177@schnapps.adilger.int> On Mar 31, 2004 16:00 +0400, Vladimir Saveliev wrote: > On Wed, 2004-03-31 at 15:44, Ivan Ivanov wrote: > > I made some tests to see how ext3 reiserfs 3.6 and jfs survive disk > > errors. > > > > The test is simple: > > format a partition, copy the kernel source, unmount and and do ?dd > > if=/dev/zero of=/dev/hdd bs=512 count=100000 seek=30000? to simulate a > > disk surface damage and then run fsck. > > > > seek=30000 ? this must be the second half of journal in reiserfs and > > ext3, for jfs I don't know > > > > Well, not that I defend reiserfs's i/o error handling > > But I do not think that your test is a fair one. You overwrote area > where reiserfs stored metadata for data you copied into it. (not sure > about jfs, it probably has the same problem). Do you want to try to > overwrite ext3's inode tables? Actually, with a 51MB write it is guaranteed to overwrite at least one inode table somewhere in the filesystem (one inode table per 32MB of disk). One of the reasons that ext2/ext3 can survive such actions is that the location of the metadata is in a fixed location so even if everything is overwritten it knows what is inode table, what is data blocks, etc. This makes ext3 less flexible (i.e. no dynamic inode allocation) but also more robust. > > jfs: > > ---- > > total data loss, can't mount, fsck didn't helps > > > > reiserfs: > > --------- > > doing ?reiserfsck ?rebuild-tree? moves all recovered data in lost+found, > > but information is almost unusable > > > > ext3: > > ----- > > after ?fsck.ext3 -f -y? almost everything was usable, directory > > structure was untouched, some files was moved in lost+found, but in > > general > > everything was usable. > > > > My opinion: > > I can't use anything but ext2/3 in a system where there is no RAID ? 99% > > of desktops and most of web and mail servers. If you have time, you may want to try overwriting some other parts of the filesystem, just to see if the results change. I don't think it will make a huge difference in the end, but it might. Note that 51MB is a large fraction of the size of a Linux kernel so you might end up overwriting 1/4 of all the data. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ From ivandi at vamo.bg Wed Mar 31 11:40:39 2004 From: ivandi at vamo.bg (Ivan Ivanov) Date: Wed, 31 Mar 2004 14:40:39 +0300 (EEST) Subject: (no subject) Message-ID: <32815.193.110.218.57.1080733239.squirrel@mail.vamo.bg> I made some tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. The test is simple: format a partition, copy the kernel source, unmount and and do ?dd if=/dev/zero of=/dev/hdd bs=512 count=100000 seek=30000? to simulate a disk surface damage and then run fsck. seek=30000 ? this must be the second half of journal in reiserfs and ext3, for jfs I don't know Result: jfs: ---- total data loss, can't mount, fsck didn't helps reiserfs: --------- doing ?reiserfsck ?rebuild-tree? moves all recovered data in lost+found, but information is almost unusable ext3: ----- after ?fsck.ext3 -f -y? almost everything was usable, directory structure was untouched, some files was moved in lost+found, but in general everything was usable. My opinion: I can't use anything but ext2/3 in a system where there is no RAID ? 99% of desktops and most of web and mail servers. Cheers Ivan From ivandi at vamo.bg Wed Mar 31 11:44:00 2004 From: ivandi at vamo.bg (Ivan Ivanov) Date: Wed, 31 Mar 2004 14:44:00 +0300 (EEST) Subject: tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. Message-ID: <32823.193.110.218.57.1080733440.squirrel@mail.vamo.bg> I made some tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. The test is simple: format a partition, copy the kernel source, unmount and and do ?dd if=/dev/zero of=/dev/hdd bs=512 count=100000 seek=30000? to simulate a disk surface damage and then run fsck. seek=30000 ? this must be the second half of journal in reiserfs and ext3, for jfs I don't know Result: jfs: ---- total data loss, can't mount, fsck didn't helps reiserfs: --------- doing ?reiserfsck ?rebuild-tree? moves all recovered data in lost+found, but information is almost unusable ext3: ----- after ?fsck.ext3 -f -y? almost everything was usable, directory structure was untouched, some files was moved in lost+found, but in general everything was usable. My opinion: I can't use anything but ext2/3 in a system where there is no RAID ? 99% of desktops and most of web and mail servers. Cheers Ivan From vs at namesys.com Wed Mar 31 12:00:03 2004 From: vs at namesys.com (Vladimir Saveliev) Date: Wed, 31 Mar 2004 16:00:03 +0400 Subject: tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. In-Reply-To: <32823.193.110.218.57.1080733440.squirrel@mail.vamo.bg> References: <32823.193.110.218.57.1080733440.squirrel@mail.vamo.bg> Message-ID: <1080734403.16245.159.camel@tribesman.namesys.com> Hello On Wed, 2004-03-31 at 15:44, Ivan Ivanov wrote: > I made some tests to see how ext3 reiserfs 3.6 and jfs survive disk > errors. > > The test is simple: > format a partition, copy the kernel source, unmount and and do ?dd > if=/dev/zero of=/dev/hdd bs=512 count=100000 seek=30000? to simulate a > disk surface damage and then run fsck. > > seek=30000 ? this must be the second half of journal in reiserfs and > ext3, for jfs I don't know > Well, not that I defend reiserfs's i/o error handling But I do not think that your test is a fare one. You overwrote area where reiserfs stored metadata for data you copied into it. (not sure about jfs, it probably has the same problem). Do you want to try to overwrite ext3's inode tables? > Result: > > jfs: > ---- > total data loss, can't mount, fsck didn't helps > > reiserfs: > --------- > doing ?reiserfsck ?rebuild-tree? moves all recovered data in lost+found, > but information is almost unusable > > ext3: > ----- > after ?fsck.ext3 -f -y? almost everything was usable, directory > structure was untouched, some files was moved in lost+found, but in > general > everything was usable. > > My opinion: > I can't use anything but ext2/3 in a system where there is no RAID ? 99% > of desktops and most of web and mail servers. > > Cheers > Ivan > > > > > From ivandi at vamo.bg Wed Mar 31 15:14:44 2004 From: ivandi at vamo.bg (Ivan Ivanov) Date: Wed, 31 Mar 2004 18:14:44 +0300 (EEST) Subject: tests to see how ext3 reiserfs 3.6 and jfs survive disk errors. In-Reply-To: <1080734403.16245.159.camel@tribesman.namesys.com> References: <32823.193.110.218.57.1080733440.squirrel@mail.vamo.bg> <1080734403.16245.159.camel@tribesman.namesys.com> Message-ID: <34060.193.110.218.57.1080746084.squirrel@mail.vamo.bg> Hi, > > But I do not think that your test is a fare one. > Of course the test is very simple. I am trying to simulate physical damage of disk surface. It is simply a scratch leading to several damaged tracks. > > You overwrote area > where reiserfs stored metadata for data you copied into it. (not sure > about jfs, it probably has the same problem). Do you want to try to > overwrite ext3's inode tables? > That is the problem. The ext2/3 inode tables are spread in all block groups across entire disk. And damaging several continuous tracks doesn't result in such a big data loss. Cheers Ivan