From Sinha_Himanshu at emc.com  Thu Jul  6 18:02:16 2006
From: Sinha_Himanshu at emc.com (Sinha_Himanshu at emc.com)
Date: Thu, 6 Jul 2006 14:02:16 -0400 
Subject: Limited write bandwidth from ext3
Message-ID: <7E76AE153FDC3240BA7E82E23972F9FE01B6037D@CORPUSMX30B.corp.emc.com>


We tried the extents+mballoc+delalloc patches suggested by Andreas and found
that it made a significant improvement in our benchmark - write bandwidth
increased from 144 MBps to 214 MBps. We are at about 85% of the bandwidth
that one can get writing to an  ext2 file which in turn is about 82% of the
bandwidth one can get writing to the block device. We are analyzing our
traces to determine the cause of these differences. So far, we see that
during writes to the ext3 file lun writes periodically wait for 5 reads
while in the case of writes to ext2 file lun writes periodically wait for
only one read.

Workload: Single threaded 512 KB writes to a new file.
				RedHat 4 U1			2.6.16.8
kernel
			(2.6.9 based kernel)		
Block Device		308 MBps			306 MBps
Ext2 file			267				255
Ext3				138				144
Ext3 with patches		N/A				216 
Ext3 with patches, journal on separate LUN	215

Himanshu


-----Original Message-----
From: Andreas Dilger [mailto:adilger at clusterfs.com] 
Sent: Wednesday, June 21, 2006 4:54 PM
To: Sinha, Himanshu
Cc: ext3-users at redhat.com
Subject: Re: Limited write bandwidth from ext3

On Jun 19, 2006  14:18 -0400, Sinha_Himanshu at emc.com wrote:
> We measured the write bandwidth for writes to the block device
> corresponding to the lun (e.g. /dev/sdb), a file in an ext2 filesystem
> and to a file in an ext3 file system.
> 		Write b/w for 512 KB writes
> Block device	312 MBps
> Ext2 file		247 MBps
> Ext3 file		130 MBps
> 
> We are looking for ways to improve the ext3 file write bandwidth.

Have a look at the extents+mballoc+delalloc patches from Alex Tomas:

ftp://ftp.lustre.org/pub/people/alex/2.6.16.8/

Mount the filesystem with "-o extents,mballoc,delalloc" to enable this.

They noticably improve IO performance while also reducing the CPU load
for ext3.  The extent patches are approved by all of the ext3 developers
and will be supported upstream fairly soon (in the kernel and e2fsprogs),
and mballoc+delalloc will follow on afterward.

NOTE: the extents on-disk format is incompatible with older kernels, so
      at this stage consider it "for benchmarking only".

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From herta.vandeneynde at cc.kuleuven.be  Mon Jul 10 10:29:50 2006
From: herta.vandeneynde at cc.kuleuven.be (Herta Van den Eynde)
Date: Mon, 10 Jul 2006 12:29:50 +0200
Subject: chattr +T not implemented?
Message-ID: <44B22C1E.6050908@cc.kuleuven.be>

We run a third party application that creates an inordinate amount of 
subdirectories in a single directory.  To speed up I/O, I wanted to set 
the T attribute on the directory that will hold the subdirectories.  The 
"chattr +T /usr/local/lepus-bb/a-0607" command returns status 0, but 
when I verify the setting, the attribute isn't there:

   # lsattr -d /usr/local/lepus-bb/a-0607
   ------------- /usr/local/lepus-bb/a-0607

Is this attribute implemented?  The manual pages entry for chattr 
suggests it is, but when I check the chattr usage, "T" isn't listed:

   #chattr -v
   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...

FWIIW
   # cat /proc/version
   Linux version 2.4.21-40.ELsmp (bhcompile at hs20-bc1-7.build.redhat.com)
   (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
   22:22:39 EST 2006

Kind regards,

Herta

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm


From tytso at mit.edu  Mon Jul 10 18:08:00 2006
From: tytso at mit.edu (Theodore Tso)
Date: Mon, 10 Jul 2006 14:08:00 -0400
Subject: chattr +T not implemented?
In-Reply-To: <44B22C1E.6050908@cc.kuleuven.be>
References: <44B22C1E.6050908@cc.kuleuven.be>
Message-ID: <20060710180800.GB16137@thunk.org>

On Mon, Jul 10, 2006 at 12:29:50PM +0200, Herta Van den Eynde wrote:
> Is this attribute implemented?  The manual pages entry for chattr 
> suggests it is, but when I check the chattr usage, "T" isn't listed:
> 
>   #chattr -v
>   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...
> 
> FWIIW
>   # cat /proc/version
>   Linux version 2.4.21-40.ELsmp (bhcompile at hs20-bc1-7.build.redhat.com)
>   (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
>   22:22:39 EST 2006

To quote from the man page:

       A  directory  with    attribute will be deemed to be the top of
       directory hierarchies for the purposes of  the  Orlov  block  allocator
       (which is used in on systems with Linux 2.5.46 or later).

You're using Linux version 2.4.21....

							- Ted


From adilger at clusterfs.com  Mon Jul 10 18:37:02 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Mon, 10 Jul 2006 12:37:02 -0600
Subject: chattr +T not implemented?
In-Reply-To: <44B22C1E.6050908@cc.kuleuven.be>
References: <44B22C1E.6050908@cc.kuleuven.be>
Message-ID: <20060710183702.GF15380@schatzie.adilger.int>

On Jul 10, 2006  12:29 +0200, Herta Van den Eynde wrote:
> We run a third party application that creates an inordinate amount of 
> subdirectories in a single directory.  To speed up I/O, I wanted to set 
> the T attribute on the directory that will hold the subdirectories.  The 
> "chattr +T /usr/local/lepus-bb/a-0607" command returns status 0, but 
> when I verify the setting, the attribute isn't there:
> 
>   # lsattr -d /usr/local/lepus-bb/a-0607
>   ------------- /usr/local/lepus-bb/a-0607
> 
> Is this attribute implemented?  The manual pages entry for chattr 
> suggests it is, but when I check the chattr usage, "T" isn't listed:
> 
>   #chattr -v
>   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...

man chattr(1) reports:
	A  directory  with  the  ?T? attribute will be deemed to be the top of
	directory hierarchies for the purposes of the  Orlov  block allocator
	(which is used in on systems with Linux 2.5.46 or later).`

You can also check with "debugfs -c -R 'stat lepus-bb/a-0607' /dev/XXXX"
(assuming /usr/local/ is the mountpoint).  It may be that the kernel is
not allowing the T attribute in the EXT3_FL_USER_VISIBLE mask, though it
does show correctly in my kernel.

#define EXT3_TOPDIR_FL                  0x00020000 /* Top of directory hierarchies*/
#define EXT3_FL_USER_VISIBLE            0x0003DFFF /* User visible flags */


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From herta.vandeneynde at cc.kuleuven.be  Mon Jul 10 22:11:54 2006
From: herta.vandeneynde at cc.kuleuven.be (Herta Van den Eynde)
Date: Tue, 11 Jul 2006 00:11:54 +0200
Subject: chattr +T not implemented?
In-Reply-To: <20060710180800.GB16137@thunk.org>
References: <44B22C1E.6050908@cc.kuleuven.be>
	<20060710180800.GB16137@thunk.org>
Message-ID: <44B2D0AA.7000809@cc.kuleuven.be>

Theodore Tso wrote:
> On Mon, Jul 10, 2006 at 12:29:50PM +0200, Herta Van den Eynde wrote:
> 
>>Is this attribute implemented?  The manual pages entry for chattr 
>>suggests it is, but when I check the chattr usage, "T" isn't listed:
>>
>>  #chattr -v
>>  usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...
>>
>>FWIIW
>>  # cat /proc/version
>>  Linux version 2.4.21-40.ELsmp (bhcompile at hs20-bc1-7.build.redhat.com)
>>  (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
>>  22:22:39 EST 2006
> 
> 
> To quote from the man page:
> 
>        A  directory  with    attribute will be deemed to be the top of
>        directory hierarchies for the purposes of  the  Orlov  block  allocator
>        (which is used in on systems with Linux 2.5.46 or later).
> 
> You're using Linux version 2.4.21....
> 
Ouch.  Missed that.  Thanks for pointing it out, Theodore.

Kind regards,

Herta

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm


From zeremski.boris at nsinfo.co.yu  Thu Jul 13 07:15:47 2006
From: zeremski.boris at nsinfo.co.yu (Zeremski Boris)
Date: Thu, 13 Jul 2006 09:15:47 +0200
Subject: detail explain of file creation process
Message-ID: <200607130800.k6D80n6u024206@mx1.redhat.com>

Hi,

 
Could someone point me to documentation or explain

in detail, process of creating file.(space reservation, inode....)

 
What is happen at low lavel?

 
Thanks

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20060713/2c862514/attachment.htm>

From evilninja at gmx.net  Thu Jul 13 17:24:34 2006
From: evilninja at gmx.net (christian)
Date: Thu, 13 Jul 2006 18:24:34 +0100 (BST)
Subject: detail explain of file creation process
In-Reply-To: <200607130800.k6D80n6u024206@mx1.redhat.com>
References: <200607130800.k6D80n6u024206@mx1.redhat.com>
Message-ID: <Pine.NEB.4.64.0607131823290.2444@vaio.testbed.de>

On Thu, 13 Jul 2006, Zeremski Boris wrote:
> Could someone point me to documentation or explain
> in detail, process of creating file.(space reservation, inode....)
> What is happen at low lavel?

does this: http://e2fsprogs.sourceforge.net/ext2intro.html suffice?

-- 
BOFH excuse #332:

suboptimal routing experience


From zeremski.boris at nsinfo.co.yu  Fri Jul 14 05:09:57 2006
From: zeremski.boris at nsinfo.co.yu (Zeremski Boris)
Date: Fri, 14 Jul 2006 07:09:57 +0200
Subject: detail explain of file creation process
In-Reply-To: <Pine.NEB.4.64.0607131823290.2444@vaio.testbed.de>
Message-ID: <200607140510.k6E5AGnC008664@mx1.redhat.com>


Hi, this link is great, explain basic concept of ext2/3 file system (inode,
directory, soft/hard links...).

What I am interested in, is more detail process of creating file. What is
going on when, for example, make 'touch test.file' till that file really
start existing on file system. 

Where can I find his kind of information?

> 
> > Could someone point me to documentation or explain
> > in detail, process of creating file.(space reservation, inode....)
> > What is happen at low lavel?
> 
> does this: http://e2fsprogs.sourceforge.net/ext2intro.html suffice?
> 


From adilger at clusterfs.com  Fri Jul 14 05:20:57 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 13 Jul 2006 23:20:57 -0600
Subject: detail explain of file creation process
In-Reply-To: <200607140510.k6E5AGnC008664@mx1.redhat.com>
References: <Pine.NEB.4.64.0607131823290.2444@vaio.testbed.de>
	<200607140510.k6E5AGnC008664@mx1.redhat.com>
Message-ID: <20060714052057.GL15380@schatzie.adilger.int>

On Jul 14, 2006  07:09 +0200, Zeremski Boris wrote:
> Hi, this link is great, explain basic concept of ext2/3 file system (inode,
> directory, soft/hard links...).
> 
> What I am interested in, is more detail process of creating file. What is
> going on when, for example, make 'touch test.file' till that file really
> start existing on file system. 
> 
> Where can I find his kind of information?

If you run UML with GDB, you can set a breakpoint at "sys_open" and follow
it around from there.  Also of interest are ext3_lookup, ext3_create. 

If you don't find any documentation, you might consider writing a wiki
page for this as you figure it out.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From ling at aliko.com  Thu Jul 13 20:13:07 2006
From: ling at aliko.com (Ling C. Ho)
Date: Thu, 13 Jul 2006 15:13:07 -0500
Subject: Ext3 overhead vs Raw
Message-ID: <44B6A953.7010209@aliko.com>

Hi,

I am trying to find way to speed up read access on ext3 filesystem.
I did some tests using dd, with different block sizes, directio and 
none, etc. The test file is about 1Gig in size, and spread across 25 
fragments (found using filefrag). Block size is 4k. I have also tried 
setting readahead buffer using blockdev , from 256 to 32767.

time /root/dd conv=nocreat  ibs=4096 obs=4096 if=/sam/cache/test/test3 
of=/dev/null
The best real elapsed time I get is about 23.5s.

If I dd the same amount of data from the disk device itself, I get about 
18.5s, which matches what hdparm -tT gives me.

Comparing strace outputs, I can see the read system calls reading from 
ext3 takes 30-35% longer to complete compare to raw device. Is this 
something  expected or  can I expect better performance?

I am running kernel.org kernel 2.6.12 .

Thanks,

...
ling


From Martin at lichtvoll.de  Fri Jul 14 15:47:22 2006
From: Martin at lichtvoll.de (Martin Steigerwald)
Date: Fri, 14 Jul 2006 17:47:22 +0200
Subject: Write barrier support in ext3
Message-ID: <200607141747.22884.Martin@lichtvoll.de>

Hello ext3 users and developers,

I am gathering information for an article about journal filesystems with 
emphasis on write barrier functionality, how it works, why journalling 
filesystems need write barrier and the current implementation of write 
barrier support for different filesystems. 

Background of this is my own experience of three XFS crashes in one week:
http://bugzilla.kernel.org/show_bug.cgi?id=6380

With 2.6.17.1 XFS seems to works stable with write caches after applying a 
(write cache unrelated) fix:
http://bugzilla.kernel.org/show_bug.cgi?id=6757

But I like to provide information on ext3, jfs, reiserfs 3 and reiser 4 as 
well.

I like to ask you:

1) Since which kernel release are write barriers officially supported and 
stable in ext3? Is it 2.6.16? I found the barrier option in 
filesystem/ext3.txt in my 2.6.17 kernel.

2) Since which kernel release are write barriers enabled by default in 
ext3 if any ?

3) Are there any performance measurements for ext3? It is expected that 
write barrier will be slower than no write barrier but faster then 
disabled write caches. 

4) Have there been any issues regarding write barrier support in ext3 that 
are worth to mention in the article?

If you have any links of relevant information pieces, please share them 
with me. Nonetheless I will continue grepping kernel changelogs and the 
internet for until I have the information I want for that article.

Please CC to me personally as I am not subscribed to the list...

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7


From sct at redhat.com  Fri Jul 14 21:47:51 2006
From: sct at redhat.com (Stephen C. Tweedie)
Date: Fri, 14 Jul 2006 22:47:51 +0100
Subject: Ext3 overhead vs Raw
In-Reply-To: <44B6A953.7010209@aliko.com>
References: <44B6A953.7010209@aliko.com>
Message-ID: <1152913671.13275.38.camel@sisko.sctweedie.blueyonder.co.uk>

Hi,

On Thu, 2006-07-13 at 15:13 -0500, Ling C. Ho wrote:

> If I dd the same amount of data from the disk device itself, I get about 
> 18.5s, which matches what hdparm -tT gives me.

Be aware, disks typically have different performance depending on where
the data is, with data on the outermost cylinders getting higher
throughput than data on innermost cylinders (there's constant rotational
velocity for the surface, but the outer tracks are longer so each
rotation carries more data past the heads.)

So all sorts of things like the exact data placement can come into
effect.  Are you sure you're using the same bits of the disk for the raw
and filesystem cases?

--Stephen


From ling at aliko.com  Fri Jul 14 22:51:10 2006
From: ling at aliko.com (Ling C. Ho)
Date: Fri, 14 Jul 2006 17:51:10 -0500
Subject: Ext3 overhead vs Raw
In-Reply-To: <1152913671.13275.38.camel@sisko.sctweedie.blueyonder.co.uk>
References: <44B6A953.7010209@aliko.com>
	<1152913671.13275.38.camel@sisko.sctweedie.blueyonder.co.uk>
Message-ID: <44B81FDE.9020906@aliko.com>

Hi Stephen,

That's a great point. I recreated the filesystem again, using default 
options and then create a directory.
These are some info:
1864 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group

# mount /dev/hdb /sam/cache
# ls -ldi /sam/cache/test
11485185 drwxr-xr-x  2 root root 4096 Jul 14 17:39 /sam/cache/test

The directory inode is in the ~701st block group if not mistaken, which 
is no where near the beginning of the filesystem.

This looks to me like it had changed from kernel 2.4 time. But is it 
still true that any files being created under the directory will still 
have the data written into free space in the same block group as the 
directory, and onwards?

So, how does it work now? Is a directory randomly placed now even on an 
empty file system?
Is there anyway to force it to be created near the beginning of the file 
system, thus towards to outermost cylinders?
The application I am working with only use one directory on a file 
system, so I don't really care where it is placed. But for performance 
testings, like the one versus raw access, it would be nice to test 
against file written at the beginning of the filesystem.

Thanks,
...
ling


Stephen C. Tweedie wrote:

>Hi,
>
>On Thu, 2006-07-13 at 15:13 -0500, Ling C. Ho wrote:
>
>  
>
>>If I dd the same amount of data from the disk device itself, I get about 
>>18.5s, which matches what hdparm -tT gives me.
>>    
>>
>
>Be aware, disks typically have different performance depending on where
>the data is, with data on the outermost cylinders getting higher
>throughput than data on innermost cylinders (there's constant rotational
>velocity for the surface, but the outer tracks are longer so each
>rotation carries more data past the heads.)
>
>So all sorts of things like the exact data placement can come into
>effect.  Are you sure you're using the same bits of the disk for the raw
>and filesystem cases?
>
>--Stephen
>
>
>  
>


From zeremski.boris at nsinfo.co.yu  Mon Jul 17 05:13:43 2006
From: zeremski.boris at nsinfo.co.yu (Zeremski Boris)
Date: Mon, 17 Jul 2006 07:13:43 +0200
Subject: detail explain of file creation process
In-Reply-To: <20060714052057.GL15380@schatzie.adilger.int>
Message-ID: <200607170514.k6H5E4FQ031473@mx1.redhat.com>


Thanks,

I will try to spend some time to solve this problem,....
If you have any suggestion, please be free to tell me.
Any help is welcome.

Bye

> -----Original Message-----
> From: Andreas Dilger [mailto:adilger at clusterfs.com]
> Sent: Friday, July 14, 2006 7:21 AM
> To: Zeremski Boris
> Cc: 'christian'; Ext3-users at redhat.com
> Subject: Re: detail explain of file creation process
> 
> On Jul 14, 2006  07:09 +0200, Zeremski Boris wrote:
> > Hi, this link is great, explain basic concept of ext2/3 file system
> (inode,
> > directory, soft/hard links...).
> >
> > What I am interested in, is more detail process of creating file. What
> is
> > going on when, for example, make 'touch test.file' till that file really
> > start existing on file system.
> >
> > Where can I find his kind of information?
> 
> If you run UML with GDB, you can set a breakpoint at "sys_open" and follow
> it around from there.  Also of interest are ext3_lookup, ext3_create.
> 
> If you don't find any documentation, you might consider writing a wiki
> page for this as you figure it out.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.


From mfaine at knology.net  Wed Jul 19 12:10:56 2006
From: mfaine at knology.net (Mark F)
Date: Wed, 19 Jul 2006 07:10:56 -0500
Subject: create very large file system
Message-ID: <e9l7bh$134$1@sea.gmane.org>

Suse Linux Enterprise Server 9 SP3

I've tried to create a large 5TB file system using both reiserfs and ext3 and both have failed.

I end up with only a 1.5TB file system.  Does anyone know why this doesn't work, what to do to fix it?

Others have suggested that  only XFS or JFS will work.  Is this so?

Thanks,
-Mark


From ulf at autotradecenter.com  Thu Jul 20 00:00:19 2006
From: ulf at autotradecenter.com (Ulf Zimmermann)
Date: Wed, 19 Jul 2006 17:00:19 -0700
Subject: Problems under Redhat EL3 and ext3
Message-ID: <5DE4B7D3E79067418154C49A739C1251D81503@msmpk01.corp.autc.com>

I am running into performance issues with ext3. Historically we had our
image files (pictures of cars, currently 5.3 million) sub divided into a
directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the
first 4 letters/numbers of the file name and use that to put it into
this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As
the file names used to be based on VIN numbers of vehicles, that wasn't
a problem. But then our developers changed the image file names using a
vehicle ID from the database. And as we rolled over 1,000,000 in vehicle
ids we would get large numbers of files into directories. And files do
not get well distributed.

So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the
file name, using then the first 3 letters/numbers to file it away. On
initial testing this worked well, distribution nice across the
directories, so we could split this on separate file systems or disks.

When we actually got to do this, a decision was made to use hard links
from the old structure to the new structure for backward capability. And
this turned into a disaster. Rsync or find on the new structure takes
dramatic longer, talking about 5 minutes for a find on the old structure
and hours on the new structure. Using strace I tracked it down to
lstat64. On the old structure lstat64 takes on average 37 usecs/call
while on the new structure it is over 2,400 usecs/call.

EL4 does not seem to have this problem, unfortunately I can't just
upgrade, out of other reasons. So anyone have ideas why lstat64 would be
so much slower on the new structure? Any help, hints, suggestions would
be great.


Regards, Ulf.

---------------------------------------------------------------------
Autotradecenter.com Inc, T: 650-532-6382, F: 650-532-6441
4600 Bohannon Drive, Suite 100, Menlo Park, CA 94025
---------------------------------------------------------------------


From adilger at clusterfs.com  Thu Jul 20 06:26:46 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 20 Jul 2006 02:26:46 -0400
Subject: create very large file system
In-Reply-To: <200607191657.38644.zam@namesys.com>
References: <e9l7bh$134$1@sea.gmane.org> <200607191657.38644.zam@namesys.com>
Message-ID: <20060720062646.GA6174@schatzie.adilger.int>

On Jul 19, 2006  16:57 +0400, Alexander Zarochentsev wrote:
> On Wednesday 19 July 2006 16:10, Mark F wrote:
> > I've tried to create a large 5TB file system using both reiserfs and
> > ext3 and both have failed.
> 
> you might need to convert the partition table to GPT format for 
> supporting 2TB+ partitions.  it can be done by the gnu parted tool.

Or, for that matter, don't use a partition table at all, since this
adds an unhelpful offset to all the filesystem structures and can
hurt performance on RAID where the filesystem is trying to align IO
to RAID stripe boundaries.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From adilger at clusterfs.com  Thu Jul 20 07:17:44 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 20 Jul 2006 03:17:44 -0400
Subject: Problems under Redhat EL3 and ext3
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251D81503@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251D81503@msmpk01.corp.autc.com>
Message-ID: <20060720071744.GE6174@schatzie.adilger.int>

On Jul 19, 2006  17:00 -0700, Ulf Zimmermann wrote:
> I am running into performance issues with ext3. Historically we had our
> image files (pictures of cars, currently 5.3 million) sub divided into a
> directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the
> first 4 letters/numbers of the file name and use that to put it into
> this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As
> the file names used to be based on VIN numbers of vehicles, that wasn't
> a problem. But then our developers changed the image file names using a
> vehicle ID from the database. And as we rolled over 1,000,000 in vehicle
> ids we would get large numbers of files into directories. And files do
> not get well distributed.
> 
> So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the
> file name, using then the first 3 letters/numbers to file it away. On
> initial testing this worked well, distribution nice across the
> directories, so we could split this on separate file systems or disks.
> 
> When we actually got to do this, a decision was made to use hard links
> from the old structure to the new structure for backward capability. And
> this turned into a disaster. Rsync or find on the new structure takes
> dramatic longer, talking about 5 minutes for a find on the old structure
> and hours on the new structure. Using strace I tracked it down to
> lstat64. On the old structure lstat64 takes on average 37 usecs/call
> while on the new structure it is over 2,400 usecs/call.
> 
> EL4 does not seem to have this problem, unfortunately I can't just
> upgrade, out of other reasons. So anyone have ideas why lstat64 would be
> so much slower on the new structure? Any help, hints, suggestions would
> be great.

Do you have directories with more than, say, 10-15,000 entries?
Do you have dir_index (directory indexing) feature enabled on your
filesystem?  This is done with "tune2fs -O dir_index" (even while
mounted) but only affects new directories.  I believe the RHEL3 code
has this functionality, but it isn't enabled by default like I
suspect it is on FC4.

Once you have enabled this, then an OFFLINE run of "e2fsck -fD {dev}"
will rebuild the directory indexes for existing directories.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From ulf at autotradecenter.com  Thu Jul 20 07:24:41 2006
From: ulf at autotradecenter.com (Ulf Zimmermann)
Date: Thu, 20 Jul 2006 00:24:41 -0700
Subject: Problems under Redhat EL3 and ext3
Message-ID: <5DE4B7D3E79067418154C49A739C1251D8150F@msmpk01.corp.autc.com>

> -----Original Message-----
> From: Andreas Dilger [mailto:adilger at clusterfs.com]
> Sent: 07/20/2006 12:18 AM
> To: Ulf Zimmermann
> Cc: ext3-users at redhat.com
> Subject: Re: Problems under Redhat EL3 and ext3
> 
> On Jul 19, 2006  17:00 -0700, Ulf Zimmermann wrote:
> > I am running into performance issues with ext3. Historically we had
our
> > image files (pictures of cars, currently 5.3 million) sub divided
into a
> > directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the
> > first 4 letters/numbers of the file name and use that to put it into
> > this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1,
etc. As
> > the file names used to be based on VIN numbers of vehicles, that
wasn't
> > a problem. But then our developers changed the image file names
using a
> > vehicle ID from the database. And as we rolled over 1,000,000 in
vehicle
> > ids we would get large numbers of files into directories. And files
do
> > not get well distributed.
> >
> > So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on
the
> > file name, using then the first 3 letters/numbers to file it away.
On
> > initial testing this worked well, distribution nice across the
> > directories, so we could split this on separate file systems or
disks.
> >
> > When we actually got to do this, a decision was made to use hard
links
> > from the old structure to the new structure for backward capability.
And
> > this turned into a disaster. Rsync or find on the new structure
takes
> > dramatic longer, talking about 5 minutes for a find on the old
structure
> > and hours on the new structure. Using strace I tracked it down to
> > lstat64. On the old structure lstat64 takes on average 37 usecs/call
> > while on the new structure it is over 2,400 usecs/call.
> >
> > EL4 does not seem to have this problem, unfortunately I can't just
> > upgrade, out of other reasons. So anyone have ideas why lstat64
would be
> > so much slower on the new structure? Any help, hints, suggestions
would
> > be great.
> 
> Do you have directories with more than, say, 10-15,000 entries?
> Do you have dir_index (directory indexing) feature enabled on your
> filesystem?  This is done with "tune2fs -O dir_index" (even while
> mounted) but only affects new directories.  I believe the RHEL3 code
> has this functionality, but it isn't enabled by default like I
> suspect it is on FC4.

The filesystem was created under EL3. I am currently copying everything
in the new structure into a new directory and it seems to be fast. My
plan at this point is to rename the hard linked new structure at the
end, and use that copy. I did run on one of the nodes e2fsck -D but that
did not help.

Hmmm, I just ran "tune2fs -O dir_index" on one node, tune2fs -l does
show dir_index enabled now. But I am not sure if that will help, as
getdents64 wasn't showing much difference in a strace -c, lstat64 on the
other hand did.

> 
> Once you have enabled this, then an OFFLINE run of "e2fsck -fD {dev}"
> will rebuild the directory indexes for existing directories.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.


From tytso at mit.edu  Thu Jul 20 18:25:29 2006
From: tytso at mit.edu (Theodore Tso)
Date: Thu, 20 Jul 2006 14:25:29 -0400
Subject: Problems under Redhat EL3 and ext3
In-Reply-To: <5DE4B7D3E79067418154C49A739C1251D8150F@msmpk01.corp.autc.com>
References: <5DE4B7D3E79067418154C49A739C1251D8150F@msmpk01.corp.autc.com>
Message-ID: <20060720182529.GB6634@thunk.org>

On Thu, Jul 20, 2006 at 12:24:41AM -0700, Ulf Zimmermann wrote:
> The filesystem was created under EL3. I am currently copying everything
> in the new structure into a new directory and it seems to be fast. My
> plan at this point is to rename the hard linked new structure at the
> end, and use that copy. I did run on one of the nodes e2fsck -D but that
> did not help.

e2fsck -D, or e2fsck -fD?  You need the -f option in order to force
e2fsck to scan the whole filesystem and optimize all filesystems.

						- Ted


From ulf at autotradecenter.com  Thu Jul 20 19:07:23 2006
From: ulf at autotradecenter.com (Ulf Zimmermann)
Date: Thu, 20 Jul 2006 12:07:23 -0700
Subject: Problems under Redhat EL3 and ext3
Message-ID: <5DE4B7D3E79067418154C49A739C1251D81515@msmpk01.corp.autc.com>

> -----Original Message-----
> From: Theodore Tso [mailto:tytso at mit.edu]
> Sent: 07/20/2006 11:25 AM
> To: Ulf Zimmermann
> Cc: Andreas Dilger; ext3-users at redhat.com
> Subject: Re: Problems under Redhat EL3 and ext3
> 
> On Thu, Jul 20, 2006 at 12:24:41AM -0700, Ulf Zimmermann wrote:
> > The filesystem was created under EL3. I am currently copying
everything
> > in the new structure into a new directory and it seems to be fast.
My
> > plan at this point is to rename the hard linked new structure at the
> > end, and use that copy. I did run on one of the nodes e2fsck -D but
that
> > did not help.
> 
> e2fsck -D, or e2fsck -fD?  You need the -f option in order to force
> e2fsck to scan the whole filesystem and optimize all filesystems.
> 
> 						- Ted

On the one node I did, it was -D, which did do a force checked, but not
because I specified -f, but because the file system hadn't been checked
in > 192 days.

Ulf.


From ulf at autotradecenter.com  Thu Jul 20 19:10:25 2006
From: ulf at autotradecenter.com (Ulf Zimmermann)
Date: Thu, 20 Jul 2006 12:10:25 -0700
Subject: Problems under Redhat EL3 and ext3
Message-ID: <5DE4B7D3E79067418154C49A739C1251D81516@msmpk01.corp.autc.com>

> -----Original Message-----
> From: ext3-users-bounces at redhat.com
[mailto:ext3-users-bounces at redhat.com]
> On Behalf Of Ulf Zimmermann
> Sent: 07/20/2006 12:07 PM
> To: Theodore Tso
> Cc: Andreas Dilger; ext3-users at redhat.com
> Subject: RE: Problems under Redhat EL3 and ext3
> 
> > -----Original Message-----
> > From: Theodore Tso [mailto:tytso at mit.edu]
> > Sent: 07/20/2006 11:25 AM
> > To: Ulf Zimmermann
> > Cc: Andreas Dilger; ext3-users at redhat.com
> > Subject: Re: Problems under Redhat EL3 and ext3
> >
> > On Thu, Jul 20, 2006 at 12:24:41AM -0700, Ulf Zimmermann wrote:
> > > The filesystem was created under EL3. I am currently copying
> everything
> > > in the new structure into a new directory and it seems to be fast.
> My
> > > plan at this point is to rename the hard linked new structure at
the
> > > end, and use that copy. I did run on one of the nodes e2fsck -D
but
> that
> > > did not help.
> >
> > e2fsck -D, or e2fsck -fD?  You need the -f option in order to force
> > e2fsck to scan the whole filesystem and optimize all filesystems.
> >
> > 						- Ted
> 
> On the one node I did, it was -D, which did do a force checked, but
not
> because I specified -f, but because the file system hadn't been
checked
> in > 192 days.
> 
> Ulf.

The one other thing I hadn't answered before, each directory has on
average 1,293 files, deviation of less then 100 each direction. In the
old structure some directories had over 50,000 files and it didn't seem
to slow it down. Dir_index was not enabled on the systems, so I enabled
it on one node, waiting for something to finish before I can unmount it
and run e2fsck -fD on it.

Ulf.


From adilger at clusterfs.com  Thu Jul 20 15:02:09 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 20 Jul 2006 11:02:09 -0400
Subject: create very large file system
In-Reply-To: <200607201317.54566.chrivers@iversen-net.dk>
References: <e9l7bh$134$1@sea.gmane.org> <200607191657.38644.zam@namesys.com>
	<20060720062646.GA6174@schatzie.adilger.int>
	<200607201317.54566.chrivers@iversen-net.dk>
Message-ID: <20060720150209.GA5299@schatzie.adilger.int>

On Jul 20, 2006  13:17 +0200, Christian Iversen wrote:
> On Thursday 20 July 2006 08:26, Andreas Dilger wrote:
> > On Jul 19, 2006  16:57 +0400, Alexander Zarochentsev wrote:
> > > On Wednesday 19 July 2006 16:10, Mark F wrote:
> > > > I've tried to create a large 5TB file system using both reiserfs and
> > > > ext3 and both have failed.
> > >
> > > you might need to convert the partition table to GPT format for
> > > supporting 2TB+ partitions.  it can be done by the gnu parted tool.
> >
> > Or, for that matter, don't use a partition table at all, since this
> > adds an unhelpful offset to all the filesystem structures and can
> > hurt performance on RAID where the filesystem is trying to align IO
> > to RAID stripe boundaries.
> 
> Can linux still auto-detect raid volumes if there's no partition table?

Hmm, that I'm not sure of - we mostly deal with external RAID devices.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From zam at namesys.com  Wed Jul 19 12:57:38 2006
From: zam at namesys.com (Alexander Zarochentsev)
Date: Wed, 19 Jul 2006 16:57:38 +0400
Subject: create very large file system
In-Reply-To: <e9l7bh$134$1@sea.gmane.org>
References: <e9l7bh$134$1@sea.gmane.org>
Message-ID: <200607191657.38644.zam@namesys.com>

On Wednesday 19 July 2006 16:10, Mark F wrote:
> Suse Linux Enterprise Server 9 SP3
>
> I've tried to create a large 5TB file system using both reiserfs and
> ext3 and both have failed.

how did they fail?

>
> I end up with only a 1.5TB file system.  Does anyone know why this
> doesn't work, what to do to fix it?

you have a single 5TB device? h/w raid, I think ?

you might need to convert the partition table to GPT format for 
supporting 2TB+ partitions.  it can be done by the gnu parted tool.

> Others have suggested that  only XFS or JFS will work.  Is this so?
>
> Thanks,
> -Mark


-- 
Alex.


From mcguire at lzu.edu.cn  Thu Jul 20 00:30:12 2006
From: mcguire at lzu.edu.cn (mcguire at lzu.edu.cn)
Date: Thu, 20 Jul 2006 08:30:12 +0800
Subject: [RTLWS8-CFP] Eighth Real-Time Linux Workshop 2nd CFP
Message-ID: <200607200030.k6K0UCLq021220@opentech.lzu.edu.cn>


We apologize for multiple receipts.


--------------------------------------------------------------------------------


                      Eighth Real-Time Linux Workshop

                            October 12-15, 2006
                         Lanzhou University - SISE
                          Tianshui South Road 222
                           Lanzhou, Gansu 730000
                                 P.R.China


  General

   Following  the  meetings  of  developers  and  users at the previous 7
   successful  real-time Linux workshops held in Vienna, Orlando, Milano,
   Boston,  and  Valencia, Singapore, Lille, the Real-Time Linux Workshop
   for  2006  will  come back to Asia again, to be held at the School for
   Information  Science  and  Engineering, Lanzhou University, in Lanzhou
   China.

   Embedded  and  real-time Linux is rapidly gaining traction in the Asia
   Pacific  region.  Embedded  systems  in  both  automation/control  and
   entertainment moving to 32/64bit systems, opening the door for the use
   of  full  featured  OS  like  GNU/Linux  on  COTS  based systems. With
   real-time  capabilities being a common demand for embedded systems the
   soft  and  hard  real-time  variants are an important extension to the
   versatile GNU/Linux GPOS.

   Authors  are  invited  to  submit  original  work dealing with general
   topics  related  to  real-time  Linux  research,  experiments and case
   studies,  as  well  as issues of integration of real-time and embedded
   Linux.  A  special focus will be on industrial case studies. Topics of
   interest include, but are not limited to:

     * Modifications and variants of the GNU/Linux operating system
       extending its real-time capabilities,
     * Contributions to real-time Linux variants, drivers and extensions,
     * User-mode real-time concepts, implementation and experience,
     * Real-time Linux applications, in academia, research and industry,
     * Work in progress reports, covering recent developments,
     * Educational material on real-time Linux,
     * Tools for embedding Linux or real-time Linux and embedded
       real-time Linux applications,
     * RTOS core concepts, RT-safe synchronization mechanisms,
     * RT-safe interaction of RT and non RT components,
     * IPC mechanisms in RTOS,
     * Analysis and Benchmarking methods and results of 
       real-time GNU/Linux variants,
     * Debugging techniques and tools, both for code and temporal
       debugging of core RTOS components, drivers and real-time
       applications,
     * Real-time related extensions to development environments.
  
  Further information:
 
  EN: http://www.realtimelinuxfoundation.org/events/rtlws-2006/ws.html 
  CN: http://dslab.lzu.edu.cn/rtlws8/index.html

  Awarded papers

  The  Programme Committee  will award a best paper in the category Real-
  Time Systems Theory.  This best paper will be invited  for  publication 
  to the Real-Time Systems Journal, RTSJ. 
  
  The  Programme Committee will award a best paper in the category Real-
  Time Systems Application. This best paper will be invited for publication 
  to the Dr Dobbs Journal. Moreover, the publication of the other papers in
  a special issue of Dr Dobbs Journal is in discussion. 

  Abstract submission

  In  order register an abstract, please go to:
  http://www.realtimelinuxfoundation.org/rtlf/register-abstract.html

  Venue

  Lanzhou University Information Building, School of Information Science
  and Engineering, Laznhou University, http://www.lzu.edu.cn/.

  Registration

  In  order  to  participate  to  the  workshop,  please register on the
  registration page at:
  http://www.realtimelinuxfoundation.org/rtlf/register-participant.html

  Accommodation

  Please refer to the Lanzhou hotel page for accomodation at
  http://dslab.lzu.edu.cn/rtlws8/hotels/hotels.htm

  Travel information

  For travel information and directions how to get to Lanzhou from an 
  international airport in China please refer to:
  http://www.realtimelinuxfoundation.org/events/rtlws-2006/

  Important dates

  August    28:  Abstract submission
  September 15:  Notification of acceptance
  September 29:  Final paper

  Pannel Participants:

     o Roberto Bucher - Scuola Universitaria Professionale della Svizzera
       Italiana, Switzerland, RTAI/ADEOS/RTAI-Lab.

     o Alfons Crespo Lorente - University of Valenica, Spain,Departament
       d'Informtica de Sistemes i Computadors, XtratuM.

     o Herman Haertig - Technical University Dresden, Germany,Institute for
       System Architecture, L4/Fiasco/L4Linux.

     o Nicholas Mc Guire - Lanzhou University, P.R. China, Distributed and
       Embedded Systems Lab, RTLinux/GPL.

     o Douglas Niehaus - University of Kansas, USA, Information and
       Telecommunication Technology Center, RT-preempt.

  Organization committee:

     * Prof. Li LIAN (Co-Chair), (SISE, Lanzhou University, CHINA)
     * Xiaoping ZHANG, LZU, CHINA
     * Jiming WANG, PKU, CHINA
     * Zhibing LI, ECNU, China
     * Prof.  Nicholas  MCGUIRE  (Co-Chair),  Real  Time Linux Foundation
       (RTLF)
     * Dr. Peter WURMSDOBLER, Real Time Linux Foundation (RTLF)
     * Dr.  Qingguo  ZHOU, (Distributed and Embedded Systems Lab, Lanzhou
       University, CHINA)

  Program committee:

    * Prof. Li Xing (Co-Chair), (Tsinghua University, CHINA)
     * Dr.  Zhang  Yunquan,  (Institute  of  Software, Chinese Academy of
       Science, CHINA)
     * Dr. Chen Yu, (Tsinghua University, CHINA)
     * Dr. Chen Maoke, (Tsinghua University, CHINA)
     * Dr. Yu Guanghui, (Dalian University of Techonolgy, CHINA)
     * Prof.   Dr.   Paolo   Mantegazza,   (Dipartimento   di  Ingegneria
       Aerospaziale, ITALY)
     * Prof.  Dr.  Bernhard  Zagar,  (Johannes  Kepler  Universitt Linz,
       AUSTRIA)
     * Prof.   Dr.   Hermann  Hrtig,  (Technische  Universitt  Dresden,
       Fakultt Informatik, GERMANY)
     * Prof.  Tei-Wei  Kuo,  (National  Taiwan  University, Department of
       Computer Science and Information Engineering,TAIWAN)
     * Anthony Skjellum, (Mississippi State University, USA)
     * Ing. Pavel Pisa, (Czech Technical University, CZECH REPUBLIC)
     * Prof. Alfons Crespo, (Universidad Politcnica de Valencia, SPAIN)
     * Dr. Qingguo Zhou, (Lanzhou University, CHINA)
     * PhD. Jaesoon Choi, (National Cancer Center, KOREA)
     * Prof. Douglas Niehaus, (Kansas University, USA)
     * Dr. Michael Hohmuth, (Technische Universitt Dresden, GERMANY)
     * Prof.  Thambipillai Srikanthan, (Nanyang Technological University,
       SINGAPORE)
     * Zhengting He, (University of Texas, USA)
     * Martin Terbuc, (Universitz of Maribor, SLOVENIA)
     * Yoshinori Sato, (the H8/300 project, JAPAN)
     * Yuqing Lan, (China Standard SoftwareCo.,LTD, CHINA)
     * Dr. Peter Wurmsdobler, (Real Time Linux Foundation, USA)
     * Prof. Nicholas Mc Guire (Co-Chair), (Lanzhou University, CHINA)

  Workshop organizers:

     * School  for  Information  Science and Engineering (SISE) , Lanzhou
       University , CHINA
     * IBM China, Xi'an Branch , China
     * Haag Embedded Systems, Austira


Peter Wurmsdobler <peter at wurmsdobler.org>
Nicholas Mc Guire <mcguire at lzu.edu.cn>
Zhou Qingguo <zhouqg at lzu.edu.cn>


From chrivers at iversen-net.dk  Thu Jul 20 11:17:54 2006
From: chrivers at iversen-net.dk (Christian Iversen)
Date: Thu, 20 Jul 2006 13:17:54 +0200
Subject: create very large file system
In-Reply-To: <20060720062646.GA6174@schatzie.adilger.int>
References: <e9l7bh$134$1@sea.gmane.org> <200607191657.38644.zam@namesys.com>
	<20060720062646.GA6174@schatzie.adilger.int>
Message-ID: <200607201317.54566.chrivers@iversen-net.dk>

On Thursday 20 July 2006 08:26, Andreas Dilger wrote:
> On Jul 19, 2006  16:57 +0400, Alexander Zarochentsev wrote:
> > On Wednesday 19 July 2006 16:10, Mark F wrote:
> > > I've tried to create a large 5TB file system using both reiserfs and
> > > ext3 and both have failed.
> >
> > you might need to convert the partition table to GPT format for
> > supporting 2TB+ partitions.  it can be done by the gnu parted tool.
>
> Or, for that matter, don't use a partition table at all, since this
> adds an unhelpful offset to all the filesystem structures and can
> hurt performance on RAID where the filesystem is trying to align IO
> to RAID stripe boundaries.

Can linux still auto-detect raid volumes if there's no partition table?

-- 
Regards,
Christian Iversen


From jbriggs at esoft.com  Thu Jul 20 16:22:55 2006
From: jbriggs at esoft.com (Jonathan Briggs)
Date: Thu, 20 Jul 2006 10:22:55 -0600
Subject: create very large file system
In-Reply-To: <200607201317.54566.chrivers@iversen-net.dk>
References: <e9l7bh$134$1@sea.gmane.org> <200607191657.38644.zam@namesys.com>
	<20060720062646.GA6174@schatzie.adilger.int>
	<200607201317.54566.chrivers@iversen-net.dk>
Message-ID: <1153412575.9802.7.camel@localhost>

On Thu, 2006-07-20 at 13:17 +0200, Christian Iversen wrote:
> On Thursday 20 July 2006 08:26, Andreas Dilger wrote:
> > On Jul 19, 2006  16:57 +0400, Alexander Zarochentsev wrote:
> > > On Wednesday 19 July 2006 16:10, Mark F wrote:
> > > > I've tried to create a large 5TB file system using both reiserfs and
> > > > ext3 and both have failed.
> > >
> > > you might need to convert the partition table to GPT format for
> > > supporting 2TB+ partitions.  it can be done by the gnu parted tool.
> >
> > Or, for that matter, don't use a partition table at all, since this
> > adds an unhelpful offset to all the filesystem structures and can
> > hurt performance on RAID where the filesystem is trying to align IO
> > to RAID stripe boundaries.
> 
> Can linux still auto-detect raid volumes if there's no partition table?

You're not supposed to be doing it that way these days.  RAID autodetect
is getting tossed out of the kernel in the future (probably still many
versions away though), and RAID, DM, LVM, and maybe even regular
partition setup is going to be done in initramfs / initrd.

At least, that is what I read.
-- 
Jonathan Briggs <jbriggs at esoft.com>
eSoft, Inc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20060720/4e3f8b4d/attachment.sig>

From avuton at gmail.com  Thu Jul 20 16:56:03 2006
From: avuton at gmail.com (Avuton Olrich)
Date: Thu, 20 Jul 2006 09:56:03 -0700
Subject: create very large file system
In-Reply-To: <1153412575.9802.7.camel@localhost>
References: <e9l7bh$134$1@sea.gmane.org> <200607191657.38644.zam@namesys.com>
	<20060720062646.GA6174@schatzie.adilger.int>
	<200607201317.54566.chrivers@iversen-net.dk>
	<1153412575.9802.7.camel@localhost>
Message-ID: <3aa654a40607200956j37aeed0o66218c7ff94d815a@mail.gmail.com>

On 7/20/06, Jonathan Briggs <jbriggs at esoft.com> wrote:
> You're not supposed to be doing it that way these days.  RAID autodetect
> is getting tossed out of the kernel in the future (probably still many

Bit OT, but is there something that is supposed to replace RAID
autodetect, or we're just supposed to make initscripts to run mdadm?
-- 
avuton
--
 Anyone who quotes me in their sig is an idiot. -- Rusty Russell.


From mcguire at lzu.edu.cn  Fri Jul 21 06:57:49 2006
From: mcguire at lzu.edu.cn (mcguire at lzu.edu.cn)
Date: Fri, 21 Jul 2006 14:57:49 +0800
Subject: [RTLWS8-CFP] Eighth Real-Time Linux Workshop 2nd CFP
Message-ID: <200607210657.k6L6vnDE003997@opentech.lzu.edu.cn>


We apologize for multiple receipts.


--------------------------------------------------------------------------------


                      Eighth Real-Time Linux Workshop

                            October 12-15, 2006
                         Lanzhou University - SISE
                          Tianshui South Road 222
                           Lanzhou, Gansu 730000
                                 P.R.China


  General

   Following  the  meetings  of  developers  and  users at the previous 7
   successful  real-time Linux workshops held in Vienna, Orlando, Milano,
   Boston,  and  Valencia, Singapore, Lille, the Real-Time Linux Workshop
   for  2006  will  come back to Asia again, to be held at the School for
   Information  Science  and  Engineering, Lanzhou University, in Lanzhou
   China.

   Embedded  and  real-time Linux is rapidly gaining traction in the Asia
   Pacific  region.  Embedded  systems  in  both  automation/control  and
   entertainment moving to 32/64bit systems, opening the door for the use
   of  full  featured  OS  like  GNU/Linux  on  COTS  based systems. With
   real-time  capabilities being a common demand for embedded systems the
   soft  and  hard  real-time  variants are an important extension to the
   versatile GNU/Linux GPOS.

   Authors  are  invited  to  submit  original  work dealing with general
   topics  related  to  real-time  Linux  research,  experiments and case
   studies,  as  well  as issues of integration of real-time and embedded
   Linux.  A  special focus will be on industrial case studies. Topics of
   interest include, but are not limited to:

     * Modifications and variants of the GNU/Linux operating system
       extending its real-time capabilities,
     * Contributions to real-time Linux variants, drivers and extensions,
     * User-mode real-time concepts, implementation and experience,
     * Real-time Linux applications, in academia, research and industry,
     * Work in progress reports, covering recent developments,
     * Educational material on real-time Linux,
     * Tools for embedding Linux or real-time Linux and embedded
       real-time Linux applications,
     * RTOS core concepts, RT-safe synchronization mechanisms,
     * RT-safe interaction of RT and non RT components,
     * IPC mechanisms in RTOS,
     * Analysis and Benchmarking methods and results of 
       real-time GNU/Linux variants,
     * Debugging techniques and tools, both for code and temporal
       debugging of core RTOS components, drivers and real-time
       applications,
     * Real-time related extensions to development environments.
  
  Further information:
 
  EN: http://www.realtimelinuxfoundation.org/events/rtlws-2006/ws.html 
  CN: http://dslab.lzu.edu.cn/rtlws8/index.html

  Awarded papers

  The  Programme Committee  will award a best paper in the category Real-
  Time Systems Theory.  This best paper will be invited  for  publication 
  to the Real-Time Systems Journal, RTSJ. 
  
  The  Programme Committee will award a best paper in the category Real-
  Time Systems Application. This best paper will be invited for publication 
  to the Dr Dobbs Journal. Moreover, the publication of the other papers in
  a special issue of Dr Dobbs Journal is in discussion. 

  Abstract submission

  In  order register an abstract, please go to:
  http://www.realtimelinuxfoundation.org/rtlf/register-abstract.html

  Venue

  Lanzhou University Information Building, School of Information Science
  and Engineering, Laznhou University, http://www.lzu.edu.cn/.

  Registration

  In  order  to  participate  to  the  workshop,  please register on the
  registration page at:
  http://www.realtimelinuxfoundation.org/rtlf/register-participant.html

  Accommodation

  Please refer to the Lanzhou hotel page for accomodation at
  http://dslab.lzu.edu.cn/rtlws8/hotels/hotels.htm

  Travel information

  For travel information and directions how to get to Lanzhou from an 
  international airport in China please refer to:
  http://www.realtimelinuxfoundation.org/events/rtlws-2006/

  Important dates

  August    28:  Abstract submission
  September 15:  Notification of acceptance
  September 29:  Final paper

  Pannel Participants:

     o Roberto Bucher - Scuola Universitaria Professionale della Svizzera
       Italiana, Switzerland, RTAI/ADEOS/RTAI-Lab.

     o Alfons Crespo Lorente - University of Valenica, Spain,Departament
       d'Informtica de Sistemes i Computadors, XtratuM.

     o Herman Haertig - Technical University Dresden, Germany,Institute for
       System Architecture, L4/Fiasco/L4Linux.

     o Nicholas Mc Guire - Lanzhou University, P.R. China, Distributed and
       Embedded Systems Lab, RTLinux/GPL.

     o Douglas Niehaus - University of Kansas, USA, Information and
       Telecommunication Technology Center, RT-preempt.

  Organization committee:

     * Prof. Li LIAN (Co-Chair), (SISE, Lanzhou University, CHINA)
     * Xiaoping ZHANG, LZU, CHINA
     * Jiming WANG, PKU, CHINA
     * Zhibing LI, ECNU, China
     * Prof.  Nicholas  MCGUIRE  (Co-Chair),  Real  Time Linux Foundation
       (RTLF)
     * Dr. Peter WURMSDOBLER, Real Time Linux Foundation (RTLF)
     * Dr.  Qingguo  ZHOU, (Distributed and Embedded Systems Lab, Lanzhou
       University, CHINA)

  Program committee:

    * Prof. Li Xing (Co-Chair), (Tsinghua University, CHINA)
     * Dr.  Zhang  Yunquan,  (Institute  of  Software, Chinese Academy of
       Science, CHINA)
     * Dr. Chen Yu, (Tsinghua University, CHINA)
     * Dr. Chen Maoke, (Tsinghua University, CHINA)
     * Dr. Yu Guanghui, (Dalian University of Techonolgy, CHINA)
     * Prof.   Dr.   Paolo   Mantegazza,   (Dipartimento   di  Ingegneria
       Aerospaziale, ITALY)
     * Prof.  Dr.  Bernhard  Zagar,  (Johannes  Kepler  Universitt Linz,
       AUSTRIA)
     * Prof.   Dr.   Hermann  Hrtig,  (Technische  Universitt  Dresden,
       Fakultt Informatik, GERMANY)
     * Prof.  Tei-Wei  Kuo,  (National  Taiwan  University, Department of
       Computer Science and Information Engineering,TAIWAN)
     * Anthony Skjellum, (Mississippi State University, USA)
     * Ing. Pavel Pisa, (Czech Technical University, CZECH REPUBLIC)
     * Prof. Alfons Crespo, (Universidad Politcnica de Valencia, SPAIN)
     * Dr. Qingguo Zhou, (Lanzhou University, CHINA)
     * PhD. Jaesoon Choi, (National Cancer Center, KOREA)
     * Prof. Douglas Niehaus, (Kansas University, USA)
     * Dr. Michael Hohmuth, (Technische Universitt Dresden, GERMANY)
     * Prof.  Thambipillai Srikanthan, (Nanyang Technological University,
       SINGAPORE)
     * Zhengting He, (University of Texas, USA)
     * Martin Terbuc, (Universitz of Maribor, SLOVENIA)
     * Yoshinori Sato, (the H8/300 project, JAPAN)
     * Yuqing Lan, (China Standard SoftwareCo.,LTD, CHINA)
     * Dr. Peter Wurmsdobler, (Real Time Linux Foundation, USA)
     * Prof. Nicholas Mc Guire (Co-Chair), (Lanzhou University, CHINA)

  Workshop organizers:

     * School  for  Information  Science and Engineering (SISE) , Lanzhou
       University , CHINA
     * IBM China, Xi'an Branch , China
     * Haag Embedded Systems, Austira


Peter Wurmsdobler <peter at wurmsdobler.org>
Nicholas Mc Guire <mcguire at lzu.edu.cn>
Zhou Qingguo <zhouqg at lzu.edu.cn>


From sct at redhat.com  Fri Jul 21 15:37:22 2006
From: sct at redhat.com (Stephen Tweedie)
Date: Fri, 21 Jul 2006 11:37:22 -0400
Subject: create very large file system
In-Reply-To: <e9l7bh$134$1@sea.gmane.org>
References: <e9l7bh$134$1@sea.gmane.org>
Message-ID: <20060721153722.GA20270@devserv.devel.redhat.com>

Hi,

On Wed, Jul 19, 2006 at 07:10:56AM -0500, Mark F wrote:
> Suse Linux Enterprise Server 9 SP3
> 
> I've tried to create a large 5TB file system using both reiserfs and ext3 
> and both have failed.
> 
> I end up with only a 1.5TB file system.  Does anyone know why this doesn't 
> work, what to do to fix it?

I fixed a bug in mke2fs that had this result over a year ago, so a
recent e2fsprogs should fix it.

Failing that, there's a workaround: use "mke2fs -b 4096" to prevent
mke2fs from trying to work out the device size in units of 1k blocks.
Counting in 4k blocks prevents a 32-bit overflow.

--Stephen


From mfaine at knology.net  Fri Jul 21 15:58:44 2006
From: mfaine at knology.net (Mark F)
Date: Fri, 21 Jul 2006 10:58:44 -0500
Subject: create very large file system
In-Reply-To: <20060721153722.GA20270@devserv.devel.redhat.com>
References: <e9l7bh$134$1@sea.gmane.org>
	<20060721153722.GA20270@devserv.devel.redhat.com>
Message-ID: <e9qtel$hcf$1@sea.gmane.org>

Stephen Tweedie wrote:
> Hi,
> 
> On Wed, Jul 19, 2006 at 07:10:56AM -0500, Mark F wrote:
>> Suse Linux Enterprise Server 9 SP3
>>
>> I've tried to create a large 5TB file system using both reiserfs and ext3 
>> and both have failed.
>>
>> I end up with only a 1.5TB file system.  Does anyone know why this doesn't 
>> work, what to do to fix it?
> 
> I fixed a bug in mke2fs that had this result over a year ago, so a
> recent e2fsprogs should fix it.
> 
> Failing that, there's a workaround: use "mke2fs -b 4096" to prevent
> mke2fs from trying to work out the device size in units of 1k blocks.
> Counting in 4k blocks prevents a 32-bit overflow.
> 
> --Stephen

Thanks,

I finally got it formated using the GPT label with parted.  I formatted it 
reiserfs, it takes a few seconds to mount but seems to work fine and shows up at 
full size.

-Mark


From bandurin at fnal.gov  Wed Jul 26 00:27:11 2006
From: bandurin at fnal.gov (Dmitry Bandurin)
Date: Tue, 25 Jul 2006 19:27:11 -0500
Subject: data recovering in EXT3
Message-ID: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>

Hello,

We have run and stopped by chance command "fsck -y" on one of our raid disks
(with ext3 file system). After that we have found that SOME files disappeared
(they are not seen in the directories where they have been before).
The data are extremely important and contain a lot of programs,
scripts for some data analysis and very hard to recover by hands.
I have run ''fsck -y" once more and it recovered just few files..
Is there any way, any tool that would allow to recover the data?
Probably there is some specific options for the recovery ralated with
journaling in ext3?

I have used debugfs, it produced following, if it helps:
debugfs:  open -f -w /dev/sdb1
debugfs:  features
Filesystem features: has_journal resize_inode filetype needs_recovery
sparse_super large_file


thanks,
Dmitry


From jlb17 at duke.edu  Thu Jul 27 10:58:01 2006
From: jlb17 at duke.edu (Joshua Baker-LePain)
Date: Thu, 27 Jul 2006 06:58:01 -0400 (EDT)
Subject: data recovering in EXT3
In-Reply-To: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0607270656230.4795@chaos.egr.duke.edu>

On Tue, 25 Jul 2006 at 7:27pm, Dmitry Bandurin wrote

> We have run and stopped by chance command "fsck -y" on one of our raid disks
> (with ext3 file system). After that we have found that SOME files disappeared
> (they are not seen in the directories where they have been before).
> The data are extremely important and contain a lot of programs,
> scripts for some data analysis and very hard to recover by hands.
> I have run ''fsck -y" once more and it recovered just few files..
> Is there any way, any tool that would allow to recover the data?
> Probably there is some specific options for the recovery ralated with
> journaling in ext3?

Have you looked in the lost+found directory?  That's my only idea, other 
than recovering the files from your backups.

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University


From mr._x at shaw.ca  Thu Jul 27 14:36:16 2006
From: mr._x at shaw.ca (..:::BeOS Mr. X:::..)
Date: Thu, 27 Jul 2006 07:36:16 -0700
Subject: data recovering in EXT3
In-Reply-To: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
Message-ID: <44C8CF60.2090805@shaw.ca>

http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
----
Q: How can I recover (undelete) deleted files from my ext3 partition?
Actually, you can't! This is what one of the developers, Andreas Dilger, 
said about it:

In order to ensure that ext3 can safely resume an unlink after a crash, 
it actually zeros out the block pointers in the inode, whereas
ext2 just marks these blocks as unused in the block bitmaps and marks 
the inode as "deleted" and leaves the block pointers alone.

Your only hope is to "grep" for parts of your files that have been 
deleted and hope for the best.
----

You can try to contact Andreas Dilger and maybe he can help.
adilger at clusterfs.com

Mr. X

Dmitry Bandurin wrote:
> Hello,
> 
> We have run and stopped by chance command "fsck -y" on one of our raid 
> disks
> (with ext3 file system). After that we have found that SOME files 
> disappeared
> (they are not seen in the directories where they have been before).
> The data are extremely important and contain a lot of programs,
> scripts for some data analysis and very hard to recover by hands.
> I have run ''fsck -y" once more and it recovered just few files..
> Is there any way, any tool that would allow to recover the data?
> Probably there is some specific options for the recovery ralated with
> journaling in ext3?
> 
> I have used debugfs, it produced following, if it helps:
> debugfs:  open -f -w /dev/sdb1
> debugfs:  features
> Filesystem features: has_journal resize_inode filetype needs_recovery
> sparse_super large_file
> 
> 
> thanks,
> Dmitry
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
> 


From gelma_mailinglist at gelma.net  Thu Jul 27 15:15:38 2006
From: gelma_mailinglist at gelma.net (Andrea Gelmini)
Date: Thu, 27 Jul 2006 17:15:38 +0200
Subject: data recovering in EXT3
In-Reply-To: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
Message-ID: <20060727151538.GF13602@jnb.gelma.net>

On Tue, Jul 25, 2006 at 07:27:11PM -0500, Dmitry Bandurin wrote:
> The data are extremely important and contain a lot of programs,
> scripts for some data analysis and very hard to recover by hands.

maybe it could help:
http://dirk.eddelbuettel.com/blog/2006/07/20#ext3_undelete

ciao,
gelma 


From bandurin at fnal.gov  Thu Jul 27 19:04:19 2006
From: bandurin at fnal.gov (Dmitry Bandurin)
Date: Thu, 27 Jul 2006 14:04:19 -0500
Subject: data recovering in EXT3
In-Reply-To: <20060727151538.GF13602@jnb.gelma.net>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
	<20060727151538.GF13602@jnb.gelma.net>
Message-ID: <22a6e1680607271204l70329230h79ce53e78aa08852@mail.gmail.com>

Thanks a lot! It looks as a really useful tool. I'll try...

On 7/27/06, Andrea Gelmini <gelma_mailinglist at gelma.net> wrote:
> On Tue, Jul 25, 2006 at 07:27:11PM -0500, Dmitry Bandurin wrote:
> > The data are extremely important and contain a lot of programs,
> > scripts for some data analysis and very hard to recover by hands.
>
> maybe it could help:
> http://dirk.eddelbuettel.com/blog/2006/07/20#ext3_undelete
>
> ciao,
> gelma
>


From ext3 at jks.tupari.net  Thu Jul 27 22:28:56 2006
From: ext3 at jks.tupari.net (Joseph Shraibman)
Date: Thu, 27 Jul 2006 18:28:56 -0400 (EDT)
Subject: maximums of ext3?
Message-ID: <Pine.LNX.4.63.0607271826400.19437@tupari.net>

Where can I find the maximums of ext3?  Today I ran into trouble after a 
directory had 31999 subdirectories in it (not including . or ..).  I know 
that ext3 can hold many more regular files than that, but where are the 
limits defined?


From adilger at clusterfs.com  Fri Jul 28 00:07:20 2006
From: adilger at clusterfs.com (Andreas Dilger)
Date: Thu, 27 Jul 2006 18:07:20 -0600
Subject: maximums of ext3?
In-Reply-To: <Pine.LNX.4.63.0607271826400.19437@tupari.net>
References: <Pine.LNX.4.63.0607271826400.19437@tupari.net>
Message-ID: <20060728000720.GP6452@schatzie.adilger.int>

On Jul 27, 2006  18:28 -0400, Joseph Shraibman wrote:
> Where can I find the maximums of ext3?  Today I ran into trouble after a 
> directory had 31999 subdirectories in it (not including . or ..).  I know 
> that ext3 can hold many more regular files than that, but where are the 
> limits defined?

linux/Documentation/filesystems/ext2.txt is a good starting place,
or wikipedia.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


From bandurin at fnal.gov  Fri Jul 28 01:26:48 2006
From: bandurin at fnal.gov (Dmitry Bandurin)
Date: Thu, 27 Jul 2006 20:26:48 -0500
Subject: data recovering in EXT3
In-Reply-To: <20060727151538.GF13602@jnb.gelma.net>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
	<20060727151538.GF13602@jnb.gelma.net>
Message-ID: <22a6e1680607271826w30dede03u90bd820320f38058@mail.gmail.com>

Hello,

magicrescue contains is aimed to recover many file types, but unfortunately
it seems it does not contain recipes for recovering most popular
ascii(text) files.
Does anybody have such extensions to standard magicrescue
to restore text files?

Dmitry

On 7/27/06, Andrea Gelmini <gelma_mailinglist at gelma.net> wrote:
> On Tue, Jul 25, 2006 at 07:27:11PM -0500, Dmitry Bandurin wrote:
> > The data are extremely important and contain a lot of programs,
> > scripts for some data analysis and very hard to recover by hands.
>
> maybe it could help:
> http://dirk.eddelbuettel.com/blog/2006/07/20#ext3_undelete
>
> ciao,
> gelma
>


From keld at dkuug.dk  Fri Jul 28 06:30:53 2006
From: keld at dkuug.dk (Keld =?iso-8859-1?Q?J=F8rn?= Simonsen)
Date: Fri, 28 Jul 2006 08:30:53 +0200
Subject: data recovering in EXT3
In-Reply-To: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
Message-ID: <20060728063053.GA16598@rap.rap.dk>

On Tue, Jul 25, 2006 at 07:27:11PM -0500, Dmitry Bandurin wrote:
> Hello,
> 
> We have run and stopped by chance command "fsck -y" on one of our raid disks
> (with ext3 file system). After that we have found that SOME files 
> disappeared
> (they are not seen in the directories where they have been before).
> The data are extremely important and contain a lot of programs,
> scripts for some data analysis and very hard to recover by hands.
> I have run ''fsck -y" once more and it recovered just few files..
> Is there any way, any tool that would allow to recover the data?
> Probably there is some specific options for the recovery ralated with
> journaling in ext3?
> 
> I have used debugfs, it produced following, if it helps:
> debugfs:  open -f -w /dev/sdb1
> debugfs:  features
> Filesystem features: has_journal resize_inode filetype needs_recovery
> sparse_super large_file

You could look at my patched version of debugfs - 
http://std.dkuug.dk/keld/readme-salvage.html

Best regards
Keld


From keld at dkuug.dk  Fri Jul 28 11:39:02 2006
From: keld at dkuug.dk (Keld =?iso-8859-1?Q?J=F8rn?= Simonsen)
Date: Fri, 28 Jul 2006 13:39:02 +0200
Subject: data recovering in EXT3
In-Reply-To: <20060728063053.GA16598@rap.rap.dk>
References: <22a6e1680607251727g2f11ad62g53e92dca8d042967@mail.gmail.com>
	<20060728063053.GA16598@rap.rap.dk>
Message-ID: <20060728113902.GA22518@rap.rap.dk>

On Fri, Jul 28, 2006 at 08:30:53AM +0200, Keld J?rn Simonsen wrote:
> On Tue, Jul 25, 2006 at 07:27:11PM -0500, Dmitry Bandurin wrote:
> > Hello,
> > 
> > We have run and stopped by chance command "fsck -y" on one of our raid disks
> > (with ext3 file system). After that we have found that SOME files 
> > disappeared
> > (they are not seen in the directories where they have been before).
> > The data are extremely important and contain a lot of programs,
> > scripts for some data analysis and very hard to recover by hands.
> > I have run ''fsck -y" once more and it recovered just few files..
> > Is there any way, any tool that would allow to recover the data?
> > Probably there is some specific options for the recovery ralated with
> > journaling in ext3?
> > 
> > I have used debugfs, it produced following, if it helps:
> > debugfs:  open -f -w /dev/sdb1
> > debugfs:  features
> > Filesystem features: has_journal resize_inode filetype needs_recovery
> > sparse_super large_file
> 
> You could look at my patched version of debugfs - 
> http://std.dkuug.dk/keld/readme-salvage.html

It saves files in the system by looking at the data blocks only.
It does not need a directory structure at all. So you could have deleted
all files on the disk, like when you have reformatted it, and still you
can salvage most of the files.

Best regards
Keld