Poor Performance WhenNumber of Files > 1M

Stephen Samuel darkonc at gmail.com
Thu Aug 2 05:11:29 UTC 2007

Searching for directories (to ensure no duplicates, etc) is going to
be order N^2.

Size of the directory is likely to be a limiting factor.

Try increasing to 10000 directories (in two layors of 100 each).  I'll
bet you that the result will be a pretty good increase in speed
(getting back to the speeds that you had with 1M directories).

On 8/1/07, Sean McCauliff <smccauliff at mail.arc.nasa.gov> wrote:
> Hi all,
> I plan on having about 100M files totaling about 8.5TiBytes.   To see
> how ext3 would perform with large numbers of files I've written a test
> program which creates a configurable number of files into a configurable
> number of directories, reads from those files, lists them and then
> deletes them.  Even up to 1M files ext3 seems to perform well and scale
> linearly; the time to execute the program on 1M files is about double
> the time it takes it to execute on .5M files.  But past 1M files it
> seems to have n^2 scalability.  Test details appear below.
> Looking at the various options for ext3 nothing jumps out as the obvious
> one to use to improve performance.
> Any recommendations?
Stephen Samuel

