Poor Performance WhenNumber of Files > 1M

David Schwartz davids at webmaster.com
Thu Aug 2 04:42:28 UTC 2007


> Hi all,
>
> I plan on having about 100M files totaling about 8.5TiBytes.   To see
> how ext3 would perform with large numbers of files I've written a test
> program which creates a configurable number of files into a configurable
> number of directories, reads from those files, lists them and then
> deletes them.  Even up to 1M files ext3 seems to perform well and scale
> linearly; the time to execute the program on 1M files is about double
> the time it takes it to execute on .5M files.  But past 1M files it
> seems to have n^2 scalability.  Test details appear below.
>
> Looking at the various options for ext3 nothing jumps out as the obvious
> one to use to improve performance.
>
> Any recommendations?

If you want performance that's not O(n^2), the number of directory levels
must go up one each time the order of magnitude of the number of files goes
up. That is, the number of files per directory must be constant.

Suppose you have a directory of N files. To locate each file requires N
location  operations each requiring looking at an average of N/2 files. So
it is O(N*(N2)), which is O(N^2).

Add another level of directories each time you increase the number of files
by a factor of 10.

DS





More information about the Ext3-users mailing list