optimising filesystem for many small files

Theodore Tso tytso at mit.edu
Sat Oct 17 22:26:19 UTC 2009


On Sat, Oct 17, 2009 at 11:26:04PM +0530, Viji V Nair wrote:
> these files are not in a single directory, this is a pyramid
> structure. There are total 15 pyramids and coming down from top to
> bottom the sub directories and files  are multiplied by a factor of 4.
> 
> The IO is scattered all over!!!! and this is a single disk file system.
> 
> Since the python application is creating files, it is creating
> multiple files to multiple sub directories at a time.

What is the application trying to do, at a high level?  Sometimes it's
not possible to optimize a filesystem against a badly designed
application.  :-(

It sounds like it is generating files distributed in subdirectories in
a completely random order.  How are the files going to be read
afterwards?  In the order they were created, or some other order
different from the order in which they were read?

With a sufficiently bad access patterns, there may not be a lot you
can do, other than (a) throw hardware at the problem, or (b) fix or
redesign the application to be more intelligent (if possible).

	     		       	    		    - Ted




More information about the Ext3-users mailing list