optimising filesystem for many small files

Ken Shelby kshelby at optonline.net
Sat Oct 17 20:35:45 UTC 2009


IMHO, software tuning will only yield incremental improvments.  I suggest 
that you throw more and better hardware at the problem.  And, as always, 
YMMV.

- Ken



----- Original Message ----- 
From: "Viji V Nair" <viji at fedoraproject.org>
To: "Eric Sandeen" <sandeen at redhat.com>
Cc: <linux-ext4 at vger.kernel.org>; <ext3-users at redhat.com>
Sent: Saturday, October 17, 2009 1:56 PM
Subject: Re: optimising filesystem for many small files


> these files are not in a single directory, this is a pyramid
> structure. There are total 15 pyramids and coming down from top to
> bottom the sub directories and files  are multiplied by a factor of 4.
>
> The IO is scattered all over!!!! and this is a single disk file system.
>
> Since the python application is creating files, it is creating
> multiple files to multiple sub directories at a time.
>
> On Sat, Oct 17, 2009 at 8:02 PM, Eric Sandeen <sandeen at redhat.com> wrote:
>> Viji V Nair wrote:
>>>
>>> Hi,
>>>
>>> System : Fedora 11 x86_64
>>> Current Filesystem: 150G ext4 (formatted with "-T small" option)
>>> Number of files: 50 Million, 1 to 30K png images
>>>
>>> We are generating these files using a python programme and getting very
>>> slow IO performance. While generation there in only write, no read. 
>>> After
>>> generation there is heavy read and no write.
>>>
>>> I am looking for best practices/recommendation to get a better
>>> performance.
>>>
>>> Any suggestions of the above are greatly appreciated.
>>>
>>> Viji
>>>
>>
>> I would start with using blktrace and/or seekwatcher to see what your IO
>> patterns look like when you're populating the disk; I would guess that
>> you're seeing IO scattered all over.
>>
>> How you are placing the files in subdirectories will affect this quite a
>> lot; sitting in 1 directory for a while, filling with images, before 
>> moving
>> on to the next directory, will probably help. Putting each new file in a
>> new subdirectory will probably give very bad results.
>>
>> -Eric
>>
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>
> __________ Information from ESET NOD32 Antivirus, version of virus 
> signature database 4518 (20091017) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
> 




More information about the Ext3-users mailing list