Huge number of files in a directory

m.roth at 5-cent.us m.roth at 5-cent.us
Wed Oct 7 21:51:49 UTC 2009


> The issue with 'ls' is that it wants to sort the output. You may want to
> try using "-f", which says "do not sort"
>
> In any case, huge numbers of files mean longer times to look through the
> directory nodes to find the file.  And "huge" will depend on your system.
> One approach that I've used is to create a directory structure based on
> the filename;  for example "abcdefgh.dat" might be stored as
> "a/b/c/d/abcdefgh.dat".  This preserves the filename, but keeps each
> directory to a managable size.

Another, possibly more appropriate answer, would be to automagically
create dated directories, and have the system write them into there each
day. For example, at midnight, you could have a script that made a
directory for 20091007, then rm today;ln -s 20091007 today.

Then you'd only have to move all of that, for which find would be the
right answer, I think.

       mark
> ________________________________________
> From: redhat-list-bounces at redhat.com [redhat-list-bounces at redhat.com] On
> Behalf Of "Fábio Jr." [fjuniorlista at gmail.com]
> Sent: Wednesday, October 07, 2009 16:48
> To: General Red Hat Linux discussion list
> Subject: Re: Huge number of files in a directory
>
> Thanks Andrew for the reply
>
> Actually, the 'ls' in the directory isn't work, takes too much time and
> blow the load up high. The number of files is a projection based in the
> size of the directory, with the avg size of the files, that dont vary
> too much.
>
> I have 2 SAS disk in RAID 1. Before this, all the data was in a single
> disk without RAID. When we make the migration to a powerfull server,
> with the new disks, it took 4 days to finish all the transfer, and it
> was performed about 1 year ago.
>
> I didnt have a througput problem yet, not network, or even write or
> read. But I'm concerned about when this will start to botter me. More
> time I let to start to apply a solution, harder it will be to be
> acomplished.
>
> I'll realy make a deep research before start to work and migrate
> anything. I just wanted t know if anyone has already have some similar
> problem, to share experiences.
>
> Thanks again.
>
> []s
>     Fábio Jr.
>
> Andrew Elliott escreveu:
>> The limit of the number of files in a directory is variable and set when
>> the
>> FS is created.  It depends on the size of the volume; if you do a 'df
>> -i' it
>> will tell you the available inodes you have.  This will help to
>> determine
>> the limits of the fs which could help in determining where the
>> bottleneck
>> is.
>>
>> Quantity?...I've had problems doing an 'ls -al' on a single directory
>> with
>> 45000 files in it (EXT3 on external scsi array)...so I'm surprised that
>> you're not having problems already...
>>
>> Suggestions?  I've read that XFS and ReiserFS are the best fs types for
>> working with a large amount of small files, although the article was old
>> (2006).  Reiser will consume more CPU than the others, but that's just
>> from
>> my personal experience.
>>
>> If you do find the number of files is a bottleneck, hardware is the
>> easiest
>> fix.  I'd recommend getting the fastest drives and bus that you can
>> afford...
>>
>> I would definitely research the issue before doing anything about it...
>>
>> Andrew.
>>
>>
>>
>> -------
>> Hello,
>>
>> I serve static content with an Apache server, and store the files in a
>> storage server, wich is mounted in the webserver via NFS. 95% of the
>> files that I serve are images, and the format of the file name is
>> {number}.png.
>>
>> I have these images all together in a single directory, and there was
>> about 4 million files in this folder.
>>
>> I wanto to change this directory structure to something more secure and
>> dynamic, to permit an easier way to scale and backup these files.
>>
>> My questions are:
>>
>> - When the quantity of files will start to become a bottleneck for my
>> filesystem? ( they are stored in partition with ext3)
>>
>> - When the quantity of files will start to become a bottleneck for my
>> OS?
>>
>> - Suggestions?
>>
>> Thanks
>>
>> []s
>>     Fábio Jr.
>>
>>
>
> --
> redhat-list mailing list
> unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
> https://www.redhat.com/mailman/listinfo/redhat-list
>
> --
> redhat-list mailing list
> unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
> https://www.redhat.com/mailman/listinfo/redhat-list
>





More information about the redhat-list mailing list