Huge number of files in a directory

Miner, Jonathan W (US SSA) jonathan.w.miner at baesystems.com
Wed Oct 7 20:57:32 UTC 2009


The issue with 'ls' is that it wants to sort the output. You may want to try using "-f", which says "do not sort"

In any case, huge numbers of files mean longer times to look through the directory nodes to find the file.  And "huge" will depend on your system. One approach that I've used is to create a directory structure based on the filename;  for example "abcdefgh.dat" might be stored as "a/b/c/d/abcdefgh.dat".  This preserves the filename, but keeps each directory to a managable size.
________________________________________
From: redhat-list-bounces at redhat.com [redhat-list-bounces at redhat.com] On Behalf Of "Fábio Jr." [fjuniorlista at gmail.com]
Sent: Wednesday, October 07, 2009 16:48
To: General Red Hat Linux discussion list
Subject: Re: Huge number of files in a directory

Thanks Andrew for the reply

Actually, the 'ls' in the directory isn't work, takes too much time and
blow the load up high. The number of files is a projection based in the
size of the directory, with the avg size of the files, that dont vary
too much.

I have 2 SAS disk in RAID 1. Before this, all the data was in a single
disk without RAID. When we make the migration to a powerfull server,
with the new disks, it took 4 days to finish all the transfer, and it
was performed about 1 year ago.

I didnt have a througput problem yet, not network, or even write or
read. But I'm concerned about when this will start to botter me. More
time I let to start to apply a solution, harder it will be to be
acomplished.

I'll realy make a deep research before start to work and migrate
anything. I just wanted t know if anyone has already have some similar
problem, to share experiences.

Thanks again.

[]s
    Fábio Jr.

Andrew Elliott escreveu:
> The limit of the number of files in a directory is variable and set when the
> FS is created.  It depends on the size of the volume; if you do a 'df -i' it
> will tell you the available inodes you have.  This will help to determine
> the limits of the fs which could help in determining where the bottleneck
> is.
>
> Quantity?...I've had problems doing an 'ls -al' on a single directory with
> 45000 files in it (EXT3 on external scsi array)...so I'm surprised that
> you're not having problems already...
>
> Suggestions?  I've read that XFS and ReiserFS are the best fs types for
> working with a large amount of small files, although the article was old
> (2006).  Reiser will consume more CPU than the others, but that's just from
> my personal experience.
>
> If you do find the number of files is a bottleneck, hardware is the easiest
> fix.  I'd recommend getting the fastest drives and bus that you can
> afford...
>
> I would definitely research the issue before doing anything about it...
>
> Andrew.
>
>
>
> -------
> Hello,
>
> I serve static content with an Apache server, and store the files in a
> storage server, wich is mounted in the webserver via NFS. 95% of the
> files that I serve are images, and the format of the file name is
> {number}.png.
>
> I have these images all together in a single directory, and there was
> about 4 million files in this folder.
>
> I wanto to change this directory structure to something more secure and
> dynamic, to permit an easier way to scale and backup these files.
>
> My questions are:
>
> - When the quantity of files will start to become a bottleneck for my
> filesystem? ( they are stored in partition with ext3)
>
> - When the quantity of files will start to become a bottleneck for my OS?
>
> - Suggestions?
>
> Thanks
>
> []s
>     Fábio Jr.
>
>

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request at redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list




More information about the redhat-list mailing list