Huge number of files in a directory

Cameron Simpson cs at zip.com.au
Thu Oct 8 04:30:49 UTC 2009


On 07Oct2009 16:57, Miner, Jonathan W (US SSA) <jonathan.w.miner at baesystems.com> wrote:
| The issue with 'ls' is that it wants to sort the output. You may want to try using "-f", which says "do not sort"

No, sorting is actually pretty cheap.

The issue with ls and large directories is usually the fact that ls
stat()s all the names. Plenty of other things need to stat() everything
too; backups of all kinds, for example. A stat() requires the OS to
search the directory to map the stat()ed name to an inode, and that's a
linear operation on ext3 if you haven't turned on directory hashing. In
consequence, the 'ls' cost goes as the square of the number of directory
entries (n names, each asking for an stat() whose cost is O(n), so
O(n^2) for the whole thing).

The usual approach is to make a tree of subdirectories to mitigate the
per-directory cost (keeping the size on n^2 low).
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

On the contrary of what you may think, your hacker is fully aware
of your company's dress code. He is fully aware of the fact that it
doesn't help him to do his job.
- Gregory Hosler <gregory.hosler at eno.ericsson.se>




More information about the redhat-list mailing list