Many small files, best practise.
Peter Grandi
pg_ext3 at ext3.for.sabi.co.UK
Mon Sep 14 09:40:18 UTC 2009
>> RHEL 5.3
>> ~1000.000.000 files (1-30k)
>> ~7TB in total
>> //
>> I'm looking for a best practice when implementing this using
>> EXT3 (or some other FS if it shouldn't do the job.).
"best practice" would be a rather radical solution.
>> On average the reads dominate (99%), writes are only used for
>> updating and isn't a part of the service provided. The data
>> is divided into 200k directories with each some 5k files.
>> This ratio (dir/files) can be altered to optimize FS
>> performance.
> If you are writing to a local S-ATA disk, ext3/4 can write a
> few thousand files/sec without doing any fsync() operations.
> With fsync(), you will drop down quite a lot.
Unfortunately using 'fsync' is a good idea for production
systems.
Also note that in order to write 10^9 files at 10^3/s rate takes
10^6 seconds; roughly 10 days to populate the filesystem (or at
least that to restore it from backups).
> One layout for directories that works well with this kind of
> thing is a time based one (say YEAR/MONTH/DAY/HOUR/MIN where
> MIN might be 0, 5, 10, ..., 55 for example).
As to the problem above and ths kind of solution, I reckon that
it is utterly absurd (and I could have used much stronger words).
BTW, the sort of people who consider seriously such utter
absurdities try to do a thorough job, and I don't want to
know how the underlying storage system is structured :-).
If anything, consider the obvious (obvious except to those who
want to use a filesystem as a small record database), which is
'fsck' time, in particular given the structure of 'ext3' (or
'ext4') metadata.
So: just don't use a filesystem as a database, spare us the
horror; use a database, even a simple one, which is not utterly
absurd.
Compare these two:
http://lists.gllug.org.uk/pipermail/gllug/2005-October/055445.html
http://lists.gllug.org.uk/pipermail/gllug/2005-October/055488.html
Anyhow I do see a lot of inane questions and "solutions" like
the above in various lists (usually the XFS one, which attracts
a lot of utter absurdities).
> When reading files in ext3 (and ext4) or doing other bulk
> operations like a large deletion, it is important to sort the
> files by inode (do the readdir, get say all of the 5k files in
> your subdir and then sort by inode before doing your bulk
> operation).
Good idea, but it is best to avoid the cases where this matters.
More information about the Ext3-users
mailing list