Many small files, best practise.

Andreas Dilger adilger at sun.com
Wed Sep 16 22:28:29 UTC 2009


On Sep 14, 2009  22:08 +0100, Peter Grandi wrote:
> > When you deal with systems that store millions of files,
> 
> Millions of files may work; but 1 billion is an utter absurdity.
> A filesystem that can store reasonably 1 billion small files in
> 7TB is an unsolved research issue...

I'd disagree.  We have Lustre filesystems with 500M files on
the ext4(ish) metadata server, and these are only 4TB.  Note
there is NO DATA in the metadata files, so it isn't quite like
a normal filesystem.

It also depends on what you mean by "small files".
We've previously discussed storing small file data in an
extended attribute, and if you are tuning for this and the
file size is small enough (3kB or less) the file data could
be stored inside the inode (i.e. zero seek data IO).

> > fsck time has improved quite a lot recently with ext4 (and
> > with xfs).
> 
> How many months do you think a 7TB filesystem with 1 billion
> files would take to 'fsck' even with those improvements? Even
> with the nice improvements?

I think you aren't backing your comments with any facts.

The e2fsck time on our MDS filesystems with 500M IN USE inodes
is on the order of 4 hours (disk-based RAID-1+0 array).  If
this was on a RAID-1+0 SSD it could be noticably faster.

Ric also commented previously about single-digit hours for e2fsck
on a test 1B file ext4 filesystem.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the Ext3-users mailing list