[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext2/ext3 directory handling

"Alan R.Becker" <beckera mail-now com> wrote:
> (1) Is the assumption that directories don't compress when deleting 
> files correct?	How is this handled (in general terms)?

That is correct.  A deleted file leaves a "hole" in the directory
which a new addition can fill (if it fits).

> (2) Is there any difference between ext2 and ext3?  


> (3) Does the htree code change the picture any (even 
> though I don't use it, and won't until it is production) ?

No, htree will not release directory blocks.

> (4) Is it possible that the directories themselves 
> were fragmented?

Yes, very probable.

However to understand why things slowed down a bit more info is needed.

It is probable that the many little files in one typical directory are
splattered all over the disk.  Does your workload regularly touch all the
file in these directories?  If so then it maybe suffering from this lack of
inter-file locality.

If not then yes, perhaps the problem is due to large, fragmented

How many bytes does a typical directory consume?  If you have the disk
space, and are confident that (say) 64k is "enough" then perhaps you could
grow each user's mail directory to (say) 64k when that user is created. 
This way they will have a nice unfragmented directory for all time.

> (5) After doing a "mkdir" to create a new directory, how many 
> file entries can it hold before it would be expanded to accept 
> another file?

4 kilobytes.  Each directory entry consumes eight bytes, plus the length of
the name rounded up to a multiple of 4 bytes.

> When a directory is expanded, how many additional 
> file entries can be stored before needing another expansion?

Another 4 kilobytes.

> (6) Say I have a directory containing some files, then I delete 
> some files, and finally I start adding files.  Will new file 
> entries use empty or vacated directory slots before expanding 
> the directory?

Deletion causes holes.  Holes are coalesced within a 4k block.  Holes are
allocated from on a first-fit basis.

> (7) I am aware of e2defrag (latest version I have found is 0.73). 
> Does this program (or any other any tool) perform any 
> directory optimization that would affect this problem?

It's obsolete.

For your purposes, all you'd need to do to defrag a directory is

	mkdir new
	ln old/* new/
	rm -rf old
	mv old new

If you use `cp' instead of `ln' then you'll defrag the files themselves,
and lay them out close to each other.  Which is only important if you app
regularly touches lots of files in a single directory.  It probably does

> (8) If e2defrag would be helpful, has it/is it being brought 
> forward to operate correctly with current (RH 8/9) systems?
> I see some warnings about blocksise restrictions, etc.

I haven't heard of anyone using it in ages.

> (9) In designing new systems, are there some useful guidelines 
> about the maximum number of files that can exist in a single 
> directory without significant performance loss?  
> I am interested in ext2, ext3, and htree.

Non-htree gets awkward at a few thousand.  htree appears to be OK up to
hundreds of thousands.  Its practical scalability is unknown, really.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]