Filesystem fragmentation and scatter-gather DMA

Ling C. Ho ling at aliko.com
Mon Mar 17 05:48:03 UTC 2008


I have this experience a couple of years ago. Under some version of 
Redhat Linux Enterprise 3 using kernel 2.4x, I tested scping two files 
slightly over 1Gig to a freshly formated ext3 filesystems 
simultaneously. It turned out the version of ext3 did not have 
reservation implemented, and we ended up with 2 files with more than 
10,000 non-contiguous fragments.

Even though the two files sat physically very close together on disk, 
the fragmentation was so bad that instead of getting over 50MB/s read we 
were expecting from reading a file at a time, we were getting  about 
10MB/s.  It's not day to day usage pattern on many desktop or servers, 
but unfortunately for us,  that's what  hundreds of our servers were  
set up to do. That is to run 2 jobs at a time, where they would first 
copy the data files from some where else, read them and then  analyze 
the data, and write some result onto another file systems.

So fragmentation could be very bad, but fortunately the later versions 
of ext3 have done much better in preventing just that.

...
ling


Jon Forrest wrote:
> The following is a short note I wrote a while back,
> mainly in response to a discussion of filesystem
> fragmentation in Windows operating systems. Most
> of what I saw also applies to *nix systems.
>
> Jon Forrest
>
> ----------------
> Why PC Disk Fragmentation Doesn't Matter (much)
>
> Jon Forrest (jlforrest at berkeley.edu)
>
> [The following is an hypothesis. I don't have
> any real data to back this up. I'd like to know
> if I'm overlooking any technical details.]
>
> Disk fragmentation can mean several things.
> On one hand it can mean that the disk blocks
> that a file occupies aren't right next to each
> other physically. The more pieces that make up a file, the
> more fragmented the file is. Or, it can mean
> that the unused blocks on a disk aren't all right
> next to each other. Win9X, Windows 2000, and Windows XP
> come with defragmentation programs. Such programs
> are also available for other Microsoft and non-Microsoft
> operating systems from commercial vendors.
>
> The question of whether a fragmented disk really
> results in anything bad has always been a topic
> of heated discussion. On one side of the issue
> the vendors of disk defragmentation programs can
> always be found. The other side is usually occupied
> by skeptical system managers, such as yours truly.
>
> For example, the following claim is made by the
> vendor of one commercial vendor:
>
> "Disk fragmentation can cripple performance even worse
> than running with insufficient memory. Eliminate it
> and you've eliminated the primary performance bottleneck
> plaguing even the best-equipped systems." But can it, and
> does it? The user's guide for this product spends some 60 pages
> describing how to run the product but never justifies this
> claim.
>
> I'm not saying that fragmentation is good. That's one reason
> why you can't buy a product whose purpose is to fragment a disk.
> But, it's hard to imagine how fragmentation can cause any noticeable
> performance problems. Here's why:
>
> 1) The greatest benefit from having a contiguous file would
> be when the whole file is read (let's stick with reads) in
> one I/O operation. The would result in the minimal amount of
> disk arm movement, which is the slowest part of a disk I/O
> operation. But, this isn't the way most I/Os take place. Instead,
> most I/Os are fairly small. Plus, and this is the kicker, on
> a modern multitasking operating system, those small I/Os are coming
> from different processes reading from different files. Assuming that the
> data to be read isn't in a memory cache, this means that the disk arm is
> going to be flying all over the place, trying to satisfy all
> the seek operations being issued by the operating system.
> Sure, the operating system, and maybe even the disk controller,
> might be trying to re-order I/Os but there's only so much of
> this that can be done. A contiguous file doesn't really help
> much because there's a very good change that the disk arm is
> going to have to move elsewhere on the disk between the time
> that pieces of a file are read.
>
> 2) The metadata for managing a filesystem is probably
> cached in RAM. This means when a file is created, or
> extended, the necessary metadata updates are done at memory
> speed, not at disk speed. So, the overhead of allocating
> multiple pieces for a new file is probably in the noise.
> Of course, the in-memory metadata eventually has to be flushed
> to disk but this is usually done after the original I/O completes,
> so there won't be any visible slowdown in the program that issued
> the I/O.
>
> 3) Modern disks do all kind of internal block remapping so there's
> no guarantee that what appears to be contiguous to the operating
> system is actually really and truly contiguous on the disk. I have
> no idea how often this possibility occurs, or how bad the skew is
> between "fake" blocks and "real" blocks. But, it could happen.
>
> So, go ahead and run your favorite disk defragmenter. I know I do.
> Now that W2K and later have an official API for moving files in an 
> atomic operation, such programs probably can't cause any harm. But
> don't be surprised if you don't see any noticeable performance
> improvements.
>
> The mystery that really puzzles and sometimes frightens me is
> why an NTFS file system becomes fragmented so easily in the first
> place. Let's say I'm installing Windows 2000 on a newly formatted
> 20GB disk. Let's say that the total amount of space used by the
> new installation is 600MB. Why should I see any fragmented files,
> other than registry files, after such an installation? I have no
> idea. My thinking is that all files that aren't created and then
> later extended should be able to be created contiguously to begin with.
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users




More information about the Ext3-users mailing list