stride (fwd)

Tue Jun 24 12:30:29 UTC 2008

Two things;

1. Most likely I missed it but I could not find how to report the stride 
   setting for a ext3 filesystem.  I do not see stride mentioned in the
   man pages for dumpe2fs and tune2fs nor in the dumpe2fs report.

2. It has been pointed out the mke2fs man page description for stride needs
   improvement.  Andreas Dilger in a post last year,
   http://osdir.com/ml/file-systems.ext3.user/2007-06/msg00003.html,
   mentioned a patch was submitted.  I assume to address the mke2fs man page.

   If this is not the case then I suggest adding something similar to Ted's
   or Andreas' descriptions to replace the current stride mke2fs man page.

   If nothing else change from

 	stride=<stripe-size>
 		Configure  the	filesystem  for	 a  RAID  array with
 		<stripe-size> filesystem blocks per stripe.

   to

 	stride=<stride-size>

                The number of filesystem blocks on a single disk.  The purpose
  		is to spread the filesystem metadata across the disks.  For
		example, if the RAID chunk/segment size is 64KB and the 
 		filesystem block size is 4KB, then the stride size is 16
		(64KB/4KB).

These types of explanations are more helpful than something like...

  -f fragment-size
              Specify the size of fragments in bytes.

taken from the mke2fs man pages.  As you can see the explanation adds very
little value.  The stride explanation simply seems wrong.

Richard

Forwarded message:
> On Thu, Jun 19, 2008 at 06:21:24AM -0400, Mag Gam wrote:
> > ok, in a way its like a stripe? I though when you do a stripe you put the
> > metadata on number of disks too. How is that different? Is there a diagram I
> > can refer to?
> 
> Yes, which is why the mke2fs man page states:
> 
> 	stride=<stripe-size>
> 		Configure  the	filesystem  for	 a  RAID  array with
> 		<stripe-size> filesystem blocks per stripe.
> 
> So if the size of a stripe on each a disk is 64k, and you are using a
> 4k filesystem blocksize, then 64k/4k == 16, and that would be an
> "ideal" stride size, in that for each successive block group, the
> inode and block bitmap would increased by an offset of 16 blocks from
> the beginning of the block group.
> 
> The reason for doing this is to avoid problems where the block bitmap
> ends up on the same disk for every single block group.  The classic
> case where this would happen is if you have a 5 disks in a RAID 5
> configuration, which means with 4 disks per stripe, and 8192 blocks in
> a blockgroup, then if the block bitmap is always at the same offset
> from the beginning of the block group, one disk will get all of the
> block bitmaps, and that ends up being a major hot spot problem for the
> hard drive.
> 
> As it turns out, if you use 4 disks in a RAID 5 configuration, or 6
> disks in a RAID 5 configuration, this problem doesn't arise at all,
> and you don't need to use the stride option.  And in most cases,
> simply using a stride=1, that is actually enough to make sure that
> each block and inode bitmaps will get forced onto successively
> different disks.
> 
> With ext4's flex_bg enhancement, the need to specify stride option of
> RAID arrays will also go away.
> 
> 							- Ted
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
> 

-- 
Regards,                                                   /~\ The ASCII
Richard Jackson                                            \ / Ribbon Campaign
Computer Systems Engineer,                                  X  Against HTML
Information Technology Unit, Technology Systems Division   / \ Email!
Enterprise Servers and Operations Department
George Mason University, Fairfax, Virginia