Proper alignment between disk HW blocks, mdadm strides, and ext[23] blocks

Andreas Dilger adilger at sun.com
Sat Nov 10 06:16:41 UTC 2007


On Nov 09, 2007  19:11 -0700, Chris Worley wrote:
> How do you measure/gauge/assure proper alignment?
> 
> The physical disk has a block structure.  What is it or how do you
> find it?  I'm guessing it's best to not partition disks in order to
> assure that whatever it's block read/write is isn't bisected by the
> partition.

For Lustre we never partition the disks for exactly this reason, and if
you are using LVM/md on the whole device it doesn't make sense either.

> Then, mdadm has some block structure.  The "-c" ("chunk") is in
> "kibibytes" (feed the dog kibbles?), with a default of 64.  Not a clue
> what they're trying to do.

That just means for RAID 0/5/6 that the amount of data or parity in a
stripe is a multipe of the chunk size, i.e. for a 4+1 RAID5 you get:

	disk0 disk1 disk2 disk3 disk4
	[64kB][64kB][64kB][64kB][64kB]
	[64kB][64kB]...

> Finally, mkfs.ext[23] has a "stride", which is defined as a "stripe
> size" in the man page (and I thought all your stripes added together
> are a "stride"), as well as a block size.

For ext2/3/4 the stride size (in kB) == the mdadm chunk size.  Note that
the ext2/3/4 stride size is in units of filesystem blocks, so if you have
4kB filesystem blocks (default for filesystems > 500MB) and a 64kB RAID5
chunk size, this is 16:

	e2fsck -E stride=16 /dev/md0

> It's important to make sure these all align properly, but their definitions
> do.

... do not?

> Could somebody please clarify... with an example?

Yes, I constantly wish the terminology were constant between different tools,
but sadly there isn't any "proper" terminology out there as far as I've been
able to see.

Cheers, Andreas
--
Andreas Dilger
Sr. Software Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the Ext3-users mailing list