[Linux-cluster] Distributed RAID

Thu Nov 6 09:38:02 UTC 2008

Michael O'Sullivan wrote:

> I have just read that GFS on mdadm does not work because mdadm is not 
> cluster aware. I really hoped to build a n + 1 RAID of the disks I have 
> presented to the RHCS nodes via iSCSI. I had a look at DDRAID which is 
> old and looks like it only supports 3, 5 and 9 disks in the distributed 
> RAID. I currently only have two (multipathed) devices, but I want them 
> to be active-active. If I put them into a mirrored logical volume in 
> CLVM will this do the trick? Or will I have to install DRDB? Is there 
> any more up-to-date distributed RAID options available for when I want 
> to make a 2 + 1, 3 +1, etc storage array? There are some posts that say 
> this may be available in CLVM soon or that mdadm may be cluster aware 
> soon. Any progress on either of these options?

You probably saw me asking these very same questions in the archives, 
without any response.

DDRAID is unmaintained, and IIRC the code was removed from the current 
development tree a while back. So don't count on it ever getting 
resurrected.

I rather doubt md will become cluster aware any time soon. CLVM doesn't 
yet support even more important features like snapshotting, so I 
wouldn't count on it supporting anything more advanced.

For straight mirroring (which is all you could sensibly do with 2 nodes 
anyway), I can highly recommend DRBD. It "just works" and works well. I 
have a number of 2-node clusters deployed with it with shared-root.

Of you really want to look into larger scale clustering with n+m 
redundancy, look into cleversafe.org. There's a thread I started on the 
forum there looking into exactly this sort of thing. I'll be testing it 
in the next month or so, when I get the hardware together, but it's 
looking plausible. It also provides proper n+m redundancy.

Another thing to note is that RAID5 is not really usable on today's big 
disks in arrays of more than 6. Think about the expected read failure 
rates on modern disks: 10^-14. That's about one uncorrectable error 
every 10TB. So if you have a 6x1TB disk array, and you lose a disk, you 
have to read 5TB of data to reconstruct onto a fresh disk. Since you get 
one uncorrectable error every 10TB, that means you have a 50/50 chance 
of another disk encountering an error and dropping out of the array, and 
losing all your data. These days higher RAID levels are really a 
necessity, not an optional extra, and at this rate, considering the read 
error rates have stayed constant while sizes have exploded, even RAID6 
won't last long.

Gordan