[Linux-cluster] Re: GFS on md on shared disks?
Ed L Cashin
ecashin at coraid.com
Thu Oct 7 18:39:24 UTC 2004
Ken Preslan <kpreslan at redhat.com> writes:
...
> Suppose Node A writes inode 23 and Node B writes inode 24 (both at the
> same time). The following sequence of events could occur:
>
> 1) Node A locks inode 23 exclusively
> 2) Node B locks inode 24 exclusively
> 3) Node A starts writing inode 23. This consists of:
> A) Reading the inode off of Disk 0
> B) Reading the parity block off of Disk 2
> C) XORing the old version of the Disk 0 block out of the Disk 2 block
> D) XORing the new version of the Disk 0 block into the Disk 2 block
> 4) Node B starts writing inode 24. This consists of:
> A) Reading the inode off of Disk 1
> B) Reading the parity block off of Disk 2
> C) XORing the old version of the Disk 1 block out of the Disk 2 block
> D) XORing the new version of the Disk 1 block into the Disk 2 block
> 5) Node A completes writing inode 23. This consists of:
> A) Writing the new block to Disk 0
> A) Writing the new parity block to Disk 2
> 6) Node A completes writing inode 24. This consists of:
That's node B if I am following you correctly.
> A) Writing the new block to Disk 1
> A) Writing the new parity block to Disk 2
>
> The problem is that you had two simultaneous read-modify-write operations
> on the parity block. Neither operation took the other one into account.
> So, the data in the non-parity blocks is correct, but the parity block is
> now corrupt. As long as you don't lose a disk, you're fine. But, as soon
> as a disk dies, the values you'll get from reading inode 23 and 24 will
> be completely bogus.
Thanks for the example. That's about as concrete as one could hope for!
> A cluster aware software RAID5 implementation would lock stripes so that
> only one machine could modify a stripe at a time.
It sounds like it would be slow. Maybe not in a situation with
reader-writer locks, where writes were infrequent, though.
--
Ed L Cashin <ecashin at coraid.com>
More information about the Linux-cluster
mailing list