[dm-devel] Proper way to test RAID456?

Qu Wenruo quwenruo.btrfs at gmx.com
Sun Jan 9 12:13:36 UTC 2022



On 2022/1/9 18:04, David Woodhouse wrote:
> On Sun, 2022-01-09 at 07:55 +0800, Qu Wenruo wrote:
>> On 2022/1/9 04:29, Lukas Straub wrote:
>>> But there is a even simpler solution for btrfs: It could just not touch
>>> stripes that already contain data.
>>
>> That would waste a lot of space, if the fs is fragemented.
>>
>> Or we have to write into data stripes when free space is low.
>>
>> That's why I'm trying to implement a PPL-like journal for btrfs RAID56.
>
> PPL writes the P/Q of the unmodified chunks from the stripe, doesn't
> it?

Did I miss something or the PPL isn't what I thought?

I thought PPL either:

a) Just write a metadata entry into the journal to indicate a full
    stripe (along with its location) is going to be written.

b) Write a metadata entry into the journal about a non-full stripe
    write, then write the new data and new P/Q into the journal

And this is before we start any data/P/Q write.

And after related data/P/Q write is finished, remove corresponding
metadata and data entry from the journal.

Or PPL have even better solution?
>
> An alternative in a true file system which can do its own block
> allocation is to just calculate the P/Q of the final stripe after it's
> been modified, and write those (and) the updated data out to newly-
> allocated blocks instead of overwriting the original.

This is what Johannes is considering, but for a different purpose.
Johannes' idea is to support zoned device. As the physical location a
zoned append write will only be known after it's written.

So his idea is to maintain another mapping tree for zoned write, so that
full stripe update will also happen in that tree.

But that idea is still in the future, on the other hand I still prefer
some tried-and-true method, as I'm 100% sure there will be new
difficulties waiting us for the new mapping tree method.

Thanks,
Qu

>
> Then the final step is to free the original data blocks and P/Q.
>
> This means that your RAID stripes no longer have a fixed topology; you
> need metadata to be able to *find* the component data and P/Q chunks...
> it ends up being non-trivial, but it has attractive properties if we
> can work it out.





More information about the dm-devel mailing list