[dm-devel] dm-bufio

Tue Mar 27 09:56:35 UTC 2012

On Sat, Mar 24, 2012 at 3:07 AM, Mikulas Patocka <mpatocka at redhat.com> wrote:
>
>
> On Sat, 24 Mar 2012, Kasatkin, Dmitry wrote:
>
>> Hi,
>>
>> Thanks for clarification.
>> Indeed everything works just with dm_bufio_write_dirty_buffers().
>> Reboot notifier is to issue the flush only..
>> As I understand, dm-bufio will do the flush but currently once per 10 seconds.
>>
>> if data on the block device and metadata on other block device get out
>> of sync, what you can do then?
>> how journal helps then?
>>
>> - Dmitry
>
> It depends what you're trying to do.
>
> If you're trying to do something like "dm-verity", but with a possibility
> to write to the device, there are several possibilities:
>
> * keep two checksums per 512-byte sector, the old checksum and the new
> checksum. If you update the block, you update the new checksum, sync the
> metadata device and then write to the data device (obviously you need to
> batch this update-sync-write for many blocks write concurrently to get
> decent performance). When you verify block, you allow either checksum to
> match. When you sync caches on the data device, you must forget all the
> old checksums.
>

Right.. It requires double space and more IO.
The it will certainly be more stable to failures.
But what if data block will be corrupted during write due to power or
other failures?
In such case both checksums will not obviously match...

> * use journaling, write data block and its checksum to a journal. If the
> computer crashes, you just replay the journal (so you don't care what data
> was present at that place, you overwrite it with data from the journal).
> The downside is that this doubles required i/o throughput, you should have
> journal and data on different devices.
>

That looks definitely more reliable.

> * do nothing and rebuild the checksums in case of crash. It is simplest,
> but it doesn't protect from data damages that happen due to the crash (for
> example, some SSDs corrupt its metadata on power failure and you can't
> detect this if you rebuild checksums after a power failure).
>

easy and nice :)

>> Yes.. I am aware of dm-verity target.
>> It suites well for read-only cases.
>> It is questionable how tree-based approach will work with read-write.
>> Each single update will cause whole tree recalculation.
>
> A write would recalculate hashes only in the branch from tree bottom to
> tree top. The obvious downside is that there is no protection from crash.
>

Yes.. I noticed.

>
> BTW. regarding that reboot notifier with
> "dm_bufio_write_dirty_buffers(d->bufio)" ... there could be another
> problem ... what if other reboot notifier (maybe for a completely
> different driver) writes to the device after
> "dm_bufio_write_dirty_buffers(d->bufio)" was performed?
>

"Target" owns/writes to device...
 What other driver will do it?
 Also rootfs is re-mounted read-only before rebooting.
It is actually not about data device sync, but "hash device" sync.

> - would it be possible to install your notifier again?
>
> - or turn into a synchronous updates? --- i.e. set a flag in your reboot
> notifier and if the flag is on, call
> "dm_bufio_write_dirty_buffers(d->bufio)" after every write.
>

I see the idea.

> Mikulas
>

Thanks,

Dmitry

> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel