[dm-devel] [PATCH] dm-verity: Fix FEC for RS roots non-aligned to block size
Sami Tolvanen
samitolvanen at google.com
Tue Feb 23 16:51:01 UTC 2021
Hi Milan,
On Mon, Feb 22, 2021 at 1:15 PM Milan Broz <gmazyland at gmail.com> wrote:
>
> Optional Forward Error Correction (FEC) code in dm-verity uses
> Reed-Solomon code and should support roots from 2 to 24.
>
> The error correction parity bytes (of roots lengths per RS block) are stored
> on a separate device in sequence without any padding.
>
> Currently, to access FEC device, the dm-verity-fec code uses dm-bufio client
> with block size set to verity data block (usually 4096 or 512 bytes).
>
> Because this block size is not divisible by some (most!) of the roots
> supported lengths, data repair cannot work for partially stored
> parity bytes.
Ah, good catch!
> This patch changes FEC device dm-bufio block size to "roots << SECTOR_SHIFT"
> where we can be sure that the full parity data is always available.
> (There cannot be partial FEC blocks because parity must cover whole sectors.)
>
> Because the optional FEC starting offset could be unaligned to this
> new block size, we have to use dm_bufio_set_sector_offset() to configure it.
>
> The problem is easily reproducible using veritysetup,
> here for example for roots=13:
>
> # create verity device with RS FEC
> dd if=/dev/urandom of=data.img bs=4096 count=8 status=none
> veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=13 | awk '/^Root hash/{ print $3 }' >roothash
>
> # create an erasure that should be always repairable with this roots setting
> dd if=/dev/zero of=data.img conv=notrunc bs=1 count=8 seek=4088 status=none
>
> # try to read it through dm-verity
> veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=13 $(cat roothash)
> dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer
> # wait for possible recursive recovery in kernel
> udevadm settle
> veritysetup close test
>
> Without it, FEC code usually ends on unrecoverable failure in RS decoder:
> device-mapper: verity-fec: 7:1: FEC 0: failed to correct: -74
> ...
>
> With the patch, errors are properly repaired.
> device-mapper: verity-fec: 7:1: FEC 0: corrected 8 errors
> ...
>
> This problem is present in all kernels since the FEC code introduction (kernel 4.5).
>
> AFAIK the problem is not visible in Android ecosystem because it always
> use default RS roots=2.
That's correct, Android always uses the default value to minimize
space overhead, which is why we never ran into this.
>
> Signed-off-by: Milan Broz <gmazyland at gmail.com>
> ---
> drivers/md/dm-verity-fec.c | 23 +++++++++++++----------
> 1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
> index fb41b4f23c48..be170581eb69 100644
> --- a/drivers/md/dm-verity-fec.c
> +++ b/drivers/md/dm-verity-fec.c
> @@ -61,18 +61,18 @@ static int fec_decode_rs8(struct dm_verity *v, struct dm_verity_fec_io *fio,
> static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
> unsigned *offset, struct dm_buffer **buf)
> {
> - u64 position, block;
> + u64 position, block, rem;
> u8 *res;
>
> position = (index + rsb) * v->fec->roots;
> - block = position >> v->data_dev_block_bits;
> - *offset = (unsigned)(position - (block << v->data_dev_block_bits));
> + block = div64_u64_rem(position, v->fec->roots << SECTOR_SHIFT, &rem);
> + *offset = (unsigned)rem;
>
> - res = dm_bufio_read(v->fec->bufio, v->fec->start + block, buf);
> + res = dm_bufio_read(v->fec->bufio, block, buf);
> if (IS_ERR(res)) {
> DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
> v->data_dev->name, (unsigned long long)rsb,
> - (unsigned long long)(v->fec->start + block),
> + (unsigned long long)(block),
Nit: parentheses not needed around block.
> PTR_ERR(res));
> *buf = NULL;
> }
> @@ -155,7 +155,7 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_fec_io *fio,
>
> /* read the next block when we run out of parity bytes */
> offset += v->fec->roots;
> - if (offset >= 1 << v->data_dev_block_bits) {
> + if (offset >= (v->fec->roots << SECTOR_SHIFT)) {
Same here, but I suppose it might help with readability.
> dm_bufio_release(buf);
>
> par = fec_read_parity(v, rsb, block_offset, &offset, &buf);
> @@ -674,7 +674,7 @@ int verity_fec_ctr(struct dm_verity *v)
> {
> struct dm_verity_fec *f = v->fec;
> struct dm_target *ti = v->ti;
> - u64 hash_blocks;
> + u64 hash_blocks, start_blocks, fec_blocks;
> int ret;
>
> if (!verity_fec_is_enabled(v)) {
> @@ -744,15 +744,18 @@ int verity_fec_ctr(struct dm_verity *v)
> }
>
> f->bufio = dm_bufio_client_create(f->dev->bdev,
> - 1 << v->data_dev_block_bits,
> + f->roots << SECTOR_SHIFT,
> 1, 0, NULL, NULL);
> if (IS_ERR(f->bufio)) {
> ti->error = "Cannot initialize FEC bufio client";
> return PTR_ERR(f->bufio);
> }
>
> - if (dm_bufio_get_device_size(f->bufio) <
> - ((f->start + f->rounds * f->roots) >> v->data_dev_block_bits)) {
> + dm_bufio_set_sector_offset(f->bufio, f->start << v->data_dev_block_bits >> SECTOR_SHIFT);
> +
> + start_blocks = div64_u64(f->start << v->data_dev_block_bits, v->fec->roots << SECTOR_SHIFT);
> + fec_blocks = div64_u64(f->rounds * f->roots, v->fec->roots << SECTOR_SHIFT);
> + if (dm_bufio_get_device_size(f->bufio) < (start_blocks + fec_blocks)) {
> ti->error = "FEC device is too small";
> return -E2BIG;
> }
I haven't tested the code, but it looks correct to me. Thanks for fixing this!
Reviewed-by: Sami Tolvanen <samitolvanen at google.com>
Sami
More information about the dm-devel
mailing list