[linux-lvm] LVM RAID5 out-of-sync recovery

Slava Prisivko vprisivko at gmail.com
Thu Oct 13 20:44:19 UTC 2016


On Wed, Oct 12, 2016 at 10:02 AM Giuliano Procida <
giuliano.procida at gmail.com> wrote:

> On 9 October 2016 at 20:00, Slava Prisivko <vprisivko at gmail.com> wrote:
>
> > I tried to reassemble the array using 3 different pairs of correct LV
>
> > images, but it doesn't work (I am sure because I cannot luksOpen a LUKS
>
> > image which is in the LV, which is almost surely uncorrectable).


>
>
> I would hope that a luks volume would at least be recognisable using
>
> file -s. If you extract the image data into a regular file you should
>
> be able to losetup that and then luksOpen the loop device.
>
Yes, it's recognizable. I can perform luksDump and luksOpen but for the
latter command the password just doesn't work. Well, cryptsetup works with
files just as well as with devices, so it doesn't help. But I tried just to
be sure and, quite naturally, it doesn't work either.

>
>
>
> > This is as useful as it gets (-vvvv -dddd):
>
> >     Loading vg-test_rmeta_0 table (253:35)
>
> >         Adding target to (253:35): 0 8192 linear 8:34 2048
>
> >         dm table   (253:35) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rmeta_0 (253:35) identical table reload.
>
> >     Loading vg-test_rimage_0 table (253:36)
>
> >         Adding target to (253:36): 0 65536 linear 8:34 10240
>
> >         dm table   (253:36) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rimage_0 (253:36) identical table reload.
>
> >     Loading vg-test_rmeta_1 table (253:37)
>
> >         Adding target to (253:37): 0 8192 linear 8:2 1951688704
>
> >         dm table   (253:37) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rmeta_1 (253:37) identical table reload.
>
> >     Loading vg-test_rimage_1 table (253:38)
>
> >         Adding target to (253:38): 0 65536 linear 8:2 1951696896
>
> >         dm table   (253:38) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rimage_1 (253:38) identical table reload.
>
> >     Loading vg-test_rmeta_2 table (253:39)
>
> >         Adding target to (253:39): 0 8192 linear 8:18 1217423360
>
> >         dm table   (253:39) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rmeta_2 (253:39) identical table reload.
>
> >     Loading vg-test_rimage_2 table (253:40)
>
> >         Adding target to (253:40): 0 65536 linear 8:18 1217431552
>
> >         dm table   (253:40) [ opencount flush ]   [16384] (*1)
>
> >     Suppressed vg-test_rimage_2 (253:40) identical table reload.
>
> >     Creating vg-test
>
> >         dm create vg-test
>
> > LVM-Pgjp5f2PRJipxvoNdsYmq0olg9iWwY5pJjiPmiesfxvdeF5zMvTsJC6vFfqNgNnZ [
>
> > noopencount flush ]   [16384] (*1)
>
> >     Loading vg-test table (253:84)
>
> >         Adding target to (253:84): 0 131072 raid raid5_ls 3 128
> region_size
>
> > 1024 3 253:35 253:36 253:37 253:38 253:39 253:40
>
> >         dm table   (253:84) [ opencount flush ]   [16384] (*1)
>
> >         dm reload   (253:84) [ noopencount flush ]   [16384] (*1)
>
> >   device-mapper: reload ioctl on (253:84) failed: Invalid argument
>
> >
>
> > I don't see any problems here.
>
>
>
> In my case I got (for example, and Gmail is going to fold the lines,
> sorry):
>
>
>
> [...]
>
>     Loading vg0-photos table (254:45)
>
>         Adding target to (254:45): 0 1258291200 raid raid6_zr 3 128
>
> region_size 1024 5 254:73 254:74 254:37 254:38 254:39 254:40 254:41
>
> 254:42 254:43 254:44
>
>         dm table   (254:45) [ opencount flush ]   [16384] (*1)
>
>         dm reload   (254:45) [ noopencount flush ]   [16384] (*1)
>
>   device-mapper: reload ioctl on (254:45) failed: Invalid argument
>
>
>
> The actual errors are in the kernel logs:
>
>
>
> [...]
>
> [144855.931712] device-mapper: raid: New device injected into existing
>
> array without 'rebuild' parameter specified
>
> [144855.935523] device-mapper: table: 254:45: raid: Unable to assemble
>
> array: Invalid superblocks
>
> [144855.939290] device-mapper: ioctl: error adding target to table
>

I had the following the first time:
[   74.743051] device-mapper: raid: Failed to read superblock of device at
position 1
[   74.761094] md/raid:mdX: device dm-73 operational as raid disk 2
[   74.765707] md/raid:mdX: device dm-67 operational as raid disk 0
[   74.770911] md/raid:mdX: allocated 3219kB
[   74.773571] md/raid:mdX: raid level 5 active with 2 out of 3 devices,
algorithm 2
[   74.775964] RAID conf printout:
[   74.775968]  --- level:5 rd:3 wd:2
[   74.775971]  disk 0, o:1, dev:dm-67
[   74.775973]  disk 2, o:1, dev:dm-73
[   74.793120] created bitmap (1 pages) for device mdX
[   74.822333] mdX: bitmap initialized from disk: read 1 pages, set 2 of 64
bits

After that I had only the previously mentioned errors in the kernel log:

device-mapper: table: 253:84: raid: Cannot change device positions in RAID
array
device-mapper: ioctl: error adding target to table

>
>
>
> 128 means 128*512 so this is 64k as in your case. I was able to verify
>
> that my extracted images matched the RAID device. My problem was not
>
> assembling the array, it was that the array would be rebuilt on every
>
> subsequent use:
>
>
>
>     Loading vg0-var table (254:21)
>
>         Adding target to (254:21): 0 52428800 raid raid5_ls 5 128
>
> region_size 1024 rebuild 0 5 254:11 254:12 254:13 254:14 254:15 254:16
>
> 254:17 254:18 254:19 254:20
>
>         dm table   (254:21) [ opencount flush ]   [16384] (*1)
>
>         dm reload   (254:21) [ noopencount flush ]   [16384] (*1)
>
>         Table size changed from 0 to 52428800 for vg0-var (254:21).
>
>
>
> >> You can check the rmeta superblocks with
>
> >> https://drive.google.com/open?id=0B8dHrWSoVcaDUk0wbHQzSEY3LTg
>
> >
>
> > Thanks, it's very useful!
>
> >
>
> > /dev/mapper/vg-test_rmeta_0
>
> > found RAID superblock at offset 0
>
> >  magic=1683123524
>
> >  features=0
>
> >  num_devices=3
>
> >  array_position=0
>
> >  events=56
>
> >  failed_devices=0
>
> >  disk_recovery_offset=18446744073709551615
>
> >  array_resync_offset=18446744073709551615
>
> >  level=5
>
> >  layout=2
>
> >  stripe_sectors=128
>
> > found bitmap file superblock at offset 4096:
>
> >          magic: 6d746962
>
> >        version: 4
>
> >           uuid: 00000000.00000000.00000000.00000000
>
> >         events: 56
>
> > events cleared: 33
>
> >          state: 00000000
>
> >      chunksize: 524288 B
>
> >   daemon sleep: 5s
>
> >      sync size: 32768 KB
>
> > max write behind: 0
>
> >
>
> > /dev/mapper/vg-test_rmeta_1
>
> > found RAID superblock at offset 0
>
> >  magic=1683123524
>
> >  features=0
>
> >  num_devices=3
>
> >  array_position=4294967295
>
> >  events=62
>
> >  failed_devices=1
>
> >  disk_recovery_offset=0
>
> >  array_resync_offset=18446744073709551615
>
> >  level=5
>
> >  layout=2
>
> >  stripe_sectors=128
>
> > found bitmap file superblock at offset 4096:
>
> >          magic: 6d746962
>
> >        version: 4
>
> >           uuid: 00000000.00000000.00000000.00000000
>
> >         events: 60
>
> > events cleared: 33
>
> >          state: 00000000
>
> >      chunksize: 524288 B
>
> >   daemon sleep: 5s
>
> >      sync size: 32768 KB
>
> > max write behind: 0
>
> >
>
> > /dev/mapper/vg-test_rmeta_2
>
> > found RAID superblock at offset 0
>
> >  magic=1683123524
>
> >  features=0
>
> >  num_devices=3
>
> >  array_position=2
>
> >  events=62
>
> >  failed_devices=1
>
> >  disk_recovery_offset=18446744073709551615
>
> >  array_resync_offset=18446744073709551615
>
> >  level=5
>
> >  layout=2
>
> >  stripe_sectors=128
>
> > found bitmap file superblock at offset 4096:
>
> >          magic: 6d746962
>
> >        version: 4
>
> >           uuid: 00000000.00000000.00000000.00000000
>
> >         events: 62
>
> > events cleared: 33
>
> >          state: 00000000
>
> >      chunksize: 524288 B
>
> >   daemon sleep: 5s
>
> >      sync size: 32768 KB
>
> > max write behind: 0
>
> >
>
> > The problem I see here is that events count is different for the three
>
> > rmetas.
>
>
>
> The event counts relate to the intent bitmap (I believe).
>
>
>
> That looks OK, because failed devices is 1, meaning 0b0...01; i.e.,
>
> device 0 of the array is "failed". The real problem is device 1 which
>
> has
>
>
>
> >  array_position=4294967295
>
>
>
> This should be 1 instead. This is 32-bit unsigned 0xf...f. It may be
>
> that it has special significance in kernel or LVM code. I've not
>
> checked beyond noticing one test: role < 0.
>
>
>
> I recommend using diff3 or pairwise diff on the metadata dumps to
>
> ensure you have not missed any other differences.
>
>
>
> One possible way forward:
>
>
>
> (Optionally) adapt my resync code so it writes back to the original
>
> files instead instead of outputting corrected linear data.
>
>
>
> Modify the rmeta data to remove the failed flag and reset the bad
>
> position to the correct value. sync and power off (or otherwise
>
> prevent the device mapper from writing back bad data).
>
>
>
> It's possible the RAID volume will fail to sync due to bitmap
>
> inconsistencies. I don't know how to re-write the superblocks to say
>
> "trust me, all data are in sync".
>
Thanks for the tip! But could it help me if the manual data reassembly
using your code doesn't work? I don't understand what metadata could do to
fix that.

>
>
>
> _______________________________________________
>
> linux-lvm mailing list
>
> linux-lvm at redhat.com
>
> https://www.redhat.com/mailman/listinfo/linux-lvm
>
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20161013/10f2647e/attachment.htm>


More information about the linux-lvm mailing list