[linux-lvm] Raid 10 - recovery after a disk failure
Pavlik Kirilov
pavllik at yahoo.ca
Mon Feb 1 17:06:42 UTC 2016
The method for restoring raid 10, which I posted in my previous email, works very well for raid 5 on 4 PVs. I tried many times the "--uuid" method from the link you sent me and I always end up with destroyed data. Here comes the output of the tests I performed:
## Ubuntu VM with 4 new disks (qcow files) vda,vdb,vdc,vdd, one physical partition per disk.
vgcreate vg_data /dev/vda1 /dev/vdb1 /dev/vdc1 /dev/vdd1
Volume group "vg_data" successfully created
lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900
Logical volume "lv_r10" created
mkfs.ext4 /dev/vg_data/lv_r10
mount /dev/vg_data/lv_r10 /mnt/r10/
mount | grep vg_data
/dev/mapper/vg_data-lv_r10 on /mnt/r10 type ext4 (rw)
echo "some data" > /mnt/r10/testr10.txt
dmesg -T | tail -n 70
------------------
[ 3822.367551] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null)
[ 3851.317428] md: mdX: resync done.
[ 3851.440927] RAID10 conf printout:
[ 3851.440935] --- wd:4 rd:4
[ 3851.440941] disk 0, wo:0, o:1, dev:dm-1
[ 3851.440945] disk 1, wo:0, o:1, dev:dm-3
[ 3851.440949] disk 2, wo:0, o:1, dev:dm-5
[ 3851.440953] disk 3, wo:0, o:1, dev:dm-7
lvs -a -o +devices
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi-aor-- 3.00g 100.00 lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-aor-- 1.50g /dev/vda1(2)
[lv_r10_rimage_1] vg_data iwi-aor-- 1.50g /dev/vdb1(2)
[lv_r10_rimage_2] vg_data iwi-aor-- 1.50g /dev/vdc1(2)
[lv_r10_rimage_3] vg_data iwi-aor-- 1.50g /dev/vdd1(2)
[lv_r10_rmeta_0] vg_data ewi-aor-- 4.00m /dev/vda1(1)
[lv_r10_rmeta_1] vg_data ewi-aor-- 4.00m /dev/vdb1(1)
[lv_r10_rmeta_2] vg_data ewi-aor-- 4.00m /dev/vdc1(1)
[lv_r10_rmeta_3] vg_data ewi-aor-- 4.00m /dev/vdd1(1)
###
### Shutting down, replacing vdb with a new disk, starting the system ###
###
lvs -a -o +devices
Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi---r-p 3.00g lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data Iwi---r-- 1.50g /dev/vda1(2)
[lv_r10_rimage_1] vg_data Iwi---r-p 1.50g unknown device(2)
[lv_r10_rimage_2] vg_data Iwi---r-- 1.50g /dev/vdc1(2)
[lv_r10_rimage_3] vg_data Iwi---r-- 1.50g /dev/vdd1(2)
[lv_r10_rmeta_0] vg_data ewi---r-- 4.00m /dev/vda1(1)
[lv_r10_rmeta_1] vg_data ewi---r-p 4.00m unknown device(1)
[lv_r10_rmeta_2] vg_data ewi---r-- 4.00m /dev/vdc1(1)
[lv_r10_rmeta_3] vg_data ewi---r-- 4.00m /dev/vdd1(1)
grep description /etc/lvm/backup/vg_data
description = "Created *after* executing 'lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900'"
pvcreate --uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp --restorefile /etc/lvm/backup/vg_data /dev/vdb1
Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
Physical volume "/dev/vdb1" successfully created
vgcfgrestore vg_data
Restored volume group vg_data
lvs -a -o +devices
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi-d-r-- 3.00g 0.00 lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-a-r-- 1.50g /dev/vda1(2)
[lv_r10_rimage_1] vg_data iwi-a-r-- 1.50g /dev/vdb1(2)
[lv_r10_rimage_2] vg_data iwi-a-r-- 1.50g /dev/vdc1(2)
[lv_r10_rimage_3] vg_data iwi-a-r-- 1.50g /dev/vdd1(2)
[lv_r10_rmeta_0] vg_data ewi-a-r-- 4.00m /dev/vda1(1)
[lv_r10_rmeta_1] vg_data ewi-a-r-- 4.00m /dev/vdb1(1)
[lv_r10_rmeta_2] vg_data ewi-a-r-- 4.00m /dev/vdc1(1)
[lv_r10_rmeta_3] vg_data ewi-a-r-- 4.00m /dev/vdd1(1)
lvchange --resync vg_data/lv_r10
Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y
lvs -a -o +devices
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi-a-r-- 3.00g 100.00
---------
dmesg | tail
------------
[ 708.691297] md: mdX: resync done.
[ 708.765376] RAID10 conf printout:
[ 708.765379] --- wd:4 rd:4
[ 708.765381] disk 0, wo:0, o:1, dev:dm-1
[ 708.765382] disk 1, wo:0, o:1, dev:dm-3
[ 708.765383] disk 2, wo:0, o:1, dev:dm-5
[ 708.765384] disk 3, wo:0, o:1, dev:dm-7
mount /dev/vg_data/lv_r10 /mnt/r10/
cat /mnt/r10/testr10.txt
some data
### Suppose now that vda must be replaced too.
### Shutting down again, replacing vda with a new disk, starting the system ###
lvs -a -o +devices
Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi---r-p 3.00g lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data Iwi---r-p 1.50g unknown device(2)
[lv_r10_rimage_1] vg_data Iwi---r-- 1.50g /dev/vdb1(2)
[lv_r10_rimage_2] vg_data Iwi---r-- 1.50g /dev/vdc1(2)
[lv_r10_rimage_3] vg_data Iwi---r-- 1.50g /dev/vdd1(2)
[lv_r10_rmeta_0] vg_data ewi---r-p 4.00m unknown device(1)
[lv_r10_rmeta_1] vg_data ewi---r-- 4.00m /dev/vdb1(1)
[lv_r10_rmeta_2] vg_data ewi---r-- 4.00m /dev/vdc1(1)
[lv_r10_rmeta_3] vg_data ewi---r-- 4.00m /dev/vdd1(1)
grep description /etc/lvm/backup/vg_data
description = "Created *after* executing 'vgscan'"
pvcreate --uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2 --restorefile /etc/lvm/backup/vg_data /dev/vda1
Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
Physical volume "/dev/vda1" successfully created
vgcfgrestore vg_data
Restored volume group vg_data
lvchange --resync vg_data/lv_r10
Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y
lvs -a -o +devices
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices
lv_r10 vg_data rwi-a-r-- 3.00g 100.00 lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-aor-- 1.50g /dev/vda1(2)
[lv_r10_rimage_1] vg_data iwi-aor-- 1.50g /dev/vdb1(2)
[lv_r10_rimage_2] vg_data iwi-aor-- 1.50g /dev/vdc1(2)
[lv_r10_rimage_3] vg_data iwi-aor-- 1.50g /dev/vdd1(2)
[lv_r10_rmeta_0] vg_data ewi-aor-- 4.00m /dev/vda1(1)
[lv_r10_rmeta_1] vg_data ewi-aor-- 4.00m /dev/vdb1(1)
[lv_r10_rmeta_2] vg_data ewi-aor-- 4.00m /dev/vdc1(1)
[lv_r10_rmeta_3] vg_data ewi-aor-- 4.00m /dev/vdd1(1)
mount -t ext4 /dev/vg_data/lv_r10 /mnt/r10/
mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg_data-lv_r10,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
dmesg | tail
-------------
[ 715.361985] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[ 715.362248] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[ 715.362548] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[ 715.362846] FAT-fs (dm-8): bogus number of reserved sectors
[ 715.362933] FAT-fs (dm-8): Can't find a valid FAT filesystem
[ 729.843473] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
As you can see, after more then one disk failure and raid repair , I lost the file system on the raid 10 volume. Please, suggest what I am doing wrong.Thanks.
Pavlik
More information about the linux-lvm
mailing list