[linux-lvm] raid10 with missing redundancy, but health status claims it is ok.
Olaf Seibert
o.seibert at syseleven.de
Fri May 27 13:56:03 UTC 2022
Hi all, I'm new to this list. I hope somebody here can help me.
We had a disk go bad (disk commands timed out and took many seconds to
do so) in our LVM installation with mirroring. With some trouble, we
managed to pvremove the offending disk, and used `lvconvert --repair -y
nova/$lv` to repair (restore redundancy) the logical volumes.
One logical volume still seems to have trouble though. In `lvs -o
devices -a` it shows no devices for 2 of its subvolumes, and it has the
weird 'v' status:
LV VG Attr
LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices
lvname nova Rwi-aor--- 800.00g
100.00
lvname_rimage_0(0),lvname_rimage_1(0),lvname_rimage_2(0),lvname_rimage_3(0)
[lvname_rimage_0] nova iwi-aor--- 400.00g
/dev/sdc1(19605)
[lvname_rimage_1] nova iwi-aor--- 400.00g
/dev/sdi1(19605)
[lvname_rimage_2] nova vwi---r--- 400.00g
[lvname_rimage_3] nova iwi-aor--- 400.00g
/dev/sdj1(19605)
[lvname_rmeta_0] nova ewi-aor--- 64.00m
/dev/sdc1(19604)
[lvname_rmeta_1] nova ewi-aor--- 64.00m
/dev/sdi1(19604)
[lvname_rmeta_2] nova ewi---r--- 64.00m
[lvname_rmeta_3] nova ewi-aor--- 64.00m
/dev/sdj1(19604)
```
and also according to `lvdisplay -am` there is a problem with
`..._rimage2` and `..._rmeta2`:
```
--- Logical volume ---
Internal LV Name lvname_rimage_2
VG Name nova
LV UUID xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
LV Write Access read/write
LV Creation host, time xxxxxxxxx, 2021-07-09 16:45:21 +0000
LV Status NOT available
LV Size 400.00 GiB
Current LE 6400
Segments 1
Allocation inherit
Read ahead sectors auto
--- Segments ---
Virtual extents 0 to 6399:
Type error
--- Logical volume ---
Internal LV Name lvname_rmeta_2
VG Name nova
LV UUID xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
LV Write Access read/write
LV Creation host, time xxxxxxxxx, 2021-07-09 16:45:21 +0000
LV Status NOT available
LV Size 64.00 MiB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
--- Segments ---
Virtual extents 0 to 0:
Type error
Similarly, the metadata looks corresponding:
lvname_rimage_2 {
id = "..."
status = ["READ", "WRITE"]
flags = []
creation_time = 1625849121 # 2021-07-09
16:45:21 +0000
creation_host = "cbk130133"
segment_count = 1
segment1 {
start_extent = 0
extent_count = 6400 # 400 Gigabytes
type = "error"
}
}
On the other hand, the health status appears to read out normal:
[13:38:20] root at cbk130133:~# lvs -o +lv_health_status
LV VG Attr LSize Pool Origin Data% Meta% Move Log
Cpy%Sync Convert Health
lvname nova Rwi-aor--- 800.00g .. 100.00
We tried various combinations of `lvconvert --repair -y nova/$lv` and
`lvchange --syncaction repair` on it without effect.
`lvchange -ay` doesn't work either:
$ sudo lvchange -ay nova/lvname_rmeta_2
Operation not permitted on hidden LV nova/lvname_rmeta_2.
$ sudo lvchange -ay nova/lvname
$ # (no effect)
$ sudo lvconvert --repair nova/lvname_rimage_2
WARNING: Disabling lvmetad cache for repair command.
WARNING: Not using lvmetad because of repair.
Command on LV nova/lvname_rimage_2 does not accept LV type error.
Command not permitted on LV nova/lvname_rimage_2.
$ sudo lvchange --resync nova/lvname_rimage_2
WARNING: Not using lvmetad because a repair command was run.
Command on LV nova/lvname_rimage_2 does not accept LV type error.
Command not permitted on LV nova/lvname_rimage_2.
$ sudo lvchange --resync nova/lvname
WARNING: Not using lvmetad because a repair command was run.
Logical volume nova/lvname in use.
Can't resync open logical volume nova/lvname.
$ lvchange --rebuild /dev/sdf1 nova/lvname
WARNING: Not using lvmetad because a repair command was run.
Do you really want to rebuild 1 PVs of logical volume nova/lvname [y/n]: y
device-mapper: create ioctl on lvname_rmeta_2 LVM-blah failed: Device
or resource busy
Failed to lock logical volume nova/lvname.
$ lvchange --raidsyncaction repair nova/lvname
# (took a long time to complete but didn't change anything)
$ sudo lvconvert --mirrors +1 nova/lvname
Using default stripesize 64.00 KiB.
--mirrors/-m cannot be changed with raid10.
Any idea how to restore redundancy on this logical volume? It is in
continuous use, of course...
It seems like somehow we must convince LVM to allocate some space for
it, instead of using the error segment (there is plenty available in the
volume group).
Thanks in advance.
-Olaf
--
SysEleven GmbH
Boxhagener Straße 80
10245 Berlin
T +49 30 233 2012 0
F +49 30 616 7555 0
http://www.syseleven.de
http://www.facebook.com/SysEleven
https://www.instagram.com/syseleven/
Aktueller System-Status immer unter:
http://www.twitter.com/syseleven
Firmensitz: Berlin
Registergericht: AG Berlin Charlottenburg, HRB 108571 B
Geschäftsführer: Marc Korthaus, Jens Ihlenfeld, Andreas Hermann
More information about the linux-lvm
mailing list