[lvm-devel] [PATCH] handle transient errors in lvconvert --repair
Takahiro Yasui
tyasui at redhat.com
Wed May 19 15:06:04 UTC 2010
Hi Petr,
On 05/19/10 08:06, Petr Rockai wrote:
> Takahiro Yasui <tyasui at redhat.com> writes:
> The catch is that this won't work correctly in other cases, especially
> with transient errors. I suspect the real problem is in not calling
> _lv_update_log_type in the new code path -- but see below: I cannot
> reliably fix this without having a reproducer. Also, I would very much
> like to have the tests you had failing on our regression suite, to avoid
> similar problem in the future.
> ...
> Unfortunately, I still cannot reproduce the problem -- I have written a
> few testcases that only fail the log, or fail a log and some other
> things and I can't seem to trigger the bug. I have tried with both
> normal and cluster locking.
>
> It would be very useful if you could provide more specific instructions
> on how to trigger this.
Here is the instruction. I used 2.02.65 but I also reproduced it using
2.02.66, too.
0. environment
# lvm version
LVM version: 2.02.66(2)-cvs (2010-05-17)
Library version: 1.02.49-cvs (2010-05-17)
Driver version: 4.16.0
# grep mirror_log_fault_policy /etc/lvm/lvm.conf
# 'mirror_image_fault_policy' and 'mirror_log_fault_policy' define
mirror_log_fault_policy = "remove"
1. create vg and lv
# vgcreate vg00 /dev/sd[c-e]; lvcreate --ig -m1 -L12m -nlv00 vg00
Volume group "vg00" successfully created
Logical volume "lv00" created
2. disable log device (/dev/sde in my environment)
# echo offline > /sys/block/sde/device/state
3. run 'lvconvert --repair'
# lvconvert --config devices{ignore_suspended_devices=1} --repair --use-policies vg00/lv00
Mirrored transient status: "2 253:1 253:2 24/24 1 AA 3 disk 253:0 D"
Mirror log status: 1 of 1 images failed - switching to core
WARNING: Failed to replace 1 of 1 logs in volume lv00
4. check logical volumes
# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv00 vg00 mwi-a- 12.00M 100.00
lv00_mlog vg00 -wi--- 4.00M
> aux prepare_vg 5
> lvcreate -m 1 --ig -L 1 -n 2way $vg $dev1 $dev2 $dev3:0
> disable_dev $dev3
> echo n | lvconvert --repair $vg/2way
> check mirror $vg 2way core
> lvs -a -o +devices | not grep unknown
> lvs -a -o +devices | not grep mlog
> vgreduce --removemissing $vg
> enable_dev $dev3
This issue didn't occurred with your test case in my environment, either.
So, the differences in our test cases seems 'policy.' I used the same
options for lvconvert as ones in dmeventd.
Thanks,
Taka
> During a call to lv_remove_mirrors above, we call through to
> _remove_mirror_images, with remove_log = 1. We have this:
>
> ... if (remove_log)
> detached_log_lv = detach_mirror_log(mirrored_seg);
>
> ...
>
> if (detached_log_lv && !_delete_lv(lv, detached_log_lv))
> return_0;
>
> So the log *should* be gone after this is finished. Since you see the
> log hanging around, I suspect that this code has some bugs (this part of
> the code is known to be problematic, unfortunately). Apart from actual
> steps to reproduce the problem, the output from lvconvert doing the
> repair would be helpful. It should be printing things like "Mirror
> status" and "Mirror log status", please paste these.
Yes, see step 4.
Thanks,
Taka
More information about the lvm-devel
mailing list