[linux-lvm] Repair thin pool

M.H. Tsai mingnus at gmail.com
Fri Feb 5 11:44:46 UTC 2016


Hi,

Seems that your steps are wrong.  You should run thin_repair before
swapping the pool metadata.
Also, thin_restore is for XML(text) input, not for binary metadata
input, so it's normal to get segmentation fault...

"lvconvert --repair ... " is a command wrapping "thin_repair +
swapping metadata"  into a single step.
If it doesn't work, then you might need to dump the metadata manually,
to check if there's serious corruption in mapping trees or not....
(I recommend to use the newest thin-provisioning-tools to get better result)

1. active the pool metadata (It's okay if the command failed. We just
want to activate the hidden metadata LV)
lvchange -ay vgg1/pool_nas

2. dump the metadata, then checkout the output XML
thin_dump /dev/mapper/vgg1-pool_nas_tmeta -o thin_dump.xml -r

I have experience in repairing many seriously corrupted thin pools. If
the physical medium is okay, I think that most cases are repairable.
I also wrote some extension to thin-provisioning-tools (not yet
published. the code still need some refinement...), maybe it could
help.


Ming-Hung Tsai


2016-02-05 9:21 GMT+08:00 Mars <kirapangzi at gmail.com>:
>
> Hi there,
>
> We're using Centos 7.0 with lvm 2.02.105 and met a problem as underlying:
> After a electricity powerdown in the datacenter room, thin provision volumes came up with wrong states:
>
> [root at storage ~]# lvs -a
>   dm_report_object: report function failed for field data_percent
>   LV                              VG               Attr       LSize   Pool        Origin           Data%  Move Log Cpy%Sync Convert
>   DailyBuild                      vgg145155121036c Vwi-d-tz--   5.00t pool_nas
>   dat                             vgg145155121036c Vwi-d-tz--  10.00t pool_nas
>   lvol0                           vgg145155121036c -wi-a-----  15.36g
>   [lvol3_pmspare]                 vgg145155121036c ewi-------  15.27g
>   market                          vgg145155121036c Vwi-d-tz--   3.00t pool_nas
>   pool_nas                        vgg145155121036c twi-a-tz--  14.90t                                0.00
>   [pool_nas_tdata]                vgg145155121036c Twi-ao----  14.90t
>   [pool_nas_tmeta]                vgg145155121036c ewi-ao----  15.27g
>   share                           vgg145155121036c Vwi-d-tz--  10.00t pool_nas
>
>
>  the thin pool "pool_nas" and general lv "lvol0" are active, but thin provision volumes cannot be actived even with cmd "lvchange -ay thin_volume_name".
>
> To recover it, we tried following ways refer to these mail conversations: http://www.spinics.net/lists/lvm/msg22629.html and http://comments.gmane.org/gmane.linux.lvm.general/14828.
>
> 1, USE: "lvconvert --repair vgg145155121036c/pool_nas"
> output as below and thin volumes still cannot be active.
> WARNING: If everything works, remove "vgg145155121036c/pool_nas_tmeta0".
> WARNING: Use pvmove command to move "vgg145155121036c/pool_nas_tmeta" on the best fitting PV.
>
> 2, USE manual repair steps:
> 2a: inactive thin pool.
> 2b: create a temp lv "metabak".
> 2c: swap the thin pool's metadata lv: "lvconvert --thinpool vgg145155121036c/pool_nas --poolmetadata metabak -y", only with "-y" option can submit the command.
> 2d: active temp lv "metabak" and create another bigger lv "metabak1".
> 2e: repair metadata: "thin_restore -i /dev/vgg145155121036c/metabak-o /dev/vgg145155121036c/metabak1", and got segment fault.
>
> So, is there any other way to recover this or some steps we do wrong?
>
> Thank you very much.
> Mars
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




More information about the linux-lvm mailing list