When LVM Goes Bad

Christopher K. Johnson ckjohnson at gwi.net
Thu Jun 22 02:42:56 UTC 2006


Paul Howarth wrote:
> Andy Green wrote:
>> A story about LVM.  I believe LVM is the default on Fedora 
>> partitioning now, at least I didn't love it that much that I would 
>> have selected it, and it is on all my boxes now.
>>
>> LVM can make a lot of sense for large storage binding together 
>> multiple devices or raids into a single logical storage device, in 
>> fact I use it for that too.  However LVM makes less sense on, say, a 
>> laptop which has and will only ever have a single 2.5" HDD for 
>> storage that is permanently available with the laptop.
>>
>> Now it doesn't matter too much when everything is working, because 
>> LVM is a fairly lightweight additional layer AFAIK.  However on a box 
>> here its sole SATA drive went bad without warning, basically some 
>> dozens of sectors were goneski after a recent period of high 
>> temperature here. The resulting symptom was that the partition 
>> contents were no longer recognized as containing a logical volume or 
>> a volume group, nor pvscan, although pvdisplay could see it was a 
>> physical volume if pointed directly at the partition.
>>
>> Recovery from LVM metadata corruption is not something that is 
>> overburdened by tools to help out, in fact I couldn't find anything 
>> useful.  By using dd I probed the damaged region and found that it 
>> started 33214 512-byte blocks into the partition, and ended 33336 
>> 512-byte blocks in, it trashed something like 60Kbytes.  Touching 
>> this region spewed IO errors to the console.  Whether this explained 
>> the loss of LVMness or a subsequent logical brain damage that 
>> happened elsewhere did it I don't know.
>>
>> What I did was to add a new HDD and install FC5 on it and boot into 
>> it, with the old HDD on as /dev/sdb.  I then used dd to copy the 
>> first 33214 512-byte blocks to a file on the new drive, dd'ed 122 
>> 512-byte blocks from /dev/zero and appended that on the end of the 
>> first file, and then used dd with bs=512 skip=33336 to copy the 
>> remainder of the damaged partition to this file also.  So after this 
>> I had a copy of the partition as a file on the new HDD with 
>> everything in the right place and the damaged area zeroed out.
>>
>> Now naturally this file will not mount loop because of the LVM, it's 
>> not a valid ext3 image.  I googled around some more and went on the 
>> LVM IRC channel and explained my problem.  No help, in fact no 
>> response.  There don't seem to be any tools or readily findable 
>> advice for recovering from this situation.
>>
>> I created a new 10MB file with dd and used mkfs.ext3 on it, and 
>> examined the first part of it using hexdump.  With the help of Google 
>> I found that the ext3 magic is present at offset +0x438, and I 
>> noticed that the first 1Kbytes of it is zeroed.  I then used hexdump 
>> and grep to search for this situation in the copied LVM partition 
>> file, and found such a situation was present at offset 0x30438.
>>
>> I decided to remove the first 0x30000 bytes of my copied partition 
>> image, which took a while because the partition was 60GB, in fact the 
>> whole process was agonizingly slow.
>>
>> After this, I was able to mount the resulting file -text3 -oloop 
>> successfully and I recovered my data.  The zeroed/damaged region 
>> trashed a small part of two directories whose contents where 
>> noncritical.  This story is offered in the hope that future Googlers 
>> will have better luck than I did.
>>
>> I wouldn't say that LVM is evil from this, but I would suggest that 
>> you simply turn it off for partitioning actions where you know there 
>> will be no expansion, because the only thing it will ever do for you 
>> in that case is to stress you out when you least need it.
>
> Had a similar issue last week actually. It's not put me off LVM but it 
> made me glad I do regular backups.
>
> Paul.
>
I use raid-1 devices as my LVM PVs to reduce the risk of such problems

-- 
   "Spend less!  Do more!  Go Open Source..." -- Dirigo.net
   Chris Johnson, RHCE #804005699817957




More information about the fedora-list mailing list