[linux-lvm] Failed PV recovery

Lamont R. Peterson peregrine at openbrainstem.net
Mon Jul 24 02:19:47 UTC 2006


Here's the setup:  home file server has 3 drives, 4.3GB, 45GB, 120GB; all IDE.  
The 4.3GB drive has a /boot/ partition and a small swap with the rest 
allocated to an LVM partition which is the only member of the "system VG.  
The other two drives are single LVM partitions and comprise the "data" VG.  
That's how it was configured for over a year.

A few months ago, I started seeing some unreadable sectors on the 45GB drive.  
I purchased a 320GB SATA drive and a PCI controller (no SATA on this 
motherboard) to replace the two drives (I'll get more SATA disks and convert 
to LVM on RAID as I can afford them).  Long story short, motherboard needed 
BIOS flash and a little coaxing to recognize the PCI STAT controller, but 
that's sorted out now.

I partition the 320GB drive with 1 LVM PV and add it the data VG.  I 
run "pvmove /dev/hde1 /dev/sda1" (120GB -> 320GB) which takes about 75 
minutes (120GB was almost completely full) no issues.

AT that point, I *should* have run "vgreduce data /dev/hde1" so that I 
wouldn't have the 120GB drive in the VG anymore, but I didn't.  20/20 

Next I ran "pvmove /dev/hdg1 /dev/sda1" (45GB -> 320GB).  About 45% of the way 
through, it crashes:

/dev/hdg1: Moved: 45.0%
  /dev/hdg1: read failed after 0 of 1024 at 4096: Input/output error
  /dev/hdg1: read failed after 0 of 2048 at 0: Input/output error
  Failed to read existing physical volume '/dev/hdg1'
  Physical volume /dev/hdg1 not found
  ABORTING: Can't reread PV /dev/hdg1
  ABORTING: Can't reread VG for /dev/hdg1

The system was still running, but the /dev/hdg disk no longer showed up.  In 
the past, I could power down for an hour or so (let the drive cool down) and 
then it would show up again.  It looked like the mounted LVs which are on 
data were fine (I could read & write), so I powered off.  Rebooting, I get 
kernel panics.  I can bring the box up in "emergency" mode or with a rescue 

Prior to this, only one LV was unusable.  I was able to read every bit of the 
rest of them just fine (I have backups of everything important).  The one bad 
LV (due to unreadable sectors on the 45GB drive) was for /var/spool/up2date 
when I was running RHEL3, which I have obviously replaced since RHEL3 
wouldn't support SATA (I have SUSE Linux 10.1 on there now).

If I had already removed the 120GB drive from the VG, I would try dd_rescue 
and copy the entire 45GB drive over to the 120GB one.  I can't get vgreduce 
to run correctly and pull it out of the VG.  When I run pvscan, I get:

NOTE:  I just booted up the box to get the output, and the 45GB disk was 
working.  It hasn't been for about a week now.  I have successfully removed 
the 120GB drive from the data VG.  Man, I gotta love having a little bit of 
luck!  Wow. :D

I could just blow it all away and recreate the data VG from scratch, reloading 
from backups (and pulling down things like .iso images, etc.).  I would like 
to figure out some techniques to try to recover this from here.  As I make my 
living teaching over 1,000 people/year (newbies and experts alike) to use 
Linux, I'd like to be able to use this experience to teach others how to 
recover if they find themselves up the "Creek Who Should Not Be Named".

1.  How can I take an unused PV out of a VG with another PV that's broken?

2.  Once I have a copy of the entire bad drive's contents, how do I alter the 
VG (hand edit?) so that it is using the copy instead of the original.

3.  What am I not asking/seeing?

4.  Are there better ways I could have handled this (other than the obvious 
like RAID to start with, etc.)?
Lamont R. Peterson <peregrine at OpenBrainstem.net>
Founder [ http://blog.OpenBrainstem.net/peregrine/ ]
GPG Key fingerprint: 0E35 93C5 4249 49F0 EC7B  4DDD BE46 4732 6460 CCB5
  ___                   ____            _           _
 / _ \ _ __   ___ _ __ | __ ) _ __ __ _(_)_ __  ___| |_ ___ _ __ ___
| | | | '_ \ / _ \ '_ \|  _ \| '__/ _` | | '_ \/ __| __/ _ \ '_ ` _ \
| |_| | |_) |  __/ | | | |_) | | | (_| | | | | \__ \ ||  __/ | | | | |
 \___/| .__/ \___|_| |_|____/|_|  \__,_|_|_| |_|___/\__\___|_| |_| |_|
      |_|               Intelligent Open Source Software Engineering
                              [ http://www.OpenBrainstem.net/ ]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20060723/77807458/attachment.sig>

More information about the linux-lvm mailing list