[linux-lvm] Fwd: [Linux-cluster] inconsistend volume group after pvmove
Jonathan Brassow
jbrassow at redhat.com
Wed Jul 2 14:16:48 UTC 2008
Spotted this message on linux-cluster...
It seems to me that the LVM label on /dev/sdh still needs to be wiped
(pvremove /dev/sdh)... [Although, I'm not sure the PV has been
removed from the VG yet... unless by chance it was failing when they
did the 'vgreduce'... and that wouldn't explain why there were
problems /before/ the vgreduce.] I'm not sure how it got to the point
of having inconsistent metadata after running the pvmove. Also note
that this is done using CLVM.
Anyone have ideas?
brassow
Begin forwarded message:
> From: "Andreas Schneider" <andreas.schneider at f-it.biz>
> Date: July 1, 2008 3:02:18 AM CDT
> To: <linux-cluster at redhat.com>
> Subject: [Linux-cluster] inconsistend volume group after pvmove
> Reply-To: linux clustering <linux-cluster at redhat.com>
>
> Hello,
> This is our setup: We have 3 Linux servers (2.6.18 Centos 5),
> clustered, with a clvmd running one “big” volume group (15 SCSI
> disks a 69,9 GB).
> After we got an hardware I/O error on one disk out gfs filesystem
> began to loop.
> So we stopped all services and we determined the corrupted disk (/
> dev/sdh) and my intention was to do the following:
> - pvmove /dev/sdh
> - vgreduce my_volumegroup /dev/sdh
> - do an intensive hardware check on the volume
>
>
> But: that’s what happened during pvmove –v /dev/sdh:
> …….
> /dev/sdh: Moved: 78,6%
> /dev/sdh: Moved: 79,1%
> /dev/sdh: Moved: 79,7%
> /dev/sdh: Moved: 80,0%
> Updating volume group metadata
> Creating volume group backup "/etc/lvm/backup/myvol_vg" (seqno
> 46).
> Error locking on node server1: device-mapper: reload ioctl failed:
> Das Argument ist ungültig
> Unable to reactivate logical volume "pvmove0"
> ABORTING: Segment progression failed.
> Removing temporary pvmove LV
> Writing out final volume group after pvmove
> Creating volume group backup "/etc/lvm/backup/myvol_vg" (seqno
> 48).
> [root at hpserver1 ~]# pvscan
> PV /dev/cciss/c0d0p2 VG VolGroup00 lvm2 [33,81 GB / 0 free]
> PV /dev/sda VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sdb VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sdc VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sdd VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sde VG fit_vg lvm2 [66,75 GB / 46,75 GB
> free]
> PV /dev/sdf VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sdg VG fit_vg lvm2 [68,36 GB / 0 free]
> PV /dev/sdh VG fit_vg lvm2 [68,36 GB / 58,36 GB
> free]
> PV /dev/sdj VG fit_vg lvm2 [68,36 GB / 54,99 GB
> free]
> PV /dev/sdi VG fit_vg lvm2 [68,36 GB / 15,09 GB
> free]
> PV /dev/sdk1 VG fit_vg lvm2 [68,36 GB / 55,09 GB
> free]
> Total: 12 [784,20 GB] / in use: 12 [784,20 GB] / in no VG: 0 [0 ]
>
> That sounded bad, and I didn’t have any idea what to do, but read,
> that pvmove can start at the point it was, so I started pvmove
> againg and now pvmove could move all data.
> pvscan and vgscan -vvv showed me, that all data were moved from the /
> dev/sdh volume to the other volumes.
>
> To be sure I restarted my cluster nodes, but they encountered
> problems mounting the gfs filesystems.
> I got this error:
>
> [root at server1 ~]# /etc/init.d/clvmd stop
> Deactivating VG myvol_vg: Volume group "myvol_vg" inconsistent
> WARNING: Inconsistent metadata found for VG myvol_vg - updating to
> use version 148
> 0 logical volume(s) in volume group "myvol_vg" now active
> [ OK ]
> Stopping clvm: [ OK ]
> [root at server1 ~]# /etc/init.d/clvmd start
> Starting clvmd: [ OK ]
> Activating VGs: 2 logical volume(s) in volume group "VolGroup00"
> now active
> Volume group "myvol_vg" inconsistent
> WARNING: Inconsistent metadata found for VG myvol_vg - updating to
> use version 151
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtuFZZe8QKoX8sVA0XRTNoDQVWVftk8cSa
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtqDfFtrJTFTGuju8nNjwtCdPGnzP3hh8k
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtc22hBY40phdVvVdFBFX28PvfF7JrlIYz
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtWfJ1EqXJ309gO3Gx0ZvpNekrmHFo9u2V
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBtCP6czghnQFEjNdv9DF6bsUmnK3eJ5vKp
> Error locking on node server1: Volume group for uuid not found:
> tGRfaK5aW00pFRXcLtrdHAw5a4GNDVBt0KNlnblpwOfcnqIjk4GJ662dxOsL70GF
> 0 logical volume(s) in volume group "myvol_vg" now active
> [ OK ]
>
> As I take a look at it, these 6 volumes are exactly the LVs which
> should be found and where all datas are stored.
>
> The next step was in the beginning step by step and in the end
> stupid try and error.
> This was one of the first actions:
>
> [root at hpserver1 ~]# vgreduce --removemissing myvol_vg
> Logging initialised at Tue Jul 1 10:00:52 2008
> Set umask to 0077
> Finding volume group "myvol_vg"
> Wiping cache of LVM-capable devices
> WARNING: Inconsistent metadata found for VG myvol_vg - updating to
> use version 229
> Volume group "myvol_vg" is already consistent
>
> We tried to deactivate the volume via vgchange –n y myvol_vg, we
> tried to “removemissing” and sadly after a few concurrent tries
> (dmsetup info –c, dmsetup mknodes and vgchange –n y myvol_vg) we can
> access our LVs, but we still get this message and we don’t know why:
>
> Volume group "myvol_vg" inconsistent
> WARNING: Inconsistent metadata found for VG myvol_vg - updating to
> use version 228
>
> I’m a little bit worried about our data,
>
> Regards
> Andreas
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20080702/3061be8b/attachment.htm>
More information about the linux-lvm
mailing list