[linux-lvm] pvmove obliterates filesystem (Opensuse 10.2, x86-64)
Brian Strand
bstrand at switchmanagement.com
Tue Oct 16 23:27:42 UTC 2007
(Apologies in advance if this is the wrong place for this.) Yesterday I
ran a pvmove of a mounted filesystem, but something went wrong and the
filesystem was very badly damaged. The box is a 2x quad-core box with
16gb running Opensuse 10.2 x86-64; it is under heavy load 24x7 (typical
load average 15-20). The storage is connected to a san via a QLogic
2462 dual-port FC HBA, using qla2400 (no dm-multipath). Note: I had
just completed a successful pvmove of another lv about 30 minutes prior
to this incident.
# pvmove --version
LVM version: 2.02.13 (2006-10-27)
Library version: 1.02.12 (2006-10-13)
Driver version: 4.7.0
# uname -a
Linux somebox 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006
x86_64 x86_64 x86_64 GNU/Linux
Here is the output from attempting to pvmove a 100gb lv:
# (time pvmove --verbose -n archlogs /dev/sdc @sata) >>
pvmove-archlogs.log-20071009 2>&1 </dev/null &
# cat pvmove-archlogs.log-20071009
Wiping cache of LVM-capable devices
Finding volume group "switch"
Archiving volume group "switch" metadata (seqno 248).
Creating logical volume pvmove0
Moving 800 extents of logical volume switch/archlogs
Found volume group "switch"
Updating volume group metadata
Creating volume group backup "/etc/lvm/backup/switch" (seqno 249).
Found volume group "switch"
Found volume group "switch"
Suspending switch-archlogs (253:13)
Found volume group "switch"
Found volume group "switch"
Creating switch-pvmove0
device-mapper: create ioctl failed: Device or resource busy
Loading switch-archlogs table
device-mapper: reload ioctl failed: Invalid argument
Checking progress every 15 seconds
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sda2) called while suspended
WARNING: dev_open(/dev/sdb) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sda2) called while suspended
WARNING: dev_open(/dev/sdb) called while suspended
WARNING: dev_open(/dev/sda2) called while suspended
WARNING: dev_open(/dev/sdb) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sdb) called while suspended
WARNING: dev_open(/dev/sda2) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sdc) called while suspended
WARNING: dev_open(/dev/sda2) called while suspended
WARNING: dev_open(/dev/sdb) called while suspended
Updating volume group metadata
Creating volume group backup "/etc/lvm/backup/switch" (seqno 250).
Found volume group "switch"
Found volume group "switch"
Found volume group "switch"
Found volume group "switch"
Suspending switch-pvmove0 (253:14)
Found volume group "switch"
Creating switch-pvmove0
device-mapper: create ioctl failed: Device or resource busy
Unable to reactivate logical volume "pvmove0"
Found volume group "switch"
Creating switch-pvmove0
device-mapper: create ioctl failed: Device or resource busy
Loading switch-archlogs table
device-mapper: reload ioctl failed: Invalid argument
ABORTING: Segment progression failed.
Found volume group "switch"
Found volume group "switch"
Found volume group "switch"
Found volume group "switch"
Found volume group "switch"
Creating switch-pvmove0
device-mapper: create ioctl failed: Device or resource busy
Unable to reactivate logical volume "pvmove0"
Found volume group "switch"
Loading switch-archlogs table
Resuming switch-archlogs (253:13)
Found volume group "switch"
Removing switch-pvmove0 (253:14)
Found volume group "switch"
Removing temporary pvmove LV
Writing out final volume group after pvmove
Creating volume group backup "/etc/lvm/backup/switch" (seqno 252).
/dev/sdc: Moved: 60.0%
real 0m21.789s
user 0m0.108s
sys 0m0.052s
Kernel messages from /var/log/messages:
Oct 9 22:33:21 somebox kernel: device-mapper: table: 253:13: linear:
dm-linear: Device lookup failed
Oct 9 22:33:21 somebox kernel: device-mapper: ioctl: error adding
target to table
Oct 9 22:33:21 somebox kernel: klogd 1.4.1, ---------- state change
----------
Oct 9 22:33:36 somebox kernel: device-mapper: table: 253:13: linear:
dm-linear: Device lookup failed
Oct 9 22:33:36 somebox kernel: device-mapper: ioctl: error adding
target to table
Oct 9 22:40:01 somebox kernel: ReiserFS: dm-13: warning: vs-4080:
reiserfs_free_block: free_block (dm-13:13061735)[dev:blocknr]: bit
already cleared
Oct 9 22:40:01 somebox kernel: ReiserFS: dm-13: warning: vs-4080:
reiserfs_free_block: free_block (dm-13:13061734)[dev:blocknr]: bit
already cleared
...and many thousands more complaints from reiserfs. Given the error
messages (especially "/dev/sdc: Moved: 60.0%") and the speed with which
the destruction occurred, my working hypothesis is that the first 60% of
the lv got repointed to the destination pv, but that the data got left
behind.
Are there any known issues with pvmove? Is pvmove a supported
operation? I had many pvmove-induced kernel oopses under Suse 9.3, but
up until this instance it had worked fine under Opensuse 10.2 for at
least 10 pvmoves on various boxes, all under load.
Thanks,
Brian
More information about the linux-lvm
mailing list