[lvm-devel] Snapshotted device gets slow

MINOURA Makoto / 箕浦 真 minoura at valinux.co.jp
Wed Sep 18 02:44:23 UTC 2013


(I'm not on this list; please Cc me)

We are observing a problem that writes to a (thick) snapshot origin
device get way slooower on RHEL 6.3 than on EL 6.2 as the cow device
is consumed.  I tracked down the problem and found that the cause was
the commit f1aabd5c60ad721609e5d781e45c234ba62fe355 (Set delay_resume_
if_new on deptree snapshot origin.).

Here's detailed description.

On 6.3 the size of the exception hashes are too small; the lower bound
64 for both completed and pending.

# lvcreate -n test -L 240g tank
  Logical volume "test" created
# lvcreate -s -n snap -l 100%FREE tank/test
  Logical volume "snap" created
# lvs /dev/tank
  LV   VG   Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert
  snap tank swi-a-s- 207.13g      test     0.00                        
  test tank owi-a-s- 240.00g
# dmsetup ls | grep tank
tank-snap-cow	(253:6)
tank-snap	(253:4)
tank-test	(253:3)
tank-test-real	(253:5)
# crash
....
(crash) print _origins[5]
$3 = {
  next = 0xffff880297977c08, 
  prev = 0xffff880297977c08
}
crash> struct -ox origin.hash_list
struct origin {
   [0x8] struct list_head hash_list;
}
crash> print *(struct origin*)0xffff880297977c00
$4 = {
  bdev = 0xffff880297519c80, 
  hash_list = {
    next = 0xffff880293c1b050, 
    prev = 0xffff880293c1b050
  }, 
  snapshots = {
    next = 0xffff880293456a40, 
    prev = 0xffff880293456a40
  }
}
crash> struct -ox dm_snapshot.list
struct dm_snapshot {
   [0x40] struct list_head list;
}
crash> print ((struct dm_snapshot*)0xffff880293456a00)->origin->name
$5 = "253:5\000\000\000\002\000\000\000st-a"
crash> print ((struct dm_snapshot*)0xffff880293456a00)->complete
$6 = {
  hash_mask = 63,				<=== !!
  hash_shift = 8, 
  table = 0xffffc900033ea000
}
crash> print ((struct dm_snapshot*)0xffff880293456a00)->pending
$7 = {
  hash_mask = 63,				<=== !!
  hash_shift = 0, 
  table = 0xffffc900033ed000
}

hash_mask+1 is the size of the hash. On EL6.2 they are 131072 and
16384 respectively.

They are initialized at dm_ctl_ioctl()=>table_load()=>snapshot_ctr()
=>init_hash_tables() from the size of the cow device or the origin
device:

static int init_hash_tables(struct dm_snapshot *s)
{
	[snip]

	hash_size = min(origin_dev_size, cow_dev_size) >> s->store->chunk_shift;
	hash_size = min(hash_size, max_buckets);

	if (hash_size < 64)
		hash_size = 64;
	hash_size = rounddown_pow_of_two(hash_size);

Here, the origin device is not resumed yet (by that commit) and it's
size is 0.  Thus the hash_size is 0 and clipped at the lower bound 64.


I'm not familiar with the device-mapper code and am not sure
that reverting the patch is the right thing to do, but I
think the current situation is not intended.

Thanks,

-- 
Minoura Makoto <minoura at valinux.co.jp>




More information about the lvm-devel mailing list