[linux-lvm] Reserve space for specific thin logical volumes

Zdenek Kabelac zkabelac at redhat.com
Wed Sep 13 08:15:44 UTC 2017


Dne 13.9.2017 v 09:53 Gionatan Danti napsal(a):
> Il 13-09-2017 01:22 matthew patton ha scritto:
>>> Step-by-step example:
>>  > - create a 40 GB thin volume and subtract its size from the thin
>> pool (USED 40 GB, FREE 60 GB, REFER 0 GB);
>>  > - overwrite the entire volume (USED 40 GB, FREE 60 GB, REFER 40 GB);
>>  > - snapshot the volume (USED 40 GB, FREE 60 GB, REFER 40 GB);
>>
>> And 3 other threads also take snapshots against the same volume, or
>> frankly any other volume in the pool.
>> Since the next step (overwrite) hasn't happened yet or has written
>> less than 20GB, all succeed.
>>
>>  > - completely overwrite the original volume (USED 80 GB, FREE 20 GB,
>> REFER 40 GB);
>>
>> 4 threads all try to write their respective 40GB. Afterall, they got
>> the green-light since their snapshot was allowed to be taken.
>> Your thinLV blows up spectacularly.
>>
>>  > - a new snapshot creation will fails (REFER is higher then FREE).
>> nobody cares about new snapshot creation attempts at this point.
>>
>>
>>> When do you decide it ?  (you need to see this is total race-lend)
>>
>> exactly!
> 
> I all the examples I did, the snapshot are suppose to be read-only or at least 
> never written. I thought that it was implicitly clear due to ZFS (used as 
> example) being read-only by default. Sorry for not explicitly stating that.
> 

Ohh this is pretty major constrain ;)

But as pointed out multiple times - with scripting around various fullness 
moments of thin-pool - several different actions can be programmed around,
starting from fstrim, ending with plain erase of unneeded snapshot.
(Maybe erasing unneeded files....)

To get most secure application - such app should actually avoid using 
page-cache (using direct-io)  in such case you are always guaranteed
to get exact error at the exact time (i.e. even without journaled mounting 
option for ext4....)


> After the last write, the cloned cvol1 is clearly corrputed, but the original 
> volume has not problem at all.

Surely there is good reason we keep 'old snapshots' still with us - although 
everyone knows it's implementation has aged :)

There are cases where this copying into separate COW areas simply works better 
- especially for temporary living object with low number of 'small' changes.

We even support  old-snapshot for thin-volumes for this reason - so you can 
use 'bigger' thin-pool chunks - but for temporary snapshot for taking backups
you can take old snapshot of thin volume...


> 
> This was more or less the case with classical, fat LVM: a snapshot runnig out 
> of space *will* fail, but the original volume remains unaffected.

Partially this might get solved in 'some' cases with fully provisioned thinLVs 
within thin-pool...

What comes to my mind as possible supporting solution is -
adding possible enhancement on LVM2 side could be  'forcible' removal of 
running volumes  (aka lvm2 equivalent  of 'dmsetup remove --force')

ATM lvm2 prevents you to remove 'running/mounted' volumes.

I can well imagine  LVM will let you forcible  replace such LV with error 
target  - so instead of  thinLV  - you will have  single 'error' target 
snapshot - which could be possibly even  auto-cleaned once the volume 
use-count drops bellow 0  (lvmpolld/dmeventd monitoring whatever...)

(Of course - we are not solving what happens to application using/running out 
of such error target - hopefully something not completely bad....)

This way - you get very 'powerful' weapon to be used in those 'scriplets'
so you can drop uneeded volumes ANYTIME you need to and reclaim its resources...

Regards

Zdenek




More information about the linux-lvm mailing list