[linux-lvm] Snapshot behavior on classic LVM vs ThinLVM

Wed Feb 28 19:07:08 UTC 2018

Hi all,

Il 28-02-2018 10:26 Zdenek Kabelac ha scritto:
> Overprovisioning on DEVICE level simply IS NOT equivalent to full
> filesystem like you would like to see all the time here and you've
> been already many times explained that filesystems are simply not
> there ready - fixes are on going but it will take its time and it's
> really pointless to exercise this on 2-3 year old kernels...

this was really beaten to death in the past months/threads. I generally 
agree with Zedenk.

To recap (Zdeneck, correct me if I am wrong): the main problem is that, 
on a full pool, async writes will more-or-less silenty fail (with errors 
shown on dmesg, but nothing more). Another possible cause of problem is 
that, even on a full pool, *some* writes will complete correctly (the 
one on already allocated chunks).

In the past was argued that putting the entire pool in read-only mode 
(where *all* writes fail, but read are permitted to complete) would be a 
better fail-safe mechanism; however, it was stated that no current 
dmtarget permit that.

Two (good) solution where given, both relying on scripting (see 
"thin_command" option on lvm.conf):
- fsfreeze on a nearly full pool (ie: >=98%);
- replace the dmthinp target with the error target (using dmsetup).

I really think that with the good scripting infrastructure currently 
built in lvm this is a more-or-less solved problem.

> Do NOT take thin snapshot of your root filesystem so you will avoid
> thin-pool overprovisioning problem.

But is someone *really* pushing thinp for root filesystem? I always used 
it for data partition only... Sure, rollback capability on root is nice, 
but it is on data which they are *really* important.

> Thin-pool was never targeted for 'regular' usage of full thin-pool.
> Full thin-pool is serious ERROR condition with bad/ill effects on 
> systems.
> Thin-pool was designed to 'delay/postpone' real space usage - aka you
> can use more 'virtual' space with the promise you deliver real storage
> later.

In stress testing, I never saw a system crash on a full thin pool, but I 
was not using it on root filesystem. There are any ill effect on system 
stability which I need to know?

>> When my root snapshot fills up and gets dropped, I lose my undo 
>> history, but at least my root filesystem won't lock up.

We discussed that in the past also, but as snapshot volumes really are 
*regular*, writable volumes (which a 'k' flag to skip activation by 
default), the LVM team take the "safe" stance to not automatically drop 
any volume.

The solution is to use scripting/thin_command with lvm tags. For 
example:
- tag all snapshot with a "snap" tag;
- when usage is dangerously high, drop all volumes with "snap" tag.

>> However, I don't have the space for a full copy of every filesystem, 
>> so if I snapshot, I will automatically overprovision.
> 
> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
> If you already have plan to never deliver promised space - you need to
> live with consequences....

I am not sure to 100% agree on that. Thinp is not only about "delaying" 
space provisioning; it clearly is also (mostly?) about fast, modern, 
usable snapshots. Docker, snapper, stratis, etc. all use thinp mainly 
for its fast, efficent snapshot capability. Denying that is not so 
useful and led to "overwarning" (ie: when snapshotting a volume on a 
virtually-fillable thin pool).

> 
> !SNAPSHOTS ARE NOT BACKUPS!
> 
> This is the key problem with your thinking here (unfortunately you are
> not 'alone' with this thinking)

Snapshot are not backups, as they do not protect from hardware problems 
(and denying that would be lame); however, they are an invaluable *part* 
of a successfull backup strategy. Having multiple rollaback target, even 
on the same machine, is a very usefull tool.

> We do provide quite good 'scripting' support for this case - but again 
> if
> the system can't crash - you can't use thin-pool for your root LV or
> you can't use over-provisioning.

Again, I don't understand by we are speaking about system crashes. On 
root *not* using thinp, I never saw a system crash due to full data 
pool.

Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are way 
too limited).

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8