[linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
zkabelac at redhat.com
Wed Feb 28 21:43:26 UTC 2018
Dne 28.2.2018 v 20:07 Gionatan Danti napsal(a):
> Hi all,
> Il 28-02-2018 10:26 Zdenek Kabelac ha scritto:
>> Overprovisioning on DEVICE level simply IS NOT equivalent to full
>> filesystem like you would like to see all the time here and you've
>> been already many times explained that filesystems are simply not
>> there ready - fixes are on going but it will take its time and it's
>> really pointless to exercise this on 2-3 year old kernels...
> this was really beaten to death in the past months/threads. I generally agree
> with Zedenk.
> To recap (Zdeneck, correct me if I am wrong): the main problem is that, on a
> full pool, async writes will more-or-less silenty fail (with errors shown on
> dmesg, but nothing more). Another possible cause of problem is that, even on a
> full pool, *some* writes will complete correctly (the one on already allocated
On default - full pool starts to 'error' all 'writes' in 60 seconds.
> In the past was argued that putting the entire pool in read-only mode (where
> *all* writes fail, but read are permitted to complete) would be a better
> fail-safe mechanism; however, it was stated that no current dmtarget permit that.
Yep - I'd probably like to see slightly different mechanism - that all
on going writes would be failing - so far - some 'writes' will pass
(those to already provisioned areas) - some will fail (those to unprovisioned).
The main problem is - after reboot - this 'missing/unprovisioned' space may
provide some old data...
> Two (good) solution where given, both relying on scripting (see "thin_command"
> option on lvm.conf):
> - fsfreeze on a nearly full pool (ie: >=98%);
> - replace the dmthinp target with the error target (using dmsetup).
Yep - this all can happen via 'monitoring.
The key is to do it early before disaster happens.
> I really think that with the good scripting infrastructure currently built in
> lvm this is a more-or-less solved problem.
It still depends - there is always some sort of 'race' - unless you are
willing to 'give-up' too early to be always sure, considering there are
technologies that may write many GB/s...
>> Do NOT take thin snapshot of your root filesystem so you will avoid
>> thin-pool overprovisioning problem.
> But is someone *really* pushing thinp for root filesystem? I always used it
You can use rootfs with thinp - it's very fast for testing i.e. upgrades
and quickly revert back - just there should be enough free space.
> In stress testing, I never saw a system crash on a full thin pool, but I was
> not using it on root filesystem. There are any ill effect on system stability
> which I need to know?
Depends on version of kernel and filesystem in use.
Note RHEL/Centos kernel has lots of backport even when it's look quite old.
> The solution is to use scripting/thin_command with lvm tags. For example:
> - tag all snapshot with a "snap" tag;
> - when usage is dangerously high, drop all volumes with "snap" tag.
Yep - every user has different plans in his mind - scripting gives user
freedom to adapt this logic to local needs...
>>> However, I don't have the space for a full copy of every filesystem, so if
>>> I snapshot, I will automatically overprovision.
As long as admin responsible controls space in thin-pool and takes action
long time before thin-pool runs out-of-space all is fine.
If admin hopes in some kind of magic to happen - we have a problem....
>> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
>> If you already have plan to never deliver promised space - you need to
>> live with consequences....
> I am not sure to 100% agree on that. Thinp is not only about "delaying" space
> provisioning; it clearly is also (mostly?) about fast, modern, usable
> snapshots. Docker, snapper, stratis, etc. all use thinp mainly for its fast,
> efficent snapshot capability. Denying that is not so useful and led to
> "overwarning" (ie: when snapshotting a volume on a virtually-fillable thin pool).
Snapshot are using space - with hope that if you will 'really' need that space
you either add this space to you system - or you drop snapshots.
Still the same logic applied....
>> !SNAPSHOTS ARE NOT BACKUPS!
>> This is the key problem with your thinking here (unfortunately you are
>> not 'alone' with this thinking)
> Snapshot are not backups, as they do not protect from hardware problems (and
> denying that would be lame); however, they are an invaluable *part* of a
> successfull backup strategy. Having multiple rollaback target, even on the
> same machine, is a very usefull tool.
Backups primarily sits on completely different storage.
If you keep backup of data in same pool:
error on this in single chunk shared by all your backup + origin - means it's
total data loss - especially in case where filesystem are using 'BTrees' and
some 'root node' is lost - can easily render you origin + all backups
problems in thin-pool metadata can make all your origin+backups just an
unordered mess of chunks.
> Again, I don't understand by we are speaking about system crashes. On root
> *not* using thinp, I never saw a system crash due to full data pool. >
> Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are way too
Yep - this case is known to be pretty stable.
But as said - with today 'rush' of development and load of updates - user do
want to try 'new disto upgrade' - if it works - all is fine - if it doesn't
let's have a quick road back - so using thin volume for rootfs is pretty
Trouble is there is quite a lot of issues non-trivial to solve.
There are also some on going ideas/projects - one of them was to have thinLVs
with priority to be always fully provisioned - so such thinLV could never be
the one to have unprovisioned chunks....
Other was a better integration of filesystem with 'provisioned' volumes.
More information about the linux-lvm