[linux-lvm] Snapshot behavior on classic LVM vs ThinLVM

Tue Feb 27 18:39:44 UTC 2018

Zdenek Kabelac schreef op 24-04-2017 23:59:

>>> I'm just currious -  what the you think will happen when you have
>>> root_LV as thin LV and thin pool runs out of space - so 'root_LV'
>>> is replaced with 'error' target.
>> 
>> Why do you suppose Root LV is on thin?
>> 
>> Why not just stick to the common scenario when thin is used for extra 
>> volumes or data?
>> 
>> I mean to say that you are raising an exceptional situation as an 
>> argument against something that I would consider quite common, which 
>> doesn't quite work that way: you can't prove that most people would 
>> not want something by raising something most people wouldn't use.
>> 
>> I mean to say let's just look at the most common denominator here.
>> 
>> Root LV on thin is not that.
> 
> Well then you might be surprised - there are user using exactly this.

I am sorry, this is a long time ago.

I was concerned with thin full behaviour and I guess I was concerned 
with being able to limit thin snapshot sizes.

I said that application failure was acceptable, but system failure not.

Then you brought up root on thin as a way of "upping the ante".

I contended that this is a bigger problem to tackle, but it shouldn't 
mean you shouldn't tackle the smaller problems.

(The smaller problem being data volumes).

Even if root is on thin and you are using it for snapshotting, it would 
be extremely unwise to overprovision such a thing or to depend on 
"additional space" being added by the admin; root filesystems are not 
meant to be expandable.

If on the other hand you do count on overprovisioning (due to snapshots) 
then being able to limit snapshot size becomes even more important.

> When you have rootLV on thinLV - you could easily snapshot it before
> doing any upgrade and revert back in case something fails on upgrade.
> See also projects like snapper...

True enough, but if you risk filling your pool because you don't have 
full room for a full snapshot, that would be extremely unwise. I'm also 
not sure write performance for a single snapshot is very much different 
between thin and non-thin?

They are both CoW. E.g. you write to an existing block it has to be 
duplicated, only for non-allocated writes thin is faster, right?

I simply cannot reconcile an attitude that thin-full-risk is acceptable 
and the admin's job while at the same time advocating it for root 
filesystems.

Now most of this thread I was under the impression that "SYSTEM HANGS" 
where the norm because that's the only thing I ever experienced (kernel 
3.x and kernel 4.4 back then), however you said that this was fixed in 
later kernels.

So given that, some of the disagreement here was void as apparently no 
one advocated that these hangs were acceptable ;-).

:).

>> I have tried it, yes. Gives troubles with Grub and requires thin 
>> package to be installed on all systems and makes it harder to install 
>> a system too.
> 
> lvm2 is cooking some better boot support atm....

Grub-probe couldn't find the root volume so I had to maintain my own 
grub.cfg.

Regardless if I ever used this again I would take care to never 
overprovision or to only overprovision at low risk with respect to 
snapshots.

Ie. you could thin provision root + var or something similar but I would 
always put data volumes (home etc) elsewhere.

Ie. not share the same pool.

Currently I was using a regular snapshot but I allocated it too small 
and it always got dropped much faster than I anticipated.

(A 1GB snapshot constantly filling up with even minor upgrade 
operations).

>> Thin root LV is not the idea for most people.
>> 
>> So again, don't you think having data volumes produce errors is not 
>> preferable to having the entire system hang?
> 
> Not sure why you insist system hangs.
> 
> If system hangs - and you have recent kernel & lvm2 - you should fill 
> bug.
> 
> If you set  '--errorwhenfull y'  - it should instantly fail.
> 
> There should not be any hanging..

Right well Debian Jessie and Ubuntu Xenial just experienced that.

>> That's irrelevant; if the thin pool is full you need to mitigate it, 
>> rebooting won't help with that.
> 
> well it's really admins task to solve the problem after panic call.
> (adding new space).

That's a lot easier if your root filesystem doesn't lock up.

;-).

Good luck booting to some rescue environment on a VPS or with some boot 
stick on a PC; the Ubuntu rescue environment for instance has been 
abysmal since SystemD.

You can't actually use the rescue environment because there is some 
weird interaction with systemd spewing messages and causing weird 
behaviour on the TTY you are supposed to work on.

Initrd yes, but not the "full rescue" systemd target, doesn't work.

My point with this thread was.....

When my root snapshot fills up and gets dropped, I lose my undo history, 
but at least my root filesystem won't lock up.

I just calculated the size too small and I am sure I can also put a 
snapshot IN a thin pool for a non-thin root volume?

Haven't tried.

However, I don't have the space for a full copy of every filesystem, so 
if I snapshot, I will automatically overprovision.

My snapshots are indeed meant for backups (of data volumes) ---- not for 
rollback ----- and for rollback ----- but only for the root filesystem.

So: my thin snapshots are meant for backup,
     my root snapshot (non-thin) is meant for rollback.

But, if any application really misbehaved... previously the entire 
system would crash (kernel 3.x).

So, the only defense is constant monitoring and emails or even tty/pty 
broadcasts because
well sometimes it is just human error where you copy the wrong thing to 
the wrong place.

Because I cannot limit my (backup) snapshots in size.

With sufficient monitoring I guess that is not much of an issue.

> Thin users can't expect to overload system in crazy way and expect the
> system will easily do something magical to restore all data.

That was never asked.

My problem was system hangs, but my question was about limiting snapshot 
size on thin.

However userspace response scripts were obviously possible.....

Including those that would prioritize dropping thin snapshots over other 
measures.