[linux-lvm] Reserve space for specific thin logical volumes

Thu Sep 21 09:49:18 UTC 2017

Dne 20.9.2017 v 15:05 Xen napsal(a):
> Gionatan Danti schreef op 18-09-2017 21:20:
> 
>> Xen, I really think that the combination of hard-threshold obtained by
>> setting thin_pool_autoextend_threshold and thin_command hook for
>> user-defined script should be sufficient to prevent and/or react to
>> full thin pools.
> 
> I will hopefully respond to Zdenek's message later (and the one before that 
> that I haven't responded to),
> 
>> I'm all for the "keep it simple" on the kernel side.
> 
> But I don't mind if you focus on this,
> 
>> That said, I would like to see some pre-defined scripts to easily
>> manage pool fullness. (...) but I would really
>> like the standardisation such predefined scripts imply.
> 
> And only provide scripts instead of kernel features.
> 
> Again, the reason I am also focussing on the kernel is because:
> 
> a) I am not convinced it cannot be done in the kernel
> b) A kernel feature would make space reservation very 'standardized'.

Hi

Some more 'light' into the existing state as this is really not about what can 
and what cannot be done in kernel - as clearly you can do 'everything' in 
kernel - if you have the code for it...

I'm here explaining position of lvm2 - which is user-space project (since we 
are on lvm2 list) - and lvm2 is using  'existing'  dm  kernel target which 
provides  thin-provisioning (and has it's configurables). So this is kernel 
piece and differs from user-space lvm2 counterpart.

Surely there is cooperation between these two - but anyone else can write some 
other 'dm'  target - and lvm2 can extend support for given  target/segment 
type if such target is used by users.

In practice your 'proposal' is quite different from the existing target - 
essentially major rework if not a whole new re-implementation  - as it's not 
'a few line' patch extension  which you might possibly believe/hope into.

I can (and effectively I've already spent a lot of time) explaining the 
existing logic and why it is really hardly doable with current design, but we 
cannot work on support for 'hypothetical' non-existing kernel target from lvm2 
side - so you need to start from 'ground-zero' level on dm target design....
or you need to 'reevaluate' your vision to be more in touch with existing 
kernel target output...

However we believe our exiting solution in 'user-space' can cover most common 
use-cases and we might just have 'big-holes' in providing better documentation 
to explain reasoning and guide users to use existing technology in more 
optimal way.

> 
> The point is that kernel features make it much easier to standardize and to 
> put some space reservation metric in userland code (it becomes a default 
> feature) and scripts remain a little bit off to the side.

Maintenance/devel/support of kernel code is more expensive - it's usually very 
easy to upgrade small 'user-space' encapsulated package - compared with major 
changes on kernel side.

So that's where dm/lvm2 design gets from - do the 'minimum necessary' inside 
kernel and  maximize usage of user-space.

Of course this decision makes some tasks harder (i.e. there are surely 
problems which would not even exist if it would be done in kernel)  - but lots 
of other things are way easier - you really can't compare those....

Yeah - standards are always problem :)  i.e. Xorg & Wayland....
but it's way better to play with user-space then playing with kernel....

> However if we *can* standardize on some tag or way of _reserving_ this space, 
> I'm all for it.

Problems of a desktop user with 0.5TB SSD are often different with servers 
using 10PB across multiple network-connected nodes.

I see you call for one standard - but it's very very difficult...

> I think a 'critical' tag in combination with the standard autoextend_threshold 
> (or something similar) is too loose and ill-defined and not very meaningful.

We look for delivering admins rock-solid bricks.

If you make small house or you build a Southfork out of it is then admins' choice.

We have spend really lot of time thinking if there is some sort of 
'one-ring-to-rule-them-all' solution - but we can't see it yet - possibly 
because we know wider range of use-cases compared with individual user-focused 
problem.

> And I would prefer to set individual space reservation for each volume even if 
> it can only be compared to 5% threshold values.

Which needs 'different' kernel target driver (and possibly some way to 
kill/split page-cache to work on 'per-device' basis....)

And just as an illustration of problems you need to start solving for this design:

You have origin and 2 snaps.
You set different 'thresholds' for these volumes  -
You then overwrite 'origin'  and you have to maintain 'data' for OTHER LVs.
So you get into the position - when 'WRITE' to origin will invalidate volume 
that is NOT even active (without lvm2 being even aware).
So suddenly rather simple individual thinLV targets  will have to maintain 
whole 'data set' and cooperate with all other active thins targets in case 
they share some data.... - so in effect WHOLE data tree needs to be 
permanently accessed -  this could be OK when you focus for use of 3 volumes
with at most couple hundreds GiB of addressable space - but does not 'fit' 
well for 1000LVs and PB of addressable data.

Regards

Zdenek