[linux-lvm] Reserve space for specific thin logical volumes
list at xenhideout.nl
Tue Sep 12 12:37:38 UTC 2017
Zdenek Kabelac schreef op 12-09-2017 13:46:
> What's wrong with BTRFS....
I don't think you are a fan of it yourself.
> Either you want fs & block layer tied together - that the btrfs/zfs
Gionatan's responses used only Block layer mechanics.
> or you want
> layered approach with separate 'fs' and block layer (dm approach)
Of course that's what I want or I wouldn't be here.
> If you are advocating here to start mixing 'dm' with 'fs' layer, just
> because you do not want to use 'btrfs' you'll probably not gain main
> traction here...
You know Zdenek, it often appears to me your job here is to dissuade
people from having any wishes or wanting anything new.
But if you look a little bit further, you will see that there is a lot
more possible within the space that you define, than you think in a
black & white vision.
"There are more things in Heaven and Earth, Horatio, than is dreamt of
in your philosophy" ;-).
I am pretty sure many of the impossibilities you cite spring from a
misunderstanding of what people want, you think they want something
extreme, but it is often much more modest than that.
Although personally I would not mind communication between layers in
which providing layer (DM) communicates some stuff to using layer (FS)
but 90% of the time that is not even needed to implement what people
Also we see ext4 being optimized around 4MB block sizes right? To create
So that's example of "interoperation" without mixing layers.
I think Gionatan has demonstrated that pure block layer functionality,
is possible to have more advanced protection ability that does not need
any knowledge about filesystems.
> We need to see EXACTLY which kind of crash do you mean.
> If you are using some older kernel - then please upgrade first and
> provide proper BZ case with reproducer.
Yes apologies here, I responded to this thing earlier (perhaps a year
ago) and the systems I was testing on was 4.4 kernel. So I cannot
currently confirm and probably is already solved (could be right).
Back then the crash was kernel messages on TTY and then after some 20-30
seconds total freeze. After I copied too much data to (test) thin pool.
Probably irrelevant now if already fixed.
> BTW you can imagine an out-of-space thin-pool with thin volume and
> filesystem as a FS, where some writes ends with 'write-error'.
> If you think there is OS system which keeps running uninterrupted,
> while number of writes ends with 'error' - show them :) - maybe we
> should stop working on Linux and switch to that (supposedly much
> better) different OS....
I don't see why you seem to think that devices cannot be logically
separated from each other in terms of their error behaviour.
If I had a system crashing because I wrote to some USB device that was
malfunctioning, that would not be a good thing either.
I have said repeatedly that the thin volumes are data volumes. Entire
system should not come crashing down.
I am sorry if I was basing myself on older kernels in those messages,
but my experience dates from a year ago ;-).
Linux kernel has had more issues with USB for example that are
unacceptable, and even Linus Torvalds himself complained about it.
Queues filling up because of pending writes to USB device and entire
system grinds to a halt.
> You can have different pools and you can use rootfs with thins to
> easily test i.e. system upgrades....
Sure but in the past GRUB2 would not work well with thin, I was basing
myself on that...
I do not see real issue with using thin rootfs myself but grub-probe
didn't work back then and OpenSUSE/GRUB guy attested to Grub not having
thin support for that.
> Most thin-pool users are AWARE how to properly use it ;) lvm2 tries
> to minimize (data-lost) impact for misused thin-pools - but we can't
> spend too much effort there....
Everyone would benefit from more effort being spent there, because it
reduces the problem space and hence the burden on all those maintainers
to provide all types of safety all the time.
EVERYONE would benefit.
> But if you advocate for continuing system use of out-of-space
> thin-pool - that I'd probably recommend start sending patches... as
> an lvm2 developer I'm not seeing this as best time investment but
Not necessarily that the system continues in full operation,
applications are allowed to crash or whatever. Just that system does not
But you say these are old problems and now fixed...
I am fine if filesystem is told "write error".
Then filesystem tells application "write error". That's fine.
But it might be helpful if "critical volumes" can reserve space in
That is what Gionatan was saying...?
Filesystem can also do this itself but not knowing about thin layer it
has to write random blocks to achieve this.
I.e. filesystem may guess about thin layout underneath and just write 1
byte to each block it wants to allocate.
But feature could more easily be implemented by LVM -- no mixing of
So number of (unallocated) blocks are reserved for critical volume.
When number drops below "needed" free blocks for those volumes, system
starts returning errors for volumes not that critical volume.
I don't see why that would be such a disturbing feature.
You just cause allocator to error earlier for non-critical volumes, and
allocator to proceed as long as possible for critical volumes.
Only think you need is runtime awareness of available free blocks.
You said before this is not efficiently possible.
Such awareness would be required, even if approximately, to implement
any such feature.
But Gionatan was only talking about volume creation in latest messages.
>> However, from both a theoretical and practical standpoint being able
>> to just shut down whatever services use those data volumes -- which is
>> only possible
> Are you aware there is just one single page cache shared for all
> in your system ?
Well I know the kernel is badly designed in that area. I mean this was
the source of the USB problems. Torvalds advocated lowering the size of
the write buffer.
Which distributions then didn't do and his patch didn't even make it
He said "50 MB write cache should be enough for everyone" and not 10% of
total memory ;-).
> Again do you have use-case where you see a crash of data mounted volume
> on overfilled thin-pool ?
Yes, again, old experiences.
> On my system - I could easily umount such volume after all 'write'
> are timeouted (eventually use thin-pool with --errorwhenfull y for
> instant error reaction.
That's good, I didn't have that back then (and still not).
It is Debian 8 / Kubuntu 16.04 systems.
> So please can you stop repeating overfilled thin-pool with thin LV
> data volume kills/crashes machine - unless you open BZ and prove
> otherwise - you will surely get 'fs' corruption but nothing like
> crashing OS can be observed on my boxes....
But when I talked about this a year ago you didn't seem to comprehend I
was talking about an older system (back then not so old) or acknowledged
that these problems had (once) existed, so I also didn't know they would
now already be solved.
Sometimes if you just acknowledge problems were there before but not
anymore, makes it a lot easier.
We spoke about this topic a year ago as well, and perhaps you didn't
understand me because for you the problems were already fixed (in your
> We are here really interested in upstream issues - not about missing
> bug fixes backports into every distribution and its every released
I understand. But it's hard for me to know which is which.
These versions are in widespread use.
Compiling your own packages is also system maintenance burden etc.
So maybe our disagreement back then came from me experiencing something
that was already solved upstream (or in later kernels).
>> He might be able to recover his system if his system is still allowed
>> to be logged into.
> There is no problem with that as long as /rootfs has consistently
> working fs!
Well I guess it was my Debian 8 / kernel 4.4 problem then...
More information about the linux-lvm