[Cluster-devel] [Lsf-pc] [LSF/MM ATTEND] [TOPIC] fs/block interface discussions

Fri Dec 12 20:38:29 UTC 2014

Jan Kara wrote:

>   Hi,
> 
> On Fri 12-12-14 11:46:34, Steven Whitehouse wrote:
>> On 11/12/14 00:52, Alasdair G Kergon wrote:
>> >On Wed, Dec 10, 2014 at 07:46:51PM +0100, Jan Kara wrote:
>> >>   But still you first need to stop all writes to the filesystem, then
>> >>   do a
>> >>sync, and then allow writing again - which is exactly what freeze does.
>> >And with device-mapper, we were asked to support the taking of snapshots
>> >of multiple volumes simultaneously (e.g. where the application data is
>> >stored across more than one filesystem). Thin dm snapshots can handle
>> >this (the original non-thin ones can't).
>> >
>> Thats good to know, and a useful feature. One of the issues I can
>> see is that because there are a number of different layers involved
>> (application/fs/storage) coordination of requirements between those
>> is not easy. To try to answer Jan's question earlier in the thread,
>> no I don't know any specific application developers, but I can
>> certainly help to propose some kind of solution, and then get some
>> feedback. I think it is probably going to be easier to start with a
>> specific proposal, albeit tentative, and then ask for feedback than
>> to just say "how should we do this?" which is a lot more open ended.
>> 
>> Going back to the other point above regarding freeze, is it not
>> necessarily a requirement to stop all writes in order to do a
>> snapshot, what is needed is in effect a barrier between operations
>> which should be represented in the snapshot and those which should
>> not, because they happen "after" the snapshot has been taken. Not
>> that I'm particularly attached to that proposal as it stands, but I
>> hope it demonstrates the kind of thing I had in mind for discussion.
>> I hope also that it will be possible to come up with a better
>> solution during and/or following the discussion.
>   I think understand your idea with a 'barrier'. It's just that I have
> troubles seeing how it would actually get implemented - how do you make
> sure that e.g. after writing back block allocation bitmap and while
> writing back other metadata, noone can allocate new blocks to file 'foo'
> and still writeback the file's inode before you submit the barrier?

Actually, I suspect something could be (relatively) trivially implemented 
using a similar strategy to dm-era. Snapshots increment the era; blocks from 
previous eras cannot be overwritten or removed, and the target could be 
mapped to view a past era. With that, you have essentially instantaneous 
snapshots (increment a counter) with only a barrier constraint, not 
freezing.

>> The goal  would really be to figure out which bits we already have,
>> which bits are missing, where the problems are, what can be done
>> better, and so forth, while we have at least two of the three layers
>> represented and in the same room. This is very much something for
>> the long term rather than a quick discussion followed by a few
>> patches kind of thing, I think,
>   Sure, if you have some proposal (not necessarily patches) then it's
> probably worth talking about.
> 
> Honza