[Cluster-devel] [Lsf-pc] [LSF/MM ATTEND] [TOPIC] fs/block interface discussions
Steven Whitehouse
swhiteho at redhat.com
Wed Dec 10 14:13:02 UTC 2014
Hi,
On 10/12/14 12:48, Jan Kara wrote:
> Hi,
>
> On Wed 10-12-14 11:49:48, Steven Whitehouse wrote:
>> I'm interested generally in topics related to integration between
>> components, one example being snapshots. We have snapshots at
>> various different layers (can be done at array level or dm/lvm level
>> and also we have filesystem support in the form of fs freezing).
> Well, usually snapshots at LVM layer are using fs freezing to get a
> consistent image of a filesystem. So these two are integrated AFAICS.
>
>> There are a few thoughts that spring to mind - one being how this
>> should integrate with applications - in order to make it easier to
>> use, and another being whether we could introduce snapshots which do
>> not require freezing the fs (as per btrfs) for other filesystems too
>> - possibly by passing down a special kind of flush from the
>> filesystem layer.
> So btrfs is special in its COW nature. For filesystems which do updates
> in place you can do COW in the block layer (after all that's what
> dm snapshotting does) but you still have to get fs into consistent state
> (that's fsfreeze), then take snapshot of the device (by setting up proper
> COW structures), and only then you can allow further modifications of the
> filesystem by unfreezing it. I don't see a way around that...
Well I think it should be possible to get the fs into a consistent state
without needing to do the freeze/snapshot/unfreeze procedure. Instead we
might have (there are no doubt other solutions too, so this is just an
example to get discussion started) an extra flag on a bio, which would
only be valid with some combination of flush flags. Then it is just a
case of telling the block layer that we want to do a snapshot, and it
would then spot the marked bio when the fs sends it down, and know that
everything before and including that bio should be in the snapshot, and
everything after that is not. So the fs would do basically a special
form of sync, setting the flag on the bio when it is consistent - the
question being how should that then be triggered? It means that there is
no longer any possibility of having a problem if the unfreeze does not
happen for any reason.
Perhaps the more important question though, is how it would/could be
integrated with applications? The ultimate goal that I had in mind is
that we could have a tool which is run to create a snapshot which will
with a single command deal with all three (application/fs/block) layers,
and it should not matter whether the snapshot is done via any particular
fs or block device, it should work in the same way. So how could we send
a message to a process to say that a snapshot is about to be taken, and
to get a message back when the app has produced a consistent set of
data, and to coordinate between multiple applications using the same
block device, or even across multiple block devices, being used by a
single app?
>> A more general topic is proposed changes to the fs/block interface,
>> of which the above may possibly be one example. There are a number
>> of proposals for new classes of block device, and new features which
>> will potentially require a different (or extended) interface at the
>> fs/block layer. These have largely been discussed to date as
>> individual features, and I wonder whether it might be useful to try
>> and bring together the various proposals to see if there is
>> commonality between at least some of them at the fs/block interface
>> level. I know that there have been discussions going on relating to
>> the individual proposals, so the idea I had was to try and look at
>> them from a slightly different angle by bringing as many of them as
>> possible together and concentrating on how they would be used from a
>> filesystem perspective,
> Could you elaborate on which combination of features you'd like to
> discuss?
> Honza
Well there are a number that I'm aware of that are currently in
development, but I suspect that this list is not complete:
- SMR drives
- persistent memory (various different types)
- Hinting from fs to block layer for various different reasons
(layout, compression, snapshots, anything else?)
- Better i/o error reporting/recovery
- copy offload
- anything I forgot?
Steve.
More information about the Cluster-devel
mailing list