[Cluster-devel] [Lsf-pc] [LSF/MM ATTEND] [TOPIC] fs/block interface discussions

Wed Dec 10 18:46:51 UTC 2014

  Hi,

On Wed 10-12-14 14:13:02, Steven Whitehouse wrote:
> On 10/12/14 12:48, Jan Kara wrote:
> >On Wed 10-12-14 11:49:48, Steven Whitehouse wrote:
> >>I'm interested generally in topics related to integration between
> >>components, one example being snapshots. We have snapshots at
> >>various different layers (can be done at array level or dm/lvm level
> >>and also we have filesystem support in the form of fs freezing).
> >   Well, usually snapshots at LVM layer are using fs freezing to get a
> >consistent image of a filesystem. So these two are integrated AFAICS.
> >
> >>There are a few thoughts that spring to mind - one being how this
> >>should integrate with applications - in order to make it easier to
> >>use, and another being whether we could introduce snapshots which do
> >>not require freezing the fs (as per btrfs) for other filesystems too
> >>- possibly by passing down a special kind of flush from the
> >>filesystem layer.
> >   So btrfs is special in its COW nature. For filesystems which do updates
> >in place you can do COW in the block layer (after all that's what
> >dm snapshotting does) but you still have to get fs into consistent state
> >(that's fsfreeze), then take snapshot of the device (by setting up proper
> >COW structures), and only then you can allow further modifications of the
> >filesystem by unfreezing it. I don't see a way around that...
> Well I think it should be possible to get the fs into a consistent
> state without needing to do the freeze/snapshot/unfreeze procedure.
> Instead we might have (there are no doubt other solutions too, so
> this is just an example to get discussion started) an extra flag on
> a bio, which would only be valid with some combination of flush
> flags. Then it is just a case of telling the block layer that we
> want to do a snapshot, and it would then spot the marked bio when
> the fs sends it down, and know that everything before and including
> that bio should be in the snapshot, and everything after that is
> not. So the fs would do basically a special form of sync, setting
> the flag on the bio when it is consistent - the question being how
> should that then be triggered? It means that there is no longer any
> possibility of having a problem if the unfreeze does not happen for
> any reason.
  But still you first need to stop all writes to the filesystem, then do a
sync, and then allow writing again - which is exactly what freeze does.
Without stopping writers, you cannot be sure you don't have a mix of old
and new files in the snapshot and also guranteeing some finite completion
time is difficult (although that's doable)... So it seems to me that what
you describe is freeze-snapshot-unfreeze cycle, just that it's fully
controlled by the kernel.

> Perhaps the more important question though, is how it would/could be
> integrated with applications? The ultimate goal that I had in mind
> is that we could have a tool which is run to create a snapshot which
> will with a single command deal with all three
> (application/fs/block) layers, and it should not matter whether the
> snapshot is done via any particular fs or block device, it should
> work in the same way. So how could we send a message to a process to
> say that a snapshot is about to be taken, and to get a message back
> when the app has produced a consistent set of data, and to
> coordinate between multiple applications using the same block
> device, or even across multiple block devices, being used by a
> single app?
  Yeah, this would be nice. But it requires buy in from the applications
which is always difficult. Do you know any application whose developers
would be interested in something like this?

> >>A more general topic is proposed changes to the fs/block interface,
> >>of which the above may possibly be one example. There are a number
> >>of proposals for new classes of block device, and new features which
> >>will potentially require a different (or extended) interface at the
> >>fs/block layer. These have largely been discussed to date as
> >>individual features, and I wonder whether it might be useful to try
> >>and bring together the various proposals to see if there is
> >>commonality between at least some of them at the fs/block interface
> >>level. I know that there have been discussions going on relating to
> >>the individual proposals, so the idea I had was to try and look at
> >>them from a slightly different angle by bringing as many of them as
> >>possible together and concentrating on how they would be used from a
> >>filesystem perspective,
> >   Could you elaborate on which combination of features you'd like to
> >discuss?
> >								Honza
> Well there are a number that I'm aware of that are currently in
> development, but I suspect that this list is not complete:
>  - SMR drives
>  - persistent memory (various different types)
>  - Hinting from fs to block layer for various different reasons
> (layout, compression, snapshots, anything else?)
>  - Better i/o error reporting/recovery
>  - copy offload
>  - anything I forgot?
  I see. OK.

								Honza
-- 
Jan Kara <jack at suse.cz>
SUSE Labs, CR