[dm-devel] [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock

ruansy.fnst at fujitsu.com ruansy.fnst at fujitsu.com
Thu Jun 17 06:51:01 UTC 2021


> -----Original Message-----
> From: Dan Williams <dan.j.williams at intel.com>
> Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
> 
> [ drop old linux-nvdimm at lists.01.org, add nvdimm at lists.linux.dev ]
> 
> On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst at fujitsu.com> wrote:
> >
> > Memory failure occurs in fsdax mode will finally be handled in
> > filesystem.  We introduce this interface to find out files or metadata
> > affected by the corrupted range, and try to recover the corrupted data
> > if possiable.
> >
> > Signed-off-by: Shiyang Ruan <ruansy.fnst at fujitsu.com>
> > ---
> >  include/linux/fs.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h index
> > c3c88fdb9b2a..92af36c4225f 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2176,6 +2176,8 @@ struct super_operations {
> >                                   struct shrink_control *);
> >         long (*free_cached_objects)(struct super_block *,
> >                                     struct shrink_control *);
> > +       int (*corrupted_range)(struct super_block *sb, struct block_device
> *bdev,
> > +                              loff_t offset, size_t len, void *data);
> 
> Why does the superblock need a new operation? Wouldn't whatever function is
> specified here just be specified to the dax_dev as the
> ->notify_failure() holder callback?

Because we need to find out which file is effected by the given poison page so that memory-failure code can do collect_procs() and kill_procs() jobs.  And it needs filesystem to use its rmap feature to search the file from a given offset.  So, we need this implemented by the specified filesystem and called by dax_device's holder.

This is the call trace I described in cover letter:
memory_failure()
 * fsdax case
 pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
  dax_device->holder_ops->corrupted_range() =>
                                      - fs_dax_corrupted_range()
                                      - md_dax_corrupted_range()
   sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()  <== **HERE**
    xfs_rmap_query_range()
     xfs_currupt_helper()
      * corrupted on metadata
          try to recover data, call xfs_force_shutdown()
      * corrupted on file data
          try to recover data, call mf_dax_kill_procs()
 * normal case
 mf_generic_kill_procs()

As you can see, this new added operation is an important for the whole progress.


--
Thanks,
Ruan.




More information about the dm-devel mailing list