[Cluster-devel] [bug report] GFS2: Fix bug-trap in ail flush code

Bob Peterson rpeterso at redhat.com
Fri Jul 30 18:32:42 UTC 2021


On 7/28/21 3:31 AM, Dan Carpenter wrote:
> 
> Hi GFS2 devs,
> 
> This is 10 year old code, but it looks suspicious and hopefully the
> recovery code doesn't get testing very often in runtime.
> 
> The patch 75549186edf1: "GFS2: Fix bug-trap in ail flush code" from
> Aug 2, 2011, leads to the following static checker warning:
> 
> 	fs/gfs2/glock.c:1487 gfs2_glock_dq()
> 	warn: sleeping in atomic context
> 
> fs/gfs2/glops.c
>      57  static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
>      58                               unsigned int nr_revokes)
>      59  {
>      60          struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
>      61          struct list_head *head = &gl->gl_ail_list;
>      62          struct gfs2_bufdata *bd, *tmp;
>      63          struct buffer_head *bh;
>      64          const unsigned long b_state = (1UL << BH_Dirty)|(1UL << BH_Pinned)|(1UL << BH_Lock);
>      65
>      66          gfs2_log_lock(sdp);
>                  ^^^^^^^^^^^^^^^^^^
>      67          spin_lock(&sdp->sd_ail_lock);
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> We're holding a spinlock here
> 
>      68          list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) {
>      69                  if (nr_revokes == 0)
>      70                          break;
>      71                  bh = bd->bd_bh;
>      72                  if (bh->b_state & b_state) {
>      73                          if (fsync)
>      74                                  continue;
>      75                          gfs2_ail_error(gl, bh);
>                                  ^^^^^^^^^^^^^^^^^^^^^^
> The gfs2_ail_error() function calls gfs2_withdraw() which can sleep or
> the call tree that this is complains about is:
> 
> --> gfs2_ail_error()
>     --> gfs2_withdraw()
>      --> signal_our_withdraw()
>          -->gfs2_glock_dq()
> 
> It's also very possible that this is a false positive...  Smatch doesn't
> understand bit tests very well and especially across function
> boundaries.
> 
>      76                  }
>      77                  gfs2_trans_add_revoke(sdp, bd);
>      78                  nr_revokes--;
>      79          }
>      80          GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count));
>      81          spin_unlock(&sdp->sd_ail_lock);
>      82          gfs2_log_unlock(sdp);
>      83  }
> 
> regards,
> dan carpenter
> 
Hi Dan,

Thanks. I don't think it's a false positive. I think we should be using 
a delayed withdraw rather than an actual withdraw. I'll send a patch out 
to fix it shortly.

Regards,

Bob Peterson




More information about the Cluster-devel mailing list