[Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

Wed Mar 27 14:40:04 UTC 2019

On 3/25/19 3:29 PM, Bob Peterson wrote:
> ----- Original Message -----
>> I think I've found the cause of the assertion I was hitting. Recovery
>> sets sd_log_flush_head but does not take locks which means a concurrent
>> call to gfs2_log_flush() can result in sd_log_head being set to
>> sd_log_flush_head. A later call to gfs2_log_flush() will then hit an
>> assertion failure in log_pull_tail() because the mismatch between
>> sd_log_head and sd_log_tail means too many blocks are freed.
>>
>> I've worked around it by taking the log_flush lock in the patch below
>> and it seems to avoid the problem. However, tracing the recovery process
>> I see that it sets sd_log_flush_head and then calls clean_journal() ->
>> gfs2_write_log_header() -> gfs2_log_bmap() -> gfs2_log_incr_head(). This
>> has:
>>
>> 	BUG_ON((sdp->sd_log_flush_head == sdp->sd_log_tail) &&
>> 	       (sdp->sd_log_flush_head != sdp->sd_log_head));
>>
>> ... but sd_log_tail and sd_log_head have not been set by
>> gfs2_recover_func() so it might still BUG_ON() during recovery if you're
>> particularly unlucky.
>>
>> I had a look at your "GFS2: Withdraw corruption patches [V2]" series but
>> I didn't see anything that might fix this.
>>
>> If you think this patch is useful then I can send it as a proper patch
>> to the list.
>>
>> Thanks,
>> Ross
>>
>> --------------
>>
>> gfs2: Take log_flush lock during recovery
>>
>> Recovery sets sd_log_flush_head but does not take any locks which means
>> a concurrent call to gfs2_log_flush can result in sd_log_head being set
>> to sd_log_flush_head. A later call to gfs2_log_flush will then hit an
>> assertion failure in log_pull_tail because the mismatch between
>> sd_log_head and sd_log_tail means too many blocks are freed.
>>
>> gfs2: fsid=xapi-clusterd:88a31b8e-4072-b0.1: fatal: assertion
>> "atomic_read(&sdp->sd_log_blks_free) <= sdp->sd_jdesc->jd_blocks" failed
>> function = log_pull_tail, file = fs/gfs2/log.c, line = 510
> 
> Hi Ross,
> 
> I think you found a valid bug, but that's not the proper solution.
> The reason is: I think journal replay and journal flushing should both be
> protected by the exclusive (EX) glock taken on the journal itself.
> 
> I think the problem may actually be a regression with patch 588bff95c94ef.
> Because of that patch, function clean_journal now sets sdp->sd_log_flush_head
> but its caller, gfs2_recover_func, is used to recover any node's journal, not
> just its own. The bug is that clean_journal should only set sd_log_flush_head
> if (and only if) it's replaying its own journal, not someone else's.
> If it sets sd_log_flush_head while replaying another node's journal, that
> will only lead to a problem like this.
> 
> I'll try and whip up another patch and perhaps you can test it for me.
> 
> FWIW, I've never seen this problem manifest on my recovery tests, but it
> still might be causing some of the weird problems I'm seeing.
> 

(Replying on this thread since I didn't get CCed on the patch.)

I still hit the assertion with this patch. gfs2_log_write_header() is 
unconditionally called and it calls gfs2_log_bmap() which changes 
sd_log_flush_head in the success path.

FWIW we hit this assertion periodically during testing but I found that 
I could get it to trigger almost every time during recovery with some 
well-placed sleeps and a kthread that continuously calls gfs2_log_flush().

-----8<----------

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index ee20ea42e7b5..8bd87082a110 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -820,6 +820,10 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct 
gfs2_glock *gl, u32 flags)
  	}
  	lops_after_commit(sdp, tr);

+       if (flags & GFS2_LFC_LOGD_TEST) {
+               msleep(1000);
+       }
+
  	gfs2_log_lock(sdp);
  	sdp->sd_log_head = sdp->sd_log_flush_head;
  	sdp->sd_log_blks_reserved = 0;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 20241436126d..9d658efdb57a 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -79,6 +79,7 @@ extern void gfs2_ail1_flush(struct gfs2_sbd *sdp, 
struct writeback_control *wbc)

  extern void gfs2_log_shutdown(struct gfs2_sbd *sdp);
  extern int gfs2_logd(void *data);
+extern int gfs2_logd2(void *data);
  extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata 
*bd);
  extern void gfs2_write_revokes(struct gfs2_sbd *sdp);

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a971862b186e..1d7f32c1960a 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -378,6 +378,7 @@ static int init_threads(struct gfs2_sbd *sdp)
  	int error = 0;

  	p = kthread_run(gfs2_logd, sdp, "gfs2_logd");
+	p = kthread_run(gfs2_logd2, sdp, "gfs2_logd2");
  	if (IS_ERR(p)) {
  		error = PTR_ERR(p);
  		fs_err(sdp, "can't start logd thread: %d\n", error);
@@ -1763,3 +1764,17 @@ const struct super_operations gfs2_super_ops = {
  	.show_options		= gfs2_show_options,
  };

+int gfs2_logd2(void *data)
+{
+	struct gfs2_sbd *sdp = data;
+
+	msleep(30000);
+
+	for (;;) {
+		msleep(100);
+		gfs2_log_flush(sdp, NULL, GFS2_LOG_HEAD_FLUSH_NORMAL |
+			       GFS2_LFC_LOGD_TEST);
+	}
+
+	return 0;
+}
diff --git a/include/uapi/linux/gfs2_ondisk.h 
b/include/uapi/linux/gfs2_ondisk.h
index 2dc10a034de1..ebae3a21c98e 100644
--- a/include/uapi/linux/gfs2_ondisk.h
+++ b/include/uapi/linux/gfs2_ondisk.h
@@ -431,6 +431,7 @@ struct gfs2_ea_header {
  #define GFS2_LFC_TRANS_END		0x01000000
  #define GFS2_LFC_LOGD_JFLUSH_REQD	0x02000000
  #define GFS2_LFC_LOGD_AIL_FLUSH_REQD	0x04000000
+#define GFS2_LFC_LOGD_TEST     0x08000000

  #define LH_V1_SIZE (offsetofend(struct gfs2_log_header, lh_hash))


Thanks,
-- 
Ross Lagerwall