[Linux-cluster] GFS2 filesystem consistency error

Daniel Dehennin daniel.dehennin at baby-gnu.org
Wed Feb 24 09:05:06 UTC 2016

Bob Peterson <rpeterso at redhat.com> writes:

> Hi Daniel,


> I took a look at that metadata you sent me, but I didn't find any evidence
> relating to the problem you posted. Either the corruption happened a long
> time prior to your saving of the metadata, or else the metadata was saved
> after an fsck.gfs2 fixed (or attempted to fix) the problem?

- when I first encountered the problem, I did an fsck on the filesystem
  with version 3.1.6 from Ubuntu.

- several days after, the same messages “dirty_inode: glock -5”
  start showing on the same node as the first time.

- I did an fsck with 3.1.8 build from git

- few days after, the same node had the “dirty_inode” messages, I
  shutdown that node and then run the “gfs2_edit savemeta”.

All nodes are same hardware and OS/kernel/pacemaker version.

> One thing's for sure: I don't see any evidence of wild file system corruption;
> certainly nothing that can account for those errors.
> You said the problem seemed to revolve around a gfs2_grow operation,
> right?

Not exactly, I live grow the fs 6 months ago and encounter some
troubles, I did an fsck by that time and the fs run fine for months.

Then we had the “dirty_inode” troubles starting on Feb 9.

> Can you make sure the lvm2 volume group has the clustered bit set?
> Please do the "vgs" command and see if that volume has "c" listed in its
> flags. If not, it could have caused problems for the gfs2_grow.

Yes it has the cluster flag.

> I've seen problems like this very rarely. Once was a legitimate bug in
> GFS2 that we fixed in RHEL5, but I assume your kernel is newer than
> that.

We have 3.13.0-78-generic from Ubuntu.


> My only working theory is this:
> This might be related to the transition between "unlinked" dinodes and
> "free". After a file is deleted, it goes to "unlinked" and has to be
> transitioned to "free". This sometimes goes wrong because of the way
> it needs to check what other nodes in the cluster are doing.
> Maybe: If you have three nodes, and a file was unlinked on node 1, then
> maybe the internode communication got confused and nodes 2 and 3 both
> tried to transition it from Unlinked to Free. That is only a theory, and
> there is absolutely no proof. However, I have a set of patches that are
> experimental, and not even in the upstream kernel yet (hopefully soon!)
> that try to tighten up and fix problems like this. It's much more common
> for multiple nodes to try to transition from Unlinked to Free, and they
> all fail, leaving the file in an "Unlinked" state.

Thanks for the explanations, so I try to re-add the down node to the
cluster and see what happen.

Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20160224/fc8a7f9a/attachment.sig>

More information about the Linux-cluster mailing list