[Linux-cluster] Possible cman init script race condition

Mon Oct 1 14:33:49 UTC 2007

-----Original Message-----
From: David Teigland [mailto:teigland at redhat.com] 
Sent: den 28 september 2007 19:03
To: Borgström Jonas
Cc: linux clustering
Subject: Re: [Linux-cluster] Possible cman init script race condition

> On Fri, Sep 28, 2007 at 11:45:47AM -0500, David Teigland wrote:
> > On Fri, Sep 28, 2007 at 09:58:18AM -0500, David Teigland wrote:
> > > On Fri, Sep 28, 2007 at 04:48:18PM +0200, Borgstr?m Jonas wrote:
> > > > I must have misunderstood you or something, but didn't I already include
> > > > that info in the message I sent a few days ago?
> > > > 
> > > > http://permalink.gmane.org/gmane.linux.redhat.cluster/9999
> > > > 
> > > > (The archive inlines the "group_tool dump" output making it a bit hard
> > > > to read, but hopefully your email client shows them as attachments).
> > > 
> > > I missed that, I'll take a look, thanks.
> > 
> > You've hit a known bug that's been fixed:
> >   https://bugzilla.redhat.com/show_bug.cgi?id=251966
> > 
> > We may have to move up the release of that fix since people are seeing the
> > problem.  Be careful when reading that bz because there's a lot of
> > incorrect diagnosis that was recorded before we figured out what the real
> > bug was.  Here's the problem, it's very complex:
> > 
> > 1. when the nodes start up, they each form a 1-node openais cluster
> >    independent of the other
> > 
> >    [This shouldn't really happen, but in reality we can't prevent it
> >     100% of the time.  We try to make it rare, and then deal with it
> >     sensibly on the rare occasion when it does happen.  You've hit
> >     the "rare" occasion -- if you're actually seeing this regularly
> >     then we probably need to fix or adjust something at the openais
> >     level to make it less common.]
>
> I'd try to use some sleeps here, before running fence_tool join on either
> node, as a work-around.  We're trying to get both nodes merged together
> before they do anything else.

Strangely enough adding a "sleep 30" line directly below the "echo "Starting cluster: "" line seems to make this problem go away every time. Note that this is before any daemon is started. It works, but I'm not sure why.

>
> Also, how often are you seeing the nodes not merge together right away?
> If it's frequent, then we need to fix that.

This happens every time on this hardware (2 Dell 1955 blades). I never got fenced to work correctly until I figured out that I need to add a sleep 30 to the cman init script. So I'm obviously very interested in seeing this fixed in a 5.0 errata or in 5.1 at the very latest. I can't really wait until 5.2 is out...

And as I mentioned before, the really scary part is that I am able to mount gfs filesystems during this kind of cluster split. And if I one node is shot, the other node replays the gfs journal and makes the filesystem writable again without first fencing the shot/missing node.

Here some "group_tool -v" output with a mounted filesystem:

[root at prod-db2 pgsql]# group_tool -v
type             level name     id       state node id local_done
fence            0     default  00010002 JOIN_START_WAIT 1 100020001 1
[1 2]
dlm              1     clvmd    00020001 JOIN_START_WAIT 1 100020001 1
[1 2]
dlm              1     pg_fs    00060001 JOIN_START_WAIT 1 100020001 1
[1 2]
gfs              2     pg_fs    00050001 JOIN_START_WAIT 1 100020001 1
[1 2]

Regards,
Jonas