[Linux-cluster] cman bad generation number
Daniel McNeil
daniel at osdl.org
Wed Jan 12 17:44:22 UTC 2005
On Wed, 2005-01-12 at 00:58, Patrick Caulfield wrote:
> On Tue, Jan 11, 2005 at 05:00:46PM -0800, Daniel McNeil wrote:
> > On Tue, 2005-01-11 at 00:56, Patrick Caulfield wrote:
> > > On Wed, Dec 22, 2004 at 09:33:39AM -0800, Daniel McNeil wrote:
> > > > How long does cman stay up in your testing?
> > >
> > > With the higher pririty on the heartbeat thread I got 5 days before iSCSI died
> > > on me again... This isn't quite the same load as yours but it is on 8 busy nodes.
> >
> > I have not seen 5 days yet on my set. See my email from yesterday.
> > Is the code to have higher priority for the heartbeat thread
> > already checked in? I restarted my test yesterday and it is
> > still going, but it usually has trouble after 50 hours or so.
> >
>
> It's rev 1.45 of membership.c checked in on the 7th Jan. If that hasn't fixed it
> I'll have to dabble with realtime things as it does seem now that the threads
> are not being woken up, even though the timer is firing.
I'm running from code as of Jan 4th, so I do not have that change.
I'll updated my code.
2 nodes died last night running my tests with
echo "9" > /proc/cluster/config/cman/max_retries
echo "1" > /proc/cluster/config/cman/hello_timer
here's the output on the console from the 3 nodes:
cl030:
CMAN: no HELLO from cl031a, removing from the cluster
CMAN: node cl032a is not responding - removing from the cluster
CMAN: quorum lost, blocking activity
cl031:
CMAN: node cl030a is not responding - removing from the cluster
CMAN: node cl032a is not responding - removing from the cluster
SM: Assertion failed on line 67 of file
/Views/redhat-cluster/cluster/cman-kernel/src/sm_membership.c
SM: assertion: "node"
SM: time = 115176056
Kernel panic - not syncing: SM: Record message above and reboot.
Message from syslogd at cl031 at Wed Jan 12 01:17:57 2005 ...
Record message above and reboot. syncing: SM:
cl032:
CMAN: too many transition restarts - will die
Daniel
More information about the Linux-cluster
mailing list