[Linux-cluster] Re: Question about GFS
Oved Ourfali
ovedwish at gmail.com
Thu Mar 24 18:13:09 UTC 2005
Hey,
First I used the default values ( rate - 0.3, misses - 1).
When I saw the problem I tried to increase the misses value to 10,
and then it sometimes work and sometimes not.
how do I add loginloops ?
Thank you !
Oved
On Thu, 24 Mar 2005 10:06:18 -0600, Michael Conrad Tadpol Tilstra
<mtilstra at redhat.com> wrote:
> On Wed, Mar 23, 2005 at 02:11:00PM +0200, Oved Ourfali wrote:
> > I have GFS version 6 installes on rhl es3 update 3.
> > The GFS includes 3 nodes, a, b and c.
> >
> > The three nodes run the lock_gulm daemon, and thus it runs in RLM mode.
> >
> > I have done some tests to check that the GFS works correctly, and i
> > ran into some thing very weird:
> > Lets assume the master is A, and B and C are slaves.
> > Disconnecting B or C from the network works fine.
> >
> > Disconnecting A causes a problem. Lets assume B tries to be the new
> > master. B indicates that A is down, but for some reason it also thinks
> > that C is down, thus it waits for enough slaves to contact him, and it
> > doesn't happen. I tried to increase the timeout, and now it sometimes
> > work and sometimes don't.
> >
> > Does anyone have a clue why it is happening ?
>
> For some reason C isn't finding B in time to let it know that it is
> still alive. So, first question, what values are you using for
> heartbeat_rate and allowed_misses? Are you seeing this with the
> defaults? or are you using something else? (before increasing it)
>
> Also, you can add LoginLoops to the verbosity setting to have gulm print
> out much more detail when it is trying to connect and find the master
> server.
>
> --
> Michael Conrad Tadpol Tilstra
> BE ALERT!!!! (The world needs more lerts ...)
>
>
More information about the Linux-cluster
mailing list