[Linux-cluster] Re: Question about GFS

Oved Ourfali ovedwish at gmail.com
Thu Mar 24 18:13:09 UTC 2005


Hey,

First I used the default values ( rate - 0.3, misses - 1).

When I saw the problem I tried to increase the misses value to 10,
and then it sometimes work and sometimes not.

how do I add loginloops ?

Thank you !
Oved


On Thu, 24 Mar 2005 10:06:18 -0600, Michael Conrad Tadpol Tilstra
<mtilstra at redhat.com> wrote:
> On Wed, Mar 23, 2005 at 02:11:00PM +0200, Oved Ourfali wrote:
> > I have GFS version 6 installes on rhl es3 update 3.
> > The GFS includes 3 nodes, a, b and c.
> > 
> > The three nodes run the lock_gulm daemon, and thus it runs in RLM mode.
> > 
> > I have done some tests to check that the GFS works correctly, and i
> > ran into some thing very weird:
> > Lets assume the master is A, and B and C are slaves.
> > Disconnecting B or C from the network works fine.
> > 
> > Disconnecting A causes a problem. Lets assume B tries to be the new
> > master. B indicates that A is down, but for some reason it also thinks
> > that C is down, thus it waits for enough slaves to contact him, and it
> > doesn't happen. I tried to increase the timeout, and now it sometimes
> > work and sometimes don't.
> > 
> > Does anyone have a clue why it is happening ?
> 
> For some reason C isn't finding B in time to let it know that it is
> still alive.  So, first question, what values are you using for
> heartbeat_rate and allowed_misses? Are you seeing this with the
> defaults? or are you using something else? (before increasing it)
> 
> Also, you can add LoginLoops to the verbosity setting to have gulm print
> out much more detail when it is trying to connect and find the master
> server. 
> 
> -- 
> Michael Conrad Tadpol Tilstra
> BE ALERT!!!!  (The world needs more lerts ...)
> 
>




More information about the Linux-cluster mailing list