[Linux-cluster] client doesnt start when lock master is not ready

Raj Kumar rajkum2002 at rediffmail.com
Mon Feb 28 15:14:47 UTC 2005


Thanks Adam! This explains the cause of the problem.

Yes, I started experiencing this problem after the upgrade from GFS-6.0.0-15 to GFS-6.0.2-24. 

Thank you,
Raj


On Thu, 24 Feb 2005 Adam Manthei wrote :
>On Thu, Feb 24, 2005 at 04:45:27PM -0000, Raj  Kumar wrote:
> > Hi All,
> >
> > We have a two node system using GFS. One of them is the lock server and
> > other is just client. We restarted our servers recently and brought the
> > lock client before bringing up the lock master. lock_gulmd is set to
> > restart at levels 3, 4 and 5. The lock client system just hungup with the
> > message "Starting lock_gulmd..." in the boot process. It's clear that this
> > situation happened since lock master server wasn't available then. When
> > the lock master server started the lock client system started successfully.
>
>This is the desired behavior.  Adjust the following value in
>/etc/sysconfig/gfs if you don't like it's behavior.
>
># GULM_QUORUM_TIMEOUT -- amount of time to wait for there to be a master
>#     before giving up.  If GULM_QUORUM_TIMEOUT is positive, then we will
>#     wait GULM_QUORUM_TIMEOUT seconds before giving up and failing when
>#     a master server is not found.  If GULM_QUORUM_TIMEOUT is zero, then
>#     wait indefinately for a master server.  If GULM_QUORUM_TIMEOUT is
>#     negative, just start lock_gulmd and not worry about whether it is
>#     quorate.
>GULM_QUORUM_TIMEOUT=300
>
> > I noticed before client system started even when lock master was not
> > available and the status of the lock_gulmd on client was set to "pending".
> > But now the system doesnt start until the master server is also started.
>
>Did you have the system mounting GFS automatically?  Apparently not since it
>would have "hung" there too.  The client node should have eventually timed
>out after 5 minutes without a master server to log into.
>
> > Has this changed recently?
>
>Define recently... sort of need the version information you are using :)
>
>My guess is that since you are complaining about this behavior, you just
>upgraded from GFS-6.0.0-15 to GFS-6.0.2-24.  From the rpm change log:
>
>* Mon Nov 15 2004 Chris Feist <cfeist at redhat.com> 6.0.2-0
>- init.d/lock_gulmd will not start if quorum is not established after
>   a specified time (rbz135732).
>- init.d/lock_gulmd will not stop if GFS is mounted (rbz135730).
>- pool init.d scripts no longer hang on startup until console input
>   is provided (rbz137382).
>
>
> > It is possible that other administrators in the
> > group may have to restart the system at times. If they start the client
> > before master (or worse they dont start master at all) then the system will
> > not complete its boot process and other services remain unavailable.
>
>Your nodes won't be able to mount GFS if there cluster the gulm servers
>aren't quorate, so what's the problem?
>
> > I like
> > the system to complete its boot process and have the lock_gulmd stay in
> > pending state until master comes back. Is there any trick to achieve this
> > behavior?
>
>GULM_QUORUM_TIMEOUT=-1
>
>One other suggestion.  I usually start sshd immediately after networking on
>my machines so that I can get into them as soon as possible.  This often
>helps when dealing with complaints of this nature.
>--
>Adam Manthei  <amanthei at redhat.com>
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050228/dab4e6b3/attachment.htm>


More information about the Linux-cluster mailing list