[Linux-cluster] running clurgmgr directly causes clustatmalfunction

Martin Waite Martin.Waite at datacash.com
Wed Jun 23 08:59:31 UTC 2010



> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Tom Lanyon
> Sent: 23 June 2010 01:40
> To: linux clustering
> Subject: Re: [Linux-cluster] running clurgmgr directly causes
clustatmalfunction
> 
> On 23/06/2010, at 1:48 AM, Martin Waite wrote:
> 
> > Hi,
> >
> > RHEL 5.4: cluster2.
> >
> > Following Tom's advice from earlier today, in order to work around a
problem with
> starting rgmanager causing frozen services to stop, I started
/usr/sbin/clurgmgrd
> directly rather than through an init.d script.   This enables the "-N"
flag to be passed in
> on the command line.
> >
> > However, starting rgmanager this way (with or without the -N flag)
causes problems
> with local invocations of clustat - ie. rgmanager cannot be seen in
its output.  (clustat
> run on other cluster nodes DO see rgmanager on this node however).
> >
> > I have waited for minutes after invoking /usr/sbin/clurgmgrd for it
to show up in
> clustat output, but with no joy.
> >
> > I have traced through the init.d script and cannot see that very
much happens in
> there to affect how clurgmgrd is run.
> >
> > Any ideas anyone ?
> 
> When you run "clurgmgrd -N" manually, have you checked
/var/log/messages to see
> whether it is indeed starting correctly?
> 
> You could also try running clurgmgrd with the -f and -d flags to run
in the foreground
> and enable debugging, so you can see what's going on.
> 
> FYI it works for me on the following - perhaps you've just found a
cman/rgmanager
> incompatibility?
> 	cman-2.0.98-1.el5_3.4
> 	openais-0.80.3-22.el5_3.8
> 	rgmanager-2.0.52-6.el5
> 
> 
> > [martin at cp1edidbm001 ~]$ sudo /sbin/service rgmanager stop
> > [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clurgmgrd start
> 
> Are you actually running this verbatim? If so, you have the wrong
command :) - it
> should be:
> 	$ sudo /usr/sbin/clurgmgrd -N
> 
> >  [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clustat
> 
> 

Hi Tom,

I was running that verbatim.   I re-ran the sequence:

[martin at cp1edidbm001 ~]$ sudo /etc/init.d/rgmanager stop
Shutting down Cluster Service Manager...
Waiting for services to stop:                              [  OK  ]
Cluster Service Manager is stopped.
[martin at cp1edidbm001 ~]$ sudo /usr/sbin/clurgmgrd -f -d -N
[20839] info: I am node #1
[20839] debug: Fence domain already joined or no fencing configured
[20839] notice: Resource Group Manager Starting
[20839] info: Loading Service Data
[20839] debug: Loading Resource Rules
[20839] debug: 0 rules loaded
[20839] debug: Building Resource Trees
[20839] debug: 0 resources defined
[20839] debug: Loading Failover Domains
[20839] debug: 2 domains defined
[20839] debug: 1 events defined
[20839] info: Skipping stop-before-start: overridden by administrator
[20839] debug: Event: Port Opened
[20839] info: State change: Local UP
[20839] info: State change: svXprdclu002 UP
[20839] info: State change: svXprdclu003 UP
[20839] info: State change: svXprdclu004 UP
[20839] info: State change: svXprdclu005 UP
[20882] debug: Event (1:1:1) Processed
[20882] debug: Event (0:2:1) Processed
[20882] debug: Event (0:3:1) Processed
[20882] debug: Event (0:4:1) Processed
[20882] debug: Event (0:5:1) Processed
[20882] debug: 5 events processed

other window....

[martin at cp1edidbm001 ~]$ sudo /usr/sbin/clustat
Cluster Status for EDISV1DBM @ Wed Jun 23 09:53:06 2010
Member Status: Quorate

 Member Name                                  ID   Status
 ------ ----                                  ---- ------
 svXprdclu001                                     1 Online, Local
 svXprdclu002                                     2 Online
 svXprdclu003                                     3 Online
 svXprdclu004                                     4 Online
 svXprdclu005                                     5 Online

So still no rgmanager output.

My versions of the packages are different:

[martin at cp1edidbm001 ~]$ rpm -qa | egrep "rgmanager|cman|openais"
openais-0.80.6-16.el5_5.1
cman-2.0.115-34.el5
rgmanager-2.0.52-1.el5_4.3

There must be something happening in the init.d script that enables this
to work.  I'll explore the environment variables later.

regards,
Martin





More information about the Linux-cluster mailing list