[Linux-cluster] running clurgmgr directly causesclustatmalfunction

Martin Waite Martin.Waite at datacash.com
Wed Jun 23 13:45:06 UTC 2010



> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Martin Waite
> Sent: 23 June 2010 10:00
> To: linux clustering
> Subject: Re: [Linux-cluster] running clurgmgr directly
causesclustatmalfunction
> 
> 
> 
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com]
> > On Behalf Of Tom Lanyon
> > Sent: 23 June 2010 01:40
> > To: linux clustering
> > Subject: Re: [Linux-cluster] running clurgmgr directly causes
> clustatmalfunction
> >
> > On 23/06/2010, at 1:48 AM, Martin Waite wrote:
> >
> > > Hi,
> > >
> > > RHEL 5.4: cluster2.
> > >
> > > Following Tom's advice from earlier today, in order to work around
a
> problem with
> > starting rgmanager causing frozen services to stop, I started
> /usr/sbin/clurgmgrd
> > directly rather than through an init.d script.   This enables the
"-N"
> flag to be passed in
> > on the command line.
> > >
> > > However, starting rgmanager this way (with or without the -N flag)
> causes problems
> > with local invocations of clustat - ie. rgmanager cannot be seen in
> its output.  (clustat
> > run on other cluster nodes DO see rgmanager on this node however).
> > >
> > > I have waited for minutes after invoking /usr/sbin/clurgmgrd for
it
> to show up in
> > clustat output, but with no joy.
> > >
> > > I have traced through the init.d script and cannot see that very
> much happens in
> > there to affect how clurgmgrd is run.
> > >
> > > Any ideas anyone ?
> >
> > When you run "clurgmgrd -N" manually, have you checked
> /var/log/messages to see
> > whether it is indeed starting correctly?
> >
> > You could also try running clurgmgrd with the -f and -d flags to run
> in the foreground
> > and enable debugging, so you can see what's going on.
> >
> > FYI it works for me on the following - perhaps you've just found a
> cman/rgmanager
> > incompatibility?
> > 	cman-2.0.98-1.el5_3.4
> > 	openais-0.80.3-22.el5_3.8
> > 	rgmanager-2.0.52-6.el5
> >
> >
> > > [martin at cp1edidbm001 ~]$ sudo /sbin/service rgmanager stop
> > > [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clurgmgrd start
> >
> > Are you actually running this verbatim? If so, you have the wrong
> command :) - it
> > should be:
> > 	$ sudo /usr/sbin/clurgmgrd -N
> >
> > >  [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clustat
> >
> >
> 
> Hi Tom,
> 
> I was running that verbatim.   I re-ran the sequence:
> 
> [martin at cp1edidbm001 ~]$ sudo /etc/init.d/rgmanager stop
> Shutting down Cluster Service Manager...
> Waiting for services to stop:                              [  OK  ]
> Cluster Service Manager is stopped.
> [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clurgmgrd -f -d -N
> [20839] info: I am node #1
> [20839] debug: Fence domain already joined or no fencing configured
> [20839] notice: Resource Group Manager Starting
> [20839] info: Loading Service Data
> [20839] debug: Loading Resource Rules
> [20839] debug: 0 rules loaded
> [20839] debug: Building Resource Trees
> [20839] debug: 0 resources defined
> [20839] debug: Loading Failover Domains
> [20839] debug: 2 domains defined
> [20839] debug: 1 events defined
> [20839] info: Skipping stop-before-start: overridden by administrator
> [20839] debug: Event: Port Opened
> [20839] info: State change: Local UP
> [20839] info: State change: svXprdclu002 UP
> [20839] info: State change: svXprdclu003 UP
> [20839] info: State change: svXprdclu004 UP
> [20839] info: State change: svXprdclu005 UP
> [20882] debug: Event (1:1:1) Processed
> [20882] debug: Event (0:2:1) Processed
> [20882] debug: Event (0:3:1) Processed
> [20882] debug: Event (0:4:1) Processed
> [20882] debug: Event (0:5:1) Processed
> [20882] debug: 5 events processed
> 
> other window....
> 
> [martin at cp1edidbm001 ~]$ sudo /usr/sbin/clustat
> Cluster Status for EDISV1DBM @ Wed Jun 23 09:53:06 2010
> Member Status: Quorate
> 
>  Member Name                                  ID   Status
>  ------ ----                                  ---- ------
>  svXprdclu001                                     1 Online, Local
>  svXprdclu002                                     2 Online
>  svXprdclu003                                     3 Online
>  svXprdclu004                                     4 Online
>  svXprdclu005                                     5 Online
> 
> So still no rgmanager output.
> 
> My versions of the packages are different:
> 
> [martin at cp1edidbm001 ~]$ rpm -qa | egrep "rgmanager|cman|openais"
> openais-0.80.6-16.el5_5.1
> cman-2.0.115-34.el5
> rgmanager-2.0.52-1.el5_4.3
> 
> There must be something happening in the init.d script that enables
this
> to work.  I'll explore the environment variables later.
> 
> regards,
> Martin
> 


Hi,

I found the cause of the problem:  we run a sudo environment that
restricts exec permission to a specified list of programs.  

The init.d script was covered by this - and so worked fine - but
/usr/sbin/clurgmrgd was not.

The problem was solved by adding /usr/sbin/clurgmrgd to the list of
programs allowed to exec under sudo.

regards,
Martin






More information about the Linux-cluster mailing list