[Linux-cluster] ricci is very unstable in one nodes

fosiul alam expertalert at gmail.com
Mon Sep 27 17:37:44 UTC 2010


Hi, Addition to my previous email have a look to this one

from http1 ( where i am trying to relocate a service)

[root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
Member http1.xxxx.local trying to enable service:httpd1...Success
Warning: service:httpd1 is now running on mail01.xxxx.local

so, its saying its Success..
but it actually no..

Thanks again



On 27 September 2010 18:31, fosiul alam <expertalert at gmail.com> wrote:

> Hi
> Thanks for your advise,
> Currently i got this
>
>
> luci-0.12.2-12.el5.centos.1
> ricci-0.12.2-12.el5.centos.1
>
> is this the same rpm as
>
> luci-0.12.2-12.el5_5.4.i386.rpm  ?
> ricci-0.12.2-12.el5_5.4.i386.rpm  ?
>
> Thanks
>
>
>
> On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
>
>> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>>
>> It appears that this problem has been fixed in this errata.
>>
>> I installed the luci and ricci updates and did some lite testing.   So
>> far, the timeout 11111 error has not shown up.
>>
>> Paul
>>
>> ----- Original Message -----
>> From: "fosiul alam" <expertalert at gmail.com>
>> To: "linux clustering" <linux-cluster at redhat.com>
>> Sent: Monday, September 27, 2010 10:48:27 AM
>> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>>
>> Hi
>> i am trying to patch ricci . let see how it goes
>>
>> but clusvcadm is failing as well
>>
>> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
>> Member http1.xxxx.local trying to enable service:httpd1...Invalid
>> operation for resource
>>
>> here, http1 , where i was trying to run the service from luci
>>
>> what could be the problem ?
>> is there any way to find out if there is any problem with config ??
>>
>> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>>
>>
>> RHEL 5.6 hasn't been released yet so your package probably contains the
>> problem. I'm not sure how in sync Centos is with RHEL or if they patch
>> earlier so I cannot give you a time frame when it will be in Centos or
>> if they have already patched it. The problem in that BZ is more of an
>> annoyance, you usually just have to retry a time or two and it works. If
>> you can't get Luci working properly with your service at all you should
>> try enabling the service through the command line with clusvcadm -e. If
>> it is not working from the command line either then there is a problem
>> with the service config.
>>
>>
>>
>>
>> -Ben
>>
>>
>>
>>
>> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>
>> > Hi Ben
>> > Thanks
>> >
>> > I named this cluster as mysql-server but i have not installed mysql
>> > database in their yet
>> >
>> > and both luci and ricci on luci server and node1 is running this
>> > version
>> >
>> > luci-0.12.2-12.el5.centos.1
>> > ricci-0.12.2-12.el5.centos.1
>> >
>> >
>> > do you think this version has problem as well ??
>> >
>> > thanks for your help
>> >
>> >
>> >
>> >
>> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
>> >
>> >
>> > There is an issue with ricci timeouts that was fixed recently:
>> >
>> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
>> >
>> > I'm not sure but you may be hitting that bug. Symptoms include: luci
>> > isn't able to get the status from the node, timeouts when querying
>> > ricci, etc. The fix should be released with 5.6
>> >
>> > On the mysql service there are some options that you need to set. Here
>> > are all the options available to that agent:
>> >
>> > mysql
>> > Defines a MySQL database server
>> >
>> > Attribute Description
>> > config_file Define configuration file
>> > listen_address Define an IP address for MySQL server. If the address
>> > is not given then first IP address from the service is taken.
>> > mysqld_options Other command-line options for mysqld
>> > name Name
>> > ref Reference to existing mysql resource in the resources section.
>> > service_name Inherit the service name.
>> > shutdown_wait Wait X seconds for correct end of service shutdown
>> > startup_wait Wait X seconds for correct end of service startup
>> > __enforce_timeouts Consider a timeout for operations as fatal.
>> > __failure_expire_time Amount of time before a failure is forgotten.
>> > __independent_subtree Treat this and all children as an independent
>> > subtree. __max_failures Maximum number of failures before returning a
>> > failure to a status check.
>> >
>> > If I recall correctly you may need to tweak:
>> >
>> > shutdown_wait Wait X seconds for correct end of service shutdown
>> > startup_wait Wait X seconds for correct end of service startup
>> >
>> > There can be problems relocating the DB if it takes too long to
>> > start/shutdown. If you are having problems relocating with luci it may
>> > be a good idea to test with:
>> >
>> > # clusvcadm -r <service name> -m <cluster node>
>> >
>> > -Ben
>> >
>> >
>> >
>> >
>> >
>> >
>> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
>> >
>> > > Hi
>> > > I have 4 nodes cluster,
>> > > It was running fine. but today one nodes is giving trouble
>> > >
>> > > From luci Gui interface, when i try to relocate service into this
>> > node
>> > > and trying to relocate from this nodes to another nodes
>> > >
>> > > from luci gui interface, its showing :
>> > >
>> > > Unable to retrieve batch 1908047789 status from
>> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
>> > > Starting cluster service "httpd1" on node "http1.domain.local" --
>> > You
>> > > will be redirected in 5 seconds.
>> > > also
>> > >
>> > > The ricci agent for this node is unresponsive. Node-specific
>> > > information is not available at this time. :
>> > >
>> > > but ricci is running on problematic node ,
>> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
>> > >
>> > > there is not any firewall running.
>> > >
>> > > iptables -L
>> > > Chain INPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain FORWARD (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain OUTPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain RH-Firewall-1-INPUT (0 references)
>> > > target prot opt source destination
>> > >
>> > > port 11111 is runningg
>> > >
>> > > netstat -an | grep 11111
>> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
>> > >
>> > >
>> > > but still ricci is very unstable , and i cant relocate any service
>> > on
>> > > this node or i cant relocate any service away from this node.
>> > >
>> > > from problematic node if i type this
>> > >
>> > > clustat
>> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
>> > > Member Status: Quorate
>> > >
>> > > Member Name ID Status
>> > > ------ ---- ---- ------
>> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
>> > > server publicdns1.xxxx.local 2 Online, rgmanager
>> > > http1.xxxx.local 3 Online, Local, rgmanager
>> > > mail01.xxxxx.local 4 Online, rgmanager
>> > >
>> > > Service Name Owner (Last) State
>> > > ------- ---- ----- ------ -----
>> > > service:httpd1 mail01.xxxx.local started
>> > > service:mysql-server http1.xxxx.local started -------------------
>> > this
>> > > is the problematic node
>> > > service:public-dns publicdns1.xxxxxx.local started
>> > >
>> > > I cant move that service mysql-server from this node or cant
>> > relocate
>> > > any service on this node ..
>> > > I am very confused.
>> > >
>> > > what shall i do to fix this issue ??
>> > >
>> > > thanks for your advise.
>> > >
>> > >
>> > >
>> > >
>> > > -- Linux-cluster mailing list
>> > > Linux-cluster at redhat.com
>> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>> > -- Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>> >
>> > -- Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> -- Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> -- Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/4101fdf9/attachment.htm>


More information about the Linux-cluster mailing list