[Linux-cluster] Corosync goes cpu to 95-99%

carlopmart carlopmart at gmail.com
Thu Jun 2 09:21:06 UTC 2011


On 06/02/2011 01:27 AM, Nicolas Ross wrote:
>
>>>
>>> cman_tool join is called in /etc/rc.d/init.d/cman I believe. Add a -P
>>> option to it.
>>>
>>> Regards
>>> -steve
>>
>> Where is "-P" option under cman_tool manpage?? I didn't see it. Appears
>> "-S", "-X", "-A", "-D" ... but not -P ...
>>
>> Is it correct to put this option under /etc/sysconfig/cman config file
>> on RHEL6??
>
> I had to modify my /etc/rc.d/init.d/cman script on each node and add -P
> (undocumented) at line 500, after $cman_join_opts
>
> And it did not solve the problem, but it help verry little bit to
> aliviate it. While a node is experiencing it, it's still not usable by
> ssh, but response time to service seems a very little better, barely
> noticable.
>
> GSS asked me today to produce a core dump of corosync while it's eating
> up CPU.
>
> Regards,
>

Oops .. Bad, bad, very bad news, almost for me. Nicolas, I have found 
the option to pass "-p" to corosync without modifying cman startup 
script. In /etc/sysconfig/cman config file, I have put a line with this:

CMAN_JOIN_OPTS="-P"

  .. and works ok.

[root at rhelnode01 sysconfig]# ps xa |grep corosync
  1033 ?        SLsl   0:00 corosync -f -p
  1494 pts/1    S+     0:00 grep corosync

I will do some tests with two nodes, But I think RHEL6.x is not yet 
ready for production environments, almost RHCS.


-- 
CL Martinez
carlopmart {at} gmail {d0t} com




More information about the Linux-cluster mailing list