[Linux-cluster] RH Cluster / Pacemaker / Veritas Cluster Server & SF

Fri Nov 19 12:15:05 UTC 2010

On 11/19/2010 12:46 PM, Radu Rendec wrote:
> On Fri, 2010-11-19 at 12:04 +0100, Fabio M. Di Nitto wrote:
>> On 11/19/2010 11:41 AM, Radu Rendec wrote: 
>>> However, I can tell you for sure that RH6 comes with RH cluster 3.x.x.
>>> My attempts to migrate the setup I have on RH5 (based on RH cluster
>>> 2.x.x) to version 3.x.x have lamentably failed.
>>
>>>
>>> It's also true that I ran the tests on Fedora 14 (because Centos 6 is
>>> not out yet) but on the other hand it's RH cluster that didn't work
>>> properly, not the distribution.
>>
>> did you file any bugzillas?
> 
> I didn't file any. First I tried posting to this list hoping that
> someone would shout "hey, you messed up your setup! you were not
> supposed to do this and that" etc etc. But I actually haven't got any
> reply at all.

Understood, but you need to file bugzilla´s for issues. We simply don´t
have enough resources to track bugs on mailing lists too.

> 
>> as upstream, I don´t really care if it´s based on Fedora or Centos.
>>
>> what problems did you hit?
> 
> The problem is described in more detail in an older post that I made a
> few days ago:
> 
> https://www.redhat.com/archives/linux-cluster/2010-November/msg00076.html
> 
> Basically I've got a "braindead" rgmanager after a few hours of cluster
> uptime. I've been keeping the machines running like that since the
> failure, hoping that someone would ask me to look at various things
> while the processes are still in this state.

> 
> There's another problem also related to rgmanager that I didn't describe
> in the other post. At a certain point, I added a new resource to the
> config file and updated the cluster config. However, the rgmanager on
> the very node that I used for updating failed to "see" the new resource,
> while the other node "saw" it immediately and started id. Cman reported
> the same config version (the new one) on both nodes.

Ok please, for each problem file a separate bugzilla, collect
/var/log/cluster/* and cluster.conf.

> 
> Looking at your email address (the domain part actually), I'm kindly
> asking you for any suggestions on where and how to report these issues
> properly :)

I´d be very happy to help, but two warnings, the rgmanager maintainer is
temporary unavailable and I don´t have his expertise. Failure to provide
requested data, is going to make it complex to debug.

If, for any reason, you need to hide password in cluster.conf, please
attach the "mangled" version in bugzilla and a send a pristine copy to
me and Lon <lhh at redhat.com>. We don´t care about your passwords or real
ip addresses, but we have seen way too many people breaking cluster.conf
only to mask a password. We need to make sure everything is in the right
place.

Another option is also to add <logging debug="on"/> in cluster.conf and
repeat the tests.

> 
> I would really like to help debugging this because I consider RH cluster
> to be great software.

thanks for the help!

Fabio