[Cluster-devel] [Linux-HA] Error running corosync

Andrew Beekhof andrew at beekhof.net
Fri Nov 11 04:17:32 UTC 2011


On Tue, Nov 8, 2011 at 1:08 PM, Tim Serong <tserong at suse.com> wrote:
> On 11/07/2011 11:34 PM, Nick Khamis wrote:
>> Hello Everyone,
>>
>> After being unsuccessful trying to get cman+pacemaker working,
>> I decided to try the latest committed version of pacemaker "git clone
>> https://github.com/ClusterLabs/pacemaker.git". And recieving
>> the following error from ocfs2_controld.pcmk:
>>
>>
>>   ocfs2_controld.pcmk -D
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'corosync_quorum' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'corosync_cman' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_clm' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_evt' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_ckpt' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_msg' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_lck' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional service options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'openais_tmr' for option: name
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next: No
>> additional configuration supplied for: service
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: config_find_next:
>> Processing additional quorum options...
>> ocfs2_controld[6883]: 2011/11/03_16:34:19 info: get_config_opt: Found
>> 'quorum_cman' for option: provider
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_cluster_type:
>> Detected an active 'cman' cluster
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: get_local_node_name:
>> Using CMAN node name: astdrbd1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info:
>> init_ais_connection_once: Connection to 'cman': established
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node
>> astdrbd1 now has id: 1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1
>> is now known as astdrbd1
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>> crm_msg_ais
>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>> crm_msg_ais
>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>> 1320352460 setup_stack at 170: Cluster connection established.  Local node id: 1
>> 1320352460 setup_stack at 174: Added Pacemaker as client 1 with fd -1
>>
>
> I still believe these errors are the result of pacemaker (apparently)
> not knowing/thinking it's running on/with openais (for some reason).  See:
>
> http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011978.html
>
> I also don't see how the patch Andrew mentioned at
> http://oss.clusterlabs.org/pipermail/pacemaker/2011-November/011992.html
> could fix this (but would be delighted to be proved wrong).

The RA, IIRC, was looking for HA_quorum_type which was unset before
that patch when starting pacemaker from the daemon.

However, if he's getting this problem while using cman (i've
completely lost track at this point) then the problem is that the RA
is selecting  ocfs2_controld.pcmk instead of the "normal"
ocfs2_controld

>
>> Setup:
>>
>> PCMK 1.1.6-2d8fad5
>> CMAN 3.1.7
>> Corosync 1.4.2
>> OpenAIS Latest version
>>
>> I just want to mention that I never start OpenAIS just corosync. Is
>> this ok for dlm,
>> and configfs? Or should I be using openais?
>
> ocfs2_controld with Pacemaker needs openais, but openais isn't something
> you "start" separately, it's a bunch of plugins that corosync is meant
> to load.  What this means in a CMAN environment, I do not know.

Cman starts corosync+openais.

>
> IMO (and as Florian alluded to in another message), you'd probably save
> yourself a lot of trouble taking prebuilt packages from a distro where
> the pieces you need are known to work together.

Indeed.

>
> Not to say I think what you're doing won't ultimately be worthwhile, but
> it could be the case that you are the first person in the world to try
> to combine these versions of these specific components in exactly the
> way you are doing so.
>
> Regards,
>
> Tim
> --
> Tim Serong
> Senior Clustering Engineer
> SUSE
> tserong at suse.com
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>




More information about the Cluster-devel mailing list