[Linux-cluster] How to integrate a custom resource agent into RHCS?

Ralph.Grothe at itdz-berlin.de Ralph.Grothe at itdz-berlin.de
Mon May 30 12:28:34 UTC 2011


Hi,

I hope this is the right forum. So bear with me Pacemaker
aficionados et alii when I talk about Red Hat Cluster Suite
(RHCS).
That's the clusterware product I am given to set up the cluster
and I'm not free to chose another software of my liking.

Though this may sound ridiculous, since days I've been labouring
to get a fairly simple custom resource agent (hence RA) to be
acknowledged by RHCS and correctly executed through its
rgmanager.

When scripting my RA I mostly adhered to
http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html apart
from where RHCS RAs differs from general OCF.

I put my RA in /usr/share/cluster and afterwards restarted
rgmanager on all nodes.

When I try to start the service whereof my RA's managed resource
is part of the service though gets started but not my resource,
as if it wasn't part of the service at all.


When I try to start my resource via rg_test nothing happens apart
from this obscure log entry


[root at aruba:~]
# rg_test test /etc/cluster/cluster.conf start aDIStn_sec
Running in test mode.
Entity: line 2: parser error : Char 0x0 out of allowed range

^
Entity: line 2: parser error : Premature end of data in tag error
line 1

^
[root at aruba:~]
# echo $?
0

[root at aruba:~]
# grep rg_test /var/log/cluster.log|tail -1
May 30 13:54:55 aruba rg_test: [28643]: <err> Cannot dump
meta-data because '/usr/share/cluster/default.metadata' is
missing 


Though this is true

[root at aruba:~]
# ls -l /usr/share/cluster/default.metadata
ls: /usr/share/cluster/default.metadata: No such file or
directory

there isn't such a file part of the installed clusterware at all
either

[root at aruba:~]
# yum groupinfo Clustering|tail -10|xargs rpm -ql|grep -c
default\\.metadata
0

And besides, I don't understand this error because since I wrote
my RA according to above mentioned RA Developer's Guide it of
course dumps its metadata


[root at aruba:~]
# /usr/share/cluster/aDIStn_sec.sh meta-data|grep action
    <actions>
        <action name="start" timeout="0"/>
        <action name="stop" timeout="0"/>
        <action name="status" timeout="5"/>
        <action name="monitor" timeout="5"/>
        <action name="meta-data" timeout="0"/>
        <action name="verify-all" timeout="5"/>
        <action name="validate-all" timeout="5"/>
    </actions>

(note, RHCS deviates from OCF here in naming its actions
verify-all instead of validate-all and status instead of monitor.
But both refer to the same case block in my RA)


I also don't understand the "Char 0x0 out of allowed range" error
from the XML parser.

If it really refers to line 2 of my cluster.conf this looks
pretty ok to me


[root at aruba:~]
# sed -n 2p /etc/cluster/cluster.conf
<cluster alias="rhcs_mock" config_version="43" name="rhcs_mock">


If I run a validity check of the XML of my cluster.conf against
RHCS's RNG schema I also get an incomprehensible error about
extra elements in interleave.

Nevertheless, all other resources of my cluster which rely on
RHCS's standard RAs are managed ok by the clusterware.



[root at aruba:~]
# declare -f cluconf_valid
cluconf_valid () 
{ 
    xmllint --noout --relaxng
/usr/share/system-config-cluster/misc/cluster.ng
${1:-/etc/cluster/cluster.conf}
}
[root at aruba:~]
# cluconf_valid 
Relax-NG validity error : Extra element cman in interleave
/etc/cluster/cluster.conf:2: element cluster: Relax-NG validity
error : Element cluster failed to validate content
/etc/cluster/cluster.conf fails to validate


Btw. is there a schema file available to check an RA's metadata
for validity?



Of course did I test my RA script for correct functionality when
used like an init script (to which end I provide the required
environment of OCF_RESKEY_parameter(s)),
and it starts, stops and monitors my resource as intended.


Can anyone help?


Regards
Ralph





More information about the Linux-cluster mailing list