[Linux-cluster] Using GULM with CLVM
Ugo PARSI
ugo.parsi at gmail.com
Thu Jun 29 22:53:41 UTC 2006
>
> clvmd should log errors to syslog. Check that the gulm cluster is quorate as
> clvmd won't do anything without a quorate cluster. You might also like to run
> it with -d and see if any errors appear on stderr.
>
Yes you're right, thanks, if clvmd & lvm are stuck this is because
gulm is inquorate and simply doesn't work at all....
But I still can't figure out on how making it work, I've spent a lot
of hours on it now, and all of my problems seems to be
IPv4/IPv6/hostname related (I guess)....
To ease the configuration and the trial process, I've reduced my
cluster.conf to the simplest case (I guess) : just 2 client nodes and
1 master gulm one. All of them are working on the 10.x.x.x private
IPv4 subnet.
Again I don't know if I can trust the 'documentation' or not, since it
is written that gulm is working on IPv6 sockets only and on the man
pages (man lock_gulmd) it seems that both IPv4 and IPv6 sockets are
handled by GULM.
My first problem is this one :
venus:/etc/init.d# lock_gulmd --use_ccs
Warning! You didn't specify a cluster name before --use_ccs
Letting ccsd choose which cluster we belong to.
I cannot find the name for ip "::ffff:10.1.1.5". Stopping.
Gulm requires 1,3,4, or 5 nodes to be specified in the servers list.
You specified 0
I cannot find the name for ip "::ffff:10.1.1.5". Stopping.
Gulm requires 1,3,4, or 5 nodes to be specified in the servers list.
You specified 0
venus:/etc/init.d# I cannot find the name for ip "::ffff:10.1.1.5". Stopping.
Gulm requires 1,3,4, or 5 nodes to be specified in the servers list.
You specified 0
Apparently, GULM forces some kind of ~IPv6-translated-IPv4~ address
that it can't find anywhere on the system.
Here's my /etc/hosts :
----------------------------------------
venus:/etc/init.d# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.1.1.5 venus
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
------------------------------------------
And here's my cluster.conf
-----------------------------------------------------------
venus:/etc/init.d# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster name="iliona" config_version="1">
<gulm>
<lockserver name="10.1.1.5"/>
</gulm>
<clusternodes>
<clusternode name="mars">
<fence>
<method name="single">
<device name="human" ipaddr="10.1.1.3"/>
</method>
</fence>
</clusternode>
<clusternode name="venus">
<fence>
<method name="single">
<device name="human" ipaddr="10.1.1.5"/>
</method>
</fence>
</clusternode>
<clusternode name="triton">
<fence>
<method name="single">
<device name="human" ipaddr="10.1.1.6"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="human" agent="fence_manual"/>
</fencedevices>
</cluster>
---------------------------------------------------------
So in order to force it's host-matching process, I've modified my
/etc/hosts according to that :
------------------------------------------
venus:/etc/init.d# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.1.1.5 venus
::ffff:10.1.1.5 venus
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
-------------------------------------------
With that one, it *SEEMED* to work (it's not printing messages at
runtime anymore and silently forks the daemon), but still on the logs
:
It seems it cannot find his IP or host again : Jun 30 00:19:03 venus
lock_gulmd_LTPX[26223]: I am (venus) with ip (::) :(
Here's the whole part :
Jun 30 00:19:01 venus lock_gulmd_main[26211]: Forked lock_gulmd_core.
Jun 30 00:19:01 venus lock_gulmd_core[26215]: Starting lock_gulmd_core
1.02.00. (built Jun 23 2006 18:56:19) Copyright (C) 2004 Red Hat, Inc.
All rights reserved.
Jun 30 00:19:01 venus lock_gulmd_core[26215]: I am running in Standard mode.
Jun 30 00:19:01 venus lock_gulmd_core[26215]: I am (venus) with ip (::)
Jun 30 00:19:01 venus lock_gulmd_core[26215]: This is cluster iliona
Jun 30 00:19:01 venus lock_gulmd_core[26215]: EOF on xdr (Magma::26198
::1 idx:1 fd:6)
Jun 30 00:19:02 venus lock_gulmd_main[26211]: Forked lock_gulmd_LT.
Jun 30 00:19:02 venus lock_gulmd_LT[26219]: Starting lock_gulmd_LT
1.02.00. (built Jun 23 2006 18:56:19) Copyright (C) 2004 Red Hat, Inc.
All rights reserved.
Jun 30 00:19:02 venus lock_gulmd_LT[26219]: I am running in Standard mode.
Jun 30 00:19:02 venus lock_gulmd_LT[26219]: I am (venus) with ip (::)
Jun 30 00:19:02 venus lock_gulmd_LT[26219]: This is cluster iliona
Jun 30 00:19:02 venus lock_gulmd_LT000[26219]: Not serving locks from
this node.
Jun 30 00:19:02 venus lock_gulmd_core[26215]: EOF on xdr (Magma::26198
::1 idx:1 fd:6)
Jun 30 00:19:03 venus lock_gulmd_main[26211]: Forked lock_gulmd_LTPX.
Jun 30 00:19:03 venus lock_gulmd_LTPX[26223]: Starting lock_gulmd_LTPX
1.02.00. (built Jun 23 2006 18:56:19) Copyright (C) 2004 Red Hat, Inc.
All rights reserved.
Jun 30 00:19:03 venus lock_gulmd_LTPX[26223]: I am running in Standard mode.
Jun 30 00:19:03 venus lock_gulmd_LTPX[26223]: I am (venus) with ip (::)
Jun 30 00:19:03 venus lock_gulmd_LTPX[26223]: This is cluster iliona
Jun 30 00:19:03 venus ccsd[26197]: Connected to cluster infrastruture
via: GuLM Plugin v1.0.4
Jun 30 00:19:03 venus ccsd[26197]: Initial status:: Inquorate
And indeed it's not acting as a 'Server/Master' but as a 'Client' too :
venus:/etc/init.d# gulm_tool getstats venus
I_am = Client
quorum_has = 1
quorum_needs = 1
rank = -1
quorate = false
GenerationID = 0
run time = 128
pid = 27456
verbosity = Default
failover = disabled
venus:/etc/init.d#
Of course the other 2 nodes are acting the same way, and with no
master, the cluster is always in inquorate/unusable state, hence my
problems with clvmd/lvm.
I've tried many other things, like putting the names inside
cluster.conf (with the names inside /etc/hosts or DNS-based only,
etc..) instead of IPs, etc.... but still the same error.
I am getting really confused by the whole system and the lack of
documentation is really painful for me to find my mistakes as a
cluster-suite beginner :/
If you have any ideas :),
Thanks a lot,
Ugo PARSI
--
An apple a day, keeps the doctor away
More information about the Linux-cluster
mailing list