[Linux-cluster] clvmd problems with centos 6.3 or normal clvmd behaviour?

Corey Kovacs corey.kovacs at gmail.com
Thu Aug 2 13:07:25 UTC 2012


I might be reading this wrong but just in case, I thought I'd point this
out.

Your quorum config.

node=2 votes*3 (nodes have 6 votes total)
qdisk=3 votes.

A single node can maintain quorum since 2+3>(9/2).

In a split brain condition where a single node cannot talk to the other
nodes, this could be disastrous.

Now, that all said, qdiskd using a volume as yours appears to be, won't be
able to start until the cluster is quorate.

Also, you might be running into a chicken and egg situation. Is your qdisk
volume marked clustered? I believe once you set the locking type to 3, all
LVM activity requires clvmd to be running. If it's not marked as clustered,
then that's not going work I don't thinks since qdisk requires concurrent
access across nodes. And you have to wait for clvmd.

It's unclear why you actually need a qdisk. If it's to keep the cluster up
in a single node mode, then I'd make the qdisk member start up in a
minority vote=1 and only change that in a controlled situation where you
are sure the other nodes are shutdown completely. Remember, the purpose of
quorum is to ensure that a majority rules and your config violates that
premise. Just sayin' :)

Does your cluster run without qdiskd configured?

Anyway, I hope this helps at least a little. If I am way off base, I
apologize and will crawl back into my cave :)

Good luck


Corey

On Thu, Aug 2, 2012 at 4:50 AM, emmanuel segura <emi2fast at gmail.com> wrote:

> if you think the problem it's in lvm, put it in the debug man lvm.conf
>
>
> 2012/8/2 Gianluca Cecchi <gianluca.cecchi at gmail.com>
>
>> On Wed, Aug 1, 2012 at 6:15 PM, Gianluca Cecchi wrote:
>> > On Wed, 1 Aug 2012 16:26:38 +0200 emmanuel segura wrote:
>> >> Why you don't remove expected_votes=3 and let the cluster automatic
>> calculate that
>> >
>> > Thanks for your answer Emmanuel, but cman starts correctly, while the
>> > problem seems related to
>> > vgchange -aly
>> > command hanging.
>> > But I tried that option too and the cluster hangs at the same point as
>> before.
>>
>> Further testing shows that cluster is indeed quorated and problems are
>> related to lvm...
>>
>> I also tried following a more used and clean configuration seen in
>> examples for 3 nodes + quorum daemon:
>>
>> 2 votes for each node
>> <clusternode name="nodeX" nodeid="X" votes="2">
>>
>> 3 votes for quorum disk
>> <quorumd device="/dev/mapper/mpathquorum" interval="5"
>> label="clrhevquorum" tko="24" votes="3">
>>
>> with and without expected_votes="9" in <cman ... /> part
>>
>> One node + its quorum only config should be ok (2+3 = 5 votes)
>>
>> After cman starts and quorumd is not master yet:
>>
>> # cman_tool status
>> Version: 6.2.0
>> Config Version: 51
>> Cluster Name: clrhev
>> Cluster Id: 43203
>> Cluster Member: Yes
>> Cluster Generation: 1428
>> Membership state: Cluster-Member
>> Nodes: 1
>> Expected votes: 9
>> Total votes: 2
>> Node votes: 2
>> Quorum: 5 Activity blocked
>> Active subsystems: 4
>> Flags:
>> Ports Bound: 0 178
>> Node name: intrarhev3
>> Node ID: 3
>> Multicast addresses: 239.192.168.108
>> Node addresses: 192.168.16.30
>>
>> Then
>> # cman_tool status
>> Version: 6.2.0
>> Config Version: 51
>> Cluster Name: clrhev
>> Cluster Id: 43203
>> Cluster Member: Yes
>> Cluster Generation: 1428
>> Membership state: Cluster-Member
>> Nodes: 1
>> Expected votes: 9
>> Quorum device votes: 3
>> Total votes: 5
>> Node votes: 2
>> Quorum: 5
>> Active subsystems: 4
>> Flags:
>> Ports Bound: 0 178
>> Node name: intrarhev3
>> Node ID: 3
>> Multicast addresses: 239.192.168.108
>> Node addresses: 192.168.16.30
>>
>> And startup continues up to clvmd step
>> In this phase, while clvmd startup hanges forever I have:
>>
>> # dlm_tool ls
>> dlm lockspaces
>> name          clvmd
>> id            0x4104eefa
>> flags         0x00000000
>> change        member 1 joined 1 remove 0 failed 0 seq 1,1
>> members       3
>>
>> # ps -ef|grep lv
>> root      3573  2593  0 01:05 ?        00:00:00 /bin/bash
>> /etc/rc3.d/S24clvmd start
>> root      3578     1  0 01:05 ?        00:00:00 clvmd -T30
>> root      3620     1  0 01:05 ?        00:00:00 /sbin/lvm pvscan
>> --cache --major 253 --minor 13
>> root      3804  3322  0 01:09 pts/0    00:00:00 grep lv
>>
>> # ps -ef|grep vg
>> root      3601  3573  0 01:05 ?        00:00:00 /sbin/vgchange -ayl
>> root      3808  3322  0 01:09 pts/0    00:00:00 grep vg
>>
>> # ps -ef|grep lv
>> root      3573  2593  0 01:05 ?        00:00:00 /bin/bash
>> /etc/rc3.d/S24clvmd start
>> root      3578     1  0 01:05 ?        00:00:00 clvmd -T30
>> root      4008  3322  0 01:13 pts/0    00:00:00 grep lv
>>
>> # ps -ef|grep 3578
>> root      3578     1  0 01:05 ?        00:00:00 clvmd -T30
>> root      4017  3322  0 01:13 pts/0    00:00:00 grep 3578
>>
>> It remains at
>> # service clvmd start
>> Starting clvmd:
>> Activating VG(s):   3 logical volume(s) in volume group "VG_VIRT02" now
>> active
>>
>> Is there any way to debug clvmd?
>> I suppose it communicates through intracluster, correct?
>> tcpdump output could be of any help?
>>
>> Any one already passed to 6.3 (on rhel and/or CentOS) and having all
>> ok with clvmd?
>>
>> BTW: I also tried lvmetad, that is tech preview in 6.3, enabling its
>> service and putting "use_lvmetad = 1" in lvm.conf but without luck...
>>
>> Thanks in advance
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120802/a8e4a5d3/attachment.htm>


More information about the Linux-cluster mailing list