[Linux-cluster] clvmd problems with centos 6.3 or normal clvmd behaviour?
emmanuel segura
emi2fast at gmail.com
Wed Aug 1 14:26:38 UTC 2012
Hello GianLuca
Why you don't remove expected_votes=3 and let the cluster automatic
calculate that
I told you be cause i had some many problems with that setting
2012/8/1 Gianluca Cecchi <gianluca.cecchi at gmail.com>
> Hello,
> testing a three node cluster + quorum disk and clvmd.
> I was at CentOS 6.2 and I seem to remember to be able to start a
> single node. Correct?
> Then I upgraded to CentOS 6.3 and had a working environment.
> My config has
> <cman expected_votes="3" quorum_dev_poll="240000" two_node="0"/>
>
> At the moment two nodes are in another site that is powered down and I
> need to start a single node config.
>
> When the node starts it gets waiting for quorum and when quorum disk
> becomes master it goes ahead:
>
> # cman_tool nodes
> Node Sts Inc Joined Name
> 0 M 0 2012-08-01 15:41:58 /dev/block/253:4
> 1 X 0 intrarhev1
> 2 X 0 intrarhev2
> 3 M 1420 2012-08-01 15:39:58 intrarhev3
>
> But the process hangs at clvmd start up. In particular at the step
> vgchange -aly
> Pid of "service clvmd start" command is 9335
>
> # pstree -alp 9335
> S24clvmd,9335 /etc/rc3.d/S24clvmd start
> └─vgchange,9363 -ayl
>
>
> # ll /proc/9363/fd/
> total 0
> lrwx------ 1 root root 64 Aug 1 15:44 0 -> /dev/console
> lrwx------ 1 root root 64 Aug 1 15:44 1 -> /dev/console
> lrwx------ 1 root root 64 Aug 1 15:44 2 -> /dev/console
> lrwx------ 1 root root 64 Aug 1 15:44 3 -> /dev/mapper/control
> lrwx------ 1 root root 64 Aug 1 15:44 4 -> socket:[1348167]
> lr-x------ 1 root root 64 Aug 1 15:44 5 -> /dev/dm-3
>
> # lsof -p 9363
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> vgchange 9363 root cwd DIR 104,3 4096 2 /
> vgchange 9363 root rtd DIR 104,3 4096 2 /
> vgchange 9363 root txt REG 104,3 971464 132238
> /sbin/lvm
> vgchange 9363 root mem REG 104,3 156872 210
> /lib64/ld-2.12.so
> vgchange 9363 root mem REG 104,3 1918016 569
> /lib64/libc-2.12.so
> vgchange 9363 root mem REG 104,3 22536 593
> /lib64/libdl-2.12.so
> vgchange 9363 root mem REG 104,3 24000 832
> /lib64/libdevmapper-event.so.1.02
> vgchange 9363 root mem REG 104,3 124624 750
> /lib64/libselinux.so.1
> vgchange 9363 root mem REG 104,3 272008 2060
> /lib64/libreadline.so.6.0
> vgchange 9363 root mem REG 104,3 138280 2469
> /lib64/libtinfo.so.5.7
> vgchange 9363 root mem REG 104,3 61648 1694
> /lib64/libudev.so.0.5.1
> vgchange 9363 root mem REG 104,3 251112 1489
> /lib64/libsepol.so.1
> vgchange 9363 root mem REG 104,3 229024 1726
> /lib64/libdevmapper.so.1.02
> vgchange 9363 root mem REG 253,7 99158576 17029
> /usr/lib/locale/locale-archive
> vgchange 9363 root mem REG 253,7 26060 134467
> /usr/lib64/gconv/gconv-modules.cache
> vgchange 9363 root 0u CHR 5,1 0t0 5218
> /dev/console
> vgchange 9363 root 1u CHR 5,1 0t0 5218
> /dev/console
> vgchange 9363 root 2u CHR 5,1 0t0 5218
> /dev/console
> vgchange 9363 root 3u CHR 10,58 0t0 5486
> /dev/mapper/control
> vgchange 9363 root 4u unix 0xffff880879b309c0 0t0 1348167 socket
> vgchange 9363 root 5r BLK 253,3 0t143360 10773
> /dev/dm-3
>
>
> # strace -p 9363
> Process 9363 attached - interrupt to quit
> read(4,
>
> multipath seems ok in general and for md=3 in particular
> # multipath -l /dev/mapper/mpathd
> mpathd (3600507630efe0b0c0000000000001181) dm-3 IBM,1750500
> size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
> |-+- policy='round-robin 0' prio=0 status=active
> | |- 0:0:0:3 sdd 8:48 active undef running
> | `- 1:0:0:3 sdl 8:176 active undef running
> `-+- policy='round-robin 0' prio=0 status=enabled
> |- 0:0:1:3 sdq 65:0 active undef running
> `- 1:0:1:3 sdy 65:128 active undef running
>
> Currently I have
> lvm2-2.02.95-10.el6.x86_64
> lvm2-cluster-2.02.95-10.el6.x86_64
>
> startup is stuck as in image attached
>
> Logs
> messages:
> Aug 1 15:46:14 udevd[663]: worker [9379] unexpectedly returned with
> status 0x0100
> Aug 1 15:46:14 udevd[663]: worker [9379] failed while handling
> '/devices/virtual/block/dm-15'
>
> dmesg
> DLM (built Jul 20 2012 01:56:50) installed
> dlm: Using TCP for communications
>
>
> qdiskd
> Aug 01 15:41:58 qdiskd Score sufficient for master operation (1/1;
> required=1); upgrading
> Aug 01 15:43:03 qdiskd Assuming master role
>
> corosync.log
> Aug 01 15:41:58 corosync [CMAN ] quorum device registered
> Aug 01 15:43:08 corosync [CMAN ] quorum regained, resuming activity
> Aug 01 15:43:08 corosync [QUORUM] This node is within the primary
> component and will provide service.
> Aug 01 15:43:08 corosync [QUORUM] Members[1]: 3
>
> fenced.log
> Aug 01 15:43:09 fenced fenced 3.0.12.1 started
> Aug 01 15:43:09 fenced failed to get dbus connection
>
> dlm_controld.log
> Aug 01 15:43:10 dlm_controld dlm_controld 3.0.12.1 started
>
> gfs_controld.log
> Aug 01 15:43:11 gfs_controld gfs_controld 3.0.12.1 started
>
>
> Do I miss anything simple?
> Is it correct to say that clvmd can start only when one node is
> active, given that it has quorum under the cluster configuration rules
> set up?
>
> Or am I hitting any known bug/problem?
>
> Thanks in advance,
> Gianluca
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
--
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120801/1be08eda/attachment.htm>
More information about the Linux-cluster
mailing list