[Linux-cluster] how to handle fence for a simple apache active/passive cluster with virtual ip on 2 virtual machine

Sat Feb 1 18:43:35 UTC 2014

On 01/02/14 01:35 PM, nik600 wrote:
> Dear all
>
> i need some clarification about clustering with rhel 6.4
>
> i have a cluster with 2 node in active/passive configuration, i simply
> want to have a virtual ip and migrate it between 2 nodes.
>
> i've noticed that if i reboot or manually shut down a node the failover
> works correctly, but if i power-off one node the cluster doesn't
> failover on the other node.
>
> Another stange situation is that if power off all the nodes and then
> switch on only one the cluster doesn't start on the active node.
>
> I've read manual and documentation at
>
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html
>
> and i've understand that the problem is related to fencing, but the
> problem is that my 2 nodes are on 2 virtual machine , i can't control
> hardware and can't issue any custom command on the host-side.
>
> I've tried to use fence_xvm but i'm not sure about it because if my VM
> has powered-off, how can it reply to fence_vxm messags?
>
> Here my logs when i power off the VM:
>
> ==> /var/log/cluster/fenced.log <==
> Feb 01 18:50:22 fenced fencing node mynode02
> Feb 01 18:50:53 fenced fence mynode02 dev 0.0 agent fence_xvm result:
> error from agent
> Feb 01 18:50:53 fenced fence mynode02 failed
>
> I've tried to force the manual fence with:
>
> fence_ack_manual mynode02
>
> and in this case the failover works properly.
>
> The point is: as i'm not using any shared filesystem but i'm only
> sharing apache with a virtual ip, i won't have any split-brain scenario
> so i don't need fencing, or not?
>
> So, is there the possibility to have a simple "dummy" fencing?
>
> here is my config.xml:
>
> <?xml version="1.0"?>
> <cluster config_version="20" name="hacluster">
>          <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="0"/>
>          <cman expected_votes="1" two_node="1"/>
>          <clusternodes>
>                  <clusternode name="mynode01" nodeid="1" votes="1">
>                          <fence>
>                                  <method name="mynode01">
>                                          <device domain="mynode01"
> name="mynode01"/>
>                                  </method>
>                          </fence>
>                  </clusternode>
>                  <clusternode name="mynode02" nodeid="2" votes="1">
>                          <fence>
>                                  <method name="mynode02">
>                                          <device domain="mynode02"
> name="mynode02"/>
>                                  </method>
>                          </fence>
>                  </clusternode>
>          </clusternodes>
>          <fencedevices>
>                  <fencedevice agent="fence_xvm" name="mynode01"/>
>                  <fencedevice agent="fence_xvm" name="mynode02"/>
>          </fencedevices>
>          <rm log_level="7">
>                  <failoverdomains>
>                          <failoverdomain name="MYSERVICE" nofailback="0"
> ordered="0" restricted="0">
>                                  <failoverdomainnode name="mynode01"
> priority="1"/>
>                                  <failoverdomainnode name="mynode02"
> priority="2"/>
>                          </failoverdomain>
>                  </failoverdomains>
>                  <resources/>
>                  <service autostart="1" exclusive="0" name="MYSERVICE"
> recovery="relocate">
>                          <ip address="192.168.1.239" monitor_link="on"
> sleeptime="2"/>
> <apache config_file="conf/httpd.conf" name="apache"
> server_root="/etc/httpd" shutdown_wait="0"/>
>                  </service>
>          </rm>
> </cluster>
>
> Thanks to all in advance.

The fence_virtd/fence_xvm agent works by using multicast to talk to the 
VM host. So the "off" confirmation comes from the hypervisor, not the 
target.

Depending on your setup, you might find better luck with fence_virsh (I 
have to use this as there is a known multicast issue with Fedora hosts). 
Can you try, as a test if nothing else, if 'fence_virsh' will work for you?

fence_virsh -a <host ip> -l root -p <host root pw> -n <virsh name for 
target vm> -o status

If this works, it should be trivial to add to cluster.conf. If that 
works, then you have a working fence method. However, I would recommend 
switching back to fence_xvm if you can. The fence_virsh agent is 
dependent on libvirtd running, which some consider a risk.

hth

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?