[Linux-cluster] XEN and Cluster Questions

Marc Grimme grimme at atix.de
Fri Oct 19 06:42:02 UTC 2007


On Thursday 18 October 2007 22:00:33 Lon Hohberger wrote:
> On Wed, 2007-10-17 at 11:41 +0200, Marc Grimme wrote:
> > Hello,
> > we are currently discussing XEN with clustering support.
> > There came some questions we are not sure what the answer is. Perhaps you
> > can help ;-) .
> >
> > Background is: We are discussing a group of XEN Dom0 Hosts sharing all
> > devices and files via GFS. They themselves again host a couple of
> > virtually redhat-clustered DomU Hosts with or without gfs.
> >
> > 1. Live Migration of cluster DomU nodes:
> > When I live migrate a virtual DomU clusternode to another DOM0 XEN Host
> > the migration works ;-) , but the virtual clusternode is thrown out of
> > the cluster. Is this a "works as designed"? I think the problem are the
> > heartbeats not coming in proper time.
> > Does that lead to the conclusion that one cannot live migrate cluster
> > nodes?
>
> Depends.  If you're using rgmanager to do migration, the migration is
> actually not live.  In order to do live migration,
> change /usr/share/cluster/vm.sh...
>
>   - where it says 'xm migrate ...'
>   - change it to 'xm migrate -l ...'
Ok got it.
Still did you try to live migrate a cluster node?
>
> That should enable live migration.
>
> > 2. Fencing:
> > How about fencing of the virtual Dom-U Clusternodes. You are never sure
> > on which Dom-0 Node runs our Dom-U Clusternode. Is the fencing via
> > fence_xvm[d] supported on such an environment? That means how does a
> > virtual DomU clusternode X running on Dom0 Xen Host x know that if
> > virtual DomU clusternode Y running on Dom0 Xen Host y is running there
> > when it is getting the fence request to fence Host y where it is running?
>
> Yes.  Fence_xvmd is designed (specifically) to handle the case where the
> dom0 hosting a particular domU is not known.  Note that this only works
> on RHEL5 with openais and such; fence_xvmd uses AIS checkpoints to store
> virtual machine locations.
>
> Notes:
> * the parent dom0 cluster still needs fencing, too :)
Yes. Thats in place. Check.
> * do not mix domU and dom0 in the same cluster,
I didn't. Check.
> * all domUs within a dom0 cluster must have different domain names,
Ups. hostname -d on dom0 and hostname -d on domu need to be different? 
What if they are empty?
Or do you mean some other domainname?
Dom0: 
[root at axqa01_2 ~]# hostname -d
[root at axqa01_2 ~]#
DomU:
[root at axqa03_1 ~]# hostname -d
cc.atix

> * do *not* reuse /etc/xen/fence_xvm.key between multiple dom0 clusters
I just did not use it.
Dom0:
[root at axqa01_2 ~]# ps ax | grep [f]ence_xvmd
 1932 pts/1    S+     0:00 fence_xvmd -ddddd -f -c none -C none
So on axqa01_2 runs axqa03_2 and on axqa01_1 runs axqa03_1
Then when I do a 
./fence_xvm -ddddd -C none -c none -H axqa03_2 on axqa03_1 I get the 
following:
Waiting for response
Received 264 bytes
Adding IP 127.0.0.1 to list (family 2)
Adding IP 10.1.2.1 to list (family 2)
Adding IP 192.168.10.40 to list (family 2)
Adding IP 192.168.122.1 to list (family 2)
Closing Netlink connection
ipv4_listen: Setting up ipv4 listen socket
ipv4_listen: Success; fd = 3
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
sign_request: no-op (HASH_NONE)
Sending to 225.0.0.12 via 127.0.0.1
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
sign_request: no-op (HASH_NONE)
Sending to 225.0.0.12 via 10.1.2.1
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
sign_request: no-op (HASH_NONE)
Sending to 225.0.0.12 via 192.168.10.40
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
sign_request: no-op (HASH_NONE)
Sending to 225.0.0.12 via 192.168.122.1
Waiting for connection from XVM host daemon.
Issuing TCP challenge
tcp_challenge: no-op (AUTH_NONE)
Responding to TCP challenge
tcp_response: no-op (AUTH_NONE)
TCP Exchange + Authentication done...
Waiting for return value from XVM host
Remote: Operation failed

on axqa01_2:
------                   ----                                 ----- -----
axqa03_2                 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00002 00001
Storing axqa03_2
libvir: Xen Daemon error : GET operation failed:
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
axqa03_2                 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00002 00001
Storing axqa03_2
Request to fence: axqa03_2
axqa03_2 is running locally
Plain TCP request
libvir: Xen Daemon error : GET operation failed:
libvir: error : invalid argument in __virGetDomain
libvir: Xen Store error : out of memory
tcp_response: no-op (AUTH_NONE)
tcp_challenge: no-op (AUTH_NONE)
Rebooting domain axqa03_2...
[[ XML Domain Info ]]
<domain type='xen'>
  <name>axqa03_2</name>
  <uuid>1732aae45a110676113df9e7da458b61</uuid>
  <os>
    <type>linux</type>
    <kernel>/var/lib/xen/boot/vmlinuz-2.6.18-52.el5xen</kernel>
    <initrd>/var/lib/xen/boot/initrd_sr-2.6.18-52.el5xen.img</initrd>
  </os>
  <currentMemory>366592</currentMemory>
  <memory>366592</memory>
  <vcpu>2</vcpu>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <disk type='block' device='disk'>
      <driver name='phy'/>
      <source dev='sds'/>
      <target dev='sds'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='file'/>
      <source file='/var/lib/xen/images/axqa03_2.localdisk.dd'/>
      <target dev='sda'/>
    </disk>
    <interface type='bridge'>
      <mac address='aa:00:00:00:00:12'/>
      <source bridge='xenbr0'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:43:90:d2'/>
      <source bridge='xenbr1'/>
    </interface>
    <console/>
  </devices>
</domain>

[[ XML END ]]
Virtual machine is Linux
Unlinkiking os block
[[ XML Domain Info (modified) ]]
<?xml version="1.0"?>
<domain type="xen">
  <name>axqa03_2</name>
  <uuid>1732aae45a110676113df9e7da458b61</uuid>
  <currentMemory>366592</currentMemory>
  <memory>366592</memory>
  <vcpu>2</vcpu>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <disk type="block" device="disk">
      <driver name="phy"/>
      <source dev="sds"/>
      <target dev="sds"/>
    </disk>
    <disk type="file" device="disk">
      <driver name="file"/>
      <source file="/var/lib/xen/images/axqa03_2.localdisk.dd"/>
      <target dev="sda"/>
    </disk>
    <interface type="bridge">
      <mac address="aa:00:00:00:00:12"/>
      <source bridge="xenbr0"/>
    </interface>
    <interface type="bridge">
      <mac address="00:16:3e:43:90:d2"/>
      <source bridge="xenbr1"/>
    </interface>
    <console/>
  </devices>
</domain>

[[ XML END ]]
[REBOOT] Calling virDomainDestroy
virDomainDestroy() failed: -1
Sending response to caller...

libvir: Xen Daemon error : GET operation failed:
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
axqa03_2                 cb165cce-1798-daf9-1252-12a2347a9fc7 00002 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00002 00001
Storing axqa03_2

on axqa01_1:

Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
axqa03_1                 8f89affa-4330-d281-9622-98665e4816c2 00001 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
Storing axqa03_1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
axqa03_1                 8f89affa-4330-d281-9622-98665e4816c2 00001 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
Storing axqa03_1
Request to fence: axqa03_2
Evaluating Domain: axqa03_2   Last Owner: 2   State 2
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
axqa03_1                 8f89affa-4330-d281-9622-98665e4816c2 00001 00002
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
Storing axqa03_1
Domain                   UUID                                 Owner State

Any ideas?

Marc.

>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Gruss / Regards,

Marc Grimme
http://www.atix.de/               http://www.open-sharedroot.org/




More information about the Linux-cluster mailing list