[Linux-cluster] openais issue

Tue Oct 6 15:48:10 UTC 2009

Hi Paras,

yes. At least it looks so...

We have a cluster of two nodes + a quorum disk (it's not configured as a 
"two-node cluster")

They are running Scientific Linux 5.x, kernel 2.6.18-128.7.1.el5xen and

  openais-0.80.6-8.el5.x86_64
  cman-2.0.115-1.el5.x86_64
  rgmanager-2.0.52-1.el5.x86_64

The XEN VMs access the disk as simple block devices.
Disks are on a SAN, configured with Clustered LVM.

  xen-3.0.3-94.el5_4.1.x86_64
  xen-libs-3.0.3-94.el5_4.1.x86_64

VM configuration files are as the following

   name = "www1"
   uuid = "3bd3e910-23c0-97ee-55ab-086260ef1e53"
   memory = 1024
   maxmem = 1024
   vcpus = 1
   bootloader = "/usr/bin/pygrub"
   vfb = [ "type=vnc,vncunused=1,keymap=en-us" ]
   disk = [ "phy:/dev/vg_cluster/www1.disk,xvda,w", \
   "phy:/dev/vg_cluster/www1.swap,xvdb,w" ]
   vif = [ "mac=00:16:3e:da:00:07,bridge=xenbr1" ]
   on_poweroff = "destroy"
   on_reboot = "restart"
   on_crash = "restart"
   extra = "xencons=tty0 console=tty0"

I changed in /etc/cluster/cluster.conf all the VM directive from

  <vm autostart="1" domain="rhcs1_dom" exclusive="0" \
   migrate="live" name="www1" path="/etc/xen" recovery="restart"/>

to

  <vm autostart="1" use_virsh="0" domain="rhcs1_dom" exclusive="0" \
   migrate="live" name="www1" path="/etc/xen" recovery="restart"/>

Rebooted the cluster nodes and it started working again...

As i said I hope I'll not have any other bad surprise (I tested a VM 
migration and it is working too), but at least cluster it's working now 
(it was not able to start a VM, before)!

Ciao
Daniela

Paras pradhan wrote:
> So you mean your cluster is running fine with the CMAN
> cman-2.0.115-1.el5.x86_64 ?
> 
> Which version of openais are you running?
> 
> Thanks
> Paras.
> 
> 
> On Mon, Oct 5, 2009 at 7:19 AM, Daniela Anzellotti
> <daniela.anzellotti at roma1.infn.it> wrote:
>> Hi all,
>>
>> I had a problem similar to Paras's one today: yum updated the following rpms
>> last week and today (I had to restart the cluster) the cluster was not able
>> to start vm: services.
>>
>> Oct 02 05:31:05 Updated: openais-0.80.6-8.el5.x86_64
>> Oct 02 05:31:07 Updated: cman-2.0.115-1.el5.x86_64
>> Oct 02 05:31:10 Updated: rgmanager-2.0.52-1.el5.x86_64
>>
>> Oct 03 04:03:12 Updated: xen-libs-3.0.3-94.el5_4.1.x86_64
>> Oct 03 04:03:12 Updated: xen-libs-3.0.3-94.el5_4.1.i386
>> Oct 03 04:03:16 Updated: xen-3.0.3-94.el5_4.1.x86_64
>>
>>
>> So, after checked the vm.sh script, I add the declaration use_virsh="0" in
>> the VM definition in the cluster.conf (as suggested by Brem, thanks!) and
>> everything is now working again.
>>
>>
>> BTW I didn't understand if the problem was caused by the new XEN version or
>> the new openais one, thus I disabled automatic updates for both.
>>
>> I hope I'll not have any other bad surprise...
>>
>> Thank you,
>> cheers,
>> Daniela
>>
>>
>> Paras pradhan wrote:
>>> Yes this is very strange. I don't know what to do now. May be re
>>> create the cluster? But not a good solution actually.
>>>
>>> Packages :
>>>
>>> Kernel: kernel-xen-2.6.18-164.el5
>>> OS: Full updated of CentOS 5.3 except CMAN downgraded to cman-2.0.98-1.el5
>>>
>>> Other packages related to cluster suite:
>>>
>>> rgmanager-2.0.52-1.el5.centos
>>> cman-2.0.98-1.el5
>>> xen-3.0.3-80.el5_3.3
>>> xen-libs-3.0.3-80.el5_3.3
>>> kmod-gfs-xen-0.1.31-3.el5_3.1
>>> kmod-gfs-xen-0.1.31-3.el5_3.1
>>> kmod-gfs-0.1.31-3.el5_3.1
>>> gfs-utils-0.1.18-1.el5
>>> gfs2-utils-0.1.62-1.el5
>>> lvm2-2.02.40-6.el5
>>> lvm2-cluster-2.02.40-7.el5
>>> openais-0.80.3-22.el5_3.9
>>>
>>> Thanks!
>>> Paras.
>>>
>>>
>>>
>>>
>>> On Wed, Sep 30, 2009 at 10:02 AM, brem belguebli
>>> <brem.belguebli at gmail.com> wrote:
>>>> Hi Paras,
>>>>
>>>> Your cluster.conf file seems correct. If it is not a ntp issue, I
>>>> don't see anything except a bug that causes this, or some prerequisite
>>>> that is not respected.
>>>>
>>>> May be you could post the versions (os, kernel, packages etc...) you
>>>> are using, someone may have hit the same issue with your versions.
>>>>
>>>> Brem
>>>>
>>>> 2009/9/30, Paras pradhan <pradhanparas at gmail.com>:
>>>>> All of the nodes are synced with ntp server. So this is not the case
>>>>> with me.
>>>>>
>>>>> Thanks
>>>>> Paras.
>>>>>
>>>>> On Tue, Sep 29, 2009 at 6:29 PM, Johannes Rußek
>>>>> <johannes.russek at io-consulting.net> wrote:
>>>>>> make sure the time on the nodes is in sync, apparently when a node has
>>>>>> too
>>>>>> much offset, you won't see rgmanager (even though the process is
>>>>>> running).
>>>>>> this happened today and setting the time fixed it for me. afaicr there
>>>>>> was
>>>>>> no sign of this in the logs though.
>>>>>> johannes
>>>>>>
>>>>>> Paras pradhan schrieb:
>>>>>>> I don't see rgmanager .
>>>>>>>
>>>>>>> Here is the o/p from clustat
>>>>>>>
>>>>>>> [root at cvtst1 cluster]# clustat
>>>>>>> Cluster Status for test @ Tue Sep 29 15:53:33 2009
>>>>>>> Member Status: Quorate
>>>>>>>
>>>>>>>  Member Name                                                     ID
>>>>>>> Status
>>>>>>>  ------ ----                                                     ----
>>>>>>> ------
>>>>>>>  cvtst2                                                    1 Online
>>>>>>>  cvtst1                                                     2 Online,
>>>>>>> Local
>>>>>>>  cvtst3                                                     3 Online
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> Paras.
>>>>>>>
>>>>>>> On Tue, Sep 29, 2009 at 3:44 PM, brem belguebli
>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>
>>>>>>>> It looks correct, rgmanager seems to start on all nodes
>>>>>>>>
>>>>>>>> what gives you clustat ?
>>>>>>>>
>>>>>>>> If rgmanager doesn't show, check out the logs something may have gone
>>>>>>>> wrong.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2009/9/29 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>
>>>>>>>>> Change to 7 and i got this log
>>>>>>>>>
>>>>>>>>> Sep 29 15:33:50 cvtst1 rgmanager: [23295]: <notice> Shutting down
>>>>>>>>> Cluster Service Manager...
>>>>>>>>> Sep 29 15:33:50 cvtst1 clurgmgrd[22869]: <notice> Shutting down
>>>>>>>>> Sep 29 15:33:50 cvtst1 clurgmgrd[22869]: <notice> Shutting down
>>>>>>>>> Sep 29 15:33:50 cvtst1 clurgmgrd[22869]: <notice> Shutdown complete,
>>>>>>>>> exiting
>>>>>>>>> Sep 29 15:33:50 cvtst1 rgmanager: [23295]: <notice> Cluster Service
>>>>>>>>> Manager is stopped.
>>>>>>>>> Sep 29 15:33:51 cvtst1 clurgmgrd[23324]: <notice> Resource Group
>>>>>>>>> Manager Starting
>>>>>>>>> Sep 29 15:33:51 cvtst1 clurgmgrd[23324]: <info> Loading Service Data
>>>>>>>>> Sep 29 15:33:51 cvtst1 clurgmgrd[23324]: <debug> Loading Resource
>>>>>>>>> Rules
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> 21 rules loaded
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> Building Resource
>>>>>>>>> Trees
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> 0 resources defined
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> Loading Failover
>>>>>>>>> Domains
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> 1 domains defined
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> 1 events defined
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <info> Initializing
>>>>>>>>> Services
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <info> Services Initialized
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <debug> Event: Port Opened
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <info> State change: Local
>>>>>>>>> UP
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <info> State change: cvtst2
>>>>>>>>> UP
>>>>>>>>> Sep 29 15:33:52 cvtst1 clurgmgrd[23324]: <info> State change: cvtst3
>>>>>>>>> UP
>>>>>>>>> Sep 29 15:33:57 cvtst1 clurgmgrd[23324]: <debug> Event (1:2:1)
>>>>>>>>> Processed
>>>>>>>>> Sep 29 15:33:57 cvtst1 clurgmgrd[23324]: <debug> Event (0:1:1)
>>>>>>>>> Processed
>>>>>>>>> Sep 29 15:33:57 cvtst1 clurgmgrd[23324]: <debug> Event (0:3:1)
>>>>>>>>> Processed
>>>>>>>>> Sep 29 15:34:02 cvtst1 clurgmgrd[23324]: <debug> 3 events processed
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Anything unusual here?
>>>>>>>>>
>>>>>>>>> Paras.
>>>>>>>>>
>>>>>>>>> On Tue, Sep 29, 2009 at 11:51 AM, brem belguebli
>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I use log_level=7 to have more debugging info.
>>>>>>>>>>
>>>>>>>>>> It seems 4 is not enough.
>>>>>>>>>>
>>>>>>>>>> Brem
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2009/9/29, Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Withe log_level of 3 I got only this
>>>>>>>>>>>
>>>>>>>>>>> Sep 29 10:31:31 cvtst1 rgmanager: [7170]: <notice> Shutting down
>>>>>>>>>>> Cluster Service Manager...
>>>>>>>>>>> Sep 29 10:31:31 cvtst1 clurgmgrd[6673]: <notice> Shutting down
>>>>>>>>>>> Sep 29 10:31:41 cvtst1 clurgmgrd[6673]: <notice> Shutdown
>>>>>>>>>>> complete,
>>>>>>>>>>> exiting
>>>>>>>>>>> Sep 29 10:31:41 cvtst1 rgmanager: [7170]: <notice> Cluster Service
>>>>>>>>>>> Manager is stopped.
>>>>>>>>>>> Sep 29 10:31:42 cvtst1 clurgmgrd[7224]: <notice> Resource Group
>>>>>>>>>>> Manager Starting
>>>>>>>>>>> Sep 29 10:39:06 cvtst1 rgmanager: [10327]: <notice> Shutting down
>>>>>>>>>>> Cluster Service Manager...
>>>>>>>>>>> Sep 29 10:39:16 cvtst1 rgmanager: [10327]: <notice> Cluster
>>>>>>>>>>> Service
>>>>>>>>>>> Manager is stopped.
>>>>>>>>>>> Sep 29 10:39:16 cvtst1 clurgmgrd[10380]: <notice> Resource Group
>>>>>>>>>>> Manager Starting
>>>>>>>>>>> Sep 29 10:39:52 cvtst1 clurgmgrd[10380]: <notice> Member 1
>>>>>>>>>>> shutting
>>>>>>>>>>> down
>>>>>>>>>>>
>>>>>>>>>>> I do not know what the last line means.
>>>>>>>>>>>
>>>>>>>>>>> rgmanager version I am running is:
>>>>>>>>>>> rgmanager-2.0.52-1.el5.centos
>>>>>>>>>>>
>>>>>>>>>>> I don't what has gone wrong.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Paras.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 28, 2009 at 6:41 PM, brem belguebli
>>>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> you mean it stopped successfully on all the nodes but it is
>>>>>>>>>>>> failing
>>>>>>>>>>>> to
>>>>>>>>>>>> start only on node cvtst1 ?
>>>>>>>>>>>>
>>>>>>>>>>>> look at the following page  to make rgmanager more verbose. It
>>>>>>>>>>>> 'll
>>>>>>>>>>>> help debug....
>>>>>>>>>>>>
>>>>>>>>>>>> http://sources.redhat.com/cluster/wiki/RGManager
>>>>>>>>>>>>
>>>>>>>>>>>> at Logging Configuration section
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2009/9/29 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Brem,
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I try to restart rgmanager on all the nodes, this time i do
>>>>>>>>>>>>> not
>>>>>>>>>>>>> see rgmanager running on the first node. But I do see on other 2
>>>>>>>>>>>>> nodes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Log on the first node:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sep 28 18:13:58 cvtst1 clurgmgrd[24099]: <notice> Resource Group
>>>>>>>>>>>>> Manager Starting
>>>>>>>>>>>>> Sep 28 18:17:29 cvtst1 rgmanager: [24627]: <notice> Shutting
>>>>>>>>>>>>> down
>>>>>>>>>>>>> Cluster Service Manager...
>>>>>>>>>>>>> Sep 28 18:17:29 cvtst1 clurgmgrd[24099]: <notice> Shutting down
>>>>>>>>>>>>> Sep 28 18:17:39 cvtst1 clurgmgrd[24099]: <notice> Shutdown
>>>>>>>>>>>>> complete,
>>>>>>>>>>>>> exiting
>>>>>>>>>>>>> Sep 28 18:17:39 cvtst1 rgmanager: [24627]: <notice> Cluster
>>>>>>>>>>>>> Service
>>>>>>>>>>>>> Manager is stopped.
>>>>>>>>>>>>> Sep 28 18:17:40 cvtst1 clurgmgrd[24679]: <notice> Resource Group
>>>>>>>>>>>>> Manager Starting
>>>>>>>>>>>>>
>>>>>>>>>>>>> -
>>>>>>>>>>>>> It seems service is running ,  but I do not see rgmanger running
>>>>>>>>>>>>> using clustat
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Don't know what is going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Sep 28, 2009 at 5:46 PM, brem belguebli
>>>>>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Paras,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another thing, it would have been more interesting to have a
>>>>>>>>>>>>>> start
>>>>>>>>>>>>>> DEBUG not a stop.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's why I was asking you to first stop the vm manually on
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> nodes, stop eventually rgmanager on all the nodes to reset the
>>>>>>>>>>>>>> potential wrong states you may have, restart rgmanager.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If your VM is configured to autostart, this will make it start.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It should normally fail (as it does now). Send out your newly
>>>>>>>>>>>>>> created
>>>>>>>>>>>>>> DEBUG file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2009/9/29 brem belguebli <brem.belguebli at gmail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Paras,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't know the xen/cluster combination well, but if I do
>>>>>>>>>>>>>>> remember
>>>>>>>>>>>>>>> well, I think I've read somewhere that when using xen you have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> declare the use_virsh=0 key in the VM definition in the
>>>>>>>>>>>>>>> cluster.conf.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This would make rgmanager use xm commands instead of virsh
>>>>>>>>>>>>>>> The DEBUG output shows clearly that you are using virsh to
>>>>>>>>>>>>>>> manage
>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>> VM instead of xm commands.
>>>>>>>>>>>>>>> Check out the RH docs about virtualization
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not a 100% sure about that, I may be completely wrong.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Brem
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2009/9/28 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The only thing I noticed is the message after stopping the vm
>>>>>>>>>>>>>>>> using xm
>>>>>>>>>>>>>>>> in all nodes and starting using clusvcadm is
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Virtual machine guest1 is blocked"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The whole DEBUG file is attached.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Sep 25, 2009 at 5:53 PM, brem belguebli
>>>>>>>>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There's a problem with the script that is called by
>>>>>>>>>>>>>>>>> rgmanager to
>>>>>>>>>>>>>>>>> start
>>>>>>>>>>>>>>>>> the VM, I don't know what causes it
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> May be you should try something like :
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1) stop the VM on all nodes with xm commands
>>>>>>>>>>>>>>>>> 2) edit the /usr/share/cluster/vm.sh script and add the
>>>>>>>>>>>>>>>>> following
>>>>>>>>>>>>>>>>> lines (after the #!/bin/bash ):
>>>>>>>>>>>>>>>>>  exec >/tmp/DEBUG 2>&1
>>>>>>>>>>>>>>>>>  set -x
>>>>>>>>>>>>>>>>> 3) start the VM with clusvcadm -e vm:guest1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It should fail as it did before.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> edit the the /tmp/DEBUG file and you will be able to see
>>>>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> fails (it may generate a lot of debug)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 4) remove the debug lines from /usr/share/cluster/vm.sh
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Post the DEBUG file if you're not able to see where it
>>>>>>>>>>>>>>>>> fails.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Brem
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2009/9/26 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> No I am not manually starting not using automatic init
>>>>>>>>>>>>>>>>>> scripts.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I started the vm using: clusvcadm -e vm:guest1
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have just stopped using clusvcadm -s vm:guest1. For few
>>>>>>>>>>>>>>>>>> seconds it
>>>>>>>>>>>>>>>>>> says guest1 started . But after a while I can see the
>>>>>>>>>>>>>>>>>> guest1 on
>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>> three nodes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> clustat says:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  Service Name
>>>>>>>>>>>>>>>>>>  Owner
>>>>>>>>>>>>>>>>>> (Last)
>>>>>>>>>>>>>>>>>>                                        State
>>>>>>>>>>>>>>>>>>  ------- ----
>>>>>>>>>>>>>>>>>>  -----
>>>>>>>>>>>>>>>>>> ------
>>>>>>>>>>>>>>>>>>                                        -----
>>>>>>>>>>>>>>>>>>  vm:guest1
>>>>>>>>>>>>>>>>>> (none)
>>>>>>>>>>>>>>>>>>                                        stopped
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But I can see the vm from xm li.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is what I can see from the log:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <notice> start on
>>>>>>>>>>>>>>>>>> vm
>>>>>>>>>>>>>>>>>> "guest1"
>>>>>>>>>>>>>>>>>> returned 1 (generic error)
>>>>>>>>>>>>>>>>>> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <warning> #68:
>>>>>>>>>>>>>>>>>> Failed
>>>>>>>>>>>>>>>>>> to start
>>>>>>>>>>>>>>>>>> vm:guest1; return value: 1
>>>>>>>>>>>>>>>>>> Sep 25 17:19:01 cvtst1 clurgmgrd[4298]: <notice> Stopping
>>>>>>>>>>>>>>>>>> service vm:guest1
>>>>>>>>>>>>>>>>>> Sep 25 17:19:02 cvtst1 clurgmgrd[4298]: <notice> Service
>>>>>>>>>>>>>>>>>> vm:guest1 is
>>>>>>>>>>>>>>>>>> recovering
>>>>>>>>>>>>>>>>>> Sep 25 17:19:15 cvtst1 clurgmgrd[4298]: <notice> Recovering
>>>>>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>>>>>> service vm:guest1
>>>>>>>>>>>>>>>>>> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <notice> start on
>>>>>>>>>>>>>>>>>> vm
>>>>>>>>>>>>>>>>>> "guest1"
>>>>>>>>>>>>>>>>>> returned 1 (generic error)
>>>>>>>>>>>>>>>>>> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <warning> #68:
>>>>>>>>>>>>>>>>>> Failed
>>>>>>>>>>>>>>>>>> to start
>>>>>>>>>>>>>>>>>> vm:guest1; return value: 1
>>>>>>>>>>>>>>>>>> Sep 25 17:19:16 cvtst1 clurgmgrd[4298]: <notice> Stopping
>>>>>>>>>>>>>>>>>> service vm:guest1
>>>>>>>>>>>>>>>>>> Sep 25 17:19:17 cvtst1 clurgmgrd[4298]: <notice> Service
>>>>>>>>>>>>>>>>>> vm:guest1 is
>>>>>>>>>>>>>>>>>> recovering
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Sep 25, 2009 at 5:07 PM, brem belguebli
>>>>>>>>>>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Have you started  your VM via rgmanager (clusvcadm -e
>>>>>>>>>>>>>>>>>>> vm:guest1) or
>>>>>>>>>>>>>>>>>>> using xm commands out of cluster control  (or maybe a thru
>>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>> automatic init script ?)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> When clustered, you should never be starting services
>>>>>>>>>>>>>>>>>>> (manually or
>>>>>>>>>>>>>>>>>>> thru automatic init script) out of cluster control
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The thing would be to stop your vm on all the nodes with
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> adequate
>>>>>>>>>>>>>>>>>>> xm command (not using xen myself) and try to start it with
>>>>>>>>>>>>>>>>>>> clusvcadm.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Then see if it is started on all nodes (send clustat
>>>>>>>>>>>>>>>>>>> output)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2009/9/25 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Ok. Please see below. my vm is running on all nodes
>>>>>>>>>>>>>>>>>>>> though
>>>>>>>>>>>>>>>>>>>> clustat
>>>>>>>>>>>>>>>>>>>> says it is stopped.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> [root at cvtst1 ~]# clustat
>>>>>>>>>>>>>>>>>>>> Cluster Status for test @ Fri Sep 25 16:52:34 2009
>>>>>>>>>>>>>>>>>>>> Member Status: Quorate
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  Member Name
>>>>>>>>>>>>>>>>>>>>    ID   Status
>>>>>>>>>>>>>>>>>>>>  ------ ----
>>>>>>>>>>>>>>>>>>>>    ---- ------
>>>>>>>>>>>>>>>>>>>>  cvtst2
>>>>>>>>>>>>>>>>>>>>  1
>>>>>>>>>>>>>>>>>>>> Online, rgmanager
>>>>>>>>>>>>>>>>>>>>  cvtst1
>>>>>>>>>>>>>>>>>>>>   2
>>>>>>>>>>>>>>>>>>>> Online,
>>>>>>>>>>>>>>>>>>>> Local, rgmanager
>>>>>>>>>>>>>>>>>>>>  cvtst3
>>>>>>>>>>>>>>>>>>>>   3
>>>>>>>>>>>>>>>>>>>> Online, rgmanager
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  Service Name
>>>>>>>>>>>>>>>>>>>>  Owner (Last)
>>>>>>>>>>>>>>>>>>>>                                        State
>>>>>>>>>>>>>>>>>>>>  ------- ----
>>>>>>>>>>>>>>>>>>>>  ----- ------
>>>>>>>>>>>>>>>>>>>>                                        -----
>>>>>>>>>>>>>>>>>>>>  vm:guest1
>>>>>>>>>>>>>>>>>>>> (none)
>>>>>>>>>>>>>>>>>>>>                                        stopped
>>>>>>>>>>>>>>>>>>>> [root at cvtst1 ~]#
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>> o/p of xm li on cvtst1
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> [root at cvtst1 ~]# xm li
>>>>>>>>>>>>>>>>>>>> Name                                      ID Mem(MiB)
>>>>>>>>>>>>>>>>>>>> VCPUs
>>>>>>>>>>>>>>>>>>>> State   Time(s)
>>>>>>>>>>>>>>>>>>>> Domain-0                                   0     3470
>>>>>>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>>>> r-----  28939.4
>>>>>>>>>>>>>>>>>>>> guest1                                     7      511
>>>>>>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>>>>>>> -b----   7727.8
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> o/p of xm li on cvtst2
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> [root at cvtst2 ~]# xm li
>>>>>>>>>>>>>>>>>>>> Name                                      ID Mem(MiB)
>>>>>>>>>>>>>>>>>>>> VCPUs
>>>>>>>>>>>>>>>>>>>> State   Time(s)
>>>>>>>>>>>>>>>>>>>> Domain-0                                   0     3470
>>>>>>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>>>> r-----  31558.9
>>>>>>>>>>>>>>>>>>>> guest1                                    21      511
>>>>>>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>>>>>>> -b----   7558.2
>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Sep 25, 2009 at 4:22 PM, brem belguebli
>>>>>>>>>>>>>>>>>>>> <brem.belguebli at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It looks like no.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> can you send an output of clustat  of when the VM is
>>>>>>>>>>>>>>>>>>>>> running
>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>> multiple nodes at the same time?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> And by the way, another one after having stopped
>>>>>>>>>>>>>>>>>>>>> (clusvcadm
>>>>>>>>>>>>>>>>>>>>> -s vm:guest1) ?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2009/9/25 Paras pradhan <pradhanparas at gmail.com>:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Anyone having issue as mine? Virtual machine service is
>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>>>> properly handled by the cluster.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 21, 2009 at 9:55 AM, Paras pradhan
>>>>>>>>>>>>>>>>>>>>>> <pradhanparas at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Ok.. here is my cluster.conf file
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>> [root at cvtst1 cluster]# more cluster.conf
>>>>>>>>>>>>>>>>>>>>>>> <?xml version="1.0"?>
>>>>>>>>>>>>>>>>>>>>>>> <cluster alias="test" config_version="9" name="test">
>>>>>>>>>>>>>>>>>>>>>>>      <fence_daemon clean_start="0" post_fail_delay="0"
>>>>>>>>>>>>>>>>>>>>>>> post_join_delay="3"/>
>>>>>>>>>>>>>>>>>>>>>>>      <clusternodes>
>>>>>>>>>>>>>>>>>>>>>>>              <clusternode name="cvtst2" nodeid="1"
>>>>>>>>>>>>>>>>>>>>>>> votes="1">
>>>>>>>>>>>>>>>>>>>>>>>                      <fence/>
>>>>>>>>>>>>>>>>>>>>>>>              </clusternode>
>>>>>>>>>>>>>>>>>>>>>>>              <clusternode name="cvtst1" nodeid="2"
>>>>>>>>>>>>>>>>>>>>>>> votes="1">
>>>>>>>>>>>>>>>>>>>>>>>                      <fence/>
>>>>>>>>>>>>>>>>>>>>>>>              </clusternode>
>>>>>>>>>>>>>>>>>>>>>>>              <clusternode name="cvtst3" nodeid="3"
>>>>>>>>>>>>>>>>>>>>>>> votes="1">
>>>>>>>>>>>>>>>>>>>>>>>                      <fence/>
>>>>>>>>>>>>>>>>>>>>>>>              </clusternode>
>>>>>>>>>>>>>>>>>>>>>>>      </clusternodes>
>>>>>>>>>>>>>>>>>>>>>>>      <cman/>
>>>>>>>>>>>>>>>>>>>>>>>      <fencedevices/>
>>>>>>>>>>>>>>>>>>>>>>>      <rm>
>>>>>>>>>>>>>>>>>>>>>>>              <failoverdomains>
>>>>>>>>>>>>>>>>>>>>>>>                      <failoverdomain name="myfd1"
>>>>>>>>>>>>>>>>>>>>>>> nofailback="0" ordered="1" restricted="0">
>>>>>>>>>>>>>>>>>>>>>>>                              <failoverdomainnode
>>>>>>>>>>>>>>>>>>>>>>> name="cvtst2" priority="3"/>
>>>>>>>>>>>>>>>>>>>>>>>                              <failoverdomainnode
>>>>>>>>>>>>>>>>>>>>>>> name="cvtst1" priority="1"/>
>>>>>>>>>>>>>>>>>>>>>>>                              <failoverdomainnode
>>>>>>>>>>>>>>>>>>>>>>> name="cvtst3" priority="2"/>
>>>>>>>>>>>>>>>>>>>>>>>                      </failoverdomain>
>>>>>>>>>>>>>>>>>>>>>>>              </failoverdomains>
>>>>>>>>>>>>>>>>>>>>>>>              <resources/>
>>>>>>>>>>>>>>>>>>>>>>>              <vm autostart="1" domain="myfd1"
>>>>>>>>>>>>>>>>>>>>>>> exclusive="0" max_restarts="0"
>>>>>>>>>>>>>>>>>>>>>>> name="guest1" path="/vms" recovery="r
>>>>>>>>>>>>>>>>>>>>>>> estart" restart_expire_time="0"/>
>>>>>>>>>>>>>>>>>>>>>>>      </rm>
>>>>>>>>>>>>>>>>>>>>>>> </cluster>
>>>>>>>>>>>>>>>>>>>>>>> [root at cvtst1 cluster]#
>>>>>>>>>>>>>>>>>>>>>>> ------
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>>>> Paras.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 20, 2009 at 9:44 AM, Volker Dormeyer
>>>>>>>>>>>>>>>>>>>>>>> <volker at ixolution.de> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Sep 18, 2009 at 05:08:57PM -0500,
>>>>>>>>>>>>>>>>>>>>>>>> Paras pradhan <pradhanparas at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I am using cluster suite for HA of xen virtual
>>>>>>>>>>>>>>>>>>>>>>>>> machines.
>>>>>>>>>>>>>>>>>>>>>>>>> Now I am
>>>>>>>>>>>>>>>>>>>>>>>>> having another problem. When I start the my xen vm
>>>>>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>> one node, it
>>>>>>>>>>>>>>>>>>>>>>>>> also starts on other nodes. Which daemon controls
>>>>>>>>>>>>>>>>>>>>>>>>>  this?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This is usually done bei clurgmgrd (which is part of
>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> rgmanager
>>>>>>>>>>>>>>>>>>>>>>>> package). To me, this sounds like a configuration
>>>>>>>>>>>>>>>>>>>>>>>> problem. Maybe,
>>>>>>>>>>>>>>>>>>>>>>>> you can post your cluster.conf?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>> Volker
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Linux-cluster mailing list
>>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Linux-cluster mailing list
>>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>> Linux-cluster mailing list
>>>>>>>> Linux-cluster at redhat.com
>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Linux-cluster mailing list
>>>>>>> Linux-cluster at redhat.com
>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>
>>>>>> --
>>>>>> Linux-cluster mailing list
>>>>>> Linux-cluster at redhat.com
>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>
>>>>> --
>>>>> Linux-cluster mailing list
>>>>> Linux-cluster at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>> --
>> - Daniela Anzellotti ------------------------------------
>>  INFN Roma - tel.: +39.06.49914282 - fax: +39.06.490354
>>  e-mail: daniela.anzellotti at roma1.infn.it
>> ---------------------------------------------------------
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
- Daniela Anzellotti ------------------------------------
  INFN Roma - tel.: +39.06.49914282 - fax: +39.06.490354
  e-mail: daniela.anzellotti at roma1.infn.it
---------------------------------------------------------