From amjadcsu at gmail.com  Tue Jul  1 08:27:26 2014
From: amjadcsu at gmail.com (Amjad Syed)
Date: Tue, 1 Jul 2014 11:27:26 +0300
Subject: [Linux-cluster] Virtual IP service
Message-ID: <CAJWdRQj+ifO9hvO+d6vKmZnj9Kn+cVCS+6wXGWWCECxR3n3sXw@mail.gmail.com>

Hello

I am trying to start a virtual ip service on my 2 node cluster.

Here are the details of the network setting and configuration.

1 : Bond(heartbeat) . This is a private network with no switch involved.
Not available to public
  node1: 192.168.10.11
  node2  : 192.168.10.10

2. Fencing (Ilo) .This one goes through a switch
   node 1: 10.10.63.92
   node2 :  10.10.63.93

3) Public ip addresses
   10.10.5.100 : node1
   10.10.5.20    node2 .

I  have set Virtual IP  as  10.10.5.23 in cluster.conf
  <service autostart="1" exclusive="0" name="IP" recovery="relocate">
                <ip address="10.10.5.23" monitor_link="on" sleeptime="10"/>

However, this Virtual IP does not work since the cman communication is on
192 network. When i try to set cman to 10.10.5.X network, the nodes go into
fence loop, i,e they fence each other

So i am asking, is there a "network-preference option" etc in cluster.conf
that can map virtual IP to private network addresses.

Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140701/a0a2e131/attachment.htm>

From mgrac at redhat.com  Tue Jul  1 11:46:01 2014
From: mgrac at redhat.com (Marek Grac)
Date: Tue, 01 Jul 2014 13:46:01 +0200
Subject: [Linux-cluster] fence-agents-4.0.10 release
Message-ID: <53B29F79.9020902@redhat.com>

Welcome to the fence-agents 4.0.10 release.

This release includes new fence agent for Docker (thanks to Ondrej 
Mular) and several bugfixes:

* fence_scsi is reimplemented on top of fencing library
* fence_zvm support distributed z/VM systems
* support for --delay was added to fence_zvm
* unmaintained fence agents were removed:
     * fence_baytech, fence_bullpap, fence_cpint, fence_mcdata,
     * fence_rackswitch, fence_vixel, fence_xcat
     * we do not plan to remove other agents
* update fence_rsb to work with new firmware

The new source tarball can be downloaded here:

https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-4.0.10.tar.xz 


To report bugs or issues:

https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

    Join us on IRC (irc.freenode.net #linux-cluster) and share your
    experience  with other sysadministrators or power users.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

m,



From ekuric at redhat.com  Fri Jul  4 09:20:25 2014
From: ekuric at redhat.com (Elvir Kuric)
Date: Fri, 04 Jul 2014 11:20:25 +0200
Subject: [Linux-cluster] Error in Cluster.conf
In-Reply-To: <CAJWdRQgVWXvVpycXtbzOi58ytrkRfFx4mp727pTO03g5Mv9i4A@mail.gmail.com>
References: <CAJWdRQi0NTuvB8+L=X9qCLOkNmOG=DTfbbVLdsY7HSFs1qP+PA@mail.gmail.com>	<53A96784.3030009@redhat.com>
	<20140624125500.GA1425@redhat.com>	<53A99D5E.2080508@alteeve.ca>
	<CAJWdRQgVWXvVpycXtbzOi58ytrkRfFx4mp727pTO03g5Mv9i4A@mail.gmail.com>
Message-ID: <53B671D9.9020501@redhat.com>

On 06/24/2014 08:44 PM, Amjad Syed wrote:
> I have updated the config file ,  validated by ccs_config_validate
>
> Added the fence_daemon and post_join_delay. I am using bonding using
> ethernet coaxial cable.
ok, check bonding modes supported ( depends what OS used - different for
rhel ( Centos ) 5 / RHEL ( CentoOS ) 6
> But for some reason whenever i start CMAN on node, it fences (kicks
> the other node). As a result at a time only one node is online . Do i
> need to use multicast to get both nodes online at same instance ?.
it would be good to see logs at from surviving node before it decide to
fence its peer. That said, boot machines at same time and see what is
happening in logs, there will be reason ( on surviving node logs ) why
it thinks that its peer is not in good state so it needs to be fenced.

mutlicast is used by default and that traffic needs to be allowed in
cluster network. You can rule out issue with muticast if you for test
purposes in cluster.conf change
        <cman expected_votes="1" two_node="1"/>
to
        <cman expected_votes="1" broadcast="yes" two_node="1"/>

if issue is not visible with broadcast="yes" then you can say that
multicast could be  issue  ( and then you can work to fix that ). If you
have RHEL 6 / CentOS you can also try with unicast udp ( udpu, by adding
transport="udpu" in above cman stanza , more in doc :
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-unicast-traffic-CA.html
)

Also you must ensure that fencing is working properly, I recommend to
take time and to read :
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Fence_Configuration_Guide/index.html

If still there is issue after all your tests with cluster and if you
have valid Red Hat subscription ( with proper support level : Standard
or Premium ) then you can visit Red Hat Customer portal
https://access.redhat.com and open case with Red Hat Support where we
can work to fix issue.

Kind regards,

Elvir Kuric

> or i am missing something here ?
>
> Now the file looks like this :
>
>
> ?xml version="1.0"?>
> <cluster config_version="2" name="oracleha">
>         <cman expected_votes="1" two_node="1"/>
>         <fencedevices>
>            <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
> login="ADMIN" name="inspuripmi"  passwd="xxxx"/>
>            <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
> login="test" name="hpipmi"  passwd="xxxx"/>
>           </fencedevices>
>           <fence_daemon post_fail_delay="0" post_join_delay="60"/>
>         <clusternodes>
>            <clusternode name= "krplporcl001"  nodeid="1" votes= "1">
>            <fence>
>                <method name  = "1">
>                  <device lanplus = "" name="inspuripmi"  action
> ="reboot"/>
>                  </method>
>             </fence>
>            </clusternode>
>             <clusternode name = "krplporcl002" nodeid="2" votes ="1">
>                  <fence>
>                  <method name = "1">
>                   <device lanplus = "" name="hpipmi" action ="reboot"/>
>                    </method>
>               </fence>
>             </clusternode>
>          </clusternodes>
>
>
>         <rm>
>
>           <failoverdomains/>
>         <resources/>
>         <service autostart="1" exclusive="0" name="IP"
> recovery="relocate">
>                 <ip address="10.10.5.23" monitor_link="on"
> sleeptime="10"/>
>         </service>
> </rm>
> </cluster>
>
> Thanks
>
>
> On Tue, Jun 24, 2014 at 6:46 PM, Digimer <lists at alteeve.ca
> <mailto:lists at alteeve.ca>> wrote:
>
>     On 24/06/14 08:55 AM, Jan Pokorn? wrote:
>
>         On 24/06/14 13:56 +0200, Fabio M. Di Nitto wrote:
>
>             On 6/24/2014 12:32 PM, Amjad Syed wrote:
>
>                 Hello
>
>                 I am getting the following error when i run
>                 ccs_config_Validate
>
>                 ccs_config_validate
>                 Relax-NG validity error : Extra element clusternodes
>                 in interleave
>
>
>             You defined <clusternodes.. twice.
>
>
>         That + the are more issues discoverable by more powerful validator
>         jing (packaged in Fedora and RHEL 7, for instance, admittedly not
>         for RHEL 6/EPEL):
>
>         $ jing cluster.rng cluster.conf
>
>             cluster.conf:13:47: error:
>                element "fencedvice" not allowed anywhere; expected the
>             element
>                end-tag or element "fencedevice"
>             cluster.conf:15:23: error:
>                element "clusternodes" not allowed here; expected the
>             element
>                end-tag or element "clvmd", "dlm", "fence_daemon",
>             "fence_xvmd",
>                "gfs_controld", "group", "logging", "quorumd", "rm",
>             "totem" or
>                "uidgid"
>             cluster.conf:26:76: error:
>                IDREF "fence_node2" without matching ID
>             cluster.conf:19:77: error:
>                IDREF "fence_node1" without matching ID
>
>
>         So it spotted also:
>         - a typo in "fencedvice"
>         - broken referential integrity; it is prescribed "name" attribute
>            of "device" tag should match a "name" of a defined
>         "fencedevice"
>
>         Hope this helps.
>
>         -- Jan
>
>
>     Also, without fence methods defined for the nodes, rgmanager will
>     block the first time there is an issue.
>
>     -- 
>     Digimer
>     Papers and Projects: https://alteeve.ca/w/
>     What if the cure for cancer is trapped in the mind of a person
>     without access to education?
>
>
>     -- 
>     Linux-cluster mailing list
>     Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
>


-- 
Elvir Kuric,TSE / Red Hat / GSS EMEA / 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140704/ea7e51bb/attachment.htm>

From laszlo.budai at acceleris.ro  Thu Jul 10 11:49:09 2014
From: laszlo.budai at acceleris.ro (Laszlo Budai)
Date: Thu, 10 Jul 2014 14:49:09 +0300
Subject: [Linux-cluster] Cman not start when quorum disk is not available
Message-ID: <53BE7DB5.3080203@acceleris.ro>

Dear all,

we have a RHEL 6.3 cluster of two nodes and a quorum disk.
We are testing the cluster against different failures. We have a problem 
when the shared storage is disconnected from one of the nodes. The node 
that has lost contact with the storage is fenced, but when restarting 
the machine cman will not start up (it will try to start but it will stop):


Jul  9 17:55:54 clnode1p kdump: started up
Jul  9 17:55:54 clnode1p kernel: bond0: no IPv6 routers present
Jul  9 17:55:54 clnode1p kernel: DLM (built Jun 13 2012 18:26:45) installed
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Corosync Cluster 
Engine ('1.4.1'): started and ready to provide service.
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Corosync built-in 
features: nss dbus rdma snmp
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Successfully read 
config from /etc/cluster/cluster.conf
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Successfully parsed 
cman config
Jul  9 17:55:55 clnode1p corosync[2514]:   [TOTEM ] Initializing 
transport (UDP/IP Multicast).
Jul  9 17:55:55 clnode1p corosync[2514]:   [TOTEM ] Initializing 
transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jul  9 17:55:55 clnode1p corosync[2514]:   [TOTEM ] The network 
interface [172.16.255.1] is now up.
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Using quorum 
provider quorum_cman
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync cluster quorum service v0.1
Jul  9 17:55:55 clnode1p corosync[2514]:   [CMAN  ] CMAN 3.0.12.1 (built 
May  8 2012 12:22:26) started
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync CMAN membership service 2.90
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: openais checkpoint service B.01.01
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync extended virtual synchrony service
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync configuration service
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync cluster closed process group service v1.01
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync cluster config database access v1.01
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync profile loading service
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Using quorum 
provider quorum_cman
Jul  9 17:55:55 clnode1p corosync[2514]:   [SERV  ] Service engine 
loaded: corosync cluster quorum service v0.1
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Compatibility mode 
set to whitetank.  Using V1 and V2 of the synchronization engine.
Jul  9 17:55:55 clnode1p corosync[2514]:   [TOTEM ] A processor joined 
or left the membership and a new membership was formed.
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Members[1]: 1
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Members[1]: 1
Jul  9 17:55:55 clnode1p corosync[2514]:   [CPG   ] chosen downlist: 
sender r(0) ip(172.16.255.1) ; members(old:0 left:0)
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Jul  9 17:55:55 clnode1p corosync[2514]:   [TOTEM ] A processor joined 
or left the membership and a new membership was formed.
Jul  9 17:55:55 clnode1p corosync[2514]:   [CMAN  ] quorum regained, 
resuming activity
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] This node is within 
the primary component and will provide service.
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Members[2]: 1 2
Jul  9 17:55:55 clnode1p corosync[2514]:   [QUORUM] Members[2]: 1 2
Jul  9 17:55:55 clnode1p corosync[2514]:   [CPG   ] chosen downlist: 
sender r(0) ip(172.16.255.1) ; members(old:1 left:0)
Jul  9 17:55:55 clnode1p corosync[2514]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Jul  9 17:55:59 clnode1p kernel: bond1: no IPv6 routers present
Jul  9 17:55:59 clnode1p qdiskd[2564]: Loading dynamic configuration
Jul  9 17:55:59 clnode1p qdiskd[2564]: Setting votes to 1
Jul  9 17:55:59 clnode1p qdiskd[2564]: Loading static configuration
Jul  9 17:55:59 clnode1p qdiskd[2564]: Timings: 8 tko, 1 interval
Jul  9 17:55:59 clnode1p qdiskd[2564]: Timings: 2 tko_up, 4 master_wait, 
2 upgrade_wait
Jul  9 17:55:59 clnode1p qdiskd[2564]: Heuristic: '/bin/ping -c1 -w1 
clswitch1m' score=1 interval=2 tko=4
Jul  9 17:55:59 clnode1p qdiskd[2564]: Heuristic: '/bin/ping -c1 -w1 
clswitch2m' score=1 interval=2 tko=4
Jul  9 17:55:59 clnode1p qdiskd[2564]: 2 heuristics loaded
Jul  9 17:55:59 clnode1p qdiskd[2564]: Quorum Daemon: 2 heuristics, 1 
interval, 8 tko, 1 votes
Jul  9 17:55:59 clnode1p qdiskd[2564]: Run Flags: 00000271
Jul  9 17:55:59 clnode1p qdiskd[2564]: stat
Jul  9 17:55:59 clnode1p qdiskd[2564]: qdisk_validate: No such file or 
directory
Jul  9 17:55:59 clnode1p qdiskd[2564]: Specified partition 
/dev/mapper/apsto1-vd01-v001 does not have a qdisk label
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Unloading all 
Corosync service engines.
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync extended virtual synchrony service
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync configuration service
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync cluster closed process group service v1.01
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync cluster config database access v1.01
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync profile loading service
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: openais checkpoint service B.01.01
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync CMAN membership service 2.90
Jul  9 17:56:01 clnode1p corosync[2514]:   [SERV  ] Service engine 
unloaded: corosync cluster quorum service v0.1
Jul  9 17:56:01 clnode1p corosync[2514]:   [MAIN  ] Corosync Cluster 
Engine exiting with status 0 at main.c:1864.


And it will remain in this state even if the storage is reattached later 
on.  So now I have only one functioning node.
What can be done to fix this (to have the cluster framework started)?

Thank you,
Laszlo



-- 
Acceleris System Integration | and IT works
  
Laszlo Budai | Technical Consultant
Bvd. Barbu Vacarescu 80 | RO-020282 Bucuresti
t +40 21 23 11 538
laszlo.budai at acceleris.ro | www.acceleris.ro
                         
Acceleris Offices are in:
Basel | Bucharest | Zollikofen | Renens | Kloten



From amujeebs at gmail.com  Tue Jul 15 14:18:30 2014
From: amujeebs at gmail.com (abdul mujeeb Siddiqui)
Date: Tue, 15 Jul 2014 17:18:30 +0300
Subject: [Linux-cluster] Basename mismatch
Message-ID: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>

Hello, I have to implemented  red hat linux 6.4 cluster suite and trying to
use Oracle11gr2 on it.But oracle service is unable to start.
Listener isnot starting.
Anyone have implemented oracle11gr2 so please
Send me cluster.conf and oracledb.sh and also listener.ora and tnsnames.ora
files pls.
Thanks in advanced
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140715/c7058cef/attachment.htm>

From devin.bougie at cornell.edu  Tue Jul 15 15:36:54 2014
From: devin.bougie at cornell.edu (Devin A. Bougie)
Date: Tue, 15 Jul 2014 15:36:54 +0000
Subject: [Linux-cluster] mixed 6.4 and 6.5 cluster - delays accessing mpath
 devices and clustered lvm's
Message-ID: <86A230DF-E57C-45C4-8949-E2E15BB75411@cornell.edu>

We have a cluster of EL6.4 servers, with one server at fully updated EL6.5.  After upgrading to 6.5, we see unreasonably long delays accessing some mpath devices and clustered lvm's on the 6.5 member.  There are no problems with the 6.4 members.

This can be seen by strace'ing lvscan.  In the following example, syscall time is at the end of the line,
reads with ascii text are mpath devices, the rest are volumes:

------
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.467385>
16241 read(5, "\17u\21^ LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 4096) = 4096 <1.760943>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.164032>
16241 read(5, "gment1 {\nstart_extent = 0\nextent"..., 4096) = 4096 <2.859972>
16241 read(5, "\353H\220\20\216\320\274\0\260\270\0\0\216\330\216\300\373\276\0|\277\0\6\271\0\2\363\244\352!\6\0"..., 4096) = 4096 <1.717222>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.476014>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.800225>
16241 read(5, "3\300\216\320\274\0|\216\300\216\330\276\0|\277\0\6\271\0\2\374\363\244Ph\34\6\313\373\271\4\0"..., 4096) = 4096 <2.008620>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.021734>
16241 read(5, "3\300\216\320\274\0|\216\300\216\330\276\0|\277\0\6\271\0\2\374\363\244Ph\34\6\313\373\271\4\0"..., 4096) = 4096 <2.126359>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.036027>
16241 read(5, "\1\4\0\0\21\4\0\0!\4\0\0\331[\362\37\2\0\4\0\0\0\0\0\0\0\0\0\356\37U\23"..., 4096) = 4096 <1.330302>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.381982>
16241 read(5, "vgift3 {\nid = \"spdYGc-5hqc-ejzd-"..., 8192) = 8192 <0.922098>
16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.440282>
16241 read(6, "vgift3 {\nid = \"spdYGc-5hqc-ejzd-"..., 8192) = 8192 <1.158817>
16241 read(5, "gment1 {\nstart_extent = 0\nextent"..., 4096) = 4096 <0.941814>
16241 read(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.518448>
16241 read(6, "gment1 {\nstart_extent = 0\nextent"..., 20480) = 20480 <2.006777>
------

The delay can also be seen in the syslog  messages we receive after restarting clvmd with debugging enabled.

------
Jul 14 11:47:58 lnx05 lvm[13423]: Got new connection on fd 5
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28
Jul 14 11:48:03 lnx05 lvm[13423]: creating pipe, [11, 12]
Jul 14 11:48:03 lnx05 lvm[13423]: Creating pre&post thread
Jul 14 11:48:03 lnx05 lvm[13423]: Created pre&post thread, state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: in sub thread: client = 0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift5' at 1 (client=0x13e7460)
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift5' mode:3 flags=0
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 24c0008
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3443, flags=0x1 (LOCAL)
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3443
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift5', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0
Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift5
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 31
Jul 14 11:48:03 lnx05 lvm[13423]: check_all_clvmds_running
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3444, flags=0x0 ()
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3444
Jul 14 11:48:03 lnx05 lvm[13423]: Sending message to all cluster nodes
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: SYNC_NAMES (0x2d) msg=0x13e7110, msglen =31, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx01-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 2 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx02-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 3 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx04-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 4 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx07-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 5 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx06-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 6 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx08-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 7 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx09-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 8 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx03-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 9 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift5' at 6 (client=0x13e7460)
Jul 14 11:48:03 lnx05 lvm[13423]: sync_unlock: 'V_vgift5' lkid:24c0008
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3445, flags=0x1 (LOCAL)
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3445
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift5', cmd = 0x6 LCK_VG (UNLOCK|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0
Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift5
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift3' at 1 (client=0x13e7460)
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift3' mode:3 flags=0
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 166000b
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3446, flags=0x1 (LOCAL)
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3446
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift3', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0
Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift3
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 31
Jul 14 11:48:03 lnx05 lvm[13423]: check_all_clvmds_running
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3447, flags=0x0 ()
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3447
Jul 14 11:48:03 lnx05 lvm[13423]: Sending message to all cluster nodes
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: SYNC_NAMES (0x2d) msg=0x13e7110, msglen =31, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx01-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 2 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx02-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 3 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx04-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 4 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx07-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 5 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx06-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 6 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx08-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 7 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx09-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 8 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx03-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 9 replies, expecting: 9
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift3' at 6 (client=0x13e7460)
Jul 14 11:48:03 lnx05 lvm[13423]: sync_unlock: 'V_vgift3' lkid:166000b
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3448, flags=0x1 (LOCAL)
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3448
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift3', cmd = 0x6 LCK_VG (UNLOCK|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0
Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift3
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28
Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift2' at 1 (client=0x13e7460)
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift2' mode:3 flags=0
Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 3b20007
Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3449, flags=0x1 (LOCAL)
Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3449
Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local
Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift2', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0
Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift2
Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes
Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1
Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition...
Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command
Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply
Jul 14 11:48:04 lnx05 lvm[13423]: Read on local socket 5, len = 31
Jul 14 11:48:04 lnx05 lvm[13423]: check_all_clvmds_running
Jul 14 11:48:04 lnx05 lvm[13423]: Got pre command condition...
Jul 14 11:48:04 lnx05 lvm[13423]: Writing status 0 down pipe 12
Jul 14 11:48:04 lnx05 lvm[13423]: Waiting to do post command - state = 0
Jul 14 11:48:04 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0
Jul 14 11:48:04 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460
Jul 14 11:48:04 lnx05 lvm[13423]: distribute command: XID = 3450, flags=0x0 ()
Jul 14 11:48:04 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3450
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25821 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25832 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25844 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25857 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25905 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0
Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote
Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25914 on node lnx04-p12
Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names
Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work
------

Before we upgrade all cluster members to 6.5, we'd like to be reasonably certain that it will fix the problem rather than spread it to the entire cluster.  Any help would be greatly appreciated.

Many thanks,
Devin





From emi2fast at gmail.com  Tue Jul 15 15:50:41 2014
From: emi2fast at gmail.com (emmanuel segura)
Date: Tue, 15 Jul 2014 17:50:41 +0200
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
Message-ID: <CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>

1: show your config
2: show what kind of problem you find, doing what you want to archive
3: show your cluster
4: without any error and some more information, is realy hard to help you

2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> Hello, I have to implemented  red hat linux 6.4 cluster suite and trying to
> use Oracle11gr2 on it.But oracle service is unable to start.
> Listener isnot starting.
> Anyone have implemented oracle11gr2 so please
> Send me cluster.conf and oracledb.sh and also listener.ora and tnsnames.ora
> files pls.
> Thanks in advanced
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



-- 
esta es mi vida e me la vivo hasta que dios quiera



From amujeebs at gmail.com  Wed Jul 16 12:31:12 2014
From: amujeebs at gmail.com (abdul mujeeb Siddiqui)
Date: Wed, 16 Jul 2014 15:31:12 +0300
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
	<CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
Message-ID: <CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>

Dear segura ,

Thanks for response
1: show your config
<?xml version="1.0"?>
<cluster config_version="10" name="oracleha">
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
           <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
login="ADMIN" name="inspuripmi"  passwd="xxx"/>
           <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
login="test" name="hpipmi"  passwd="xxxx"/>
          </fencedevices>
      <fence_daemon post_fail_delay="0" post_join_delay="60"/>
        <clusternodes>
           <clusternode name= "192.168.10.10"  nodeid="1" >
           <fence>
               <method name  = "1">
                 <device lanplus = "" name="inspuripmi"  action ="reboot"/>
                 </method>
            </fence>
           </clusternode>
            <clusternode name = "192.168.10.11" nodeid="2">
                 <fence>
                 <method name = "1">
                  <device lanplus = "" name="hpipmi" action ="reboot"/>
                  </method>
               </fence>
            </clusternode>
         </clusternodes>

        <rm>
          <failoverdomains/>
        <resources/>
        <service autostart="1" exclusive="0" name="IP" recovery="relocate">
                <ip address="10.10.5.23" monitor_link="on" sleeptime="10"/>
                <fs device="/dev/mapper/datavg-lv_orau02" force_unmount="1"
fstype="ext4" mountpoint="/u02" name="/u02"/>
                <fs device="/dev/mapper/datavg-lv_orau03" force_unmount="1"
fstype="ext4" mountpoint="/u03" name="/u03"/>
                <fs device="/dev/mapper/datavg-lv_orau04" force_unmount="1"
fstype="ext4" mountpoint="/u04" name="/u04"/>
                <fs device="/dev/mapper/datavg-lv_orau05" force_unmount="1"
fstype="ext4" mountpoint="/u05" name="/u05"/>
        <oracledb home="/u01/app/oracle/product/11.2.0.4/dbhome_1"
name="testvip" type="base" user="oracle" vhost="10.10.5.23"/>
        </service>
</rm>
</cluster>

 2: show what kind of problem you find, doing what you want to archive
[root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP
Running in test mode.
<debug>  Validating configuration for testvip
[oracledb] Validating configuration for testvip
basename: missing operand
Try `basename --help' for more information.
Failed to start IP


[root@/usr/share/cluster]# vi oracledb.sh
# Customize these to match your Oracle installation. #
######################################################
#
# 1. Oracle user.  Must be the same across all cluster members.  In the
event
#    that this script is run by the super-user, it will automatically switch
#    to the Oracle user and restart.  Oracle needs to run as the Oracle
#    user, not as root.
#[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
#
# 2. Oracle home.  This is set up during the installation phase of Oracle.
#    From the perspective of the cluster, this is generally the mount point
#    you intend to use as the mount point for your Oracle Infrastructure
#    service.
#
#[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home
[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/u01/app/oracle/product/
11.2.0.4/dbhome_1
#
# 3. This is your SID.  This is set up during oracle installation as well.
#
#[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl
[ -n "$ORACLE_SID" ] || ORACLE_SID=testvip
#
# 4. The oracle user probably doesn't have the permission to write to
# /var/lock/subsys, so use the user's home directory.
#
#[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock"
[ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock"
#[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch
privileges
#
# 5. Type of Oracle Database.  Currently supported: 10g 10g-iAS(untested!)
#[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em"
[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base"
#
# 6. Oracle virtual hostname.  This is the hostname you gave Oracle during
#    installation.
#
#[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com


 3: show your cluster
root at xxxx]# clustat
Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
Member Status: Quorate
 Member Name                             ID   Status
 ------ ----                             ---- ------
 192.168.10.10                               1 Online, Local, rgmanager
 192.168.10.11                               2 Online, rgmanager
 Service Name                   Owner (Last)                   State
 ------- ----                   ----- ------                   -----
 service:IP                     192.168.10.10                  started

[root at xxx]# clustat
Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
Member Status: Quorate
 Member Name                             ID   Status
 ------ ----                             ---- ------
 192.168.10.10                               1 Online,rgmanager
 192.168.10.11                               2 Online,Local,rgmanager
 Service Name                   Owner (Last)                   State
 ------- ----                   ----- ------                   -----
 service:IP                     192.168.10.11                  started

 4: without any error and some more information, is realy hard to help you

I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5  .
In normal working fine .
When I am using Oracle Service in Cluster.conf as above I am getting ERROR
Try `basename --help' for more information.
Failed to start IP
Database Service unable to start.

I have try all the options given in oracledb.sh  but same ERROR
Please help me .

Thanks in advance.


On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura <emi2fast at gmail.com> wrote:

> 1: show your config
> 2: show what kind of problem you find, doing what you want to archive
> 3: show your cluster
> 4: without any error and some more information, is realy hard to help you
>
> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> > Hello, I have to implemented  red hat linux 6.4 cluster suite and trying
> to
> > use Oracle11gr2 on it.But oracle service is unable to start.
> > Listener isnot starting.
> > Anyone have implemented oracle11gr2 so please
> > Send me cluster.conf and oracledb.sh and also listener.ora and
> tnsnames.ora
> > files pls.
> > Thanks in advanced
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140716/b7d0fc44/attachment.htm>

From emi2fast at gmail.com  Wed Jul 16 14:31:35 2014
From: emi2fast at gmail.com (emmanuel segura)
Date: Wed, 16 Jul 2014 16:31:35 +0200
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
	<CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
	<CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>
Message-ID: <CAE7pJ3A6cH4c_6qi-ys6hXng=AOoGtgNYL9n8b=_+z1dC5J0Yg@mail.gmail.com>

can you try to use orainstance agent ?

2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> Dear segura ,
>
> Thanks for response
> 1: show your config
> <?xml version="1.0"?>
> <cluster config_version="10" name="oracleha">
>         <cman expected_votes="1" two_node="1"/>
>         <fencedevices>
>            <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
> login="ADMIN" name="inspuripmi"  passwd="xxx"/>
>            <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
> login="test" name="hpipmi"  passwd="xxxx"/>
>           </fencedevices>
>       <fence_daemon post_fail_delay="0" post_join_delay="60"/>
>         <clusternodes>
>            <clusternode name= "192.168.10.10"  nodeid="1" >
>            <fence>
>                <method name  = "1">
>                  <device lanplus = "" name="inspuripmi"  action ="reboot"/>
>                  </method>
>             </fence>
>            </clusternode>
>             <clusternode name = "192.168.10.11" nodeid="2">
>                  <fence>
>                  <method name = "1">
>                   <device lanplus = "" name="hpipmi" action ="reboot"/>
>                   </method>
>                </fence>
>             </clusternode>
>          </clusternodes>
>
>         <rm>
>           <failoverdomains/>
>         <resources/>
>         <service autostart="1" exclusive="0" name="IP" recovery="relocate">
>                 <ip address="10.10.5.23" monitor_link="on" sleeptime="10"/>
>                 <fs device="/dev/mapper/datavg-lv_orau02" force_unmount="1"
> fstype="ext4" mountpoint="/u02" name="/u02"/>
>                 <fs device="/dev/mapper/datavg-lv_orau03" force_unmount="1"
> fstype="ext4" mountpoint="/u03" name="/u03"/>
>                 <fs device="/dev/mapper/datavg-lv_orau04" force_unmount="1"
> fstype="ext4" mountpoint="/u04" name="/u04"/>
>                 <fs device="/dev/mapper/datavg-lv_orau05" force_unmount="1"
> fstype="ext4" mountpoint="/u05" name="/u05"/>
>         <oracledb home="/u01/app/oracle/product/11.2.0.4/dbhome_1"
> name="testvip" type="base" user="oracle" vhost="10.10.5.23"/>
>         </service>
> </rm>
> </cluster>
>
>  2: show what kind of problem you find, doing what you want to archive
> [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP
> Running in test mode.
> <debug>  Validating configuration for testvip
> [oracledb] Validating configuration for testvip
> basename: missing operand
> Try `basename --help' for more information.
> Failed to start IP
>
>
> [root@/usr/share/cluster]# vi oracledb.sh
> # Customize these to match your Oracle installation. #
> ######################################################
> #
> # 1. Oracle user.  Must be the same across all cluster members.  In the
> event
> #    that this script is run by the super-user, it will automatically switch
> #    to the Oracle user and restart.  Oracle needs to run as the Oracle
> #    user, not as root.
> #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> #
> # 2. Oracle home.  This is set up during the installation phase of Oracle.
> #    From the perspective of the cluster, this is generally the mount point
> #    you intend to use as the mount point for your Oracle Infrastructure
> #    service.
> #
> #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home
> [ -n "$ORACLE_HOME" ] ||
> ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1
> #
> # 3. This is your SID.  This is set up during oracle installation as well.
> #
> #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl
> [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip
> #
> # 4. The oracle user probably doesn't have the permission to write to
> # /var/lock/subsys, so use the user's home directory.
> #
> #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock"
> [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock"
> #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch
> privileges
> #
> # 5. Type of Oracle Database.  Currently supported: 10g 10g-iAS(untested!)
> #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em"
> [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base"
> #
> # 6. Oracle virtual hostname.  This is the hostname you gave Oracle during
> #    installation.
> #
> #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com
>
>
>  3: show your cluster
> root at xxxx]# clustat
> Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> Member Status: Quorate
>  Member Name                             ID   Status
>  ------ ----                             ---- ------
>  192.168.10.10                               1 Online, Local, rgmanager
>  192.168.10.11                               2 Online, rgmanager
>  Service Name                   Owner (Last)                   State
>  ------- ----                   ----- ------                   -----
>  service:IP                     192.168.10.10                  started
>
> [root at xxx]# clustat
> Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> Member Status: Quorate
>  Member Name                             ID   Status
>  ------ ----                             ---- ------
>  192.168.10.10                               1 Online,rgmanager
>  192.168.10.11                               2 Online,Local,rgmanager
>  Service Name                   Owner (Last)                   State
>  ------- ----                   ----- ------                   -----
>  service:IP                     192.168.10.11                  started
>
>
>  4: without any error and some more information, is realy hard to help you
>
> I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5  .
> In normal working fine .
> When I am using Oracle Service in Cluster.conf as above I am getting ERROR
> Try `basename --help' for more information.
> Failed to start IP
> Database Service unable to start.
>
> I have try all the options given in oracledb.sh  but same ERROR
> Please help me .
>
> Thanks in advance.
>
>
> On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura <emi2fast at gmail.com> wrote:
>>
>> 1: show your config
>> 2: show what kind of problem you find, doing what you want to archive
>> 3: show your cluster
>> 4: without any error and some more information, is realy hard to help you
>>
>> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
>> > Hello, I have to implemented  red hat linux 6.4 cluster suite and trying
>> > to
>> > use Oracle11gr2 on it.But oracle service is unable to start.
>> > Listener isnot starting.
>> > Anyone have implemented oracle11gr2 so please
>> > Send me cluster.conf and oracledb.sh and also listener.ora and
>> > tnsnames.ora
>> > files pls.
>> > Thanks in advanced
>> >
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



-- 
esta es mi vida e me la vivo hasta que dios quiera



From amujeebs at gmail.com  Wed Jul 16 16:36:36 2014
From: amujeebs at gmail.com (abdul mujeeb Siddiqui)
Date: Wed, 16 Jul 2014 19:36:36 +0300
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAE7pJ3A6cH4c_6qi-ys6hXng=AOoGtgNYL9n8b=_+z1dC5J0Yg@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
	<CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
	<CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>
	<CAE7pJ3A6cH4c_6qi-ys6hXng=AOoGtgNYL9n8b=_+z1dC5J0Yg@mail.gmail.com>
Message-ID: <CAG7dDZfO_M7hv9f2BsLk=s2ctHaXF=7ztu5bnc7en_tGubMK1A@mail.gmail.com>

You mean say orainstance.sh script from /usr/share/cluster.
 On 16 Jul 2014 17:36, "emmanuel segura" <emi2fast at gmail.com> wrote:

> can you try to use orainstance agent ?
>
> 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> > Dear segura ,
> >
> > Thanks for response
> > 1: show your config
> > <?xml version="1.0"?>
> > <cluster config_version="10" name="oracleha">
> >         <cman expected_votes="1" two_node="1"/>
> >         <fencedevices>
> >            <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
> > login="ADMIN" name="inspuripmi"  passwd="xxx"/>
> >            <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
> > login="test" name="hpipmi"  passwd="xxxx"/>
> >           </fencedevices>
> >       <fence_daemon post_fail_delay="0" post_join_delay="60"/>
> >         <clusternodes>
> >            <clusternode name= "192.168.10.10"  nodeid="1" >
> >            <fence>
> >                <method name  = "1">
> >                  <device lanplus = "" name="inspuripmi"  action
> ="reboot"/>
> >                  </method>
> >             </fence>
> >            </clusternode>
> >             <clusternode name = "192.168.10.11" nodeid="2">
> >                  <fence>
> >                  <method name = "1">
> >                   <device lanplus = "" name="hpipmi" action ="reboot"/>
> >                   </method>
> >                </fence>
> >             </clusternode>
> >          </clusternodes>
> >
> >         <rm>
> >           <failoverdomains/>
> >         <resources/>
> >         <service autostart="1" exclusive="0" name="IP"
> recovery="relocate">
> >                 <ip address="10.10.5.23" monitor_link="on"
> sleeptime="10"/>
> >                 <fs device="/dev/mapper/datavg-lv_orau02"
> force_unmount="1"
> > fstype="ext4" mountpoint="/u02" name="/u02"/>
> >                 <fs device="/dev/mapper/datavg-lv_orau03"
> force_unmount="1"
> > fstype="ext4" mountpoint="/u03" name="/u03"/>
> >                 <fs device="/dev/mapper/datavg-lv_orau04"
> force_unmount="1"
> > fstype="ext4" mountpoint="/u04" name="/u04"/>
> >                 <fs device="/dev/mapper/datavg-lv_orau05"
> force_unmount="1"
> > fstype="ext4" mountpoint="/u05" name="/u05"/>
> >         <oracledb home="/u01/app/oracle/product/11.2.0.4/dbhome_1"
> > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/>
> >         </service>
> > </rm>
> > </cluster>
> >
> >  2: show what kind of problem you find, doing what you want to archive
> > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP
> > Running in test mode.
> > <debug>  Validating configuration for testvip
> > [oracledb] Validating configuration for testvip
> > basename: missing operand
> > Try `basename --help' for more information.
> > Failed to start IP
> >
> >
> > [root@/usr/share/cluster]# vi oracledb.sh
> > # Customize these to match your Oracle installation. #
> > ######################################################
> > #
> > # 1. Oracle user.  Must be the same across all cluster members.  In the
> > event
> > #    that this script is run by the super-user, it will automatically
> switch
> > #    to the Oracle user and restart.  Oracle needs to run as the Oracle
> > #    user, not as root.
> > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> > #
> > # 2. Oracle home.  This is set up during the installation phase of
> Oracle.
> > #    From the perspective of the cluster, this is generally the mount
> point
> > #    you intend to use as the mount point for your Oracle Infrastructure
> > #    service.
> > #
> > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home
> > [ -n "$ORACLE_HOME" ] ||
> > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1
> > #
> > # 3. This is your SID.  This is set up during oracle installation as
> well.
> > #
> > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl
> > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip
> > #
> > # 4. The oracle user probably doesn't have the permission to write to
> > # /var/lock/subsys, so use the user's home directory.
> > #
> > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock"
> > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock"
> > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch
> > privileges
> > #
> > # 5. Type of Oracle Database.  Currently supported: 10g
> 10g-iAS(untested!)
> > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em"
> > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base"
> > #
> > # 6. Oracle virtual hostname.  This is the hostname you gave Oracle
> during
> > #    installation.
> > #
> > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com
> >
> >
> >  3: show your cluster
> > root at xxxx]# clustat
> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> > Member Status: Quorate
> >  Member Name                             ID   Status
> >  ------ ----                             ---- ------
> >  192.168.10.10                               1 Online, Local, rgmanager
> >  192.168.10.11                               2 Online, rgmanager
> >  Service Name                   Owner (Last)                   State
> >  ------- ----                   ----- ------                   -----
> >  service:IP                     192.168.10.10                  started
> >
> > [root at xxx]# clustat
> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> > Member Status: Quorate
> >  Member Name                             ID   Status
> >  ------ ----                             ---- ------
> >  192.168.10.10                               1 Online,rgmanager
> >  192.168.10.11                               2 Online,Local,rgmanager
> >  Service Name                   Owner (Last)                   State
> >  ------- ----                   ----- ------                   -----
> >  service:IP                     192.168.10.11                  started
> >
> >
> >  4: without any error and some more information, is realy hard to help
> you
> >
> > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5  .
> > In normal working fine .
> > When I am using Oracle Service in Cluster.conf as above I am getting
> ERROR
> > Try `basename --help' for more information.
> > Failed to start IP
> > Database Service unable to start.
> >
> > I have try all the options given in oracledb.sh  but same ERROR
> > Please help me .
> >
> > Thanks in advance.
> >
> >
> > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura <emi2fast at gmail.com>
> wrote:
> >>
> >> 1: show your config
> >> 2: show what kind of problem you find, doing what you want to archive
> >> 3: show your cluster
> >> 4: without any error and some more information, is realy hard to help
> you
> >>
> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> >> > Hello, I have to implemented  red hat linux 6.4 cluster suite and
> trying
> >> > to
> >> > use Oracle11gr2 on it.But oracle service is unable to start.
> >> > Listener isnot starting.
> >> > Anyone have implemented oracle11gr2 so please
> >> > Send me cluster.conf and oracledb.sh and also listener.ora and
> >> > tnsnames.ora
> >> > files pls.
> >> > Thanks in advanced
> >> >
> >> >
> >> > --
> >> > Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>
> >>
> >> --
> >> esta es mi vida e me la vivo hasta que dios quiera
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140716/fba134a7/attachment.htm>

From emi2fast at gmail.com  Wed Jul 16 17:12:12 2014
From: emi2fast at gmail.com (emmanuel segura)
Date: Wed, 16 Jul 2014 19:12:12 +0200
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAG7dDZfO_M7hv9f2BsLk=s2ctHaXF=7ztu5bnc7en_tGubMK1A@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
	<CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
	<CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>
	<CAE7pJ3A6cH4c_6qi-ys6hXng=AOoGtgNYL9n8b=_+z1dC5J0Yg@mail.gmail.com>
	<CAG7dDZfO_M7hv9f2BsLk=s2ctHaXF=7ztu5bnc7en_tGubMK1A@mail.gmail.com>
Message-ID: <CAE7pJ3ALKFpVcv37K2a9PQVLHNvimo6ZLQgOenvBuqdW1rCZ3g@mail.gmail.com>

yes.

2014-07-16 18:36 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> You mean say orainstance.sh script from /usr/share/cluster.
>
> On 16 Jul 2014 17:36, "emmanuel segura" <emi2fast at gmail.com> wrote:
>>
>> can you try to use orainstance agent ?
>>
>> 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
>> > Dear segura ,
>> >
>> > Thanks for response
>> > 1: show your config
>> > <?xml version="1.0"?>
>> > <cluster config_version="10" name="oracleha">
>> >         <cman expected_votes="1" two_node="1"/>
>> >         <fencedevices>
>> >            <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
>> > login="ADMIN" name="inspuripmi"  passwd="xxx"/>
>> >            <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
>> > login="test" name="hpipmi"  passwd="xxxx"/>
>> >           </fencedevices>
>> >       <fence_daemon post_fail_delay="0" post_join_delay="60"/>
>> >         <clusternodes>
>> >            <clusternode name= "192.168.10.10"  nodeid="1" >
>> >            <fence>
>> >                <method name  = "1">
>> >                  <device lanplus = "" name="inspuripmi"  action
>> > ="reboot"/>
>> >                  </method>
>> >             </fence>
>> >            </clusternode>
>> >             <clusternode name = "192.168.10.11" nodeid="2">
>> >                  <fence>
>> >                  <method name = "1">
>> >                   <device lanplus = "" name="hpipmi" action ="reboot"/>
>> >                   </method>
>> >                </fence>
>> >             </clusternode>
>> >          </clusternodes>
>> >
>> >         <rm>
>> >           <failoverdomains/>
>> >         <resources/>
>> >         <service autostart="1" exclusive="0" name="IP"
>> > recovery="relocate">
>> >                 <ip address="10.10.5.23" monitor_link="on"
>> > sleeptime="10"/>
>> >                 <fs device="/dev/mapper/datavg-lv_orau02"
>> > force_unmount="1"
>> > fstype="ext4" mountpoint="/u02" name="/u02"/>
>> >                 <fs device="/dev/mapper/datavg-lv_orau03"
>> > force_unmount="1"
>> > fstype="ext4" mountpoint="/u03" name="/u03"/>
>> >                 <fs device="/dev/mapper/datavg-lv_orau04"
>> > force_unmount="1"
>> > fstype="ext4" mountpoint="/u04" name="/u04"/>
>> >                 <fs device="/dev/mapper/datavg-lv_orau05"
>> > force_unmount="1"
>> > fstype="ext4" mountpoint="/u05" name="/u05"/>
>> >         <oracledb home="/u01/app/oracle/product/11.2.0.4/dbhome_1"
>> > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/>
>> >         </service>
>> > </rm>
>> > </cluster>
>> >
>> >  2: show what kind of problem you find, doing what you want to archive
>> > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP
>> > Running in test mode.
>> > <debug>  Validating configuration for testvip
>> > [oracledb] Validating configuration for testvip
>> > basename: missing operand
>> > Try `basename --help' for more information.
>> > Failed to start IP
>> >
>> >
>> > [root@/usr/share/cluster]# vi oracledb.sh
>> > # Customize these to match your Oracle installation. #
>> > ######################################################
>> > #
>> > # 1. Oracle user.  Must be the same across all cluster members.  In the
>> > event
>> > #    that this script is run by the super-user, it will automatically
>> > switch
>> > #    to the Oracle user and restart.  Oracle needs to run as the Oracle
>> > #    user, not as root.
>> > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
>> > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
>> > #
>> > # 2. Oracle home.  This is set up during the installation phase of
>> > Oracle.
>> > #    From the perspective of the cluster, this is generally the mount
>> > point
>> > #    you intend to use as the mount point for your Oracle Infrastructure
>> > #    service.
>> > #
>> > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home
>> > [ -n "$ORACLE_HOME" ] ||
>> > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1
>> > #
>> > # 3. This is your SID.  This is set up during oracle installation as
>> > well.
>> > #
>> > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl
>> > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip
>> > #
>> > # 4. The oracle user probably doesn't have the permission to write to
>> > # /var/lock/subsys, so use the user's home directory.
>> > #
>> > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock"
>> > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock"
>> > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch
>> > privileges
>> > #
>> > # 5. Type of Oracle Database.  Currently supported: 10g
>> > 10g-iAS(untested!)
>> > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em"
>> > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base"
>> > #
>> > # 6. Oracle virtual hostname.  This is the hostname you gave Oracle
>> > during
>> > #    installation.
>> > #
>> > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com
>> >
>> >
>> >  3: show your cluster
>> > root at xxxx]# clustat
>> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
>> > Member Status: Quorate
>> >  Member Name                             ID   Status
>> >  ------ ----                             ---- ------
>> >  192.168.10.10                               1 Online, Local, rgmanager
>> >  192.168.10.11                               2 Online, rgmanager
>> >  Service Name                   Owner (Last)                   State
>> >  ------- ----                   ----- ------                   -----
>> >  service:IP                     192.168.10.10                  started
>> >
>> > [root at xxx]# clustat
>> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
>> > Member Status: Quorate
>> >  Member Name                             ID   Status
>> >  ------ ----                             ---- ------
>> >  192.168.10.10                               1 Online,rgmanager
>> >  192.168.10.11                               2 Online,Local,rgmanager
>> >  Service Name                   Owner (Last)                   State
>> >  ------- ----                   ----- ------                   -----
>> >  service:IP                     192.168.10.11                  started
>> >
>> >
>> >  4: without any error and some more information, is realy hard to help
>> > you
>> >
>> > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5  .
>> > In normal working fine .
>> > When I am using Oracle Service in Cluster.conf as above I am getting
>> > ERROR
>> > Try `basename --help' for more information.
>> > Failed to start IP
>> > Database Service unable to start.
>> >
>> > I have try all the options given in oracledb.sh  but same ERROR
>> > Please help me .
>> >
>> > Thanks in advance.
>> >
>> >
>> > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura <emi2fast at gmail.com>
>> > wrote:
>> >>
>> >> 1: show your config
>> >> 2: show what kind of problem you find, doing what you want to archive
>> >> 3: show your cluster
>> >> 4: without any error and some more information, is realy hard to help
>> >> you
>> >>
>> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
>> >> > Hello, I have to implemented  red hat linux 6.4 cluster suite and
>> >> > trying
>> >> > to
>> >> > use Oracle11gr2 on it.But oracle service is unable to start.
>> >> > Listener isnot starting.
>> >> > Anyone have implemented oracle11gr2 so please
>> >> > Send me cluster.conf and oracledb.sh and also listener.ora and
>> >> > tnsnames.ora
>> >> > files pls.
>> >> > Thanks in advanced
>> >> >
>> >> >
>> >> > --
>> >> > Linux-cluster mailing list
>> >> > Linux-cluster at redhat.com
>> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >>
>> >>
>> >>
>> >> --
>> >> esta es mi vida e me la vivo hasta que dios quiera
>> >>
>> >> --
>> >> Linux-cluster mailing list
>> >> Linux-cluster at redhat.com
>> >> https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>> >
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



-- 
esta es mi vida e me la vivo hasta que dios quiera



From amujeebs at gmail.com  Wed Jul 16 17:38:21 2014
From: amujeebs at gmail.com (abdul mujeeb Siddiqui)
Date: Wed, 16 Jul 2014 20:38:21 +0300
Subject: [Linux-cluster] Basename mismatch
In-Reply-To: <CAE7pJ3ALKFpVcv37K2a9PQVLHNvimo6ZLQgOenvBuqdW1rCZ3g@mail.gmail.com>
References: <CAG7dDZeRuyq8SMTf8+tHE1BBDmrfU7Wf0XSF3CmSvfBOEtES-A@mail.gmail.com>
	<CAE7pJ3DjmmpO0ZFzcwAzY=WNuNu9=xPbvwumsTBQqJKtFBiPXQ@mail.gmail.com>
	<CAG7dDZd_ZF3XebDdQBFCeOQQaELppUip1+MLBWHQNgPuJT5vQg@mail.gmail.com>
	<CAE7pJ3A6cH4c_6qi-ys6hXng=AOoGtgNYL9n8b=_+z1dC5J0Yg@mail.gmail.com>
	<CAG7dDZfO_M7hv9f2BsLk=s2ctHaXF=7ztu5bnc7en_tGubMK1A@mail.gmail.com>
	<CAE7pJ3ALKFpVcv37K2a9PQVLHNvimo6ZLQgOenvBuqdW1rCZ3g@mail.gmail.com>
Message-ID: <CAG7dDZdSxk8o+PbV6zk8PbWjRX+dMUVtHq2cz-jvs0SR2pMKTQ@mail.gmail.com>

Can send me the syntax to be used in cluster.conf file ? Type=?
Aa I am using Oracle 11.2.0.4
On 16 Jul 2014 20:16, "emmanuel segura" <emi2fast at gmail.com> wrote:

> yes.
>
> 2014-07-16 18:36 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> > You mean say orainstance.sh script from /usr/share/cluster.
> >
> > On 16 Jul 2014 17:36, "emmanuel segura" <emi2fast at gmail.com> wrote:
> >>
> >> can you try to use orainstance agent ?
> >>
> >> 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com>:
> >> > Dear segura ,
> >> >
> >> > Thanks for response
> >> > 1: show your config
> >> > <?xml version="1.0"?>
> >> > <cluster config_version="10" name="oracleha">
> >> >         <cman expected_votes="1" two_node="1"/>
> >> >         <fencedevices>
> >> >            <fencedevice agent= "fence_ipmilan" ipaddr="10.10.63.93"
> >> > login="ADMIN" name="inspuripmi"  passwd="xxx"/>
> >> >            <fencedevice agent = "fence_ilo2" ipaddr="10.10.63.92"
> >> > login="test" name="hpipmi"  passwd="xxxx"/>
> >> >           </fencedevices>
> >> >       <fence_daemon post_fail_delay="0" post_join_delay="60"/>
> >> >         <clusternodes>
> >> >            <clusternode name= "192.168.10.10"  nodeid="1" >
> >> >            <fence>
> >> >                <method name  = "1">
> >> >                  <device lanplus = "" name="inspuripmi"  action
> >> > ="reboot"/>
> >> >                  </method>
> >> >             </fence>
> >> >            </clusternode>
> >> >             <clusternode name = "192.168.10.11" nodeid="2">
> >> >                  <fence>
> >> >                  <method name = "1">
> >> >                   <device lanplus = "" name="hpipmi" action
> ="reboot"/>
> >> >                   </method>
> >> >                </fence>
> >> >             </clusternode>
> >> >          </clusternodes>
> >> >
> >> >         <rm>
> >> >           <failoverdomains/>
> >> >         <resources/>
> >> >         <service autostart="1" exclusive="0" name="IP"
> >> > recovery="relocate">
> >> >                 <ip address="10.10.5.23" monitor_link="on"
> >> > sleeptime="10"/>
> >> >                 <fs device="/dev/mapper/datavg-lv_orau02"
> >> > force_unmount="1"
> >> > fstype="ext4" mountpoint="/u02" name="/u02"/>
> >> >                 <fs device="/dev/mapper/datavg-lv_orau03"
> >> > force_unmount="1"
> >> > fstype="ext4" mountpoint="/u03" name="/u03"/>
> >> >                 <fs device="/dev/mapper/datavg-lv_orau04"
> >> > force_unmount="1"
> >> > fstype="ext4" mountpoint="/u04" name="/u04"/>
> >> >                 <fs device="/dev/mapper/datavg-lv_orau05"
> >> > force_unmount="1"
> >> > fstype="ext4" mountpoint="/u05" name="/u05"/>
> >> >         <oracledb home="/u01/app/oracle/product/11.2.0.4/dbhome_1"
> >> > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/>
> >> >         </service>
> >> > </rm>
> >> > </cluster>
> >> >
> >> >  2: show what kind of problem you find, doing what you want to archive
> >> > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP
> >> > Running in test mode.
> >> > <debug>  Validating configuration for testvip
> >> > [oracledb] Validating configuration for testvip
> >> > basename: missing operand
> >> > Try `basename --help' for more information.
> >> > Failed to start IP
> >> >
> >> >
> >> > [root@/usr/share/cluster]# vi oracledb.sh
> >> > # Customize these to match your Oracle installation. #
> >> > ######################################################
> >> > #
> >> > # 1. Oracle user.  Must be the same across all cluster members.  In
> the
> >> > event
> >> > #    that this script is run by the super-user, it will automatically
> >> > switch
> >> > #    to the Oracle user and restart.  Oracle needs to run as the
> Oracle
> >> > #    user, not as root.
> >> > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> >> > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle
> >> > #
> >> > # 2. Oracle home.  This is set up during the installation phase of
> >> > Oracle.
> >> > #    From the perspective of the cluster, this is generally the mount
> >> > point
> >> > #    you intend to use as the mount point for your Oracle
> Infrastructure
> >> > #    service.
> >> > #
> >> > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home
> >> > [ -n "$ORACLE_HOME" ] ||
> >> > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1
> >> > #
> >> > # 3. This is your SID.  This is set up during oracle installation as
> >> > well.
> >> > #
> >> > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl
> >> > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip
> >> > #
> >> > # 4. The oracle user probably doesn't have the permission to write to
> >> > # /var/lock/subsys, so use the user's home directory.
> >> > #
> >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock"
> >> > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock"
> >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch
> >> > privileges
> >> > #
> >> > # 5. Type of Oracle Database.  Currently supported: 10g
> >> > 10g-iAS(untested!)
> >> > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em"
> >> > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base"
> >> > #
> >> > # 6. Oracle virtual hostname.  This is the hostname you gave Oracle
> >> > during
> >> > #    installation.
> >> > #
> >> > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com
> >> >
> >> >
> >> >  3: show your cluster
> >> > root at xxxx]# clustat
> >> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> >> > Member Status: Quorate
> >> >  Member Name                             ID   Status
> >> >  ------ ----                             ---- ------
> >> >  192.168.10.10                               1 Online, Local,
> rgmanager
> >> >  192.168.10.11                               2 Online, rgmanager
> >> >  Service Name                   Owner (Last)                   State
> >> >  ------- ----                   ----- ------                   -----
> >> >  service:IP                     192.168.10.10                  started
> >> >
> >> > [root at xxx]# clustat
> >> > Cluster Status for oracleha @ Tue Apr  8 11:10:30 2014
> >> > Member Status: Quorate
> >> >  Member Name                             ID   Status
> >> >  ------ ----                             ---- ------
> >> >  192.168.10.10                               1 Online,rgmanager
> >> >  192.168.10.11                               2 Online,Local,rgmanager
> >> >  Service Name                   Owner (Last)                   State
> >> >  ------- ----                   ----- ------                   -----
> >> >  service:IP                     192.168.10.11                  started
> >> >
> >> >
> >> >  4: without any error and some more information, is realy hard to help
> >> > you
> >> >
> >> > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5  .
> >> > In normal working fine .
> >> > When I am using Oracle Service in Cluster.conf as above I am getting
> >> > ERROR
> >> > Try `basename --help' for more information.
> >> > Failed to start IP
> >> > Database Service unable to start.
> >> >
> >> > I have try all the options given in oracledb.sh  but same ERROR
> >> > Please help me .
> >> >
> >> > Thanks in advance.
> >> >
> >> >
> >> > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura <emi2fast at gmail.com>
> >> > wrote:
> >> >>
> >> >> 1: show your config
> >> >> 2: show what kind of problem you find, doing what you want to archive
> >> >> 3: show your cluster
> >> >> 4: without any error and some more information, is realy hard to help
> >> >> you
> >> >>
> >> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui <amujeebs at gmail.com
> >:
> >> >> > Hello, I have to implemented  red hat linux 6.4 cluster suite and
> >> >> > trying
> >> >> > to
> >> >> > use Oracle11gr2 on it.But oracle service is unable to start.
> >> >> > Listener isnot starting.
> >> >> > Anyone have implemented oracle11gr2 so please
> >> >> > Send me cluster.conf and oracledb.sh and also listener.ora and
> >> >> > tnsnames.ora
> >> >> > files pls.
> >> >> > Thanks in advanced
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Linux-cluster mailing list
> >> >> > Linux-cluster at redhat.com
> >> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> esta es mi vida e me la vivo hasta que dios quiera
> >> >>
> >> >> --
> >> >> Linux-cluster mailing list
> >> >> Linux-cluster at redhat.com
> >> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >> >
> >> >
> >> >
> >> > --
> >> > Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>
> >>
> >> --
> >> esta es mi vida e me la vivo hasta que dios quiera
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140716/9c1bf715/attachment.htm>

From christoph at macht-blau.org  Wed Jul 23 15:53:56 2014
From: christoph at macht-blau.org (C. Handel)
Date: Wed, 23 Jul 2014 17:53:56 +0200
Subject: [Linux-cluster] corosync ring failure
Message-ID: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>

hi,

i run a cluster with two corosync rings. One of the rings is marked
faulty every fourty seconds, to immediately recover a second later.
the other ring is stable

i have no idea how i should debug this.


we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1
cluster consists of three machines. Ring1 is running on 10gigbit
interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their
respective switch.

corosync communication is udpu, rrp_mode is passive

cluster.conf:

<cluster config_version="30" name="aslfile">

<cman transport="udpu">
</cman>

<fence_daemon post_join_delay="120" post_fail_delay="30"/>

<fencedevices>
        <fencedevice name="pcmk" agent="fence_pcmk" action="off"/>
</fencedevices>

<quorumd
   cman_label="qdisk"
   device="/dev/mapper/mpath-091quorump1"
   min_score="1"
   votes="2"
   >
</quorumd>

<clusternodes>
<clusternode name="asl430m90" nodeid="430">
        <altname name="asl430"/>
        <fence>
                <method name="pcmk-redirect">
                        <device name="pcmk" port="asl430m90"/>
                </method>
        </fence>
</clusternode>
<clusternode name="asl431m90" nodeid="431">
        <altname name="asl431"/>
        <fence>
                <method name="pcmk-redirect">
                        <device name="pcmk" port="asl431m90"/>
                </method>
        </fence>
</clusternode>
<clusternode name="asl432m90" nodeid="432">
        <altname name="asl432"/>
        <fence>
                <method name="pcmk-redirect">
                        <device name="pcmk" port="asl432m90"/>
                </method>
        </fence>
</clusternode>
</clusternodes>
</cluster>


syslog


Jul 23 17:48:34 asl431 corosync[3254]:   [TOTEM ] Marking ringid 1
interface 140.181.134.212 FAULTY
Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
Jul 23 17:49:14 asl431 corosync[3254]:   [TOTEM ] Marking ringid 1
interface 140.181.134.212 FAULTY
Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1



Greetings
   Christoph



From lists at alteeve.ca  Wed Jul 23 16:16:48 2014
From: lists at alteeve.ca (Digimer)
Date: Wed, 23 Jul 2014 12:16:48 -0400
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
Message-ID: <53CFDFF0.3010504@alteeve.ca>

Any logs in the switch? Is the multicast group being deleted/recreated?

On 23/07/14 11:53 AM, C. Handel wrote:
> hi,
>
> i run a cluster with two corosync rings. One of the rings is marked
> faulty every fourty seconds, to immediately recover a second later.
> the other ring is stable
>
> i have no idea how i should debug this.
>
>
> we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1
> cluster consists of three machines. Ring1 is running on 10gigbit
> interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their
> respective switch.
>
> corosync communication is udpu, rrp_mode is passive
>
> cluster.conf:
>
> <cluster config_version="30" name="aslfile">
>
> <cman transport="udpu">
> </cman>
>
> <fence_daemon post_join_delay="120" post_fail_delay="30"/>
>
> <fencedevices>
>          <fencedevice name="pcmk" agent="fence_pcmk" action="off"/>
> </fencedevices>
>
> <quorumd
>     cman_label="qdisk"
>     device="/dev/mapper/mpath-091quorump1"
>     min_score="1"
>     votes="2"
>     >
> </quorumd>
>
> <clusternodes>
> <clusternode name="asl430m90" nodeid="430">
>          <altname name="asl430"/>
>          <fence>
>                  <method name="pcmk-redirect">
>                          <device name="pcmk" port="asl430m90"/>
>                  </method>
>          </fence>
> </clusternode>
> <clusternode name="asl431m90" nodeid="431">
>          <altname name="asl431"/>
>          <fence>
>                  <method name="pcmk-redirect">
>                          <device name="pcmk" port="asl431m90"/>
>                  </method>
>          </fence>
> </clusternode>
> <clusternode name="asl432m90" nodeid="432">
>          <altname name="asl432"/>
>          <fence>
>                  <method name="pcmk-redirect">
>                          <device name="pcmk" port="asl432m90"/>
>                  </method>
>          </fence>
> </clusternode>
> </clusternodes>
> </cluster>
>
>
> syslog
>
>
> Jul 23 17:48:34 asl431 corosync[3254]:   [TOTEM ] Marking ringid 1
> interface 140.181.134.212 FAULTY
> Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
> Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
> Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
> Jul 23 17:49:14 asl431 corosync[3254]:   [TOTEM ] Marking ringid 1
> interface 140.181.134.212 FAULTY
> Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
> Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
> Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered ring 1
>
>
>
> Greetings
>     Christoph
>


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



From morpheus.ibis at gmail.com  Wed Jul 23 20:00:09 2014
From: morpheus.ibis at gmail.com (Pavel Herrmann)
Date: Wed, 23 Jul 2014 22:00:09 +0200
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <53CFDFF0.3010504@alteeve.ca>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
	<53CFDFF0.3010504@alteeve.ca>
Message-ID: <24252688.qLsjkavrKW@bloomfield>

Hi,

On Wednesday 23 of July 2014 12:16:48 Digimer wrote:
> Any logs in the switch? Is the multicast group being deleted/recreated?

I believe there would be no multicast for UDPU transport

Can you check to see if any of the devices (servers and switches) is dropping 
UDP packets, be it for congestion or damage?

regards,
Pavel

> 
> On 23/07/14 11:53 AM, C. Handel wrote:
> > hi,
> > 
> > i run a cluster with two corosync rings. One of the rings is marked
> > faulty every fourty seconds, to immediately recover a second later.
> > the other ring is stable
> > 
> > i have no idea how i should debug this.
> > 
> > 
> > we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1
> > cluster consists of three machines. Ring1 is running on 10gigbit
> > interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their
> > respective switch.
> > 
> > corosync communication is udpu, rrp_mode is passive
> > 
> > cluster.conf:
> > 
> > <cluster config_version="30" name="aslfile">
> > 
> > <cman transport="udpu">
> > </cman>
> > 
> > <fence_daemon post_join_delay="120" post_fail_delay="30"/>
> > 
> > <fencedevices>
> > 
> >          <fencedevice name="pcmk" agent="fence_pcmk" action="off"/>
> > 
> > </fencedevices>
> > 
> > <quorumd
> > 
> >     cman_label="qdisk"
> >     device="/dev/mapper/mpath-091quorump1"
> >     min_score="1"
> >     votes="2"
> > 
> > </quorumd>
> > 
> > <clusternodes>
> > <clusternode name="asl430m90" nodeid="430">
> > 
> >          <altname name="asl430"/>
> >          <fence>
> >          
> >                  <method name="pcmk-redirect">
> >                  
> >                          <device name="pcmk" port="asl430m90"/>
> >                  
> >                  </method>
> >          
> >          </fence>
> > 
> > </clusternode>
> > <clusternode name="asl431m90" nodeid="431">
> > 
> >          <altname name="asl431"/>
> >          <fence>
> >          
> >                  <method name="pcmk-redirect">
> >                  
> >                          <device name="pcmk" port="asl431m90"/>
> >                  
> >                  </method>
> >          
> >          </fence>
> > 
> > </clusternode>
> > <clusternode name="asl432m90" nodeid="432">
> > 
> >          <altname name="asl432"/>
> >          <fence>
> >          
> >                  <method name="pcmk-redirect">
> >                  
> >                          <device name="pcmk" port="asl432m90"/>
> >                  
> >                  </method>
> >          
> >          </fence>
> > 
> > </clusternode>
> > </clusternodes>
> > </cluster>
> > 
> > 
> > syslog
> > 
> > 
> > Jul 23 17:48:34 asl431 corosync[3254]:   [TOTEM ] Marking ringid 1
> > interface 140.181.134.212 FAULTY
> > Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically recovered
> > ring 1 Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ] Automatically
> > recovered ring 1 Jul 23 17:48:35 asl431 corosync[3254]:   [TOTEM ]
> > Automatically recovered ring 1 Jul 23 17:49:14 asl431 corosync[3254]:  
> > [TOTEM ] Marking ringid 1 interface 140.181.134.212 FAULTY
> > Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically recovered
> > ring 1 Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ] Automatically
> > recovered ring 1 Jul 23 17:49:15 asl431 corosync[3254]:   [TOTEM ]
> > Automatically recovered ring 1
> > 
> > 
> > 
> > Greetings
> > 
> >     Christoph



From christoph at macht-blau.org  Thu Jul 24 07:30:01 2014
From: christoph at macht-blau.org (C. Handel)
Date: Thu, 24 Jul 2014 09:30:01 +0200
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
Message-ID: <CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>

>>> i run a cluster with two corosync rings. One of the rings is marked
>>> faulty every fourty seconds, to immediately recover a second later.
>>> the other ring is stable
>>>
>>> i have no idea how i should debug this.
>>>
>>>
>>> we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1
>>> cluster consists of three machines. Ring1 is running on 10gigbit
>>> interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their
>>> respective switch.

>> Any logs in the switch? Is the multicast group being deleted/recreated?

> believe there would be no multicast for UDPU transport

>Can you check to see if any of the devices (servers and switches) is >dropping
>UDP packets, be it for congestion or damage?

the switch has no load, interface utilization is below 10%, no crc
errors on the ports and no errors in the log. On the same switch a
second cluster (four machines, similiar config) is running fine.

Greetings
   Christoph



From misch at schwartzkopff.org  Thu Jul 24 07:42:08 2014
From: misch at schwartzkopff.org (Michael Schwartzkopff)
Date: Thu, 24 Jul 2014 09:42:08 +0200
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
	<CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>
Message-ID: <3854552.a8rUyGa0b9@nb003>

Am Donnerstag, 24. Juli 2014, 09:30:01 schrieb C. Handel:
> >>> i run a cluster with two corosync rings. One of the rings is marked
> >>> faulty every fourty seconds, to immediately recover a second later.
> >>> the other ring is stable
> >>> 
> >>> i have no idea how i should debug this.
> >>> 
> >>> 
> >>> we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1
> >>> cluster consists of three machines. Ring1 is running on 10gigbit
> >>> interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their
> >>> respective switch.
> >> 
> >> Any logs in the switch? Is the multicast group being deleted/recreated?
> > 
> > believe there would be no multicast for UDPU transport
> >
> >Can you check to see if any of the devices (servers and switches) is
> >>dropping UDP packets, be it for congestion or damage?
> 
> the switch has no load, interface utilization is below 10%, no crc
> errors on the ports and no errors in the log. On the same switch a
> second cluster (four machines, similiar config) is running fine.

Any Spanning Tree Problems? Dou you have any bridges (i.e. for virtual 
machines) configured in your setup?

did you do some debug on your switch?

Greetings,

-- 
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 M?nchen

Tel: (0162) 1650044
Fax: (089) 620 304 13
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140724/e2d333af/attachment.sig>

From yvette at dbtgroup.com  Thu Jul 24 16:57:41 2014
From: yvette at dbtgroup.com (Yvette S. Hirth, CCP, CDP)
Date: Thu, 24 Jul 2014 09:57:41 -0700
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
	<CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>
Message-ID: <53D13B05.7050505@dbtgroup.com>

On 07/24/2014 12:30 AM, C. Handel wrote:

> the switch has no load, interface utilization is below 10%, no crc
> errors on the ports and no errors in the log. On the same switch a
> second cluster (four machines, similiar config) is running fine.

hi,

did you vlan the switches so the two clusters are "logically separate"? 
  if they're on the same VLAN they might interfere with each other...

also i second Michael Schwartzkopff's suggestion of looking into 
Spanning Tree Protocol (STP).  if your switches (i'm assUming you're 
using two) are not stacked(1), you may be running into that, as well.

hth
yvette

(1) - stacking switches allows for double bandwidth operations, and can 
eliminate the need / requirement for STP - tho STP may still be in use.



From christoph at macht-blau.org  Fri Jul 25 06:22:09 2014
From: christoph at macht-blau.org (C. Handel)
Date: Fri, 25 Jul 2014 08:22:09 +0200
Subject: [Linux-cluster] corosync ring failure
In-Reply-To: <53D13B05.7050505@dbtgroup.com>
References: <CADjFtTmrQ2_M2EFzxwCEgLMf++akcV=-f9BMQp-YDSYD1H_=aQ@mail.gmail.com>
	<CADjFtTk1aPVxSuvir6Dm6GxCpGwP8gyQDR5bzP6dk2eqEL-NxQ@mail.gmail.com>
	<53D13B05.7050505@dbtgroup.com>
Message-ID: <CADjFtTnaXspDAp=YTT62cCuDZPV3Xwu5WutK__8S9fzmrZzMVA@mail.gmail.com>

>> the switch has no load, interface utilization is below 10%, no crc
>> errors on the ports and no errors in the log. On the same switch a
>> second cluster (four machines, similiar config) is running fine.
>
> did you vlan the switches so the two clusters are "logically separate"?  if
> they're on the same VLAN they might interfere with each other...
>
> also i second Michael Schwartzkopff's suggestion of looking into Spanning
> Tree Protocol (STP).  if your switches (i'm assUming you're using two) are
> not stacked(1), you may be running into that, as well.

There are vlans and spanning trees and stacking.

Both (actualy three) cluster are in the same vlans. One vlan for
cluster internals and one for external. One internal ring, one
external ring. On the internal ring each Cluster uses it's own IP
subnet. Communication is udpu and should not interfere.

Spanning tree has no events. All ports are always in forwarding mode.
As ring failure happens every 40 seconds i think it is unlikely for
spanning tree to be the reason.

i pulled a wiredump on one of the nodes (432). But i can't really make
sense of it.

orf packets from node 431 arive and i send out orf packets to the node
430. So the ring looks fine.

I modified the cluster config to include

<dlm protocol="sctp"/> <!-- missed the note in "man cman" about this -->
<totem rrp_mode="active" /> <!-- missed this note also -->

rebooted all nodes (for another reason) and everything looks fine now.
No idea if it is the config change or the reboot. I will pull another
wiredump and compare.


Greetings
   Christoph



From laszlo.budai at acceleris.ro  Tue Jul 29 16:58:59 2014
From: laszlo.budai at acceleris.ro (Laszlo Budai)
Date: Tue, 29 Jul 2014 19:58:59 +0300
Subject: [Linux-cluster] RHEL 6.3  - qdisk lost
Message-ID: <53D7D2D3.10603@acceleris.ro>

Dear All,

I have a two node cluster (rgmanager-3.0.12.1-17.el6.x86_64) with a 
shared storage. The storage contains the quorum disk also.
there are some services, and there are some dependencies set.
We are testing what is happening if the storage is disconnected from one 
node (to see the cluster response for such a failure).
So we start from a good cluster (all is OK) and we disconnect the 
storage from the first node.

What I have observed:
1. the cluster is fencing node 1
2. node 2 is trying to start the services, but even if we have 3 
services (let's say B,C,D) which are depending on a fourth one (say A) 
the cluster is trying to start the services in this order: B,C,D,A. 
Obviously it fails for B,C,D and gives us the following messages:

Jul 29 15:49:54 node1 rgmanager[5135]: service:B is not runnable; 
dependency not met
Jul 29 15:49:54 node2 rgmanager[5135]: Not stopping service:B: service 
is failed
Jul 29 15:49:54 node2 rgmanager[5135]: Unable to stop RG service:B in 
recoverable state

it will leave them in "recoverable" state even if service A will start 
successfully (so the dependency would be met now). Why is this happening?

I would expect the rgmanager to start the services in an order that 
would satisfy the dependency relationships. Or if it is not doing that , 
then at least to react to the service state change event (service A has 
started, so dependencies should be evaluated again).

What can be done about it?

Thank you in advance,
Laszlo