From amjadcsu at gmail.com Tue Jul 1 08:27:26 2014 From: amjadcsu at gmail.com (Amjad Syed) Date: Tue, 1 Jul 2014 11:27:26 +0300 Subject: [Linux-cluster] Virtual IP service Message-ID: Hello I am trying to start a virtual ip service on my 2 node cluster. Here are the details of the network setting and configuration. 1 : Bond(heartbeat) . This is a private network with no switch involved. Not available to public node1: 192.168.10.11 node2 : 192.168.10.10 2. Fencing (Ilo) .This one goes through a switch node 1: 10.10.63.92 node2 : 10.10.63.93 3) Public ip addresses 10.10.5.100 : node1 10.10.5.20 node2 . I have set Virtual IP as 10.10.5.23 in cluster.conf However, this Virtual IP does not work since the cman communication is on 192 network. When i try to set cman to 10.10.5.X network, the nodes go into fence loop, i,e they fence each other So i am asking, is there a "network-preference option" etc in cluster.conf that can map virtual IP to private network addresses. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From mgrac at redhat.com Tue Jul 1 11:46:01 2014 From: mgrac at redhat.com (Marek Grac) Date: Tue, 01 Jul 2014 13:46:01 +0200 Subject: [Linux-cluster] fence-agents-4.0.10 release Message-ID: <53B29F79.9020902@redhat.com> Welcome to the fence-agents 4.0.10 release. This release includes new fence agent for Docker (thanks to Ondrej Mular) and several bugfixes: * fence_scsi is reimplemented on top of fencing library * fence_zvm support distributed z/VM systems * support for --delay was added to fence_zvm * unmaintained fence agents were removed: * fence_baytech, fence_bullpap, fence_cpint, fence_mcdata, * fence_rackswitch, fence_vixel, fence_xcat * we do not plan to remove other agents * update fence_rsb to work with new firmware The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-4.0.10.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. m, From ekuric at redhat.com Fri Jul 4 09:20:25 2014 From: ekuric at redhat.com (Elvir Kuric) Date: Fri, 04 Jul 2014 11:20:25 +0200 Subject: [Linux-cluster] Error in Cluster.conf In-Reply-To: References: <53A96784.3030009@redhat.com> <20140624125500.GA1425@redhat.com> <53A99D5E.2080508@alteeve.ca> Message-ID: <53B671D9.9020501@redhat.com> On 06/24/2014 08:44 PM, Amjad Syed wrote: > I have updated the config file , validated by ccs_config_validate > > Added the fence_daemon and post_join_delay. I am using bonding using > ethernet coaxial cable. ok, check bonding modes supported ( depends what OS used - different for rhel ( Centos ) 5 / RHEL ( CentoOS ) 6 > But for some reason whenever i start CMAN on node, it fences (kicks > the other node). As a result at a time only one node is online . Do i > need to use multicast to get both nodes online at same instance ?. it would be good to see logs at from surviving node before it decide to fence its peer. That said, boot machines at same time and see what is happening in logs, there will be reason ( on surviving node logs ) why it thinks that its peer is not in good state so it needs to be fenced. mutlicast is used by default and that traffic needs to be allowed in cluster network. You can rule out issue with muticast if you for test purposes in cluster.conf change to if issue is not visible with broadcast="yes" then you can say that multicast could be issue ( and then you can work to fix that ). If you have RHEL 6 / CentOS you can also try with unicast udp ( udpu, by adding transport="udpu" in above cman stanza , more in doc : https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-unicast-traffic-CA.html ) Also you must ensure that fencing is working properly, I recommend to take time and to read : https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Fence_Configuration_Guide/index.html If still there is issue after all your tests with cluster and if you have valid Red Hat subscription ( with proper support level : Standard or Premium ) then you can visit Red Hat Customer portal https://access.redhat.com and open case with Red Hat Support where we can work to fix issue. Kind regards, Elvir Kuric > or i am missing something here ? > > Now the file looks like this : > > > ?xml version="1.0"?> > > > > login="ADMIN" name="inspuripmi" passwd="xxxx"/> > login="test" name="hpipmi" passwd="xxxx"/> > > > > > > > ="reboot"/> > > > > > > > > > > > > > > > > > > recovery="relocate"> > sleeptime="10"/> > > > > > Thanks > > > On Tue, Jun 24, 2014 at 6:46 PM, Digimer > wrote: > > On 24/06/14 08:55 AM, Jan Pokorn? wrote: > > On 24/06/14 13:56 +0200, Fabio M. Di Nitto wrote: > > On 6/24/2014 12:32 PM, Amjad Syed wrote: > > Hello > > I am getting the following error when i run > ccs_config_Validate > > ccs_config_validate > Relax-NG validity error : Extra element clusternodes > in interleave > > > You defined > > That + the are more issues discoverable by more powerful validator > jing (packaged in Fedora and RHEL 7, for instance, admittedly not > for RHEL 6/EPEL): > > $ jing cluster.rng cluster.conf > > cluster.conf:13:47: error: > element "fencedvice" not allowed anywhere; expected the > element > end-tag or element "fencedevice" > cluster.conf:15:23: error: > element "clusternodes" not allowed here; expected the > element > end-tag or element "clvmd", "dlm", "fence_daemon", > "fence_xvmd", > "gfs_controld", "group", "logging", "quorumd", "rm", > "totem" or > "uidgid" > cluster.conf:26:76: error: > IDREF "fence_node2" without matching ID > cluster.conf:19:77: error: > IDREF "fence_node1" without matching ID > > > So it spotted also: > - a typo in "fencedvice" > - broken referential integrity; it is prescribed "name" attribute > of "device" tag should match a "name" of a defined > "fencedevice" > > Hope this helps. > > -- Jan > > > Also, without fence methods defined for the nodes, rgmanager will > block the first time there is an issue. > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person > without access to education? > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Elvir Kuric,TSE / Red Hat / GSS EMEA / -------------- next part -------------- An HTML attachment was scrubbed... URL: From laszlo.budai at acceleris.ro Thu Jul 10 11:49:09 2014 From: laszlo.budai at acceleris.ro (Laszlo Budai) Date: Thu, 10 Jul 2014 14:49:09 +0300 Subject: [Linux-cluster] Cman not start when quorum disk is not available Message-ID: <53BE7DB5.3080203@acceleris.ro> Dear all, we have a RHEL 6.3 cluster of two nodes and a quorum disk. We are testing the cluster against different failures. We have a problem when the shared storage is disconnected from one of the nodes. The node that has lost contact with the storage is fenced, but when restarting the machine cman will not start up (it will try to start but it will stop): Jul 9 17:55:54 clnode1p kdump: started up Jul 9 17:55:54 clnode1p kernel: bond0: no IPv6 routers present Jul 9 17:55:54 clnode1p kernel: DLM (built Jun 13 2012 18:26:45) installed Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service. Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Successfully parsed cman config Jul 9 17:55:55 clnode1p corosync[2514]: [TOTEM ] Initializing transport (UDP/IP Multicast). Jul 9 17:55:55 clnode1p corosync[2514]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Jul 9 17:55:55 clnode1p corosync[2514]: [TOTEM ] The network interface [172.16.255.1] is now up. Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Using quorum provider quorum_cman Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Jul 9 17:55:55 clnode1p corosync[2514]: [CMAN ] CMAN 3.0.12.1 (built May 8 2012 12:22:26) started Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync configuration service Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync profile loading service Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Using quorum provider quorum_cman Jul 9 17:55:55 clnode1p corosync[2514]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Jul 9 17:55:55 clnode1p corosync[2514]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Members[1]: 1 Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Members[1]: 1 Jul 9 17:55:55 clnode1p corosync[2514]: [CPG ] chosen downlist: sender r(0) ip(172.16.255.1) ; members(old:0 left:0) Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Completed service synchronization, ready to provide service. Jul 9 17:55:55 clnode1p corosync[2514]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jul 9 17:55:55 clnode1p corosync[2514]: [CMAN ] quorum regained, resuming activity Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] This node is within the primary component and will provide service. Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Members[2]: 1 2 Jul 9 17:55:55 clnode1p corosync[2514]: [QUORUM] Members[2]: 1 2 Jul 9 17:55:55 clnode1p corosync[2514]: [CPG ] chosen downlist: sender r(0) ip(172.16.255.1) ; members(old:1 left:0) Jul 9 17:55:55 clnode1p corosync[2514]: [MAIN ] Completed service synchronization, ready to provide service. Jul 9 17:55:59 clnode1p kernel: bond1: no IPv6 routers present Jul 9 17:55:59 clnode1p qdiskd[2564]: Loading dynamic configuration Jul 9 17:55:59 clnode1p qdiskd[2564]: Setting votes to 1 Jul 9 17:55:59 clnode1p qdiskd[2564]: Loading static configuration Jul 9 17:55:59 clnode1p qdiskd[2564]: Timings: 8 tko, 1 interval Jul 9 17:55:59 clnode1p qdiskd[2564]: Timings: 2 tko_up, 4 master_wait, 2 upgrade_wait Jul 9 17:55:59 clnode1p qdiskd[2564]: Heuristic: '/bin/ping -c1 -w1 clswitch1m' score=1 interval=2 tko=4 Jul 9 17:55:59 clnode1p qdiskd[2564]: Heuristic: '/bin/ping -c1 -w1 clswitch2m' score=1 interval=2 tko=4 Jul 9 17:55:59 clnode1p qdiskd[2564]: 2 heuristics loaded Jul 9 17:55:59 clnode1p qdiskd[2564]: Quorum Daemon: 2 heuristics, 1 interval, 8 tko, 1 votes Jul 9 17:55:59 clnode1p qdiskd[2564]: Run Flags: 00000271 Jul 9 17:55:59 clnode1p qdiskd[2564]: stat Jul 9 17:55:59 clnode1p qdiskd[2564]: qdisk_validate: No such file or directory Jul 9 17:55:59 clnode1p qdiskd[2564]: Specified partition /dev/mapper/apsto1-vd01-v001 does not have a qdisk label Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Unloading all Corosync service engines. Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync extended virtual synchrony service Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync configuration service Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01 Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync cluster config database access v1.01 Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync profile loading service Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: openais checkpoint service B.01.01 Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync CMAN membership service 2.90 Jul 9 17:56:01 clnode1p corosync[2514]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1 Jul 9 17:56:01 clnode1p corosync[2514]: [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:1864. And it will remain in this state even if the storage is reattached later on. So now I have only one functioning node. What can be done to fix this (to have the cluster framework started)? Thank you, Laszlo -- Acceleris System Integration | and IT works Laszlo Budai | Technical Consultant Bvd. Barbu Vacarescu 80 | RO-020282 Bucuresti t +40 21 23 11 538 laszlo.budai at acceleris.ro | www.acceleris.ro Acceleris Offices are in: Basel | Bucharest | Zollikofen | Renens | Kloten From amujeebs at gmail.com Tue Jul 15 14:18:30 2014 From: amujeebs at gmail.com (abdul mujeeb Siddiqui) Date: Tue, 15 Jul 2014 17:18:30 +0300 Subject: [Linux-cluster] Basename mismatch Message-ID: Hello, I have to implemented red hat linux 6.4 cluster suite and trying to use Oracle11gr2 on it.But oracle service is unable to start. Listener isnot starting. Anyone have implemented oracle11gr2 so please Send me cluster.conf and oracledb.sh and also listener.ora and tnsnames.ora files pls. Thanks in advanced -------------- next part -------------- An HTML attachment was scrubbed... URL: From devin.bougie at cornell.edu Tue Jul 15 15:36:54 2014 From: devin.bougie at cornell.edu (Devin A. Bougie) Date: Tue, 15 Jul 2014 15:36:54 +0000 Subject: [Linux-cluster] mixed 6.4 and 6.5 cluster - delays accessing mpath devices and clustered lvm's Message-ID: <86A230DF-E57C-45C4-8949-E2E15BB75411@cornell.edu> We have a cluster of EL6.4 servers, with one server at fully updated EL6.5. After upgrading to 6.5, we see unreasonably long delays accessing some mpath devices and clustered lvm's on the 6.5 member. There are no problems with the 6.4 members. This can be seen by strace'ing lvscan. In the following example, syscall time is at the end of the line, reads with ascii text are mpath devices, the rest are volumes: ------ 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.467385> 16241 read(5, "\17u\21^ LVM2 x[5A%r0N*>\1\0\0\0\0\20\0\0\0\0\0\0"..., 4096) = 4096 <1.760943> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.164032> 16241 read(5, "gment1 {\nstart_extent = 0\nextent"..., 4096) = 4096 <2.859972> 16241 read(5, "\353H\220\20\216\320\274\0\260\270\0\0\216\330\216\300\373\276\0|\277\0\6\271\0\2\363\244\352!\6\0"..., 4096) = 4096 <1.717222> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.476014> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.800225> 16241 read(5, "3\300\216\320\274\0|\216\300\216\330\276\0|\277\0\6\271\0\2\374\363\244Ph\34\6\313\373\271\4\0"..., 4096) = 4096 <2.008620> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.021734> 16241 read(5, "3\300\216\320\274\0|\216\300\216\330\276\0|\277\0\6\271\0\2\374\363\244Ph\34\6\313\373\271\4\0"..., 4096) = 4096 <2.126359> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.036027> 16241 read(5, "\1\4\0\0\21\4\0\0!\4\0\0\331[\362\37\2\0\4\0\0\0\0\0\0\0\0\0\356\37U\23"..., 4096) = 4096 <1.330302> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.381982> 16241 read(5, "vgift3 {\nid = \"spdYGc-5hqc-ejzd-"..., 8192) = 8192 <0.922098> 16241 read(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <2.440282> 16241 read(6, "vgift3 {\nid = \"spdYGc-5hqc-ejzd-"..., 8192) = 8192 <1.158817> 16241 read(5, "gment1 {\nstart_extent = 0\nextent"..., 4096) = 4096 <0.941814> 16241 read(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096 <1.518448> 16241 read(6, "gment1 {\nstart_extent = 0\nextent"..., 20480) = 20480 <2.006777> ------ The delay can also be seen in the syslog messages we receive after restarting clvmd with debugging enabled. ------ Jul 14 11:47:58 lnx05 lvm[13423]: Got new connection on fd 5 Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28 Jul 14 11:48:03 lnx05 lvm[13423]: creating pipe, [11, 12] Jul 14 11:48:03 lnx05 lvm[13423]: Creating pre&post thread Jul 14 11:48:03 lnx05 lvm[13423]: Created pre&post thread, state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: in sub thread: client = 0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift5' at 1 (client=0x13e7460) Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift5' mode:3 flags=0 Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 24c0008 Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3443, flags=0x1 (LOCAL) Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3443 Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift5', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift5 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 31 Jul 14 11:48:03 lnx05 lvm[13423]: check_all_clvmds_running Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3444, flags=0x0 () Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3444 Jul 14 11:48:03 lnx05 lvm[13423]: Sending message to all cluster nodes Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: SYNC_NAMES (0x2d) msg=0x13e7110, msglen =31, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx01-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 2 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx02-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 3 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx04-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 4 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx07-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 5 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx06-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 6 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx08-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 7 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx09-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 8 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx03-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 9 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28 Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift5' at 6 (client=0x13e7460) Jul 14 11:48:03 lnx05 lvm[13423]: sync_unlock: 'V_vgift5' lkid:24c0008 Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3445, flags=0x1 (LOCAL) Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3445 Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift5', cmd = 0x6 LCK_VG (UNLOCK|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift5 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28 Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift3' at 1 (client=0x13e7460) Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift3' mode:3 flags=0 Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 166000b Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3446, flags=0x1 (LOCAL) Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3446 Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift3', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift3 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 31 Jul 14 11:48:03 lnx05 lvm[13423]: check_all_clvmds_running Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3447, flags=0x0 () Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3447 Jul 14 11:48:03 lnx05 lvm[13423]: Sending message to all cluster nodes Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: SYNC_NAMES (0x2d) msg=0x13e7110, msglen =31, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx01-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 2 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx02-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 3 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx04-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 4 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx07-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 5 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx06-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 6 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx08-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 7 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx09-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 8 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx03-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 9 replies, expecting: 9 Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28 Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift3' at 6 (client=0x13e7460) Jul 14 11:48:03 lnx05 lvm[13423]: sync_unlock: 'V_vgift3' lkid:166000b Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3448, flags=0x1 (LOCAL) Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3448 Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift3', cmd = 0x6 LCK_VG (UNLOCK|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift3 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:03 lnx05 lvm[13423]: Read on local socket 5, len = 28 Jul 14 11:48:03 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:03 lnx05 lvm[13423]: doing PRE command LOCK_VG 'V_vgift2' at 1 (client=0x13e7460) Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: 'V_vgift2' mode:3 flags=0 Jul 14 11:48:03 lnx05 lvm[13423]: sync_lock: returning lkid 3b20007 Jul 14 11:48:03 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:03 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: distribute command: XID = 3449, flags=0x1 (LOCAL) Jul 14 11:48:03 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=28, csid=(nil), xid=3449 Jul 14 11:48:03 lnx05 lvm[13423]: process_work_item: local Jul 14 11:48:03 lnx05 lvm[13423]: process_local_command: LOCK_VG (0x33) msg=0x13e7110, msglen =28, client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: do_lock_vg: resource 'V_vgift2', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 Jul 14 11:48:03 lnx05 lvm[13423]: Invalidating cached metadata for VG vgift2 Jul 14 11:48:03 lnx05 lvm[13423]: Reply from node lnx05-p12: 0 bytes Jul 14 11:48:03 lnx05 lvm[13423]: Got 1 replies, expecting: 1 Jul 14 11:48:03 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:03 lnx05 lvm[13423]: Got post command condition... Jul 14 11:48:03 lnx05 lvm[13423]: Waiting for next pre command Jul 14 11:48:03 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:03 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:03 lnx05 lvm[13423]: Send local reply Jul 14 11:48:04 lnx05 lvm[13423]: Read on local socket 5, len = 31 Jul 14 11:48:04 lnx05 lvm[13423]: check_all_clvmds_running Jul 14 11:48:04 lnx05 lvm[13423]: Got pre command condition... Jul 14 11:48:04 lnx05 lvm[13423]: Writing status 0 down pipe 12 Jul 14 11:48:04 lnx05 lvm[13423]: Waiting to do post command - state = 0 Jul 14 11:48:04 lnx05 lvm[13423]: read on PIPE 11: 4 bytes: status: 0 Jul 14 11:48:04 lnx05 lvm[13423]: background routine status was 0, sock_client=0x13e7460 Jul 14 11:48:04 lnx05 lvm[13423]: distribute command: XID = 3450, flags=0x0 () Jul 14 11:48:04 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e2820. client=0x13e7460, msg=0x13e27f0, len=31, csid=(nil), xid=3450 Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25821 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25832 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25844 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25857 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25905 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work Jul 14 11:48:14 lnx05 lvm[13423]: add_to_lvmqueue: cmd=0x13e27f0. client=0x6c60c0, msg=0x7fffd749a7dc, len=31, csid=0x7fffd749a75c, xid=0 Jul 14 11:48:14 lnx05 lvm[13423]: process_work_item: remote Jul 14 11:48:14 lnx05 lvm[13423]: process_remote_command SYNC_NAMES (0x2d) for clientid 0x5000000 XID 25914 on node lnx04-p12 Jul 14 11:48:14 lnx05 lvm[13423]: Syncing device names Jul 14 11:48:14 lnx05 lvm[13423]: LVM thread waiting for work ------ Before we upgrade all cluster members to 6.5, we'd like to be reasonably certain that it will fix the problem rather than spread it to the entire cluster. Any help would be greatly appreciated. Many thanks, Devin From emi2fast at gmail.com Tue Jul 15 15:50:41 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Tue, 15 Jul 2014 17:50:41 +0200 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: 1: show your config 2: show what kind of problem you find, doing what you want to archive 3: show your cluster 4: without any error and some more information, is realy hard to help you 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui : > Hello, I have to implemented red hat linux 6.4 cluster suite and trying to > use Oracle11gr2 on it.But oracle service is unable to start. > Listener isnot starting. > Anyone have implemented oracle11gr2 so please > Send me cluster.conf and oracledb.sh and also listener.ora and tnsnames.ora > files pls. > Thanks in advanced > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera From amujeebs at gmail.com Wed Jul 16 12:31:12 2014 From: amujeebs at gmail.com (abdul mujeeb Siddiqui) Date: Wed, 16 Jul 2014 15:31:12 +0300 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: Dear segura , Thanks for response 1: show your config 2: show what kind of problem you find, doing what you want to archive [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP Running in test mode. Validating configuration for testvip [oracledb] Validating configuration for testvip basename: missing operand Try `basename --help' for more information. Failed to start IP [root@/usr/share/cluster]# vi oracledb.sh # Customize these to match your Oracle installation. # ###################################################### # # 1. Oracle user. Must be the same across all cluster members. In the event # that this script is run by the super-user, it will automatically switch # to the Oracle user and restart. Oracle needs to run as the Oracle # user, not as root. #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle # # 2. Oracle home. This is set up during the installation phase of Oracle. # From the perspective of the cluster, this is generally the mount point # you intend to use as the mount point for your Oracle Infrastructure # service. # #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home [ -n "$ORACLE_HOME" ] || ORACLE_HOME=/u01/app/oracle/product/ 11.2.0.4/dbhome_1 # # 3. This is your SID. This is set up during oracle installation as well. # #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip # # 4. The oracle user probably doesn't have the permission to write to # /var/lock/subsys, so use the user's home directory. # #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock" [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock" #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch privileges # # 5. Type of Oracle Database. Currently supported: 10g 10g-iAS(untested!) #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em" [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base" # # 6. Oracle virtual hostname. This is the hostname you gave Oracle during # installation. # #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com 3: show your cluster root at xxxx]# clustat Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ 192.168.10.10 1 Online, Local, rgmanager 192.168.10.11 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:IP 192.168.10.10 started [root at xxx]# clustat Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ 192.168.10.10 1 Online,rgmanager 192.168.10.11 2 Online,Local,rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:IP 192.168.10.11 started 4: without any error and some more information, is realy hard to help you I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5 . In normal working fine . When I am using Oracle Service in Cluster.conf as above I am getting ERROR Try `basename --help' for more information. Failed to start IP Database Service unable to start. I have try all the options given in oracledb.sh but same ERROR Please help me . Thanks in advance. On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura wrote: > 1: show your config > 2: show what kind of problem you find, doing what you want to archive > 3: show your cluster > 4: without any error and some more information, is realy hard to help you > > 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui : > > Hello, I have to implemented red hat linux 6.4 cluster suite and trying > to > > use Oracle11gr2 on it.But oracle service is unable to start. > > Listener isnot starting. > > Anyone have implemented oracle11gr2 so please > > Send me cluster.conf and oracledb.sh and also listener.ora and > tnsnames.ora > > files pls. > > Thanks in advanced > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Wed Jul 16 14:31:35 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Wed, 16 Jul 2014 16:31:35 +0200 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: can you try to use orainstance agent ? 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui : > Dear segura , > > Thanks for response > 1: show your config > > > > > login="ADMIN" name="inspuripmi" passwd="xxx"/> > login="test" name="hpipmi" passwd="xxxx"/> > > > > > > > > > > > > > > > > > > > > > > > > > fstype="ext4" mountpoint="/u02" name="/u02"/> > fstype="ext4" mountpoint="/u03" name="/u03"/> > fstype="ext4" mountpoint="/u04" name="/u04"/> > fstype="ext4" mountpoint="/u05" name="/u05"/> > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/> > > > > > 2: show what kind of problem you find, doing what you want to archive > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP > Running in test mode. > Validating configuration for testvip > [oracledb] Validating configuration for testvip > basename: missing operand > Try `basename --help' for more information. > Failed to start IP > > > [root@/usr/share/cluster]# vi oracledb.sh > # Customize these to match your Oracle installation. # > ###################################################### > # > # 1. Oracle user. Must be the same across all cluster members. In the > event > # that this script is run by the super-user, it will automatically switch > # to the Oracle user and restart. Oracle needs to run as the Oracle > # user, not as root. > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > # > # 2. Oracle home. This is set up during the installation phase of Oracle. > # From the perspective of the cluster, this is generally the mount point > # you intend to use as the mount point for your Oracle Infrastructure > # service. > # > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home > [ -n "$ORACLE_HOME" ] || > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1 > # > # 3. This is your SID. This is set up during oracle installation as well. > # > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip > # > # 4. The oracle user probably doesn't have the permission to write to > # /var/lock/subsys, so use the user's home directory. > # > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock" > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock" > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch > privileges > # > # 5. Type of Oracle Database. Currently supported: 10g 10g-iAS(untested!) > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em" > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base" > # > # 6. Oracle virtual hostname. This is the hostname you gave Oracle during > # installation. > # > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com > > > 3: show your cluster > root at xxxx]# clustat > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > Member Status: Quorate > Member Name ID Status > ------ ---- ---- ------ > 192.168.10.10 1 Online, Local, rgmanager > 192.168.10.11 2 Online, rgmanager > Service Name Owner (Last) State > ------- ---- ----- ------ ----- > service:IP 192.168.10.10 started > > [root at xxx]# clustat > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > Member Status: Quorate > Member Name ID Status > ------ ---- ---- ------ > 192.168.10.10 1 Online,rgmanager > 192.168.10.11 2 Online,Local,rgmanager > Service Name Owner (Last) State > ------- ---- ----- ------ ----- > service:IP 192.168.10.11 started > > > 4: without any error and some more information, is realy hard to help you > > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5 . > In normal working fine . > When I am using Oracle Service in Cluster.conf as above I am getting ERROR > Try `basename --help' for more information. > Failed to start IP > Database Service unable to start. > > I have try all the options given in oracledb.sh but same ERROR > Please help me . > > Thanks in advance. > > > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura wrote: >> >> 1: show your config >> 2: show what kind of problem you find, doing what you want to archive >> 3: show your cluster >> 4: without any error and some more information, is realy hard to help you >> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui : >> > Hello, I have to implemented red hat linux 6.4 cluster suite and trying >> > to >> > use Oracle11gr2 on it.But oracle service is unable to start. >> > Listener isnot starting. >> > Anyone have implemented oracle11gr2 so please >> > Send me cluster.conf and oracledb.sh and also listener.ora and >> > tnsnames.ora >> > files pls. >> > Thanks in advanced >> > >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster at redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera From amujeebs at gmail.com Wed Jul 16 16:36:36 2014 From: amujeebs at gmail.com (abdul mujeeb Siddiqui) Date: Wed, 16 Jul 2014 19:36:36 +0300 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: You mean say orainstance.sh script from /usr/share/cluster. On 16 Jul 2014 17:36, "emmanuel segura" wrote: > can you try to use orainstance agent ? > > 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui : > > Dear segura , > > > > Thanks for response > > 1: show your config > > > > > > > > > > > login="ADMIN" name="inspuripmi" passwd="xxx"/> > > > login="test" name="hpipmi" passwd="xxxx"/> > > > > > > > > > > > > > > ="reboot"/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > recovery="relocate"> > > sleeptime="10"/> > > force_unmount="1" > > fstype="ext4" mountpoint="/u02" name="/u02"/> > > force_unmount="1" > > fstype="ext4" mountpoint="/u03" name="/u03"/> > > force_unmount="1" > > fstype="ext4" mountpoint="/u04" name="/u04"/> > > force_unmount="1" > > fstype="ext4" mountpoint="/u05" name="/u05"/> > > > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/> > > > > > > > > > > 2: show what kind of problem you find, doing what you want to archive > > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP > > Running in test mode. > > Validating configuration for testvip > > [oracledb] Validating configuration for testvip > > basename: missing operand > > Try `basename --help' for more information. > > Failed to start IP > > > > > > [root@/usr/share/cluster]# vi oracledb.sh > > # Customize these to match your Oracle installation. # > > ###################################################### > > # > > # 1. Oracle user. Must be the same across all cluster members. In the > > event > > # that this script is run by the super-user, it will automatically > switch > > # to the Oracle user and restart. Oracle needs to run as the Oracle > > # user, not as root. > > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > > # > > # 2. Oracle home. This is set up during the installation phase of > Oracle. > > # From the perspective of the cluster, this is generally the mount > point > > # you intend to use as the mount point for your Oracle Infrastructure > > # service. > > # > > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home > > [ -n "$ORACLE_HOME" ] || > > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1 > > # > > # 3. This is your SID. This is set up during oracle installation as > well. > > # > > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl > > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip > > # > > # 4. The oracle user probably doesn't have the permission to write to > > # /var/lock/subsys, so use the user's home directory. > > # > > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock" > > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock" > > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch > > privileges > > # > > # 5. Type of Oracle Database. Currently supported: 10g > 10g-iAS(untested!) > > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em" > > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base" > > # > > # 6. Oracle virtual hostname. This is the hostname you gave Oracle > during > > # installation. > > # > > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com > > > > > > 3: show your cluster > > root at xxxx]# clustat > > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > > Member Status: Quorate > > Member Name ID Status > > ------ ---- ---- ------ > > 192.168.10.10 1 Online, Local, rgmanager > > 192.168.10.11 2 Online, rgmanager > > Service Name Owner (Last) State > > ------- ---- ----- ------ ----- > > service:IP 192.168.10.10 started > > > > [root at xxx]# clustat > > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > > Member Status: Quorate > > Member Name ID Status > > ------ ---- ---- ------ > > 192.168.10.10 1 Online,rgmanager > > 192.168.10.11 2 Online,Local,rgmanager > > Service Name Owner (Last) State > > ------- ---- ----- ------ ----- > > service:IP 192.168.10.11 started > > > > > > 4: without any error and some more information, is realy hard to help > you > > > > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5 . > > In normal working fine . > > When I am using Oracle Service in Cluster.conf as above I am getting > ERROR > > Try `basename --help' for more information. > > Failed to start IP > > Database Service unable to start. > > > > I have try all the options given in oracledb.sh but same ERROR > > Please help me . > > > > Thanks in advance. > > > > > > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura > wrote: > >> > >> 1: show your config > >> 2: show what kind of problem you find, doing what you want to archive > >> 3: show your cluster > >> 4: without any error and some more information, is realy hard to help > you > >> > >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui : > >> > Hello, I have to implemented red hat linux 6.4 cluster suite and > trying > >> > to > >> > use Oracle11gr2 on it.But oracle service is unable to start. > >> > Listener isnot starting. > >> > Anyone have implemented oracle11gr2 so please > >> > Send me cluster.conf and oracledb.sh and also listener.ora and > >> > tnsnames.ora > >> > files pls. > >> > Thanks in advanced > >> > > >> > > >> > -- > >> > Linux-cluster mailing list > >> > Linux-cluster at redhat.com > >> > https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> > >> > >> -- > >> esta es mi vida e me la vivo hasta que dios quiera > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Wed Jul 16 17:12:12 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Wed, 16 Jul 2014 19:12:12 +0200 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: yes. 2014-07-16 18:36 GMT+02:00 abdul mujeeb Siddiqui : > You mean say orainstance.sh script from /usr/share/cluster. > > On 16 Jul 2014 17:36, "emmanuel segura" wrote: >> >> can you try to use orainstance agent ? >> >> 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui : >> > Dear segura , >> > >> > Thanks for response >> > 1: show your config >> > >> > >> > >> > >> > > > login="ADMIN" name="inspuripmi" passwd="xxx"/> >> > > > login="test" name="hpipmi" passwd="xxxx"/> >> > >> > >> > >> > >> > >> > >> > > > ="reboot"/> >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > > recovery="relocate"> >> > > > sleeptime="10"/> >> > > > force_unmount="1" >> > fstype="ext4" mountpoint="/u02" name="/u02"/> >> > > > force_unmount="1" >> > fstype="ext4" mountpoint="/u03" name="/u03"/> >> > > > force_unmount="1" >> > fstype="ext4" mountpoint="/u04" name="/u04"/> >> > > > force_unmount="1" >> > fstype="ext4" mountpoint="/u05" name="/u05"/> >> > > > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/> >> > >> > >> > >> > >> > 2: show what kind of problem you find, doing what you want to archive >> > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP >> > Running in test mode. >> > Validating configuration for testvip >> > [oracledb] Validating configuration for testvip >> > basename: missing operand >> > Try `basename --help' for more information. >> > Failed to start IP >> > >> > >> > [root@/usr/share/cluster]# vi oracledb.sh >> > # Customize these to match your Oracle installation. # >> > ###################################################### >> > # >> > # 1. Oracle user. Must be the same across all cluster members. In the >> > event >> > # that this script is run by the super-user, it will automatically >> > switch >> > # to the Oracle user and restart. Oracle needs to run as the Oracle >> > # user, not as root. >> > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle >> > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle >> > # >> > # 2. Oracle home. This is set up during the installation phase of >> > Oracle. >> > # From the perspective of the cluster, this is generally the mount >> > point >> > # you intend to use as the mount point for your Oracle Infrastructure >> > # service. >> > # >> > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home >> > [ -n "$ORACLE_HOME" ] || >> > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1 >> > # >> > # 3. This is your SID. This is set up during oracle installation as >> > well. >> > # >> > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl >> > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip >> > # >> > # 4. The oracle user probably doesn't have the permission to write to >> > # /var/lock/subsys, so use the user's home directory. >> > # >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock" >> > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock" >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch >> > privileges >> > # >> > # 5. Type of Oracle Database. Currently supported: 10g >> > 10g-iAS(untested!) >> > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em" >> > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base" >> > # >> > # 6. Oracle virtual hostname. This is the hostname you gave Oracle >> > during >> > # installation. >> > # >> > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com >> > >> > >> > 3: show your cluster >> > root at xxxx]# clustat >> > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 >> > Member Status: Quorate >> > Member Name ID Status >> > ------ ---- ---- ------ >> > 192.168.10.10 1 Online, Local, rgmanager >> > 192.168.10.11 2 Online, rgmanager >> > Service Name Owner (Last) State >> > ------- ---- ----- ------ ----- >> > service:IP 192.168.10.10 started >> > >> > [root at xxx]# clustat >> > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 >> > Member Status: Quorate >> > Member Name ID Status >> > ------ ---- ---- ------ >> > 192.168.10.10 1 Online,rgmanager >> > 192.168.10.11 2 Online,Local,rgmanager >> > Service Name Owner (Last) State >> > ------- ---- ----- ------ ----- >> > service:IP 192.168.10.11 started >> > >> > >> > 4: without any error and some more information, is realy hard to help >> > you >> > >> > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5 . >> > In normal working fine . >> > When I am using Oracle Service in Cluster.conf as above I am getting >> > ERROR >> > Try `basename --help' for more information. >> > Failed to start IP >> > Database Service unable to start. >> > >> > I have try all the options given in oracledb.sh but same ERROR >> > Please help me . >> > >> > Thanks in advance. >> > >> > >> > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura >> > wrote: >> >> >> >> 1: show your config >> >> 2: show what kind of problem you find, doing what you want to archive >> >> 3: show your cluster >> >> 4: without any error and some more information, is realy hard to help >> >> you >> >> >> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui : >> >> > Hello, I have to implemented red hat linux 6.4 cluster suite and >> >> > trying >> >> > to >> >> > use Oracle11gr2 on it.But oracle service is unable to start. >> >> > Listener isnot starting. >> >> > Anyone have implemented oracle11gr2 so please >> >> > Send me cluster.conf and oracledb.sh and also listener.ora and >> >> > tnsnames.ora >> >> > files pls. >> >> > Thanks in advanced >> >> > >> >> > >> >> > -- >> >> > Linux-cluster mailing list >> >> > Linux-cluster at redhat.com >> >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> >> >> >> >> -- >> >> esta es mi vida e me la vivo hasta que dios quiera >> >> >> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster at redhat.com >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> > >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster at redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera From amujeebs at gmail.com Wed Jul 16 17:38:21 2014 From: amujeebs at gmail.com (abdul mujeeb Siddiqui) Date: Wed, 16 Jul 2014 20:38:21 +0300 Subject: [Linux-cluster] Basename mismatch In-Reply-To: References: Message-ID: Can send me the syntax to be used in cluster.conf file ? Type=? Aa I am using Oracle 11.2.0.4 On 16 Jul 2014 20:16, "emmanuel segura" wrote: > yes. > > 2014-07-16 18:36 GMT+02:00 abdul mujeeb Siddiqui : > > You mean say orainstance.sh script from /usr/share/cluster. > > > > On 16 Jul 2014 17:36, "emmanuel segura" wrote: > >> > >> can you try to use orainstance agent ? > >> > >> 2014-07-16 14:31 GMT+02:00 abdul mujeeb Siddiqui : > >> > Dear segura , > >> > > >> > Thanks for response > >> > 1: show your config > >> > > >> > > >> > > >> > > >> > >> > login="ADMIN" name="inspuripmi" passwd="xxx"/> > >> > >> > login="test" name="hpipmi" passwd="xxxx"/> > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > ="reboot"/> > >> > > >> > > >> > > >> > > >> > > >> > > >> > ="reboot"/> > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > recovery="relocate"> > >> > >> > sleeptime="10"/> > >> > >> > force_unmount="1" > >> > fstype="ext4" mountpoint="/u02" name="/u02"/> > >> > >> > force_unmount="1" > >> > fstype="ext4" mountpoint="/u03" name="/u03"/> > >> > >> > force_unmount="1" > >> > fstype="ext4" mountpoint="/u04" name="/u04"/> > >> > >> > force_unmount="1" > >> > fstype="ext4" mountpoint="/u05" name="/u05"/> > >> > >> > name="testvip" type="base" user="oracle" vhost="10.10.5.23"/> > >> > > >> > > >> > > >> > > >> > 2: show what kind of problem you find, doing what you want to archive > >> > [root at xxxx]# rg_test test /etc/cluster/cluster.conf stop service IP > >> > Running in test mode. > >> > Validating configuration for testvip > >> > [oracledb] Validating configuration for testvip > >> > basename: missing operand > >> > Try `basename --help' for more information. > >> > Failed to start IP > >> > > >> > > >> > [root@/usr/share/cluster]# vi oracledb.sh > >> > # Customize these to match your Oracle installation. # > >> > ###################################################### > >> > # > >> > # 1. Oracle user. Must be the same across all cluster members. In > the > >> > event > >> > # that this script is run by the super-user, it will automatically > >> > switch > >> > # to the Oracle user and restart. Oracle needs to run as the > Oracle > >> > # user, not as root. > >> > #[ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > >> > [ -n "$ORACLE_USER" ] || ORACLE_USER=oracle > >> > # > >> > # 2. Oracle home. This is set up during the installation phase of > >> > Oracle. > >> > # From the perspective of the cluster, this is generally the mount > >> > point > >> > # you intend to use as the mount point for your Oracle > Infrastructure > >> > # service. > >> > # > >> > #[ -n "$ORACLE_HOME" ] || ORACLE_HOME=/mnt/oracle/home > >> > [ -n "$ORACLE_HOME" ] || > >> > ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1 > >> > # > >> > # 3. This is your SID. This is set up during oracle installation as > >> > well. > >> > # > >> > #[ -n "$ORACLE_SID" ] || ORACLE_SID=orcl > >> > [ -n "$ORACLE_SID" ] || ORACLE_SID=testvip > >> > # > >> > # 4. The oracle user probably doesn't have the permission to write to > >> > # /var/lock/subsys, so use the user's home directory. > >> > # > >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/home/$ORACLE_USER/.oracle-ias.lock" > >> > [ -n "$LOCKFILE" ] || LOCKFILE="$ORACLE_HOME/.oracle-ias.lock" > >> > #[ -n "$LOCKFILE" ] || LOCKFILE="/var/lock/subsys/oracle-ias" # Watch > >> > privileges > >> > # > >> > # 5. Type of Oracle Database. Currently supported: 10g > >> > 10g-iAS(untested!) > >> > #[ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base-em" > >> > [ -n "$ORACLE_TYPE" ] || ORACLE_TYPE="base" > >> > # > >> > # 6. Oracle virtual hostname. This is the hostname you gave Oracle > >> > during > >> > # installation. > >> > # > >> > #[ -n "$ORACLE_HOSTNAME" ] || ORACLE_HOSTNAME=svc0.foo.test.com > >> > > >> > > >> > 3: show your cluster > >> > root at xxxx]# clustat > >> > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > >> > Member Status: Quorate > >> > Member Name ID Status > >> > ------ ---- ---- ------ > >> > 192.168.10.10 1 Online, Local, > rgmanager > >> > 192.168.10.11 2 Online, rgmanager > >> > Service Name Owner (Last) State > >> > ------- ---- ----- ------ ----- > >> > service:IP 192.168.10.10 started > >> > > >> > [root at xxx]# clustat > >> > Cluster Status for oracleha @ Tue Apr 8 11:10:30 2014 > >> > Member Status: Quorate > >> > Member Name ID Status > >> > ------ ---- ---- ------ > >> > 192.168.10.10 1 Online,rgmanager > >> > 192.168.10.11 2 Online,Local,rgmanager > >> > Service Name Owner (Last) State > >> > ------- ---- ----- ------ ----- > >> > service:IP 192.168.10.11 started > >> > > >> > > >> > 4: without any error and some more information, is realy hard to help > >> > you > >> > > >> > I am using Oracle 11.2.0.4 Database on Red Hat Cluster Suite 6.5 . > >> > In normal working fine . > >> > When I am using Oracle Service in Cluster.conf as above I am getting > >> > ERROR > >> > Try `basename --help' for more information. > >> > Failed to start IP > >> > Database Service unable to start. > >> > > >> > I have try all the options given in oracledb.sh but same ERROR > >> > Please help me . > >> > > >> > Thanks in advance. > >> > > >> > > >> > On Tue, Jul 15, 2014 at 6:50 PM, emmanuel segura > >> > wrote: > >> >> > >> >> 1: show your config > >> >> 2: show what kind of problem you find, doing what you want to archive > >> >> 3: show your cluster > >> >> 4: without any error and some more information, is realy hard to help > >> >> you > >> >> > >> >> 2014-07-15 16:18 GMT+02:00 abdul mujeeb Siddiqui >: > >> >> > Hello, I have to implemented red hat linux 6.4 cluster suite and > >> >> > trying > >> >> > to > >> >> > use Oracle11gr2 on it.But oracle service is unable to start. > >> >> > Listener isnot starting. > >> >> > Anyone have implemented oracle11gr2 so please > >> >> > Send me cluster.conf and oracledb.sh and also listener.ora and > >> >> > tnsnames.ora > >> >> > files pls. > >> >> > Thanks in advanced > >> >> > > >> >> > > >> >> > -- > >> >> > Linux-cluster mailing list > >> >> > Linux-cluster at redhat.com > >> >> > https://www.redhat.com/mailman/listinfo/linux-cluster > >> >> > >> >> > >> >> > >> >> -- > >> >> esta es mi vida e me la vivo hasta que dios quiera > >> >> > >> >> -- > >> >> Linux-cluster mailing list > >> >> Linux-cluster at redhat.com > >> >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > >> > > >> > > >> > -- > >> > Linux-cluster mailing list > >> > Linux-cluster at redhat.com > >> > https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> > >> > >> -- > >> esta es mi vida e me la vivo hasta que dios quiera > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph at macht-blau.org Wed Jul 23 15:53:56 2014 From: christoph at macht-blau.org (C. Handel) Date: Wed, 23 Jul 2014 17:53:56 +0200 Subject: [Linux-cluster] corosync ring failure Message-ID: hi, i run a cluster with two corosync rings. One of the rings is marked faulty every fourty seconds, to immediately recover a second later. the other ring is stable i have no idea how i should debug this. we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1 cluster consists of three machines. Ring1 is running on 10gigbit interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their respective switch. corosync communication is udpu, rrp_mode is passive cluster.conf: syslog Jul 23 17:48:34 asl431 corosync[3254]: [TOTEM ] Marking ringid 1 interface 140.181.134.212 FAULTY Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Jul 23 17:49:14 asl431 corosync[3254]: [TOTEM ] Marking ringid 1 interface 140.181.134.212 FAULTY Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 Greetings Christoph From lists at alteeve.ca Wed Jul 23 16:16:48 2014 From: lists at alteeve.ca (Digimer) Date: Wed, 23 Jul 2014 12:16:48 -0400 Subject: [Linux-cluster] corosync ring failure In-Reply-To: References: Message-ID: <53CFDFF0.3010504@alteeve.ca> Any logs in the switch? Is the multicast group being deleted/recreated? On 23/07/14 11:53 AM, C. Handel wrote: > hi, > > i run a cluster with two corosync rings. One of the rings is marked > faulty every fourty seconds, to immediately recover a second later. > the other ring is stable > > i have no idea how i should debug this. > > > we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1 > cluster consists of three machines. Ring1 is running on 10gigbit > interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their > respective switch. > > corosync communication is udpu, rrp_mode is passive > > cluster.conf: > > > > > > > > > > > > > cman_label="qdisk" > device="/dev/mapper/mpath-091quorump1" > min_score="1" > votes="2" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > syslog > > > Jul 23 17:48:34 asl431 corosync[3254]: [TOTEM ] Marking ringid 1 > interface 140.181.134.212 FAULTY > Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > Jul 23 17:49:14 asl431 corosync[3254]: [TOTEM ] Marking ringid 1 > interface 140.181.134.212 FAULTY > Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered ring 1 > > > > Greetings > Christoph > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From morpheus.ibis at gmail.com Wed Jul 23 20:00:09 2014 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Wed, 23 Jul 2014 22:00:09 +0200 Subject: [Linux-cluster] corosync ring failure In-Reply-To: <53CFDFF0.3010504@alteeve.ca> References: <53CFDFF0.3010504@alteeve.ca> Message-ID: <24252688.qLsjkavrKW@bloomfield> Hi, On Wednesday 23 of July 2014 12:16:48 Digimer wrote: > Any logs in the switch? Is the multicast group being deleted/recreated? I believe there would be no multicast for UDPU transport Can you check to see if any of the devices (servers and switches) is dropping UDP packets, be it for congestion or damage? regards, Pavel > > On 23/07/14 11:53 AM, C. Handel wrote: > > hi, > > > > i run a cluster with two corosync rings. One of the rings is marked > > faulty every fourty seconds, to immediately recover a second later. > > the other ring is stable > > > > i have no idea how i should debug this. > > > > > > we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1 > > cluster consists of three machines. Ring1 is running on 10gigbit > > interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their > > respective switch. > > > > corosync communication is udpu, rrp_mode is passive > > > > cluster.conf: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > cman_label="qdisk" > > device="/dev/mapper/mpath-091quorump1" > > min_score="1" > > votes="2" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > syslog > > > > > > Jul 23 17:48:34 asl431 corosync[3254]: [TOTEM ] Marking ringid 1 > > interface 140.181.134.212 FAULTY > > Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically recovered > > ring 1 Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] Automatically > > recovered ring 1 Jul 23 17:48:35 asl431 corosync[3254]: [TOTEM ] > > Automatically recovered ring 1 Jul 23 17:49:14 asl431 corosync[3254]: > > [TOTEM ] Marking ringid 1 interface 140.181.134.212 FAULTY > > Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically recovered > > ring 1 Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] Automatically > > recovered ring 1 Jul 23 17:49:15 asl431 corosync[3254]: [TOTEM ] > > Automatically recovered ring 1 > > > > > > > > Greetings > > > > Christoph From christoph at macht-blau.org Thu Jul 24 07:30:01 2014 From: christoph at macht-blau.org (C. Handel) Date: Thu, 24 Jul 2014 09:30:01 +0200 Subject: [Linux-cluster] corosync ring failure In-Reply-To: References: Message-ID: >>> i run a cluster with two corosync rings. One of the rings is marked >>> faulty every fourty seconds, to immediately recover a second later. >>> the other ring is stable >>> >>> i have no idea how i should debug this. >>> >>> >>> we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1 >>> cluster consists of three machines. Ring1 is running on 10gigbit >>> interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their >>> respective switch. >> Any logs in the switch? Is the multicast group being deleted/recreated? > believe there would be no multicast for UDPU transport >Can you check to see if any of the devices (servers and switches) is >dropping >UDP packets, be it for congestion or damage? the switch has no load, interface utilization is below 10%, no crc errors on the ports and no errors in the log. On the same switch a second cluster (four machines, similiar config) is running fine. Greetings Christoph From misch at schwartzkopff.org Thu Jul 24 07:42:08 2014 From: misch at schwartzkopff.org (Michael Schwartzkopff) Date: Thu, 24 Jul 2014 09:42:08 +0200 Subject: [Linux-cluster] corosync ring failure In-Reply-To: References: Message-ID: <3854552.a8rUyGa0b9@nb003> Am Donnerstag, 24. Juli 2014, 09:30:01 schrieb C. Handel: > >>> i run a cluster with two corosync rings. One of the rings is marked > >>> faulty every fourty seconds, to immediately recover a second later. > >>> the other ring is stable > >>> > >>> i have no idea how i should debug this. > >>> > >>> > >>> we are running sl6.5 with pacemaker 1.1.10, cman 3.0.12, corosync 1.4.1 > >>> cluster consists of three machines. Ring1 is running on 10gigbit > >>> interfaces, Ring0 on 1gigibit interfaces. Both rings don't leave their > >>> respective switch. > >> > >> Any logs in the switch? Is the multicast group being deleted/recreated? > > > > believe there would be no multicast for UDPU transport > > > >Can you check to see if any of the devices (servers and switches) is > >>dropping UDP packets, be it for congestion or damage? > > the switch has no load, interface utilization is below 10%, no crc > errors on the ports and no errors in the log. On the same switch a > second cluster (four machines, similiar config) is running fine. Any Spanning Tree Problems? Dou you have any bridges (i.e. for virtual machines) configured in your setup? did you do some debug on your switch? Greetings, -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 M?nchen Tel: (0162) 1650044 Fax: (089) 620 304 13 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: This is a digitally signed message part. URL: From yvette at dbtgroup.com Thu Jul 24 16:57:41 2014 From: yvette at dbtgroup.com (Yvette S. Hirth, CCP, CDP) Date: Thu, 24 Jul 2014 09:57:41 -0700 Subject: [Linux-cluster] corosync ring failure In-Reply-To: References: Message-ID: <53D13B05.7050505@dbtgroup.com> On 07/24/2014 12:30 AM, C. Handel wrote: > the switch has no load, interface utilization is below 10%, no crc > errors on the ports and no errors in the log. On the same switch a > second cluster (four machines, similiar config) is running fine. hi, did you vlan the switches so the two clusters are "logically separate"? if they're on the same VLAN they might interfere with each other... also i second Michael Schwartzkopff's suggestion of looking into Spanning Tree Protocol (STP). if your switches (i'm assUming you're using two) are not stacked(1), you may be running into that, as well. hth yvette (1) - stacking switches allows for double bandwidth operations, and can eliminate the need / requirement for STP - tho STP may still be in use. From christoph at macht-blau.org Fri Jul 25 06:22:09 2014 From: christoph at macht-blau.org (C. Handel) Date: Fri, 25 Jul 2014 08:22:09 +0200 Subject: [Linux-cluster] corosync ring failure In-Reply-To: <53D13B05.7050505@dbtgroup.com> References: <53D13B05.7050505@dbtgroup.com> Message-ID: >> the switch has no load, interface utilization is below 10%, no crc >> errors on the ports and no errors in the log. On the same switch a >> second cluster (four machines, similiar config) is running fine. > > did you vlan the switches so the two clusters are "logically separate"? if > they're on the same VLAN they might interfere with each other... > > also i second Michael Schwartzkopff's suggestion of looking into Spanning > Tree Protocol (STP). if your switches (i'm assUming you're using two) are > not stacked(1), you may be running into that, as well. There are vlans and spanning trees and stacking. Both (actualy three) cluster are in the same vlans. One vlan for cluster internals and one for external. One internal ring, one external ring. On the internal ring each Cluster uses it's own IP subnet. Communication is udpu and should not interfere. Spanning tree has no events. All ports are always in forwarding mode. As ring failure happens every 40 seconds i think it is unlikely for spanning tree to be the reason. i pulled a wiredump on one of the nodes (432). But i can't really make sense of it. orf packets from node 431 arive and i send out orf packets to the node 430. So the ring looks fine. I modified the cluster config to include rebooted all nodes (for another reason) and everything looks fine now. No idea if it is the config change or the reboot. I will pull another wiredump and compare. Greetings Christoph From laszlo.budai at acceleris.ro Tue Jul 29 16:58:59 2014 From: laszlo.budai at acceleris.ro (Laszlo Budai) Date: Tue, 29 Jul 2014 19:58:59 +0300 Subject: [Linux-cluster] RHEL 6.3 - qdisk lost Message-ID: <53D7D2D3.10603@acceleris.ro> Dear All, I have a two node cluster (rgmanager-3.0.12.1-17.el6.x86_64) with a shared storage. The storage contains the quorum disk also. there are some services, and there are some dependencies set. We are testing what is happening if the storage is disconnected from one node (to see the cluster response for such a failure). So we start from a good cluster (all is OK) and we disconnect the storage from the first node. What I have observed: 1. the cluster is fencing node 1 2. node 2 is trying to start the services, but even if we have 3 services (let's say B,C,D) which are depending on a fourth one (say A) the cluster is trying to start the services in this order: B,C,D,A. Obviously it fails for B,C,D and gives us the following messages: Jul 29 15:49:54 node1 rgmanager[5135]: service:B is not runnable; dependency not met Jul 29 15:49:54 node2 rgmanager[5135]: Not stopping service:B: service is failed Jul 29 15:49:54 node2 rgmanager[5135]: Unable to stop RG service:B in recoverable state it will leave them in "recoverable" state even if service A will start successfully (so the dependency would be met now). Why is this happening? I would expect the rgmanager to start the services in an order that would satisfy the dependency relationships. Or if it is not doing that , then at least to react to the service state change event (service A has started, so dependencies should be evaluated again). What can be done about it? Thank you in advance, Laszlo