From Fredrik.Hudner at evry.com Fri May 3 13:33:38 2013 From: Fredrik.Hudner at evry.com (Fredrik Hudner) Date: Fri, 3 May 2013 13:33:38 +0000 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown Message-ID: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> Dear all, I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. Environment: Centos 6.3 fence-agents.x86_64 I'm running the command manually with different options: # fence_vmware_soap -o off -a vcenter-adress -l drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 Unable to connect/login to fencing device # fence_vmware_soap -o off -a 192.168.231.31 -l drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 No handlers could be found for logger "suds.client" Unable to connect/login to fencing device In vCenters (5.1) system logs I can see the following error: 2013-05-03T14:00:07.031+02:00 [07800 error 'Default'] [0] error:14094416:SSL routines:SSL3_READ_BYTES:sslv3 alert certificate unknown 2013-05-03T14:00:07.031+02:00 [07800 error 'Default'] SSLStreamImpl::DoServerHandshake (000000005d11ce30) SSL_accept failed. Dumping SSL error queue: 2013-05-03T14:00:07.031+02:00 [07800 warning 'ProxySvc'] SSL Handshake failed for stream TCPStreamWin32(socket=TCP(fd=31640) local=vcentre-adress:443,? peer=vcentre-adress:53876), error: class Vmacore::Ssl::SSLException(SSL Exception: error:14094416:SSL routines:SSL3_READ_BYTES:sslv3 alert certificate unknown) Question is: Is the unknown certificate the real problem here ? and if so, on which host is it actually missing (source host, vCentre or target host) ? Any other clues how to get this to work is much appreciated (and if you need more information, please let me know) Kind regards /Fred From mgrac at redhat.com Fri May 3 14:13:56 2013 From: mgrac at redhat.com (Marek Grac) Date: Fri, 03 May 2013 16:13:56 +0200 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown In-Reply-To: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> References: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> Message-ID: <5183C624.1000807@redhat.com> On 05/03/2013 03:33 PM, Fredrik Hudner wrote: > Dear all, > > I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. > > Environment: > Centos 6.3 > fence-agents.x86_64 > > I'm running the command manually with different options: > > # fence_vmware_soap -o off -a vcenter-adress -l drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 > Unable to connect/login to fencing device > > # fence_vmware_soap -o off -a 192.168.231.31 -l drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 > No handlers could be found for logger "suds.client" > Unable to connect/login to fencing device It looks like you are trying to connect to API on port 443 without using ssl (on command line you can use --ssl; no need to use -u) m, From Fredrik.Hudner at evry.com Fri May 3 14:25:55 2013 From: Fredrik.Hudner at evry.com (Fredrik Hudner) Date: Fri, 3 May 2013 14:25:55 +0000 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown In-Reply-To: <5183C624.1000807@redhat.com> References: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> <5183C624.1000807@redhat.com> Message-ID: <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> On 05/03/2013 03:33 PM, Fredrik Hudner wrote: > Dear all, > > I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. > > Environment: > Centos 6.3 > fence-agents.x86_64 > > I'm running the command manually with different options: > > # fence_vmware_soap -o off -a vcenter-adress -l > drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 Unable to > connect/login to fencing device > > # fence_vmware_soap -o off -a 192.168.231.31 -l > drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 No handlers could be found for logger "suds.client" > Unable to connect/login to fencing device It looks like you are trying to connect to API on port 443 without using ssl (on command line you can use --ssl; no need to use -u) m, I thought the -z option did that ? Besides if I don't user -u it default to port 23 (telnet) and they will never open that in the firewall for me :) -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From Fredrik.Hudner at evry.com Sat May 4 13:50:33 2013 From: Fredrik.Hudner at evry.com (Fredrik Hudner) Date: Sat, 4 May 2013 13:50:33 +0000 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown In-Reply-To: <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> References: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> <5183C624.1000807@redhat.com> <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> Message-ID: <64275F83AE588D4EBF165BF5D8783D41042FCA@ccdex003> On 05/03/2013 03:33 PM, Fredrik Hudner wrote: > Dear all, > > I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. > > Environment: > Centos 6.3 > fence-agents.x86_64 > > I'm running the command manually with different options: > > # fence_vmware_soap -o off -a vcenter-adress -l > drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 Unable to > connect/login to fencing device > > # fence_vmware_soap -o off -a 192.168.231.31 -l > drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 No handlers could be found for logger "suds.client" > Unable to connect/login to fencing device It looks like you are trying to connect to API on port 443 without using ssl (on command line you can use --ssl; no need to use -u) m, I thought the -z option did that ? Besides if I don't user -u it default to port 23 (telnet) and they will never open that in the firewall for me :) I tried with --ssl option and still get the same result From mgrac at redhat.com Thu May 9 07:20:54 2013 From: mgrac at redhat.com (Marek Grac) Date: Thu, 09 May 2013 09:20:54 +0200 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown In-Reply-To: <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> References: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> <5183C624.1000807@redhat.com> <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> Message-ID: <518B4E56.9030809@redhat.com> On 05/03/2013 04:25 PM, Fredrik Hudner wrote: > On 05/03/2013 03:33 PM, Fredrik Hudner wrote: >> Dear all, >> >> I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. >> >> Environment: >> Centos 6.3 >> fence-agents.x86_64 >> >> I'm running the command manually with different options: >> >> # fence_vmware_soap -o off -a vcenter-adress -l >> drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 Unable to >> connect/login to fencing device >> >> # fence_vmware_soap -o off -a 192.168.231.31 -l >> drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 No handlers could be found for logger "suds.client" >> Unable to connect/login to fencing device > It looks like you are trying to connect to API on port 443 without using ssl (on command line you can use --ssl; no need to use -u) > > m, > > I thought the -z option did that ? > Besides if I don't user -u it default to port 23 (telnet) and they will never open that in the firewall for me :) > Yes, you are right (there is no need to set a port if it is a default one and -z/--ssl is set). Which version of VMWare do you have? Take a look if it is supported https://access.redhat.com/site/articles/2860 and no work-arounds are needed. m, From ssloh at singnet.com.sg Fri May 10 06:35:04 2013 From: ssloh at singnet.com.sg (ssloh) Date: Fri, 10 May 2013 14:35:04 +0800 Subject: [Linux-cluster] cluster issue References: Message-ID: <288507E9FAC14412BC43756C95DF9EB3@vince> ----- Original Message ----- From: "santosh lohar" To: Sent: Tuesday, September 28, 2010 2:44 PM Subject: [Linux-cluster] cluster issue Hi all, I am facing the problem with SGE and flexlm licencing details are below: *Hardware: * IBM 3650 , 2 Quad core CPU , 16 GB RAM , total nos of node2 + one master node conected with IB switch connectivity: *Software* : ROCKS 5.1 / os -RHEL4 mars hill/ fluent / MSC mentat. Problem : 1 when I submitt the jobs with SGE the "qhost -F MDAdv " is showinf updated status of license issued and avilable but when I submitt the jobs outside SGE then it will not able to recognize the latest status of license tokens 2. jobs submitted after 4 cpu's then cluster computation will get slows down , Kindly suggest me what to do in this case , thanks in advance Regards Santosh On Mon, Sep 27, 2010 at 11:07 PM, wrote: > Send Linux-cluster mailing list submissions to > linux-cluster at redhat.com > > To subscribe or unsubscribe via the World Wide Web, visit > https://www.redhat.com/mailman/listinfo/linux-cluster > or, via email, send a message with subject or body 'help' to > linux-cluster-request at redhat.com > > You can reach the person managing the list at > linux-cluster-owner at redhat.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linux-cluster digest..." > > > Today's Topics: > > 1. Unable to patch conga (fosiul alam) > 2. Re: ricci is very unstable in one nodes (Paul M. Dyer) > 3. Re: porblem with quorum at cluster boot (brem belguebli) > 4. Re: ricci is very unstable in one nodes (fosiul alam) > 5. Re: ricci is very unstable in one nodes (fosiul alam) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 27 Sep 2010 17:02:20 +0100 > From: fosiul alam > To: linux clustering > Subject: [Linux-cluster] Unable to patch conga > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > hi > Due to the same issue, I see exact same problem in my luci interface > so i am trying to patch conga. > > I downloaded , > > > http://mirrors.kernel.org/centos/5/os/SRPMS/conga-0.12.2-12.el5.centos.1.src.rpm > rpm -i conga-0.12.2-12.el5.centos.1.src.rpm > cd /usr/src/redhat/SOURCE > > tar -xvzf conga-0.12.2.tar.gz > patch -p0 < /path/to/where_the_patch/ricci.patch > > [root at beaver SOURCES]# cd conga-0.12.2 > > Now i am facing the problem to install > > ./autogen.sh --include_zope_and_plone=yes > Zope-2.9.8-final.tgz passed sha512sum test > Plone-2.5.5.tar.gz passed sha512sum test > cat: clustermon.spec.in.in: No such file or directory > > Run `./configure` to configure conga build, > or `make srpms` to build conga and clustermon srpms > or `make rpms` to build all rpms > > [root at beaver conga-0.12.2]# ./configure --include_zope_and_plone=yes > D-BUS version 1.1.2 detected -> major 1, minor 1 > missing zope directory, extract zope source-code into it and try again > > > Now, how will i tell ./configure where is zope and plone ? > do i need this zope and plone ? > > Please give me some advise > > Fosiul > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://www.redhat.com/archives/linux-cluster/attachments/20100927/21959f19/attachment.html > > > > ------------------------------ > > Message: 2 > Date: Mon, 27 Sep 2010 11:55:28 -0500 (CDT) > From: "Paul M. Dyer" > To: linux clustering > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > Message-ID: <1480320.10.1285606528829.JavaMail.root at athena> > Content-Type: text/plain; charset=utf-8 > > http://rhn.redhat.com/errata/RHBA-2010-0716.html > > It appears that this problem has been fixed in this errata. > > I installed the luci and ricci updates and did some lite testing. So far, > the timeout 11111 error has not shown up. > > Paul > > ----- Original Message ----- > From: "fosiul alam" > To: "linux clustering" > Sent: Monday, September 27, 2010 10:48:27 AM > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > > Hi > i am trying to patch ricci . let see how it goes > > but clusvcadm is failing as well > > [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local > Member http1.xxxx.local trying to enable service:httpd1...Invalid > operation for resource > > here, http1 , where i was trying to run the service from luci > > what could be the problem ? > is there any way to find out if there is any problem with config ?? > > On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote: > > > RHEL 5.6 hasn't been released yet so your package probably contains the > problem. I'm not sure how in sync Centos is with RHEL or if they patch > earlier so I cannot give you a time frame when it will be in Centos or > if they have already patched it. The problem in that BZ is more of an > annoyance, you usually just have to retry a time or two and it works. If > you can't get Luci working properly with your service at all you should > try enabling the service through the command line with clusvcadm -e. If > it is not working from the command line either then there is a problem > with the service config. > > > > > -Ben > > > > > ----- "fosiul alam" < expertalert at gmail.com > wrote: > > > Hi Ben > > Thanks > > > > I named this cluster as mysql-server but i have not installed mysql > > database in their yet > > > > and both luci and ricci on luci server and node1 is running this > > version > > > > luci-0.12.2-12.el5.centos.1 > > ricci-0.12.2-12.el5.centos.1 > > > > > > do you think this version has problem as well ?? > > > > thanks for your help > > > > > > > > > > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote: > > > > > > There is an issue with ricci timeouts that was fixed recently: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=564490 > > > > I'm not sure but you may be hitting that bug. Symptoms include: luci > > isn't able to get the status from the node, timeouts when querying > > ricci, etc. The fix should be released with 5.6 > > > > On the mysql service there are some options that you need to set. Here > > are all the options available to that agent: > > > > mysql > > Defines a MySQL database server > > > > Attribute Description > > config_file Define configuration file > > listen_address Define an IP address for MySQL server. If the address > > is not given then first IP address from the service is taken. > > mysqld_options Other command-line options for mysqld > > name Name > > ref Reference to existing mysql resource in the resources section. > > service_name Inherit the service name. > > shutdown_wait Wait X seconds for correct end of service shutdown > > startup_wait Wait X seconds for correct end of service startup > > __enforce_timeouts Consider a timeout for operations as fatal. > > __failure_expire_time Amount of time before a failure is forgotten. > > __independent_subtree Treat this and all children as an independent > > subtree. __max_failures Maximum number of failures before returning a > > failure to a status check. > > > > If I recall correctly you may need to tweak: > > > > shutdown_wait Wait X seconds for correct end of service shutdown > > startup_wait Wait X seconds for correct end of service startup > > > > There can be problems relocating the DB if it takes too long to > > start/shutdown. If you are having problems relocating with luci it may > > be a good idea to test with: > > > > # clusvcadm -r -m > > > > -Ben > > > > > > > > > > > > > > ----- "fosiul alam" < expertalert at gmail.com > wrote: > > > > > Hi > > > I have 4 nodes cluster, > > > It was running fine. but today one nodes is giving trouble > > > > > > From luci Gui interface, when i try to relocate service into this > > node > > > and trying to relocate from this nodes to another nodes > > > > > > from luci gui interface, its showing : > > > > > > Unable to retrieve batch 1908047789 status from > > > beaver.domain.local:11111: clusvcadm start failed to start httpd1: > > > Starting cluster service "httpd1" on node "http1.domain.local" -- > > You > > > will be redirected in 5 seconds. > > > also > > > > > > The ricci agent for this node is unresponsive. Node-specific > > > information is not available at this time. : > > > > > > but ricci is running on problematic node , > > > ricci 7324 0.0 0.1 58876 2932 ? S > > > > > there is not any firewall running. > > > > > > iptables -L > > > Chain INPUT (policy ACCEPT) > > > target prot opt source destination > > > > > > Chain FORWARD (policy ACCEPT) > > > target prot opt source destination > > > > > > Chain OUTPUT (policy ACCEPT) > > > target prot opt source destination > > > > > > Chain RH-Firewall-1-INPUT (0 references) > > > target prot opt source destination > > > > > > port 11111 is runningg > > > > > > netstat -an | grep 11111 > > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN > > > > > > > > > but still ricci is very unstable , and i cant relocate any service > > on > > > this node or i cant relocate any service away from this node. > > > > > > from problematic node if i type this > > > > > > clustat > > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010 > > > Member Status: Quorate > > > > > > Member Name ID Status > > > ------ ---- ---- ------ > > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this > > > server publicdns1.xxxx.local 2 Online, rgmanager > > > http1.xxxx.local 3 Online, Local, rgmanager > > > mail01.xxxxx.local 4 Online, rgmanager > > > > > > Service Name Owner (Last) State > > > ------- ---- ----- ------ ----- > > > service:httpd1 mail01.xxxx.local started > > > service:mysql-server http1.xxxx.local started ------------------- > > this > > > is the problematic node > > > service:public-dns publicdns1.xxxxxx.local started > > > > > > I cant move that service mysql-server from this node or cant > > relocate > > > any service on this node .. > > > I am very confused. > > > > > > what shall i do to fix this issue ?? > > > > > > thanks for your advise. > > > > > > > > > > > > > > > -- Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > ------------------------------ > > Message: 3 > Date: Mon, 27 Sep 2010 19:05:06 +0200 > From: brem belguebli > To: linux clustering > Subject: Re: [Linux-cluster] porblem with quorum at cluster boot > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > The configuration you are trying to build, 2 cluster nodes (1 vote each) > plus a quorum disk 1 vote (making a total expected votes= 3) must remain up > if you loose 1 of the members (as long as the remaining node still accesses > the quorum disk) because there are still 2 active votes (1 remaining node > + 1 quorum disk) = 2 > expected_votes/2. > > The Quorum (majority) must be greater (absolutely greater >) than > expected_votes/2 (51% or greater) in order to service to continue. > > > 2010/9/27 Bennie R Thomas > > > Try setting your expected votes to 2 or 1.. > > > > Your Cluster is hanging with one node because it want's 3 votes. > > > > > > > > From: Brem Belguebli To: linux clustering < > > linux-cluster at redhat.com> Date: 09/25/2010 10:30 AM Subject: Re: > > [Linux-cluster] porblem with quorum at cluster boot Sent by: > > linux-cluster-bounces at redhat.com > > ------------------------------ > > > > > > > > On Fri, 2010-09-24 at 12:52 -0400, Jason_Henderson at Mitel.com wrote: > > > > > > I think you still need two_node="1" in your conf file if you want a > > > single node to become quorate. > > > > > two_nodes=1 is only valid if you do not have a quorum disk. > > > > > linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM: > > > > > > > hello, > > > > > > > > I have a 2 node cluster with qdisk quorum partition; > > > > > > > > each node has 1 vote and the qdisk has 1 vote too; in cluster.conf > > > I > > > > have this explicit declaration: > > > > > > > > > > > > when I have both 2 nodes active cman_tool status tell me this: > > > > > > > > Version: 6.1.0 > > > > Nodes: 2 > > > > Expected votes: 3 > > > > Quorum device votes: 1 > > > > Total votes: 3 > > > > Node votes: 1 > > > > Quorum: 2 > > > > > > > > then, if I power off a node these value, as expected, changed this > > > way: > > > > Nodes: 1 > > > > Total votes: 2 > > > > > > > > and the cluster is still quorate and functional. > > > > > > > > the problem is if I power off both the node and them power on only > > > one > > > > of them: in this case the single node does not quorate and the > > > cluster > > > > does not start: I have to power on both the node to have the > > > cluster > > > > (and services on the cluster) working. > > > > > > > > I'd like the cluster can work (and boot) even with a single node > > > (ie, if > > > > one of the node has hw failure and is down I still want to be able > > > to > > > > reboot the working node and have it booting correctly the cluster) > > > > > > > > any hints? (thank's for reading all this) > > > > > > > > -- > > > > bye, > > > > emilio > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster at redhat.com > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > > > Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://www.redhat.com/archives/linux-cluster/attachments/20100927/e452edb5/attachment.html > > > > ------------------------------ > > Message: 4 > Date: Mon, 27 Sep 2010 18:31:31 +0100 > From: fosiul alam > To: linux clustering > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > Message-ID: > > > > > Content-Type: text/plain; charset="iso-8859-1" > > Hi > Thanks for your advise, > Currently i got this > > luci-0.12.2-12.el5.centos.1 > ricci-0.12.2-12.el5.centos.1 > > is this the same rpm as > > luci-0.12.2-12.el5_5.4.i386.rpm ? > ricci-0.12.2-12.el5_5.4.i386.rpm ? > > Thanks > > > On 27 September 2010 17:55, Paul M. Dyer wrote: > > > http://rhn.redhat.com/errata/RHBA-2010-0716.html > > > > It appears that this problem has been fixed in this errata. > > > > I installed the luci and ricci updates and did some lite testing. So > far, > > the timeout 11111 error has not shown up. > > > > Paul > > > > ----- Original Message ----- > > From: "fosiul alam" > > To: "linux clustering" > > Sent: Monday, September 27, 2010 10:48:27 AM > > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > > > > Hi > > i am trying to patch ricci . let see how it goes > > > > but clusvcadm is failing as well > > > > [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local > > Member http1.xxxx.local trying to enable service:httpd1...Invalid > > operation for resource > > > > here, http1 , where i was trying to run the service from luci > > > > what could be the problem ? > > is there any way to find out if there is any problem with config ?? > > > > On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote: > > > > > > RHEL 5.6 hasn't been released yet so your package probably contains the > > problem. I'm not sure how in sync Centos is with RHEL or if they patch > > earlier so I cannot give you a time frame when it will be in Centos or > > if they have already patched it. The problem in that BZ is more of an > > annoyance, you usually just have to retry a time or two and it works. If > > you can't get Luci working properly with your service at all you should > > try enabling the service through the command line with clusvcadm -e. If > > it is not working from the command line either then there is a problem > > with the service config. > > > > > > > > > > -Ben > > > > > > > > > > ----- "fosiul alam" < expertalert at gmail.com > wrote: > > > > > Hi Ben > > > Thanks > > > > > > I named this cluster as mysql-server but i have not installed mysql > > > database in their yet > > > > > > and both luci and ricci on luci server and node1 is running this > > > version > > > > > > luci-0.12.2-12.el5.centos.1 > > > ricci-0.12.2-12.el5.centos.1 > > > > > > > > > do you think this version has problem as well ?? > > > > > > thanks for your help > > > > > > > > > > > > > > > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote: > > > > > > > > > There is an issue with ricci timeouts that was fixed recently: > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=564490 > > > > > > I'm not sure but you may be hitting that bug. Symptoms include: luci > > > isn't able to get the status from the node, timeouts when querying > > > ricci, etc. The fix should be released with 5.6 > > > > > > On the mysql service there are some options that you need to set. Here > > > are all the options available to that agent: > > > > > > mysql > > > Defines a MySQL database server > > > > > > Attribute Description > > > config_file Define configuration file > > > listen_address Define an IP address for MySQL server. If the address > > > is not given then first IP address from the service is taken. > > > mysqld_options Other command-line options for mysqld > > > name Name > > > ref Reference to existing mysql resource in the resources section. > > > service_name Inherit the service name. > > > shutdown_wait Wait X seconds for correct end of service shutdown > > > startup_wait Wait X seconds for correct end of service startup > > > __enforce_timeouts Consider a timeout for operations as fatal. > > > __failure_expire_time Amount of time before a failure is forgotten. > > > __independent_subtree Treat this and all children as an independent > > > subtree. __max_failures Maximum number of failures before returning a > > > failure to a status check. > > > > > > If I recall correctly you may need to tweak: > > > > > > shutdown_wait Wait X seconds for correct end of service shutdown > > > startup_wait Wait X seconds for correct end of service startup > > > > > > There can be problems relocating the DB if it takes too long to > > > start/shutdown. If you are having problems relocating with luci it may > > > be a good idea to test with: > > > > > > # clusvcadm -r -m > > > > > > -Ben > > > > > > > > > > > > > > > > > > > > > ----- "fosiul alam" < expertalert at gmail.com > wrote: > > > > > > > Hi > > > > I have 4 nodes cluster, > > > > It was running fine. but today one nodes is giving trouble > > > > > > > > From luci Gui interface, when i try to relocate service into this > > > node > > > > and trying to relocate from this nodes to another nodes > > > > > > > > from luci gui interface, its showing : > > > > > > > > Unable to retrieve batch 1908047789 status from > > > > beaver.domain.local:11111: clusvcadm start failed to start httpd1: > > > > Starting cluster service "httpd1" on node "http1.domain.local" -- > > > You > > > > will be redirected in 5 seconds. > > > > also > > > > > > > > The ricci agent for this node is unresponsive. Node-specific > > > > information is not available at this time. : > > > > > > > > but ricci is running on problematic node , > > > > ricci 7324 0.0 0.1 58876 2932 ? S > > > > > > > there is not any firewall running. > > > > > > > > iptables -L > > > > Chain INPUT (policy ACCEPT) > > > > target prot opt source destination > > > > > > > > Chain FORWARD (policy ACCEPT) > > > > target prot opt source destination > > > > > > > > Chain OUTPUT (policy ACCEPT) > > > > target prot opt source destination > > > > > > > > Chain RH-Firewall-1-INPUT (0 references) > > > > target prot opt source destination > > > > > > > > port 11111 is runningg > > > > > > > > netstat -an | grep 11111 > > > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN > > > > > > > > > > > > but still ricci is very unstable , and i cant relocate any service > > > on > > > > this node or i cant relocate any service away from this node. > > > > > > > > from problematic node if i type this > > > > > > > > clustat > > > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010 > > > > Member Status: Quorate > > > > > > > > Member Name ID Status > > > > ------ ---- ---- ------ > > > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this > > > > server publicdns1.xxxx.local 2 Online, rgmanager > > > > http1.xxxx.local 3 Online, Local, rgmanager > > > > mail01.xxxxx.local 4 Online, rgmanager > > > > > > > > Service Name Owner (Last) State > > > > ------- ---- ----- ------ ----- > > > > service:httpd1 mail01.xxxx.local started > > > > service:mysql-server http1.xxxx.local started ------------------- > > > this > > > > is the problematic node > > > > service:public-dns publicdns1.xxxxxx.local started > > > > > > > > I cant move that service mysql-server from this node or cant > > > relocate > > > > any service on this node .. > > > > I am very confused. > > > > > > > > what shall i do to fix this issue ?? > > > > > > > > thanks for your advise. > > > > > > > > > > > > > > > > > > > > -- Linux-cluster mailing list > > > > Linux-cluster at redhat.com > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > -- Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://www.redhat.com/archives/linux-cluster/attachments/20100927/462f567b/attachment.html > > > > ------------------------------ > > Message: 5 > Date: Mon, 27 Sep 2010 18:37:44 +0100 > From: fosiul alam > To: linux clustering > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > Message-ID: > > > > Content-Type: text/plain; charset="iso-8859-1" > > Hi, Addition to my previous email have a look to this one > > from http1 ( where i am trying to relocate a service) > > [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local > Member http1.xxxx.local trying to enable service:httpd1...Success > Warning: service:httpd1 is now running on mail01.xxxx.local > > so, its saying its Success.. > but it actually no.. > > Thanks again > > > > On 27 September 2010 18:31, fosiul alam wrote: > > > Hi > > Thanks for your advise, > > Currently i got this > > > > > > luci-0.12.2-12.el5.centos.1 > > ricci-0.12.2-12.el5.centos.1 > > > > is this the same rpm as > > > > luci-0.12.2-12.el5_5.4.i386.rpm ? > > ricci-0.12.2-12.el5_5.4.i386.rpm ? > > > > Thanks > > > > > > > > On 27 September 2010 17:55, Paul M. Dyer wrote: > > > >> http://rhn.redhat.com/errata/RHBA-2010-0716.html > >> > >> It appears that this problem has been fixed in this errata. > >> > >> I installed the luci and ricci updates and did some lite testing. So > >> far, the timeout 11111 error has not shown up. > >> > >> Paul > >> > >> ----- Original Message ----- > >> From: "fosiul alam" > >> To: "linux clustering" > >> Sent: Monday, September 27, 2010 10:48:27 AM > >> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes > >> > >> Hi > >> i am trying to patch ricci . let see how it goes > >> > >> but clusvcadm is failing as well > >> > >> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local > >> Member http1.xxxx.local trying to enable service:httpd1...Invalid > >> operation for resource > >> > >> here, http1 , where i was trying to run the service from luci > >> > >> what could be the problem ? > >> is there any way to find out if there is any problem with config ?? > >> > >> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote: > >> > >> > >> RHEL 5.6 hasn't been released yet so your package probably contains the > >> problem. I'm not sure how in sync Centos is with RHEL or if they patch > >> earlier so I cannot give you a time frame when it will be in Centos or > >> if they have already patched it. The problem in that BZ is more of an > >> annoyance, you usually just have to retry a time or two and it works. If > >> you can't get Luci working properly with your service at all you should > >> try enabling the service through the command line with clusvcadm -e. If > >> it is not working from the command line either then there is a problem > >> with the service config. > >> > >> > >> > >> > >> -Ben > >> > >> > >> > >> > >> ----- "fosiul alam" < expertalert at gmail.com > wrote: > >> > >> > Hi Ben > >> > Thanks > >> > > >> > I named this cluster as mysql-server but i have not installed mysql > >> > database in their yet > >> > > >> > and both luci and ricci on luci server and node1 is running this > >> > version > >> > > >> > luci-0.12.2-12.el5.centos.1 > >> > ricci-0.12.2-12.el5.centos.1 > >> > > >> > > >> > do you think this version has problem as well ?? > >> > > >> > thanks for your help > >> > > >> > > >> > > >> > > >> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote: > >> > > >> > > >> > There is an issue with ricci timeouts that was fixed recently: > >> > > >> > https://bugzilla.redhat.com/show_bug.cgi?id=564490 > >> > > >> > I'm not sure but you may be hitting that bug. Symptoms include: luci > >> > isn't able to get the status from the node, timeouts when querying > >> > ricci, etc. The fix should be released with 5.6 > >> > > >> > On the mysql service there are some options that you need to set. Here > >> > are all the options available to that agent: > >> > > >> > mysql > >> > Defines a MySQL database server > >> > > >> > Attribute Description > >> > config_file Define configuration file > >> > listen_address Define an IP address for MySQL server. If the address > >> > is not given then first IP address from the service is taken. > >> > mysqld_options Other command-line options for mysqld > >> > name Name > >> > ref Reference to existing mysql resource in the resources section. > >> > service_name Inherit the service name. > >> > shutdown_wait Wait X seconds for correct end of service shutdown > >> > startup_wait Wait X seconds for correct end of service startup > >> > __enforce_timeouts Consider a timeout for operations as fatal. > >> > __failure_expire_time Amount of time before a failure is forgotten. > >> > __independent_subtree Treat this and all children as an independent > >> > subtree. __max_failures Maximum number of failures before returning a > >> > failure to a status check. > >> > > >> > If I recall correctly you may need to tweak: > >> > > >> > shutdown_wait Wait X seconds for correct end of service shutdown > >> > startup_wait Wait X seconds for correct end of service startup > >> > > >> > There can be problems relocating the DB if it takes too long to > >> > start/shutdown. If you are having problems relocating with luci it may > >> > be a good idea to test with: > >> > > >> > # clusvcadm -r -m > >> > > >> > -Ben > >> > > >> > > >> > > >> > > >> > > >> > > >> > ----- "fosiul alam" < expertalert at gmail.com > wrote: > >> > > >> > > Hi > >> > > I have 4 nodes cluster, > >> > > It was running fine. but today one nodes is giving trouble > >> > > > >> > > From luci Gui interface, when i try to relocate service into this > >> > node > >> > > and trying to relocate from this nodes to another nodes > >> > > > >> > > from luci gui interface, its showing : > >> > > > >> > > Unable to retrieve batch 1908047789 status from > >> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1: > >> > > Starting cluster service "httpd1" on node "http1.domain.local" -- > >> > You > >> > > will be redirected in 5 seconds. > >> > > also > >> > > > >> > > The ricci agent for this node is unresponsive. Node-specific > >> > > information is not available at this time. : > >> > > > >> > > but ricci is running on problematic node , > >> > > ricci 7324 0.0 0.1 58876 2932 ? S >> > > > >> > > there is not any firewall running. > >> > > > >> > > iptables -L > >> > > Chain INPUT (policy ACCEPT) > >> > > target prot opt source destination > >> > > > >> > > Chain FORWARD (policy ACCEPT) > >> > > target prot opt source destination > >> > > > >> > > Chain OUTPUT (policy ACCEPT) > >> > > target prot opt source destination > >> > > > >> > > Chain RH-Firewall-1-INPUT (0 references) > >> > > target prot opt source destination > >> > > > >> > > port 11111 is runningg > >> > > > >> > > netstat -an | grep 11111 > >> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN > >> > > > >> > > > >> > > but still ricci is very unstable , and i cant relocate any service > >> > on > >> > > this node or i cant relocate any service away from this node. > >> > > > >> > > from problematic node if i type this > >> > > > >> > > clustat > >> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010 > >> > > Member Status: Quorate > >> > > > >> > > Member Name ID Status > >> > > ------ ---- ---- ------ > >> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this > >> > > server publicdns1.xxxx.local 2 Online, rgmanager > >> > > http1.xxxx.local 3 Online, Local, rgmanager > >> > > mail01.xxxxx.local 4 Online, rgmanager > >> > > > >> > > Service Name Owner (Last) State > >> > > ------- ---- ----- ------ ----- > >> > > service:httpd1 mail01.xxxx.local started > >> > > service:mysql-server http1.xxxx.local started ------------------- > >> > this > >> > > is the problematic node > >> > > service:public-dns publicdns1.xxxxxx.local started > >> > > > >> > > I cant move that service mysql-server from this node or cant > >> > relocate > >> > > any service on this node .. > >> > > I am very confused. > >> > > > >> > > what shall i do to fix this issue ?? > >> > > > >> > > thanks for your advise. > >> > > > >> > > > >> > > > >> > > > >> > > -- Linux-cluster mailing list > >> > > Linux-cluster at redhat.com > >> > > https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > >> > -- Linux-cluster mailing list > >> > Linux-cluster at redhat.com > >> > https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > >> > > >> > -- Linux-cluster mailing list > >> > Linux-cluster at redhat.com > >> > https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> -- Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> > >> -- Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://www.redhat.com/archives/linux-cluster/attachments/20100927/4101fdf9/attachment.html > > > > ------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > End of Linux-cluster Digest, Vol 77, Issue 23 > ********************************************* > -- Santosh -------------------------------------------------------------------------------- -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From Fredrik.Hudner at evry.com Fri May 10 06:58:08 2013 From: Fredrik.Hudner at evry.com (Fredrik Hudner) Date: Fri, 10 May 2013 06:58:08 +0000 Subject: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown In-Reply-To: <518B4E56.9030809@redhat.com> References: <64275F83AE588D4EBF165BF5D8783D41042D3B@ccdex003> <5183C624.1000807@redhat.com> <64275F83AE588D4EBF165BF5D8783D41042DC4@ccdex003> <518B4E56.9030809@redhat.com> Message-ID: <64275F83AE588D4EBF165BF5D8783D41043F4F@ccdex003> -----Ursprungligt meddelande----- Fr?n: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] F?r Marek Grac Skickat: den 9 maj 2013 09:21 Till: linux-cluster at redhat.com ?mne: Re: [Linux-cluster] fence_vmware_soap sslv3 alert certificat unkown On 05/03/2013 04:25 PM, Fredrik Hudner wrote: > On 05/03/2013 03:33 PM, Fredrik Hudner wrote: >> Dear all, >> >> I have a pacemaker cluster and need to setup a stonith fencing agent, in this case fence_vmware_soap. >> >> Environment: >> Centos 6.3 >> fence-agents.x86_64 >> >> I'm running the command manually with different options: >> >> # fence_vmware_soap -o off -a vcenter-adress -l >> drift\vcenter_tdtestclu -p password -n tdtestclu02 -u 443 Unable to >> connect/login to fencing device >> >> # fence_vmware_soap -o off -a 192.168.231.31 -l >> drift\vcenter_tdtestclu -p password -z -n tdtestclu02 -u 443 No handlers could be found for logger "suds.client" >> Unable to connect/login to fencing device > It looks like you are trying to connect to API on port 443 without > using ssl (on command line you can use --ssl; no need to use -u) > > m, > > I thought the -z option did that ? > Besides if I don't user -u it default to port 23 (telnet) > and they will never open that in the firewall for me :) > Yes, you are right (there is no need to set a port if it is a default one and -z/--ssl is set). Which version of VMWare do you have? Take a look if it is supported https://access.redhat.com/site/articles/2860 and no work-arounds are needed. m, Thanks for the reply Marek, I think I have come around the problem logging in to the Vcentre.. I investigated the log files in there and it seems like I havet o build a SSL certificate to access.. So now I have to learn how to rebuild a windows cert to a linux cert, but I think I understand that part Many thanks /Fredrik -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From delphine.ramalingom at univ-reunion.fr Mon May 13 06:28:53 2013 From: delphine.ramalingom at univ-reunion.fr (Delphine Ramalingom) Date: Mon, 13 May 2013 10:28:53 +0400 Subject: [Linux-cluster] error clusvcadm Message-ID: <51908825.2010902@univ-reunion.fr> Hello, I have a problem and I need some help. Our cluster linux have been stopped for maintenance in the room server butr, an error was occured during the stopping procedure : Local machine disabling service:HA_MGMT...Failure The cluster was electrically stopped. But since the restart, I don't succed to restart services with command clussvcadm. I have this message : clusvcadm -e HA_MGMT Local machine trying to enable service:HA_MGMT...Aborted; service failed and startFilesystem: Could not match LABEL=postfix with a real device Do you have a solution for me ? Thanks a lot in advance. Regards Delphine From torajveersingh at gmail.com Mon May 13 06:37:20 2013 From: torajveersingh at gmail.com (Rajveer Singh) Date: Mon, 13 May 2013 12:07:20 +0530 Subject: [Linux-cluster] error clusvcadm In-Reply-To: <51908825.2010902@univ-reunion.fr> References: <51908825.2010902@univ-reunion.fr> Message-ID: Hi Delphine, It seems there is some filesystem crash. Please share your /var/log/messages and /etc/cluster/cluster.conf file to help you futher. Regards, Rajveer Singh On Mon, May 13, 2013 at 11:58 AM, Delphine Ramalingom < delphine.ramalingom at univ-reunion.fr> wrote: > Hello, > > I have a problem and I need some help. > > Our cluster linux have been stopped for maintenance in the room server > butr, an error was occured during the stopping procedure : > Local machine disabling service:HA_MGMT...Failure > > The cluster was electrically stopped. But since the restart, I don't > succed to restart services with command clussvcadm. > I have this message : > > clusvcadm -e HA_MGMT > Local machine trying to enable service:HA_MGMT...Aborted; service failed > and > startFilesystem: Could not match LABEL=postfix with a real device > > Do you have a solution for me ? > > Thanks a lot in advance. > > Regards > Delphine > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/**mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From delphine.ramalingom at univ-reunion.fr Mon May 13 07:32:27 2013 From: delphine.ramalingom at univ-reunion.fr (Delphine Ramalingom) Date: Mon, 13 May 2013 11:32:27 +0400 Subject: [Linux-cluster] error clusvcadm In-Reply-To: References: <51908825.2010902@univ-reunion.fr> Message-ID: <5190970B.7030908@univ-reunion.fr> Hi, This is the cluster.conf : [root at titan0 11:29:14 ~]# cat /etc/cluster/cluster.conf