From andrew at beekhof.net Tue Sep 1 06:33:57 2009 From: andrew at beekhof.net (Andrew Beekhof) Date: Tue, 1 Sep 2009 08:33:57 +0200 Subject: [Linux-cluster] Problem with Pacemaker and Corosync In-Reply-To: <6e4c20e70908310612o120933cema2609513f13be78c@mail.gmail.com> References: <6e4c20e70908310612o120933cema2609513f13be78c@mail.gmail.com> Message-ID: try turning on debug, there's nothing in the logs that indicate why the lrmd is having a problem On Mon, Aug 31, 2009 at 3:12 PM, Thomas Georgiou wrote: > Hi, > > I have installed Pacemaker 1.0.5, Corosync 1.0.0, and Openais 1.0.1 > from source according to the Clusterlabs docs. ?However, when I go to > start corosync/pacemaker, I get error messages pertaining to lrm and > cibadmin -Q hangs and complains that the remote node is not available. > ?Attached is the corosync log. > > Any ideas? > > Thomas Georgiou > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From ccaulfie at redhat.com Tue Sep 1 07:00:53 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 01 Sep 2009 08:00:53 +0100 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <29ae894c0908281038i20f408f7gf14483ad6f73ca5e@mail.gmail.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <29ae894c0908281038i20f408f7gf14483ad6f73ca5e@mail.gmail.com> Message-ID: <4A9CC6A5.3090502@redhat.com> On 28/08/09 18:38, brem belguebli wrote: > Hi > the clusternodes defined in cluster.conf are : > > node1.mydomain > node2.mydomain > > which correpond to the bond0 interfaces on both nodes. > > I expect to use node1-hb and node2-hb as heartbeat interfaces. (bond1) > > I may have misunderstood something, but are you telling me that I have > to use the nodeX-hb as clusternodes in cluster.conf ? Yes, that's exactly what you need to do. Chrissie > Brem > > 2009/8/28 Christine Caulfield > > > On 28/08/09 15:24, brem belguebli wrote: > > Hi Chrissie, > Are you pointing me to the paragraph "what's the right way to > ....eth0 ?" > I've tried this at first adding adding a suffix to the > interfaces but > nothing happened. Suffix -p may be hardcoded (I've used -hb) > Here's an outputof my /etc/hosts (identical on both nodes): > 10.146.15.184 node1 node1.mydomain > 10.146.15.175 node2 node2.mydomain > 192.168.84.50 node1-hb > 192.168.84.51 node2-hb > Still using bond0 .... > Brem > > > > The suffix isn't hard-coded or anything to do with cman really, it's > just a way of distinguishing interfaces. > > You need to edit cluster.conf to tell it to use the different name. > > If you get despertate then you can always put the IP address in > cluster.conf, but the output from cman_tool nodes doesn't look as nice! > > > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > ------------------------------------------------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ccaulfie at redhat.com Tue Sep 1 07:01:38 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 01 Sep 2009 08:01:38 +0100 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <4A97F090.9080508@redhat.com> <29ae894c0908281045l6c93c7dbo2d0e4f27c5bab14e@mail.gmail.com> <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> Message-ID: <4A9CC6D2.1000408@redhat.com> On 31/08/09 08:56, brem belguebli wrote: > Hi, > I was wondering if there is a way with cman to configure 2 heartbeat > channels (let's say my prod bond0 and my outband bond1) as it seems to > be possible with openais and their redundant ring interfaces configuration. > Brem > Yes, there's an item about it on the FAQ page I mentioned earlier. I hope it's more complete the the last one! Chrissie > 2009/8/28, brem belguebli >: > > Ok, > > It answers my last question. > > I have been confused by some mention somewhere in a post or doc > saying that cman has a built-in algorithm to determine, just by > adding entries in /etc/hosts, the right interfaces to use . > > Brem > > > > 2009/8/28 Christine Caulfield > > > On 28/08/09 15:53, Christine Caulfield wrote: > > On 28/08/09 15:24, brem belguebli wrote: > > Hi Chrissie, > Are you pointing me to the paragraph "what's the right > way to ....eth0 ?" > I've tried this at first adding adding a suffix to the > interfaces but > nothing happened. Suffix -p may be hardcoded (I've used -hb) > Here's an outputof my /etc/hosts (identical on both nodes): > 10.146.15.184 node1 node1.mydomain > 10.146.15.175 node2 node2.mydomain > 192.168.84.50 node1-hb > 192.168.84.51 node2-hb > Still using bond0 .... > Brem > > > > The suffix isn't hard-coded or anything to do with cman > really, it's > just a way of distinguishing interfaces. > > You need to edit cluster.conf to tell it to use the > different name. > > If you get despertate then you can always put the IP address in > cluster.conf, but the output from cman_tool nodes doesn't > look as nice! > > > I haven't read that article in detail before but you're right, > it makes no mention of changing cluster.conf! > > I've fixed that now, thank you. > > > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > ------------------------------------------------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From brem.belguebli at gmail.com Tue Sep 1 07:53:17 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Tue, 1 Sep 2009 09:53:17 +0200 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <4A9CC6D2.1000408@redhat.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <4A97F090.9080508@redhat.com> <29ae894c0908281045l6c93c7dbo2d0e4f27c5bab14e@mail.gmail.com> <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> <4A9CC6D2.1000408@redhat.com> Message-ID: <29ae894c0909010053g15ef0c30y513d5501b776bbf2@mail.gmail.com> Hello Chrissie, I couldn't find the item in the doc (the CMAN FAQ). Brem 2009/9/1, Christine Caulfield : > > On 31/08/09 08:56, brem belguebli wrote: > >> Hi, >> I was wondering if there is a way with cman to configure 2 heartbeat >> channels (let's say my prod bond0 and my outband bond1) as it seems to >> be possible with openais and their redundant ring interfaces >> configuration. >> Brem >> >> > Yes, there's an item about it on the FAQ page I mentioned earlier. I hope > it's more complete the the last one! > > Chrissie > > 2009/8/28, brem belguebli > >: >> >> Ok, >> >> It answers my last question. >> >> I have been confused by some mention somewhere in a post or doc >> saying that cman has a built-in algorithm to determine, just by >> adding entries in /etc/hosts, the right interfaces to use . >> >> Brem >> >> >> >> 2009/8/28 Christine Caulfield > > >> >> On 28/08/09 15:53, Christine Caulfield wrote: >> >> On 28/08/09 15:24, brem belguebli wrote: >> >> Hi Chrissie, >> Are you pointing me to the paragraph "what's the right >> way to ....eth0 ?" >> I've tried this at first adding adding a suffix to the >> interfaces but >> nothing happened. Suffix -p may be hardcoded (I've used >> -hb) >> Here's an outputof my /etc/hosts (identical on both nodes): >> 10.146.15.184 node1 node1.mydomain >> 10.146.15.175 node2 node2.mydomain >> 192.168.84.50 node1-hb >> 192.168.84.51 node2-hb >> Still using bond0 .... >> Brem >> >> >> >> The suffix isn't hard-coded or anything to do with cman >> really, it's >> just a way of distinguishing interfaces. >> >> You need to edit cluster.conf to tell it to use the >> different name. >> >> If you get despertate then you can always put the IP address in >> cluster.conf, but the output from cman_tool nodes doesn't >> look as nice! >> >> >> I haven't read that article in detail before but you're right, >> it makes no mention of changing cluster.conf! >> >> I've fixed that now, thank you. >> >> >> Chrissie >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> >> ------------------------------------------------------------------------ >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alain.Moulle at bull.net Tue Sep 1 09:08:45 2009 From: Alain.Moulle at bull.net (Alain.Moulle) Date: Tue, 01 Sep 2009 11:08:45 +0200 Subject: [Linux-cluster] cluster.conf in another place ? Message-ID: <4A9CE49D.8020504@bull.net> Hi, I have this cman version : cman-3.0.0-15.rc1.fc11.x86_64 is it possible to put the cluster.conf in another place than /etc/cluster/. and if so, how can I tell it to cman ? Thanks Alain -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Tue Sep 1 09:15:01 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Tue, 01 Sep 2009 11:15:01 +0200 Subject: [Linux-cluster] cluster.conf in another place ? In-Reply-To: <4A9CE49D.8020504@bull.net> References: <4A9CE49D.8020504@bull.net> Message-ID: <1251796501.339.42.camel@cerberus.int.fabbione.net> On Tue, 2009-09-01 at 11:08 +0200, Alain.Moulle wrote: > Hi, > I have this cman version : > cman-3.0.0-15.rc1.fc11.x86_64 > is it possible to put the cluster.conf in another place > than /etc/cluster/. > and if so, how can I tell it to cman ? > Thanks > Alain The exact same way I already explained to you before: http://www.redhat.com/archives/linux-cluster/2009-August/msg00260.html Cheers Fabio From jakov.sosic at srce.hr Tue Sep 1 09:19:54 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Tue, 1 Sep 2009 11:19:54 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <4A9C3EFF.1000006@nerd.com> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3EFF.1000006@nerd.com> Message-ID: <20090901111954.57a7a8ac@nb-jsosic> On Mon, 31 Aug 2009 14:22:07 -0700 Rick Stevens wrote: > I don't see that there's anything to fix. You had a three-node > cluster so you needed a majority of nodes up to maintain a quorum. > One node died, killing quorum and thus stopping the cluster Nope. Quorum is still there. I have 3 nodes with qdisk, and two nodes remained in quorum. Then, I had to reboot the nodes because of some multipath/scsi changes, and after that, they only try to fence the missing node, and they can't get to it's fencing device, and rgmanager is not showing in my output. Quorum is regained after both nodes restarted. So, bassically what I mean is that you cannot start cluster with one node and it's fence device missing, although you have gained quorum. 2 nodes and qdisk is much more than I need - I need only one node + qdisk for cluster to function properly. > As a three-node cluster, it's dead. > It can't be run as a three-node cluster until the third node is > fixed. Those are the rules. Well this is the part that I don't like :) Why can't I for example put 10 missing nodes in my cluster.conf - if other nodes don't gain quorum, they shouldn't start services and that's it, but if they do gain quorum, what's the point of constantly trying to fence missing fence device of missing node?! > A two node cluster requires special handling of things to prevent the > dread split-brain situation, which is what two_node does. Running the > surviving nodes as a two-node cluster is, by definition, a > reconfiguration. I'd say simply requiring you to set two_node is > pretty damned innocuous to let you run a dead (ok, mortally wounded) > cluster. > > If you pulled a drive out of a RAID6--thus degrading it to a RAID5-- > would you complain because it didn't remain a RAID6? First of all, RAID6 without one disk _IS NOT_ RAID5. In terms of redundancy they are the same, but on disk data is not the same, so that two are not equal. And yes - I would complain if I had to _REBUILD_ degraded array to RAID5. And until it's rebuilded, if the array was unavailable - that would be a major issue - what's the point of redundancy then if I loose whole array/cluster when one unit fails? But with RAID6 I don't have to. As a matter of fact, I can loose one more drive, and leave it in that state until I buy new two drives and hotplug them into the chassis. EG.: until quorum is maintained, array and data in it are not jeopardized. With RHCS that should be the same, shouldn't it? I'm just asking, why can't I leave the missing node in the configuration, which will be active once it returns from dealer? Why do I have to reconfig the cluster? That is not a good behaviour IMHO - there should be some command to mark node as missing, and the cluster should work fine with two nodes + qdisk because it has quorum. Isn't that the point of quorum? What's the point of cluster, if one node cannot malfunction, and be taken away to repairs, without the need of setting up a new cluster? In your RAID6 configuration, it's like taking away one disk breaks the array until you rebuild it... -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From jakov.sosic at srce.hr Tue Sep 1 09:21:57 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Tue, 1 Sep 2009 11:21:57 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <4A9C421A.8020003@nerd.com> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3EFF.1000006@nerd.com> <4A9C421A.8020003@nerd.com> Message-ID: <20090901112157.1698207c@nb-jsosic> On Mon, 31 Aug 2009 14:35:22 -0700 Rick Stevens wrote: > On re-reading my response, it seemed unintentionally harsh. I didn't > mean any disrespect, sir. I was simply questioning the concept that a > reconfiguration of a cluster shouldn't be required when, indeed the > cluster was being reconfigured. The other response I saw to this > thread regarding planning, and things such as last-man-standing was > much better worded. > > My apologies if it seemed I was jumping down your throat. I wasn't. Come on, no problem at all. We are just discussing, we are not shooting each other with rifles :) If I deserve harsh words, be free to land them on me :) PS.: There is quorum, there is qdisk, so last-man-standing issue is solved, planned whatever. Maybe I wasn't makin' myself clear... -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From jakov.sosic at srce.hr Tue Sep 1 09:26:48 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Tue, 1 Sep 2009 11:26:48 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <4A9C3FEE.3000309@wol.de> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> Message-ID: <20090901112648.3e32c88d@nb-jsosic> On Mon, 31 Aug 2009 23:26:06 +0200 "Marc - A. Dahlhaus" wrote: > I think your so called 'limitation' is more related to mistakes that > was made during the planing phase of your cluster setup than to > missing functionality. Yeah, and what can be that mistake? I'll feel free to quote John: > The best course of action to take would be to remove that missing > node from your cluster configuration using conga, > system-config-cluster, or by hand > editing /etc/cluster/cluster.conf. As long as it exists in the > configuration then the other nodes will expect it to join the > cluster, and they will attempt to fence it when they try to join the > cluster and see it is not present. Where's the issue with my config there? It seems to be an issue with RHCS misbehaving with one fence device missing. > Please take a look at the qdisk manpage and aditionaly to the cman > faq sections about tiebraker, qdisks and especially the last man > standing setup... qdisk already set up. I never said I lost quorum. I have quorum. But without one node missing completely, with it's fence device, rgmanager just doesn't start up the services, and is not listed in clustat. I repeat, I HAVE GAINED QUORUM, and I have qdisk for the case two out of three are out. -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From kkovachev at varna.net Tue Sep 1 09:44:06 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 1 Sep 2009 12:44:06 +0300 Subject: [Linux-cluster] How to disable node? In-Reply-To: <20090901112648.3e32c88d@nb-jsosic> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> Message-ID: <20090901093910.M73501@varna.net> Hi, On Tue, 1 Sep 2009 11:26:48 +0200, Jakov Sosic wrote > On Mon, 31 Aug 2009 23:26:06 +0200 > "Marc - A. Dahlhaus" wrote: > > > I think your so called 'limitation' is more related to mistakes that > > was made during the planing phase of your cluster setup than to > > missing functionality. > > Yeah, and what can be that mistake? I'll feel free to quote John: > > > The best course of action to take would be to remove that missing > > node from your cluster configuration using conga, > > system-config-cluster, or by hand > > editing /etc/cluster/cluster.conf. As long as it exists in the > > configuration then the other nodes will expect it to join the > > cluster, and they will attempt to fence it when they try to join the > > cluster and see it is not present. > > Where's the issue with my config there? It seems to be an issue with > RHCS misbehaving with one fence device missing. it is not just 'one fence device missing' it is the only fence device that could fence that node, so if you add fence manual as a last resort, you will be able to bring back your cluster to live in such cases > > > Please take a look at the qdisk manpage and aditionaly to the cman > > faq sections about tiebraker, qdisks and especially the last man > > standing setup... > > qdisk already set up. I never said I lost quorum. I have quorum. But > without one node missing completely, with it's fence device, rgmanager > just doesn't start up the services, and is not listed in clustat. I > repeat, I HAVE GAINED QUORUM, and I have qdisk for the case two out of > three are out. > > -- > | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | > ================================================================= > | start fighting cancer -> http://www.worldcommunitygrid.org/ | > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From mad at wol.de Tue Sep 1 10:29:36 2009 From: mad at wol.de (Marc - A. Dahlhaus [ Administration | Westermann GmbH ]) Date: Tue, 01 Sep 2009 12:29:36 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <20090901112648.3e32c88d@nb-jsosic> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> Message-ID: <1251800976.10463.43.camel@marc> Am Dienstag, den 01.09.2009, 11:26 +0200 schrieb Jakov Sosic: > On Mon, 31 Aug 2009 23:26:06 +0200 > "Marc - A. Dahlhaus" wrote: > > > I think your so called 'limitation' is more related to mistakes that > > was made during the planing phase of your cluster setup than to > > missing functionality. > > Yeah, and what can be that mistake? I'll feel free to quote John: > > > The best course of action to take would be to remove that missing > > node from your cluster configuration using conga, > > system-config-cluster, or by hand > > editing /etc/cluster/cluster.conf. As long as it exists in the > > configuration then the other nodes will expect it to join the > > cluster, and they will attempt to fence it when they try to join the > > cluster and see it is not present. > > Where's the issue with my config there? It seems to be an issue with > RHCS misbehaving with one fence device missing. It isn't misbehaving at all here. The job of RHCS in this case is to save your data against failure. If fenced can't fence a node successfully, RHCS will wait in stalled mode (because it doesn't get a successful response from the fence-agent) until someone who knows what he is doing comes around to fix up the problem. If it wouldn't do it that way a separated node could eat up your data. It is the job of fenced to stop all activities until fencing is in a working shape again. This behaviour is perfectly fine IMO... The mistakes in the planing phase of your cluster setup are: - You use system dependent fencing like "HP iLO" wich will be missing if your system is missing and no independent fencing like an APC PowerSwitch... Think about a power purge which kills booth of your PSU on a system, a system dependent management device would be missing from your network in this case leading to exactly the problem you're faced with. - You haven't read through the related documentation (read on and you spot to what i am referring to). > > Please take a look at the qdisk manpage and aditionaly to the cman > > faq sections about tiebraker, qdisks and especially the last man > > standing setup... > > qdisk already set up. I never said I lost quorum. I have quorum. But > without one node missing completely, with it's fence device, rgmanager > just doesn't start up the services, and is not listed in clustat. I > repeat, I HAVE GAINED QUORUM, and I have qdisk for the case two out of > three are out. Your mistake is that you started fenced in normal mode in which it will fence all nodes that it can't reach to get around a possible split-brain scenario. You need to start fenced in "clean start" without fencing mode (read the fenced manpage as it is documented there) because you know everything is right. RHCS can't on it's own know anything about, for it the missing node is separated on network/link layer and could be eating up all your data just fine until it gets fenced. As long as the missing node isn't joining it will not get fenced by the other nodes in clean start node of fenced so it will be your way out of this problem. Marc From esggrupos at gmail.com Tue Sep 1 10:38:54 2009 From: esggrupos at gmail.com (ESGLinux) Date: Tue, 1 Sep 2009 12:38:54 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. Message-ID: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> Hi All, First, sorry if this can be considered Off Topic but my first aproach was using clustering to my problem so I suposse you could have the same problem. I have 2 computers running JBoss and I need to share a directory for the cache (I use OSCache). First I try to use a NFS service on a Red hat Cluster ( I use this reference http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_NFS_Over_GFS/index.html ) My problem is that the performance with this approach is too much low for my application. So I decided to make each machine use its own cache dir and with rsync keep this dirs synchronized. I don?t know if what I have done is a stupidity or It?s a good solution, so what do you think about it?, Do you know any way to do what I need Thanks in advance. ESG -------------- next part -------------- An HTML attachment was scrubbed... URL: From jakov.sosic at srce.hr Tue Sep 1 10:48:51 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Tue, 1 Sep 2009 12:48:51 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <1251800976.10463.43.camel@marc> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> <1251800976.10463.43.camel@marc> Message-ID: <20090901124851.7daf2c75@pc-jsosic.srce.hr> On Tue, 01 Sep 2009 12:29:36 +0200 "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" wrote: > It isn't misbehaving at all here. > > The job of RHCS in this case is to save your data against failure. > > If fenced can't fence a node successfully, RHCS will wait in stalled > mode (because it doesn't get a successful response from the > fence-agent) until someone who knows what he is doing comes around to > fix up the problem. If it wouldn't do it that way a separated node > could eat up your data. It is the job of fenced to stop all > activities until fencing is in a working shape again. > > This behaviour is perfectly fine IMO... Isn't that the mission of quorum? For example - if you have qourum you will run services, if you don't have quorum you won't. If there is a qdisk and single of three nodes is missing, it can't have quorum - so it can't run services? OK I understand that this is the safer way... But that's why I was asking in the first place for a command to flag node as missing completely, so that I can avoid all reconfigurations. Reconfiguration while a node missing will trigger odd behavior when node comes back - node will be fenced constantly because it has wrong config version. > - You use system dependent fencing like "HP iLO" wich will be missing > if your system is missing and no independent fencing like an > APC PowerSwitch... Yes but that are the only devices I have available for fencing. So that is the limitation of hardware, on which I don't have any influence in this case. I already know that fence devices are my only SPOF currently... But I can't help myself. > Think about a power purge which kills booth of your PSU on a system, > a system dependent management device would be missing from your > network in this case leading to exactly the problem you're faced > with. I will take a look if APC UPS-es have something like killpower for certain ports, if not I will set up false manual fencing to get around this problem. Thank you. > Your mistake is that you started fenced in normal mode in which it > will fence all nodes that it can't reach to get around a possible > split-brain scenario. You need to start fenced in "clean start" > without fencing mode (read the fenced manpage as it is documented > there) because you know everything is right. Adding clean_start again presumes reconfiguring just like removing a node and declaring cluster a two_node, and I wanted to avoid reconfigurations... Thank you very much. -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From robejrm at gmail.com Tue Sep 1 10:59:36 2009 From: robejrm at gmail.com (Juan Ramon Martin Blanco) Date: Tue, 1 Sep 2009 12:59:36 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> Message-ID: <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> On Tue, Sep 1, 2009 at 12:38 PM, ESGLinux wrote: > Hi All, > First, sorry if this can be considered Off Topic but my first aproach was > using clustering to my problem so I suposse you could have the same problem. > > I have 2 computers running JBoss and I need to share a directory for the > cache (I use OSCache). > > First I try to use a NFS service on a Red hat Cluster ( I use this > reference > http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_NFS_Over_GFS/index.html > ) > Do you have a shared storage? If the answer is yes, just use gfs and mount the filesystem on both machines. Greetings, Juanra > > My problem is that the performance with this approach is too much low for > my application. So I decided to make each machine use its own cache dir and > with rsync keep this dirs synchronized. > > I don?t know if what I have done is a stupidity or It?s a good solution, so > what do you think about it?, Do you know any way to do what I need > > Thanks in advance. > > ESG > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From esggrupos at gmail.com Tue Sep 1 11:05:19 2009 From: esggrupos at gmail.com (ESGLinux) Date: Tue, 1 Sep 2009 13:05:19 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> Message-ID: <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> 2009/9/1 Juan Ramon Martin Blanco > > > On Tue, Sep 1, 2009 at 12:38 PM, ESGLinux wrote: > >> Hi All, >> First, sorry if this can be considered Off Topic but my first aproach was >> using clustering to my problem so I suposse you could have the same problem. >> >> I have 2 computers running JBoss and I need to share a directory for the >> cache (I use OSCache). >> >> First I try to use a NFS service on a Red hat Cluster ( I use this >> reference >> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_NFS_Over_GFS/index.html >> ) >> > > Do you have a shared storage? If the answer is yes, just use gfs and mount > the filesystem on both machines. > > Nop, I haven?t but your answer makes me a new question. Can I use GFS directly without making a cluster? I mean can I attach the iSCSI devices for example, and mount a GFS filesystem on it without creating a cluster, and a service asociated to this GFS filesystem? Thanks ESG -------------- next part -------------- An HTML attachment was scrubbed... URL: From mad at wol.de Tue Sep 1 11:11:21 2009 From: mad at wol.de (Marc - A. Dahlhaus [ Administration | Westermann GmbH ]) Date: Tue, 01 Sep 2009 13:11:21 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <20090901124851.7daf2c75@pc-jsosic.srce.hr> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> <1251800976.10463.43.camel@marc> <20090901124851.7daf2c75@pc-jsosic.srce.hr> Message-ID: <1251803481.12201.10.camel@marc> Am Dienstag, den 01.09.2009, 12:48 +0200 schrieb Jakov Sosic: > On Tue, 01 Sep 2009 12:29:36 +0200 > "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" > wrote: > > > It isn't misbehaving at all here. > > > > The job of RHCS in this case is to save your data against failure. > > > > If fenced can't fence a node successfully, RHCS will wait in stalled > > mode (because it doesn't get a successful response from the > > fence-agent) until someone who knows what he is doing comes around to > > fix up the problem. If it wouldn't do it that way a separated node > > could eat up your data. It is the job of fenced to stop all > > activities until fencing is in a working shape again. > > > > This behaviour is perfectly fine IMO... > > Isn't that the mission of quorum? For example - if you have qourum you > will run services, if you don't have quorum you won't. If there is a > qdisk and single of three nodes is missing, it can't have quorum - so > it can't run services? > > OK I understand that this is the safer way... But that's why I was > asking in the first place for a command to flag node as missing > completely, so that I can avoid all reconfigurations. Reconfiguration > while a node missing will trigger odd behavior when node comes back - > node will be fenced constantly because it has wrong config version. > > > > - You use system dependent fencing like "HP iLO" wich will be missing > > if your system is missing and no independent fencing like an > > APC PowerSwitch... > > Yes but that are the only devices I have available for fencing. So that > is the limitation of hardware, on which I don't have any influence in > this case. I already know that fence devices are my only SPOF > currently... But I can't help myself. > > > > Think about a power purge which kills booth of your PSU on a system, > > a system dependent management device would be missing from your > > network in this case leading to exactly the problem you're faced > > with. > > I will take a look if APC UPS-es have something like killpower for > certain ports, if not I will set up false manual fencing to get around > this problem. Thank you. Its actually the "APC Switched Rack PDUs" that you should look after. You can get an 8 port device for a small budget... > > Your mistake is that you started fenced in normal mode in which it > > will fence all nodes that it can't reach to get around a possible > > split-brain scenario. You need to start fenced in "clean start" > > without fencing mode (read the fenced manpage as it is documented > > there) because you know everything is right. > > Adding clean_start again presumes reconfiguring just like removing a > node and declaring cluster a two_node, and I wanted to avoid > reconfigurations... It's just a matter of starting fenced with "fenced -c" on your two nodes. No cluster.conf fiddling needed at all... Search for "start_daemon fenced" in /etc/init.d/cman and add " -c" behind it. You should remove that after your third node gets back. > Thank you very much. You're welcome. Marc From robejrm at gmail.com Tue Sep 1 11:20:33 2009 From: robejrm at gmail.com (Juan Ramon Martin Blanco) Date: Tue, 1 Sep 2009 13:20:33 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> Message-ID: <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> On Tue, Sep 1, 2009 at 1:05 PM, ESGLinux wrote: > > > 2009/9/1 Juan Ramon Martin Blanco > >> >> >> On Tue, Sep 1, 2009 at 12:38 PM, ESGLinux wrote: >> >>> Hi All, >>> First, sorry if this can be considered Off Topic but my first aproach was >>> using clustering to my problem so I suposse you could have the same problem. >>> >>> I have 2 computers running JBoss and I need to share a directory for the >>> cache (I use OSCache). >>> >>> First I try to use a NFS service on a Red hat Cluster ( I use this >>> reference >>> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_NFS_Over_GFS/index.html >>> ) >>> >> >> Do you have a shared storage? If the answer is yes, just use gfs and mount >> the filesystem on both machines. >> >> > Nop, I haven?t but your answer makes me a new question. Can I use GFS > directly without making a cluster? > I mean can I attach the iSCSI devices for example, and mount a GFS > filesystem on it without creating a cluster, and a service asociated to this > GFS filesystem? > You should use one iscsi lun shared by both cluster nodes. You can mount a GFS filesystem without locking (lock=nolock) with (correct me if I am wrong) the node not being part of a cluster, but only in one node at a time. You can mount a GFS filesystem created for a certain cluster without having the filesystem configured as a resource, the only requisite is that the nodes mounting the filesystem have to be part of that certain cluster. Regards, Juanra > > > Thanks > > ESG > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From esggrupos at gmail.com Tue Sep 1 12:21:47 2009 From: esggrupos at gmail.com (ESGLinux) Date: Tue, 1 Sep 2009 14:21:47 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> Message-ID: <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> > You should use one iscsi lun shared by both cluster nodes. You can mount a > GFS filesystem without locking (lock=nolock) with (correct me if I am wrong) > the node not being part of a cluster, but only in one node at a time. > You can mount a GFS filesystem created for a certain cluster without > having the filesystem configured as a resource, the only requisite is that > the nodes mounting the filesystem have to be part of that certain cluster. > If I have understand you ok, I need to create a cluster, for example, MYCLUSTER, then create a resource of type GFS filesystem. After that I must create 2 nodes in the cluster, access de iscsi lun from this nodes and finally mount the gfs filesystem. With these I can share this directory between the nodes without the risk of file corruption? Well, in the case I can?t use this approach, is there any way to do this? Thanks for your time, ESG > > Regards, > Juanra > >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From kkovachev at varna.net Tue Sep 1 12:36:35 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 1 Sep 2009 15:36:35 +0300 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> Message-ID: <20090901123347.M11629@varna.net> On Tue, 1 Sep 2009 14:21:47 +0200, ESGLinux wrote > > > > > > You should use one iscsi lun shared by both cluster nodes. You can mount a GFS filesystem without locking (lock=nolock) with (correct me if I am wrong) the node not being part of a cluster, but only in one node at a time. > You can mount a GFS filesystem created for a certain cluster without having the filesystem configured as a resource, the only requisite is that the nodes mounting the filesystem have to be part of that certain cluster. > > > If I have understand you ok, I need to create a cluster, for example, MYCLUSTER, then create a resource of type GFS filesystem. After that I must create 2 nodes in the cluster, access de iscsi lun from this nodes and finally mount the gfs filesystem. > > With these I can share this directory between the nodes without the risk of file corruption? > > Well, in the case I can?t use this approach, is there any way to do this? > if you don't have shared storage, but you have local disks - you may use DRBD instead of iSCSI. About the cluster - you don't need to define any resources - just have a cluster which is quorate to avoid data corruption while accessing the GFS on DRBD > Thanks for your time, > > ESG > > > > Regards, > Juanra > > > From esggrupos at gmail.com Tue Sep 1 12:48:13 2009 From: esggrupos at gmail.com (ESGLinux) Date: Tue, 1 Sep 2009 14:48:13 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <20090901123347.M11629@varna.net> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> <20090901123347.M11629@varna.net> Message-ID: <3128ba140909010548k5d112bd4t4be25ba2deccffee@mail.gmail.com> 2009/9/1 Kaloyan Kovachev > On Tue, 1 Sep 2009 14:21:47 +0200, ESGLinux wrote > > > > > > > > > > > > You should use one iscsi lun shared by both cluster nodes. You can mount > a > GFS filesystem without locking (lock=nolock) with (correct me if I am > wrong) > the node not being part of a cluster, but only in one node at a time. > > You can mount a GFS filesystem created for a certain cluster without > having > the filesystem configured as a resource, the only requisite is that the > nodes > mounting the filesystem have to be part of that certain cluster. > > > > > > If I have understand you ok, I need to create a cluster, for example, > MYCLUSTER, then create a resource of type GFS filesystem. After that I must > create 2 nodes in the cluster, access de iscsi lun from this nodes and > finally > mount the gfs filesystem. > > > > With these I can share this directory between the nodes without the risk > of > file corruption? > > > > Well, in the case I can?t use this approach, is there any way to do this? > > > > if you don't have shared storage, but you have local disks - you may use > DRBD > instead of iSCSI. this looks interesting, any good manual about using DRBD? > About the cluster - you don't need to define any resources - > just have a cluster which is quorate to avoid data corruption while > accessing > the GFS on DRBD > > ok, so I only need the cluster with the 2 nodes and the gfs filesystem formated, for example like this: gfs_mkfs -p lock_dlm -t MyCLUSTER:mydata -j 8 /dev/sda1 When I have done this I can mount /dev/sda1 in both nodes as use it isn?t it? Thanks, ESG > Thanks for your time, > > > > ESG > > > > > > > > Regards, > > Juanra > > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Tue Sep 1 12:57:32 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 01 Sep 2009 13:57:32 +0100 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <29ae894c0909010053g15ef0c30y513d5501b776bbf2@mail.gmail.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <4A97F090.9080508@redhat.com> <29ae894c0908281045l6c93c7dbo2d0e4f27c5bab14e@mail.gmail.com> <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> <4A9CC6D2.1000408@redhat.com> <29ae894c0909010053g15ef0c30y513d5501b776bbf2@mail.gmail.com> Message-ID: <4A9D1A3C.60907@redhat.com> On 01/09/09 08:53, brem belguebli wrote: > Hello Chrissie, > I couldn't find the item in the doc (the CMAN FAQ). > Brem > > It's here: http://sources.redhat.com/cluster/wiki/MultiHome Chrissie From kkovachev at varna.net Tue Sep 1 12:58:28 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 1 Sep 2009 15:58:28 +0300 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <3128ba140909010548k5d112bd4t4be25ba2deccffee@mail.gmail.com> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> <20090901123347.M11629@varna.net> <3128ba140909010548k5d112bd4t4be25ba2deccffee@mail.gmail.com> Message-ID: <20090901125151.M56559@varna.net> On Tue, 1 Sep 2009 14:48:13 +0200, ESGLinux wrote > 2009/9/1 Kaloyan Kovachev > On Tue, 1 Sep 2009 14:21:47 +0200, ESGLinux wrote > > > > > > > > > > > > > You should use one iscsi lun shared by both cluster nodes. You can mount a > GFS filesystem without locking (lock=nolock) with (correct me if I am wrong) > the node not being part of a cluster, but only in one node at a time. > > You can mount a GFS filesystem created for a certain cluster without having > the filesystem configured as a resource, the only requisite is that the nodes > mounting the filesystem have to be part of that certain cluster. > > > > > > If I have understand you ok, I need to create a cluster, for example, > MYCLUSTER, then create a resource of type GFS filesystem. After that I must > create 2 nodes in the cluster, access de iscsi lun from this nodes and finally > mount the gfs filesystem. > > > > With these I can share this directory between the nodes without the risk of > file corruption? > > > > Well, in the case I can?t use this approach, is there any way to do this? > > > > if you don't have shared storage, but you have local disks - you may use DRBD > instead of iSCSI. > > this looks interesting, any good manual about using DRBD? > There is a good documentation at http://www.drbd.org/ search for primary-primary mode and make sure the replication channels is the same as for the cluster communication to avoid split-brain and data corruption > About the cluster - you don't need to define any resources - > just have a cluster which is quorate to avoid data corruption while accessing > the GFS on DRBD > > > > ok, so I only need the cluster with the 2 nodes and the gfs filesystem formated, for example like this: > > gfs_mkfs -p lock_dlm -t MyCLUSTER:mydata -j 8 /dev/sda1 > When I have done this I can mount /dev/sda1 in both nodes as use it > [UTF-8?]isn??t it? you should format and mount /dev/drbd0 which is made on top of /dev/sda1, not /dev/sda1 itself > Thanks, > ESG > > > > Thanks for your time, > > > > ESG > > > > > > > > Regards, > > Juanra > > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From esggrupos at gmail.com Tue Sep 1 13:02:07 2009 From: esggrupos at gmail.com (ESGLinux) Date: Tue, 1 Sep 2009 15:02:07 +0200 Subject: [Linux-cluster] SEMI OT. Synchronizing jboss cache dir. In-Reply-To: <20090901125151.M56559@varna.net> References: <3128ba140909010338q4bd4c805t27f5c29791970d13@mail.gmail.com> <8a5668960909010359h7ff92fbaj7ba700bc1c742c4c@mail.gmail.com> <3128ba140909010405q387b858bk2d73d723f1ecca79@mail.gmail.com> <8a5668960909010420u1ae1d0d8g46023d3e86c246e2@mail.gmail.com> <3128ba140909010521x3dc2771dn3aa627c62612ffa@mail.gmail.com> <20090901123347.M11629@varna.net> <3128ba140909010548k5d112bd4t4be25ba2deccffee@mail.gmail.com> <20090901125151.M56559@varna.net> Message-ID: <3128ba140909010602s798134b6s327791ba456fd62c@mail.gmail.com> > > > > There is a good documentation at http://www.drbd.org/ search for > primary-primary mode and make sure the replication channels is the same as > for > the cluster communication to avoid split-brain and data corruption > I?ll check it, thanks > > > About the cluster - you don't need to define any resources - > > just have a cluster which is quorate to avoid data corruption while > accessing > > the GFS on DRBD > > > > > > > > ok, so I only need the cluster with the 2 nodes and the gfs filesystem > formated, for example like this: > > > > gfs_mkfs -p lock_dlm -t MyCLUSTER:mydata -j 8 /dev/sda1 > > When I have done this I can mount /dev/sda1 in both nodes as use it > > [UTF-8?]isn??t it? > > you should format and mount /dev/drbd0 which is made on top of /dev/sda1, > not > /dev/sda1 itself > for now this kind of device drbd0 is totally strange for me ;-). I?m going to read about drbd and I suposse I?ll finally understand it, Thanks again, ESG > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tageorgiou at gmail.com Tue Sep 1 13:21:34 2009 From: tageorgiou at gmail.com (Thomas Georgiou) Date: Tue, 1 Sep 2009 09:21:34 -0400 Subject: [Linux-cluster] Problem with Pacemaker and Corosync In-Reply-To: References: <6e4c20e70908310612o120933cema2609513f13be78c@mail.gmail.com> Message-ID: <6e4c20e70909010621s6341695et5785634db84324f3@mail.gmail.com> Attached are the logs with debug enabled. Here is corosync.conf: #compatibility: none aisexec { user: root group: root } totem { version: 2 secauth: off threads: 0 token: 1000 join: 60 consenus: 4800 vsftype: none max_messages: 20 clear:node_high_bit: yes interface { ringnumber: 0 bindnetaddr: 198.38.17.40 mcastaddr: 226.94.1.1 mcastport: 5405 } } service { name: pacemaker ver: 0 } logging { fileline: off to_stderr: yes to_syslog: yes to_file: yes logfile: /var/log/corosync.log debug: on timestamp: on } amf { mode: disabled } On Tue, Sep 1, 2009 at 2:33 AM, Andrew Beekhof wrote: > try turning on debug, there's nothing in the logs that indicate why > the lrmd is having a problem > > On Mon, Aug 31, 2009 at 3:12 PM, Thomas Georgiou wrote: >> Hi, >> >> I have installed Pacemaker 1.0.5, Corosync 1.0.0, and Openais 1.0.1 >> from source according to the Clusterlabs docs. ?However, when I go to >> start corosync/pacemaker, I get error messages pertaining to lrm and >> cibadmin -Q hangs and complains that the remote node is not available. >> ?Attached is the corosync log. >> >> Any ideas? >> >> Thomas Georgiou >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- A non-text attachment was scrubbed... Name: corosync.log Type: application/octet-stream Size: 76914 bytes Desc: not available URL: From brem.belguebli at gmail.com Tue Sep 1 13:51:35 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Tue, 1 Sep 2009 15:51:35 +0200 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <4A9D1A3C.60907@redhat.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <4A97F090.9080508@redhat.com> <29ae894c0908281045l6c93c7dbo2d0e4f27c5bab14e@mail.gmail.com> <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> <4A9CC6D2.1000408@redhat.com> <29ae894c0909010053g15ef0c30y513d5501b776bbf2@mail.gmail.com> <4A9D1A3C.60907@redhat.com> Message-ID: <29ae894c0909010651y1fd4688cyc52726572d65cf81@mail.gmail.com> Thanks Will it be supported in the future ? 2009/9/1, Christine Caulfield : > > On 01/09/09 08:53, brem belguebli wrote: > >> Hello Chrissie, >> I couldn't find the item in the doc (the CMAN FAQ). >> Brem >> >> >> > It's here: > > http://sources.redhat.com/cluster/wiki/MultiHome > > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Tue Sep 1 14:00:16 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 01 Sep 2009 15:00:16 +0100 Subject: [Linux-cluster] Use alternate network interfaces for heartbeat in RHCS In-Reply-To: <29ae894c0909010651y1fd4688cyc52726572d65cf81@mail.gmail.com> References: <29ae894c0908280710w3f999b0as1438451bf5869a8e@mail.gmail.com> <4A97E67A.1030506@redhat.com> <29ae894c0908280724p4fc8cbe4g6943a3138f278c1b@mail.gmail.com> <4A97EF74.6090904@redhat.com> <4A97F090.9080508@redhat.com> <29ae894c0908281045l6c93c7dbo2d0e4f27c5bab14e@mail.gmail.com> <29ae894c0908310056u68c1272dud8babe5ac3542f9d@mail.gmail.com> <4A9CC6D2.1000408@redhat.com> <29ae894c0909010053g15ef0c30y513d5501b776bbf2@mail.gmail.com> <4A9D1A3C.60907@redhat.com> <29ae894c0909010651y1fd4688cyc52726572d65cf81@mail.gmail.com> Message-ID: <4A9D28F0.70301@redhat.com> On 01/09/09 14:51, brem belguebli wrote: > Thanks > Will it be supported in the future ? Yes it will. But I can't be sure about just when "the future" is in this case, sorry! Chrissie > 2009/9/1, Christine Caulfield >: > > On 01/09/09 08:53, brem belguebli wrote: > > Hello Chrissie, > I couldn't find the item in the doc (the CMAN FAQ). > Brem > > > > It's here: > > http://sources.redhat.com/cluster/wiki/MultiHome > > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > ------------------------------------------------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From carlopmart at gmail.com Tue Sep 1 16:41:30 2009 From: carlopmart at gmail.com (carlopmart) Date: Tue, 01 Sep 2009 18:41:30 +0200 Subject: [Linux-cluster] fence vmware for vsphere esxi 4 Message-ID: <4A9D4EBA.60907@gmail.com> Hi all, When will be possible to use fence_vmware or fence_vmware_ng with vsphere esxi 4?? Maybe on RHEL/CentOS 5.4?? Thanks. -- CL Martinez carlopmart {at} gmail {d0t} com From pradhanparas at gmail.com Tue Sep 1 17:21:50 2009 From: pradhanparas at gmail.com (Paras pradhan) Date: Tue, 1 Sep 2009 12:21:50 -0500 Subject: [Linux-cluster] Book Message-ID: <8b711df40909011021p7d06155ch15b5e083fee1c8a3@mail.gmail.com> Is there any book that covers virtualization using Xen and clustering using Red hat Cluster suite in a single book that covers running a HA cluster for virtual machines ? Thanks Paras. From lhh at redhat.com Tue Sep 1 18:41:40 2009 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 01 Sep 2009 14:41:40 -0400 Subject: [Linux-cluster] NFS client failover In-Reply-To: <4A94F87C.7030708@lists.grepular.com> References: <4A93C8CE.7010202@lists.grepular.com> <4A943E31.3080209@lists.grepular.com> <4A94F87C.7030708@lists.grepular.com> Message-ID: <1251830500.3209.544.camel@localhost.localdomain> On Wed, 2009-08-26 at 09:55 +0100, Mike Cardwell wrote: > On 25/08/2009 20:40, Mike Cardwell wrote: > > > I figured that failover would happen more smoothly if the client was > > aware of and in control of what was going on. If the IP suddenly moves > > to another NFS server I don't know how the NFS client will cope with that. > > Well, it seems to cope quite well. The nfs mount "hangs" for a few > seconds whilst the IP moves from one server to another (unavoidable > obviously), but it then picks up from where it was. I suspect there will > be file corruption issues with files that are partially written when the > failover happens, but I guess that can't be avoided without a client > side solution. I don't think we've had reports of corruption in the past. When using TCP, the client can hang for a very long time before recovering; using UDP seems to resolve this. -- Lon From lhh at redhat.com Tue Sep 1 19:19:41 2009 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 01 Sep 2009 15:19:41 -0400 Subject: [Linux-cluster] 3 node cluster and quorum disk? In-Reply-To: <20090826161128.1e32721c@pc-jsosic.srce.hr> References: <20090826161128.1e32721c@pc-jsosic.srce.hr> Message-ID: <1251832781.3209.548.camel@localhost.localdomain> On Wed, 2009-08-26 at 16:11 +0200, Jakov Sosic wrote: > Hi. > > I have a situation - when two nodes are up in 3 node cluster, and one > node goes down, cluster looses quorate - although I'm using qdiskd... > > > > > label="SAS-qdisk" status_file="/tmp/qdisk"/> If that doesn't fix it entirely, get rid of status_file, decrease interval, and increase tko. Try: interval=2 tko=12 ? -- Lon From jfriesse at redhat.com Wed Sep 2 07:30:48 2009 From: jfriesse at redhat.com (Jan Friesse) Date: Wed, 02 Sep 2009 09:30:48 +0200 Subject: [Linux-cluster] fence vmware for vsphere esxi 4 In-Reply-To: <4A9D4EBA.60907@gmail.com> References: <4A9D4EBA.60907@gmail.com> Message-ID: <4A9E1F28.3030506@redhat.com> Hi, I'm pretty sure, that old fence_vmware will don't work on ESXi, because ESXi (at least ESXi 3.5) doesn't have support for ssh in DOM-0, and we are using it. Fence_vmware_ng should work correctly (because using VI Perl API), but it's not officially supported. In case it doesn't work, please let me know. Regards, Honza carlopmart wrote: > Hi all, > > When will be possible to use fence_vmware or fence_vmware_ng with > vsphere esxi 4?? Maybe on RHEL/CentOS 5.4?? > > Thanks. > From jakov.sosic at srce.hr Wed Sep 2 09:47:09 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Wed, 2 Sep 2009 11:47:09 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <1251803481.12201.10.camel@marc> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> <1251800976.10463.43.camel@marc> <20090901124851.7daf2c75@pc-jsosic.srce.hr> <1251803481.12201.10.camel@marc> Message-ID: <20090902114709.66f6c5f3@nb-jsosic> On Tue, 01 Sep 2009 13:11:21 +0200 "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" wrote: > Its actually the "APC Switched Rack PDUs" that you should look after. > You can get an 8 port device for a small budget... Is this it: http://www.apc.com/products/family/index.cfm?id=70 It's still too expensive - AP7920 is around 800-900$ in my country... I was hoping to get two for that price. -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From mad at wol.de Wed Sep 2 10:15:27 2009 From: mad at wol.de (Marc - A. Dahlhaus [ Administration | Westermann GmbH ]) Date: Wed, 02 Sep 2009 12:15:27 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <20090902114709.66f6c5f3@nb-jsosic> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> <1251800976.10463.43.camel@marc> <20090901124851.7daf2c75@pc-jsosic.srce.hr> <1251803481.12201.10.camel@marc> <20090902114709.66f6c5f3@nb-jsosic> Message-ID: <1251886527.10505.9.camel@marc> Am Mittwoch, den 02.09.2009, 11:47 +0200 schrieb Jakov Sosic: > On Tue, 01 Sep 2009 13:11:21 +0200 > "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" > wrote: > > > Its actually the "APC Switched Rack PDUs" that you should look after. > > You can get an 8 port device for a small budget... > > Is this it: > > http://www.apc.com/products/family/index.cfm?id=70 > > It's still too expensive - AP7920 is around 800-900$ in my country... I > was hoping to get two for that price. > That's the one, the street price here is around 350? per device. From corey.kovacs at gmail.com Wed Sep 2 10:33:51 2009 From: corey.kovacs at gmail.com (Corey Kovacs) Date: Wed, 2 Sep 2009 06:33:51 -0400 Subject: [Linux-cluster] dealing with oom-killer.... Message-ID: <7d6e8da40909020333h405f3f82w928f65b44afffc51@mail.gmail.com> A colleague has a 5 node cluster with 4GB ram in each node. It's not enough for the cluster and more ram is on the way. The problem though is that until the ram arrives, there is risk of oom-killer (which he found out the other day) firing up and putting the node into a state which made it utterly useless but still looked good to the cluster. We could of course disable oom-killer but that's a workaround, not a fix. I am wondering if the cluster responding to oom-killer firing up and fencing the offending node is possible and if so, how others might have done it. Seems like it should just be handled by the cluster tho. Maybe have cman put a message across the openais "bus" like, "Hey, losing my brain here, someone whak me"... Thanks Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Wed Sep 2 10:47:39 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Wed, 02 Sep 2009 11:47:39 +0100 Subject: [Linux-cluster] dealing with oom-killer.... In-Reply-To: <7d6e8da40909020333h405f3f82w928f65b44afffc51@mail.gmail.com> References: <7d6e8da40909020333h405f3f82w928f65b44afffc51@mail.gmail.com> Message-ID: <4A9E4D4B.5000003@redhat.com> On 02/09/09 11:33, Corey Kovacs wrote: > A colleague has a 5 node cluster with 4GB ram in each node. It's not > enough for the cluster and more ram is on the way. The problem though is > that until the ram arrives, there is risk of oom-killer (which he found > out the other day) firing up and putting the node into a state which > made it utterly useless but still looked good to the cluster. We could > of course disable oom-killer but that's a workaround, not a fix. > > I am wondering if the cluster responding to oom-killer firing up and > fencing the offending node is possible and if so, how others might have > done it. Seems like it should just be handled by the cluster tho. Maybe > have cman put a message across the openais "bus" like, "Hey, losing my > brain here, someone whak me"... > I suppose you could give cman a large value for /proc//oom_score so that it is the first thing to be killed if the system runs out of memory. That should guarantee that it will be fenced by the other nodes ... provided they have enough memory to remain quorate! Chrissie From brem.belguebli at gmail.com Wed Sep 2 11:14:04 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Wed, 2 Sep 2009 13:14:04 +0200 Subject: [Linux-cluster] Re: Fencing question in geo cluster (dual sites clustering) In-Reply-To: <29ae894c0908210227r85df80fm173af6452d22a5b2@mail.gmail.com> References: <29ae894c0908210227r85df80fm173af6452d22a5b2@mail.gmail.com> Message-ID: <29ae894c0909020414s518a5530n538ca3f5e21377f1@mail.gmail.com> Hi, Any idea or comment on this. Thanks Brem CF link attached to diagram that describes the setup. http://1.bp.blogspot.com/_mz9iIrpv_qo/Si1NmQ2QNmI/AAAAAAAADP4/fV8j_ZsGlBw/s1600-h/Drawing1.png 2009/8/21, brem belguebli : > > Hi, > > I'm trying to find out what best fencing solution could fit a dual sites > cluster. > > Cluster is equally sized on each site (2 nodes/site), each site hosting a > SAN array so that each node from any site can see the 2 arrays. > > Quorum disk (iscsi LUN) is hosted on a 3rd site. > > SAN and LAN using the same telco infrastructure (2 redundant DWDM loops). > > In case something happens at Telco level (both DWDM loops are broken) that > makes 1 of the 2 sites completely isolated from the rest of the world, > the nodes at the good site (the one still operationnal) won't be able to > fence any node from the wrong site (the one that is isolated) as there is no > way for them to reach their ILO's or do any SAN fencing as the switches at > the wrong site are no more reachable. > > As qdiskd is not reachable from the wrong nodes, they end up being rebooted > by qdisk, but there is a short time (a few seconds) during which the wrong > nodes are still seing their local SAN array storage and may potentially have > written data on it. > > Any ideas or comments on how to ensure data integrity in such setup ? > > Regards > > Brem > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Wed Sep 2 11:57:54 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Wed, 02 Sep 2009 12:57:54 +0100 Subject: [Linux-cluster] dealing with oom-killer.... In-Reply-To: <7d6e8da40909020333h405f3f82w928f65b44afffc51@mail.gmail.com> References: <7d6e8da40909020333h405f3f82w928f65b44afffc51@mail.gmail.com> Message-ID: <4A9E5DC2.4030107@redhat.com> On 02/09/09 11:33, Corey Kovacs wrote: > A colleague has a 5 node cluster with 4GB ram in each node. It's not > enough for the cluster and more ram is on the way. The problem though is > that until the ram arrives, there is risk of oom-killer (which he found > out the other day) firing up and putting the node into a state which > made it utterly useless but still looked good to the cluster. We could > of course disable oom-killer but that's a workaround, not a fix. > > I am wondering if the cluster responding to oom-killer firing up and > fencing the offending node is possible and if so, how others might have > done it. Seems like it should just be handled by the cluster tho. Maybe > have cman put a message across the openais "bus" like, "Hey, losing my > brain here, someone whak me"... > I suppose you could give cman a large value for /proc//oom_score so that it is the first thing to be killed if the system runs out of memory. That should guarantee that it will be fenced by the other nodes ... provided they have enough memory to remain quorate! Chrissie From jakov.sosic at srce.hr Wed Sep 2 15:59:38 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Wed, 2 Sep 2009 17:59:38 +0200 Subject: [Linux-cluster] How to disable node? In-Reply-To: <1251886527.10505.9.camel@marc> References: <20090831223053.5461ad55@nb-jsosic> <20090831225918.33c892eb@nb-jsosic> <4A9C3FEE.3000309@wol.de> <20090901112648.3e32c88d@nb-jsosic> <1251800976.10463.43.camel@marc> <20090901124851.7daf2c75@pc-jsosic.srce.hr> <1251803481.12201.10.camel@marc> <20090902114709.66f6c5f3@nb-jsosic> <1251886527.10505.9.camel@marc> Message-ID: <20090902175938.3d186aa5@pc-jsosic.srce.hr> On Wed, 02 Sep 2009 12:15:27 +0200 "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" wrote: > That's the one, the street price here is around 350? per device. Well, then it's time for me to call some of german emigrants :) -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From alfredo.moralejo at roche.com Wed Sep 2 16:48:50 2009 From: alfredo.moralejo at roche.com (Moralejo, Alfredo) Date: Wed, 2 Sep 2009 18:48:50 +0200 Subject: [Linux-cluster] Re: Fencing question in geo cluster (dual sites clustering) In-Reply-To: <29ae894c0909020414s518a5530n538ca3f5e21377f1@mail.gmail.com> References: <29ae894c0908210227r85df80fm173af6452d22a5b2@mail.gmail.com> <29ae894c0909020414s518a5530n538ca3f5e21377f1@mail.gmail.com> Message-ID: What kind of data replication will be used? Regards, Alfredo ________________________________ From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of brem belguebli Sent: Wednesday, September 02, 2009 1:14 PM To: linux clustering Subject: [Linux-cluster] Re: Fencing question in geo cluster (dual sites clustering) Hi, Any idea or comment on this. Thanks Brem CF link attached to diagram that describes the setup. http://1.bp.blogspot.com/_mz9iIrpv_qo/Si1NmQ2QNmI/AAAAAAAADP4/fV8j_ZsGlBw/s1600-h/Drawing1.png 2009/8/21, brem belguebli >: Hi, I'm trying to find out what best fencing solution could fit a dual sites cluster. Cluster is equally sized on each site (2 nodes/site), each site hosting a SAN array so that each node from any site can see the 2 arrays. Quorum disk (iscsi LUN) is hosted on a 3rd site. SAN and LAN using the same telco infrastructure (2 redundant DWDM loops). In case something happens at Telco level (both DWDM loops are broken) that makes 1 of the 2 sites completely isolated from the rest of the world, the nodes at the good site (the one still operationnal) won't be able to fence any node from the wrong site (the one that is isolated) as there is no way for them to reach their ILO's or do any SAN fencing as the switches at the wrong site are no more reachable. As qdiskd is not reachable from the wrong nodes, they end up being rebooted by qdisk, but there is a short time (a few seconds) during which the wrong nodes are still seing their local SAN array storage and may potentially have written data on it. Any ideas or comments on how to ensure data integrity in such setup ? Regards Brem -------------- next part -------------- An HTML attachment was scrubbed... URL: From brem.belguebli at gmail.com Wed Sep 2 18:11:23 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Wed, 2 Sep 2009 20:11:23 +0200 Subject: [Linux-cluster] Re: Fencing question in geo cluster (dual sites clustering) In-Reply-To: References: <29ae894c0908210227r85df80fm173af6452d22a5b2@mail.gmail.com> <29ae894c0909020414s518a5530n538ca3f5e21377f1@mail.gmail.com> Message-ID: <29ae894c0909021111q6ebf0113k97a7107f2e5c416b@mail.gmail.com> Hi Alfredo, For the moment, it is a POC, and I'm basing the whole thing on the RAID1 mdadm resource script I have submitted. I'm also considering the possibility of using a Continuous Access (HP arrays like EMC's SRDF functionnality) but still need raid manager binaries etc ... and the time and inspiration to write the scripts. Ideally, I would tend to privilege LVM mirror, but it still has some points to be addressed as SPOF on mirrorlog etc... Brem 2009/9/2 Moralejo, Alfredo > What kind of data replication will be used? > > > > Regards, > > > > Alfredo > > > ------------------------------ > > *From:* linux-cluster-bounces at redhat.com [mailto: > linux-cluster-bounces at redhat.com] *On Behalf Of *brem belguebli > *Sent:* Wednesday, September 02, 2009 1:14 PM > *To:* linux clustering > *Subject:* [Linux-cluster] Re: Fencing question in geo cluster (dual sites > clustering) > > > > Hi, > > > > Any idea or comment on this. > > > > Thanks > > > > Brem > > > > > > > > CF link attached to diagram that describes the setup. > > http://1.bp.blogspot.com/_mz9iIrpv_qo/Si1NmQ2QNmI/AAAAAAAADP4/fV8j_ZsGlBw/s1600-h/Drawing1.png > > > 2009/8/21, brem belguebli : > > Hi, > > > > I'm trying to find out what best fencing solution could fit a dual sites > cluster. > > > > Cluster is equally sized on each site (2 nodes/site), each site hosting a > SAN array so that each node from any site can see the 2 arrays. > > > > Quorum disk (iscsi LUN) is hosted on a 3rd site. > > > > SAN and LAN using the same telco infrastructure (2 redundant DWDM loops). > > > > In case something happens at Telco level (both DWDM loops are broken) that > makes 1 of the 2 sites completely isolated from the rest of the world, > > the nodes at the good site (the one still operationnal) won't be able to > fence any node from the wrong site (the one that is isolated) as there is no > way for them to reach their ILO's or do any SAN fencing as the switches at > the wrong site are no more reachable. > > > > As qdiskd is not reachable from the wrong nodes, they end up being rebooted > by qdisk, but there is a short time (a few seconds) during which the wrong > nodes are still seing their local SAN array storage and may potentially have > written data on it. > > > > Any ideas or comments on how to ensure data integrity in such setup ? > > > > Regards > > > > Brem > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From plebeuz at ig.com.br Thu Sep 3 14:06:47 2009 From: plebeuz at ig.com.br (Daniel Viana Auler(Plebeuz)) Date: Thu, 03 Sep 2009 11:06:47 -0300 Subject: [Linux-cluster] GFS - Cluster Message-ID: <4A9FCD77.1020806@ig.com.br> Hello, People, can i use gfs without a storage? I want to use a local device and then make a cluster in 2 other machines to use gfs for test. Att, Plebeuz -- From bmr at redhat.com Thu Sep 3 14:24:16 2009 From: bmr at redhat.com (Bryn M. Reeves) Date: Thu, 03 Sep 2009 15:24:16 +0100 Subject: [Linux-cluster] GFS - Cluster In-Reply-To: <4A9FCD77.1020806@ig.com.br> References: <4A9FCD77.1020806@ig.com.br> Message-ID: <1251987856.25346.168.camel@breeves.fab.redhat.com> On Thu, 2009-09-03 at 11:06 -0300, Daniel Viana Auler(Plebeuz) wrote: > Hello, > People, can i use gfs without a storage? I want to use a local > device and then make a cluster in 2 other machines to use gfs for test. Checkout the software iscsi target or the gnbd package. Regards, Bryn. From gordan at bobich.net Thu Sep 3 14:53:07 2009 From: gordan at bobich.net (Gordan Bobic) Date: Thu, 3 Sep 2009 15:53:07 +0100 Subject: [Linux-cluster] GFS - Cluster Message-ID: <4A4B4FF016C5FEC3@> (added by '') Or DRBD. -----Original Message----- From: "Bryn M. Reeves" To: "linux clustering" Sent: 03/09/09 15:24 Subject: Re: [Linux-cluster] GFS - Cluster On Thu, 2009-09-03 at 11:06 -0300, Daniel Viana Auler(Plebeuz) wrote: > Hello, > People, can i use gfs without a storage? I want to use a local > device and then make a cluster in 2 other machines to use gfs for test. Checkout the software iscsi target or the gnbd package. Regards, Bryn. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From maciej.grela at nsn.com Fri Sep 4 07:50:31 2009 From: maciej.grela at nsn.com (Maciej Grela) Date: Fri, 04 Sep 2009 09:49:31 +0159 Subject: [Linux-cluster] GFS - Cluster In-Reply-To: <4A9FCD77.1020806@ig.com.br> References: <4A9FCD77.1020806@ig.com.br> Message-ID: <4AA0C6A3.2050607@nsn.com> ext Daniel Viana Auler(Plebeuz) pisze: > Hello, > People, can i use gfs without a storage? I want to use a > local device and then make a cluster in 2 other machines to use gfs > for test. > > Att, > > Plebeuz You could use nbd to export the blockdevice to the second node. In case of gfs you need some way for *both* the nodes to see the same block device. Haven't tried the nbd approach myself though. Best regards, Maciej Grela From Alain.Moulle at bull.net Fri Sep 4 09:46:38 2009 From: Alain.Moulle at bull.net (Alain.Moulle) Date: Fri, 04 Sep 2009 11:46:38 +0200 Subject: [Linux-cluster] Question about "ccs_tool update" Message-ID: <4AA0E1FE.70903@bull.net> Hi, With this release : cman-3.0.2-1.fc11.x86_64 it seems that we can't do ccs_tool update anymore : ccs_tool update /etc/cluster/cluster.conf Unknown command, update. Try 'ccs_tool help' for help. and effectively the help does not list anymore options update (neither upgrade). Therefore, what is the new way to make it dynamically update the configuration ? (in former releases, we used to do ccs_tool update ... and then cman_tool version -r ...) Thanks for your help Alain From fdinitto at redhat.com Fri Sep 4 10:33:40 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 04 Sep 2009 12:33:40 +0200 Subject: [Linux-cluster] Re: [Cluster-devel] Can't manage virtual servers after upgrade In-Reply-To: References: Message-ID: <1252060420.6387.0.camel@cerberus.int.fabbione.net> hi, in future please use linux-cluster at redhat.com mailing list or file a bugzilla. cluster-devel is meant for development only topics. Thanks Fabio On Fri, 2009-09-04 at 14:26 +0400, Alexander wrote: > Hello! > > I have upgrade 3 servers in cluster to RHEL 5.4 and now i can't start virtual machine service via luci. In /var/log/messages i see errors: > > Sep 4 11:42:55 hwcl-n1 clurgmgrd[5374]: start on vm "vps-nagios" returned 1 (generic error) > Sep 4 11:42:55 hwcl-n1 clurgmgrd[5374]: #68: Failed to start vm:vps-nagios; return value: 1 > Sep 4 11:42:55 hwcl-n1 clurgmgrd[5374]: Stopping service vm:vps-nagios > > After upgrade, via luci web-interface i can't add new virtual machine service. Looks like cluster soft don't know, that server booted with xen kernel and xen is started. > When i boot servers with kernel without xen and install KVM hypervisor, then luci can create new service for virtual machines, but i need use xen hypervisor. > > Can anybody help - where is problem with xen hypervisor? Probably, some rpm packet is missing? (but i update server via command "yum update"). > > Thank You. > > With best regards, Alexander. From fdinitto at redhat.com Fri Sep 4 10:34:19 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 04 Sep 2009 12:34:19 +0200 Subject: [Linux-cluster] Re: [Cluster-devel] luci SSL error: SSL_ERROR_ZERO_RETURN In-Reply-To: <1251964843.31750.17.camel@leodolter.obvsg.at> References: <1251964843.31750.17.camel@leodolter.obvsg.at> Message-ID: <1252060459.6387.2.camel@cerberus.int.fabbione.net> hi, in future please use linux-cluster at redhat.com mailing list or file a bugzilla. cluster-devel is meant for development only topics. Thanks Fabio On Thu, 2009-09-03 at 10:00 +0200, Ulrich Leodolter wrote: > Hello, > > luci is unable to get ssl certs from ricci. > i have setup luci/ricci as described in redhat manual. > > i tried this on RHEL5.3 x86_64 and today after upgrade > to RHEL5.4 x86_64. > > there is no problem on RHEL5.3 i386 machine, > looks like it is a x86_64 ssl problem???? > > > after click on "View SSL cert fingerprints" is see this message: > > The following errors occurred: > > Error reading from myhost.mydomain:11111: SSL error: SSL_ERROR_ZERO_RETURN > > > syslog messages: > > Sep 3 11:14:47 myhost luci: Luci startup succeeded > Sep 3 11:14:47 myhost luci: Listening on port 8084; accessible via URL https://myhost.mydomain:8084 > Sep 3 11:16:25 myhost luci[7987]: Error reading from myhost.mydomain:11111: SSL error: SSL_ERROR_ZERO_RETURN > Sep 3 11:16:33 myhost luci[7987]: Error reading from myhost.mydomain:11111: SSL error: SSL_ERROR_ZERO_RETURN > Sep 3 11:16:33 myhost luci[7987]: Unable to establish an SSL connection to myhost.mydomain:11111: unable to open temp file > > > > > any tips ???? > thx > ulrich > > > > From fdinitto at redhat.com Fri Sep 4 10:42:08 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 04 Sep 2009 12:42:08 +0200 Subject: [Linux-cluster] Question about "ccs_tool update" In-Reply-To: <4AA0E1FE.70903@bull.net> References: <4AA0E1FE.70903@bull.net> Message-ID: <1252060928.6387.8.camel@cerberus.int.fabbione.net> On Fri, 2009-09-04 at 11:46 +0200, Alain.Moulle wrote: > Hi, > > With this release : cman-3.0.2-1.fc11.x86_64 > it seems that we can't do ccs_tool update anymore : > > ccs_tool update /etc/cluster/cluster.conf > Unknown command, update. > Try 'ccs_tool help' for help. > > and effectively the help does not list anymore options update (neither > upgrade). > > Therefore, what is the new way to make it dynamically update the > configuration ? The configuration distribution across nodes is now delegate to luci/ricci via ccs_sync command. The old ccsd ccs_tool bits are gone. Assuming your configuration is identical on all nodes you can issue, on one node only, cman_tool version -r $newversion. $newversion is either 0 (autodetect the version from cluster.conf and check that is newer/higher than the runtime config) or the exact version you want to load. Note that we are still working on smoothing a few corners in the new configuration system and that a bad config could be problematic for the cluster. Fabio From Alain.Moulle at bull.net Fri Sep 4 10:51:56 2009 From: Alain.Moulle at bull.net (Alain.Moulle) Date: Fri, 04 Sep 2009 12:51:56 +0200 Subject: [Linux-cluster] Question about "ccs_tool update" In-Reply-To: <1252060928.6387.8.camel@cerberus.int.fabbione.net> References: <4AA0E1FE.70903@bull.net> <1252060928.6387.8.camel@cerberus.int.fabbione.net> Message-ID: <4AA0F14C.2050104@bull.net> Hi Fabio, and many thanks. But just another precision : you mean that ccs_sync is making the job now , in a hidden way when cman_tool -r version is executed , right ? but does the fact that cluster.conf is in another place than /etc/cluster matter for ccs_sync to work fine ? because I just tried : [root at oberon3 ~]# ccs_sync help Unable to parse /etc/cluster/cluster.conf: No such file or directory Does that mean that ccs_sync does not take in account the /etc/sysconfig/cman file ? Thanks again Alain Fabio M. Di Nitto a ?crit : > On Fri, 2009-09-04 at 11:46 +0200, Alain.Moulle wrote: > >> Hi, >> >> With this release : cman-3.0.2-1.fc11.x86_64 >> it seems that we can't do ccs_tool update anymore : >> >> ccs_tool update /etc/cluster/cluster.conf >> Unknown command, update. >> Try 'ccs_tool help' for help. >> >> and effectively the help does not list anymore options update (neither >> upgrade). >> >> Therefore, what is the new way to make it dynamically update the >> configuration ? >> > > The configuration distribution across nodes is now delegate to > luci/ricci via ccs_sync command. The old ccsd ccs_tool bits are gone. > > Assuming your configuration is identical on all nodes you can issue, on > one node only, cman_tool version -r $newversion. > > $newversion is either 0 (autodetect the version from cluster.conf and > check that is newer/higher than the runtime config) or the exact version > you want to load. > > Note that we are still working on smoothing a few corners in the new > configuration system and that a bad config could be problematic for the > cluster. > > Fabio > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ntadmin at fi.upm.es Fri Sep 4 12:08:11 2009 From: ntadmin at fi.upm.es (Miguel Sanchez) Date: Fri, 04 Sep 2009 14:08:11 +0200 Subject: [Linux-cluster] How backup domU partition from dom0? Message-ID: <4AA1032B.2090306@fi.upm.es> Hi. I have two hosts forming a cluster for run xen vm's. Each vm has a disk which is corresponding to a clvm logic volume within dom0. I pretended to backup the lv's from dom0 doing a 'kpart -a' and a readonly mount (with the vm running). Probably it is not very correct (but the alternative snapshot was not possible with clvm). Most of times, the operation is ok, and the backup finishes with problems, but in other cases 'mount -r /dev/mapper/device /path' does not return and it stays consuming time indefinitely. I cannot kill the process y have to fence the node. How could I make the domU backups within dom0 without these problems? Thanks. Miguel. From fajar at fajar.net Fri Sep 4 13:15:17 2009 From: fajar at fajar.net (Fajar A. Nugraha) Date: Fri, 4 Sep 2009 20:15:17 +0700 Subject: [Linux-cluster] How backup domU partition from dom0? In-Reply-To: <4AA1032B.2090306@fi.upm.es> References: <4AA1032B.2090306@fi.upm.es> Message-ID: <7207d96f0909040615r26a36041p7ed978c73ad43753@mail.gmail.com> On Fri, Sep 4, 2009 at 7:08 PM, Miguel Sanchez wrote: > Hi. I have two hosts forming a cluster for run xen vm's. Each vm has a disk > which is corresponding to a clvm logic volume within dom0. > I pretended to backup the lv's from dom0 doing a 'kpart -a' and a readonly > mount (with the vm running). Probably it is not very correct (but the > alternative snapshot was not possible with clvm). Did you know that if you mount an ext3 partition READ ONLY you could actually do a WRITE to that partition to replay the journal, and so cause possible data corruption? > Most of times, the operation is ok, and the backup finishes with problems, > but in other cases 'mount -r /dev/mapper/device /path' does not return and > it stays consuming time indefinitely. I cannot kill the process y have to > fence the node. > > How could I make the domU backups within dom0 without these problems? You can't. Not without clvm snapshot. What you could probably do : - do backup from within domU - do backup from the SAN, if it supports snapshot. -- Fajar From fdinitto at redhat.com Fri Sep 4 13:47:11 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 04 Sep 2009 15:47:11 +0200 Subject: [Linux-cluster] Question about "ccs_tool update" In-Reply-To: <4AA0F14C.2050104@bull.net> References: <4AA0E1FE.70903@bull.net> <1252060928.6387.8.camel@cerberus.int.fabbione.net> <4AA0F14C.2050104@bull.net> Message-ID: <1252072031.6387.14.camel@cerberus.int.fabbione.net> On Fri, 2009-09-04 at 12:51 +0200, Alain.Moulle wrote: > Hi Fabio, > and many thanks. But just another precision : > you mean that ccs_sync is making the job > now , in a hidden way when cman_tool -r version is > executed , right ? No, cman_tool doesn't invoke ccs_sync. > but does the fact that cluster.conf is in another place > than /etc/cluster matter for ccs_sync to work fine ? > because I just tried : > [root at oberon3 ~]# ccs_sync help > Unable to parse /etc/cluster/cluster.conf: No such file or directory > Does that mean that ccs_sync does not take in account the > /etc/sysconfig/cman file ? I have CC'ed Ryan that wrote ccs_sync. I really have no idea as I do scp manually my cluster.conf around. Fabio PS Pretty please, can you stop sending html colored messages? It's really hard to read black on blue. From rmccabe at redhat.com Fri Sep 4 15:16:36 2009 From: rmccabe at redhat.com (Ryan McCabe) Date: Fri, 4 Sep 2009 11:16:36 -0400 Subject: [Linux-cluster] Question about "ccs_tool update" In-Reply-To: <1252072031.6387.14.camel@cerberus.int.fabbione.net> References: <4AA0E1FE.70903@bull.net> <1252060928.6387.8.camel@cerberus.int.fabbione.net> <4AA0F14C.2050104@bull.net> <1252072031.6387.14.camel@cerberus.int.fabbione.net> Message-ID: <20090904151636.GB30811@redhat.com> On Fri, Sep 04, 2009 at 03:47:11PM +0200, Fabio M. Di Nitto wrote: > > because I just tried : > > [root at oberon3 ~]# ccs_sync help > > Unable to parse /etc/cluster/cluster.conf: No such file or directory > > Does that mean that ccs_sync does not take in account the > > /etc/sysconfig/cman file ? > > I have CC'ed Ryan that wrote ccs_sync. I really have no idea as I do scp > manually my cluster.conf around. Hi, It doesn't take /etc/sysconfig/cman into account currently. Could you please open a bz ticket about this, and I'll try to get it fixed as soon as possible. Thanks, Ryan From fdinitto at redhat.com Fri Sep 4 15:22:22 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 04 Sep 2009 17:22:22 +0200 Subject: [Linux-cluster] Question about "ccs_tool update" In-Reply-To: <20090904151636.GB30811@redhat.com> References: <4AA0E1FE.70903@bull.net> <1252060928.6387.8.camel@cerberus.int.fabbione.net> <4AA0F14C.2050104@bull.net> <1252072031.6387.14.camel@cerberus.int.fabbione.net> <20090904151636.GB30811@redhat.com> Message-ID: <1252077742.6387.18.camel@cerberus.int.fabbione.net> On Fri, 2009-09-04 at 11:16 -0400, Ryan McCabe wrote: > On Fri, Sep 04, 2009 at 03:47:11PM +0200, Fabio M. Di Nitto wrote: > > > because I just tried : > > > [root at oberon3 ~]# ccs_sync help > > > Unable to parse /etc/cluster/cluster.conf: No such file or directory > > > Does that mean that ccs_sync does not take in account the > > > /etc/sysconfig/cman file ? > > > > I have CC'ed Ryan that wrote ccs_sync. I really have no idea as I do scp > > manually my cluster.conf around. > > Hi, > > It doesn't take /etc/sysconfig/cman into account currently. Could you > please open a bz ticket about this, and I'll try to get it fixed as soon > as possible. Ryan, it needs to take into account COROSYNC_CLUSTER_CONFIG_FILE env var either from the running environment or loaded via either /etc/sysconfig/cluster or /etc/sysconfig/cman for rpm based distros and /etc/default/cluster or /etc/default/cman on deb based distros. cman is always preferred over cluster. Fabio From ntadmin at fi.upm.es Fri Sep 4 19:42:24 2009 From: ntadmin at fi.upm.es (Miguel Sanchez) Date: Fri, 04 Sep 2009 21:42:24 +0200 Subject: [Linux-cluster] How backup domU partition from dom0? In-Reply-To: <7207d96f0909040615r26a36041p7ed978c73ad43753@mail.gmail.com> References: <4AA1032B.2090306@fi.upm.es> <7207d96f0909040615r26a36041p7ed978c73ad43753@mail.gmail.com> Message-ID: <4AA16DA0.6030602@fi.upm.es> Fajar A. Nugraha escribi?: > On Fri, Sep 4, 2009 at 7:08 PM, Miguel Sanchez wrote: > >> Hi. I have two hosts forming a cluster for run xen vm's. Each vm has a disk >> which is corresponding to a clvm logic volume within dom0. >> I pretended to backup the lv's from dom0 doing a 'kpart -a' and a readonly >> mount (with the vm running). Probably it is not very correct (but the >> alternative snapshot was not possible with clvm). >> > > Did you know that if you mount an ext3 partition READ ONLY you could > actually do a WRITE to that partition to replay the journal, and so > cause possible data corruption? > No, I don't. Then could the partition within dom0 be defined readonly with 'blockdev --setro' and avoid any write, direcly in the data as well as possible replaying the journal? -- Miguel. From fajar at fajar.net Sat Sep 5 02:04:16 2009 From: fajar at fajar.net (Fajar A. Nugraha) Date: Sat, 5 Sep 2009 09:04:16 +0700 Subject: [Linux-cluster] How backup domU partition from dom0? In-Reply-To: <4AA16DA0.6030602@fi.upm.es> References: <4AA1032B.2090306@fi.upm.es> <7207d96f0909040615r26a36041p7ed978c73ad43753@mail.gmail.com> <4AA16DA0.6030602@fi.upm.es> Message-ID: <7207d96f0909041904r6b1a16a0n1664d8941ee8e2d1@mail.gmail.com> On Sat, Sep 5, 2009 at 2:42 AM, Miguel Sanchez wrote: > Fajar A. Nugraha escribi?: >> >> Did you know that if you mount an ext3 partition READ ONLY you could >> actually do a WRITE to that partition to replay the journal, and so >> cause possible data corruption? >> > > No, I don't. Then could the partition within dom0 be defined readonly with > 'blockdev --setro' and avoid any write, direcly in the data as well as > possible replaying the journal? AFAIK if you do that you won't be able to replay the journal, and kernel will refuse to mount it :P -- Fajar From Luis.Cerezo at pgs.com Tue Sep 8 20:40:36 2009 From: Luis.Cerezo at pgs.com (Luis Cerezo) Date: Tue, 8 Sep 2009 15:40:36 -0500 Subject: [Linux-cluster] 3 node cluster and quorum disk? In-Reply-To: <1251832781.3209.548.camel@localhost.localdomain> References: <20090826161128.1e32721c@pc-jsosic.srce.hr> <1251832781.3209.548.camel@localhost.localdomain> Message-ID: <41AFBA96-80A5-4342-9A80-5178FF7E0C1A@pgs.com> how many votes do the other nodes have? Luis E. Cerezo Global IT GV: +1 412 223 7396 On Sep 1, 2009, at 2:19 PM, Lon Hohberger wrote: > On Wed, 2009-08-26 at 16:11 +0200, Jakov Sosic wrote: >> Hi. >> >> I have a situation - when two nodes are up in 3 node cluster, and one >> node goes down, cluster looses quorate - although I'm using qdiskd... > > >> >> >> >> >> > label="SAS-qdisk" status_file="/tmp/qdisk"/> > > > > If that doesn't fix it entirely, get rid of status_file, decrease > interval, and increase tko. Try: > > interval=2 tko=12 ? > > -- Lon > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster This e-mail, including any attachments and response string, may contain proprietary information which is confidential and may be legally privileged. It is for the intended recipient only. If you are not the intended recipient or transmission error has misdirected this e-mail, please notify the author by return e-mail and delete this message and any attachment immediately. If you are not the intended recipient you must not use, disclose, distribute, forward, copy, print or rely on this e-mail in any way except as permitted by the author. From pradhanparas at gmail.com Tue Sep 8 20:57:00 2009 From: pradhanparas at gmail.com (Paras pradhan) Date: Tue, 8 Sep 2009 15:57:00 -0500 Subject: [Linux-cluster] 3 node cluster and quorum disk? In-Reply-To: <20090826161128.1e32721c@pc-jsosic.srce.hr> References: <20090826161128.1e32721c@pc-jsosic.srce.hr> Message-ID: <8b711df40909081357p35d14345kaff6199b742efb76@mail.gmail.com> On Wed, Aug 26, 2009 at 9:11 AM, Jakov Sosic wrote: > Hi. > > I have a situation - when two nodes are up in 3 node cluster, and one > node goes down, cluster looses quorate - although I'm using qdiskd... > > I think that problem is in switching qdisk master from one node to > another. In that case, rgmanager disables all running services, which is > not acceptable situation. Services are currently set to > autostart="0" because cluster is in evaluation phase. > > Here is my config: > > > > ? ? ? ? > ? ? ? ? > ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ? ? ? ? > > ? ? ? ? > ? ? ? ? > > > ? ? ? ? > ? ? ? ? > > ? ? ? ? > ? ? ? ? ? ? ? ?label="SAS-qdisk" status_file="/tmp/qdisk"/> > > ? ? ? ? > ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ?ipaddr="" login="" passwd="" name="node01-ipmi"/> > ? ? ? ? ? ? ? ? ? ? ? ?ipaddr="" login="" passwd="" name="node02-ipmi"/> > ? ? ? ? ? ? ? ? ? ? ? ?ipaddr="" login="" passwd="" name="node03-ipmi"/> > ? ? ? ? > > > > Should I change any of the timeouts? > > > > > > > -- > | ? ?Jakov Sosic ? ?| ? ?ICQ: 28410271 ? ?| ? PGP: 0x965CAE2D ? | > ================================================================= > | start fighting cancer -> http://www.worldcommunitygrid.org/ ? | > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > I ran into the same problem. I am also running a 3 nodes cluster with qdisk. Before, my node1 , node2 and nod3 has 1, 1, 2 votes and qdisk had 2. I ran into the same problem as u are having now. Then I change the votes from 2 to 1 to node 3 and added a vote to qdisk . Now it is running fine. I don't know what happed before. I have tested a lot but didnot succeeded. Now my, interval = 1 and tko=10 in my case. Paras. From alan.zg at gmail.com Tue Sep 8 22:34:11 2009 From: alan.zg at gmail.com (Alan A) Date: Tue, 8 Sep 2009 17:34:11 -0500 Subject: [Linux-cluster] Multicasting problems Message-ID: It has come to the point where our cluster production configuration has halted due to the unexpected issues with multicasting on LAN/WAN. The problem is that the firewall enabled on the switch ports does not support multicasting, and between cluster nodes and the routers lays firewall. Nodes -> Switch with integrated Firewall devices -> Router We are aware of problems encountered with Cisco switches and are trying to clear some things. For instance in RHEL Knowledgebase article 5933 it states: *The recommended method is to enable multicast routing for a given vlan so that the Catalyst will act as the IGMP querier. This consists of the following steps:* * * 1. *Enabling multicast on the switch globally* 2. *Choosing the vlan the cluster nodes are using* 3. *Turning on PIM routing for that subnet* My Questions: Can we enable PIM routing on the Server NIC itself without using dedicated network device? Meaning IGMP multicast would be managed by the NIC's itself from each node, can the nodes awarnes function this way? Any suggestions on how to get around firewall issue without purchesing firewalls with routing tables? Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdake at redhat.com Tue Sep 8 22:38:55 2009 From: sdake at redhat.com (Steven Dake) Date: Tue, 08 Sep 2009 15:38:55 -0700 Subject: [Linux-cluster] Multicasting problems In-Reply-To: References: Message-ID: <1252449535.18865.7.camel@localhost.localdomain> On Tue, 2009-09-08 at 17:34 -0500, Alan A wrote: > It has come to the point where our cluster production configuration > has halted due to the unexpected issues with multicasting on LAN/WAN. > > The problem is that the firewall enabled on the switch ports does not > support multicasting, and between cluster nodes and the routers lays > firewall. > > Nodes -> Switch with integrated Firewall devices -> Router > > We are aware of problems encountered with Cisco switches and are > trying to clear some things. For instance in RHEL Knowledgebase > article 5933 it states: > > > The recommended method is to enable multicast routing for a given vlan > so that the Catalyst will act as the IGMP querier. This consists of > the following steps: > > > > 1. Enabling multicast on the switch globally > > 2. Choosing the vlan the cluster nodes are using > > 3. Turning on PIM routing for that subnet > > > My Questions: > > Can we enable PIM routing on the Server NIC itself without using > dedicated network device? Meaning IGMP multicast would be managed by > the NIC's itself from each node, can the nodes awarnes function this > way? > > Any suggestions on how to get around firewall issue without purchesing > firewalls with routing tables? > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. > I'm afraid only Cisco (and maybe some Cisco experts on this list) knows the answers to your questions. I suggest you contact your Cisco TAC for advice on configuring their products. They can help you achieve best results. Regards -steve > > > -- > Alan A. > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From alan.zg at gmail.com Wed Sep 9 03:30:17 2009 From: alan.zg at gmail.com (Alan A) Date: Tue, 8 Sep 2009 22:30:17 -0500 Subject: [Linux-cluster] Multicasting problems In-Reply-To: <1252449535.18865.7.camel@localhost.localdomain> References: <1252449535.18865.7.camel@localhost.localdomain> Message-ID: Thank you for your input. We are contacting Cisco to get their input on this, but we have to explore RH options if any as well. Would there be a way to enable NIC (network device on the server) and make it IGMP aware somehow. In essence how can I make NIC's manage IGMP and PIM, is there a way? I know I can make a NIC in Linux become a router, but how do I make it IGMP and PIM aware on each node? On Tue, Sep 8, 2009 at 5:38 PM, Steven Dake wrote: > On Tue, 2009-09-08 at 17:34 -0500, Alan A wrote: > > It has come to the point where our cluster production configuration > > has halted due to the unexpected issues with multicasting on LAN/WAN. > > > > The problem is that the firewall enabled on the switch ports does not > > support multicasting, and between cluster nodes and the routers lays > > firewall. > > > > Nodes -> Switch with integrated Firewall devices -> Router > > > > We are aware of problems encountered with Cisco switches and are > > trying to clear some things. For instance in RHEL Knowledgebase > > article 5933 it states: > > > > > > The recommended method is to enable multicast routing for a given vlan > > so that the Catalyst will act as the IGMP querier. This consists of > > the following steps: > > > > > > > > 1. Enabling multicast on the switch globally > > > > 2. Choosing the vlan the cluster nodes are using > > > > 3. Turning on PIM routing for that subnet > > > > > > My Questions: > > > > Can we enable PIM routing on the Server NIC itself without using > > dedicated network device? Meaning IGMP multicast would be managed by > > the NIC's itself from each node, can the nodes awarnes function this > > way? > > > > Any suggestions on how to get around firewall issue without purchesing > > firewalls with routing tables? > > > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. > > > > I'm afraid only Cisco (and maybe some Cisco experts on this list) knows > the answers to your questions. I suggest you contact your Cisco TAC for > advice on configuring their products. They can help you achieve best > results. > > Regards > -steve > > > > > > > > -- > > Alan A. > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jakov.sosic at srce.hr Wed Sep 9 09:08:02 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Wed, 9 Sep 2009 11:08:02 +0200 Subject: [Linux-cluster] Multicasting problems In-Reply-To: References: Message-ID: <20090909110802.5a8c5113@nb-jsosic> On Tue, 8 Sep 2009 17:34:11 -0500 Alan A wrote: > It has come to the point where our cluster production configuration > has halted due to the unexpected issues with multicasting on LAN/WAN. > > The problem is that the firewall enabled on the switch ports does not > support multicasting, and between cluster nodes and the routers lays > firewall. > > Nodes -> Switch with integrated Firewall devices -> Router > > We are aware of problems encountered with Cisco switches and are > trying to clear some things. For instance in RHEL Knowledgebase > article 5933 it states: > > *The recommended method is to enable multicast routing for a given > vlan so that the Catalyst will act as the IGMP querier. This consists > of the following steps:* > > * * > > 1. > > *Enabling multicast on the switch globally* > 2. > > *Choosing the vlan the cluster nodes are using* > 3. > > *Turning on PIM routing for that subnet* > > > My Questions: > > Can we enable PIM routing on the Server NIC itself without using > dedicated network device? Meaning IGMP multicast would be managed by > the NIC's itself from each node, can the nodes awarnes function this > way? > > Any suggestions on how to get around firewall issue without purchesing > firewalls with routing tables? > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. It seems that I was right with my diagnostics :D Why don't you create VLAN with private subnet addresses, in for example 10.0.0.0/8, and allow PIM on that VLAN, and trunk it with regular wlan that you use now. And then configure RHCS to heartbeat over this new private VLAN with enabled PIM? You wouldn't need the firewall because the VLAN would be used only for cluster communication, and it could be fully isolated. It does not need to be routed at all - because heartbeat packages go only between nodes. So no external access to that VLAN would be enabled. It's perfectly safe. If you need help on configuring either Cisco 6500 or RHEL for VLAN trunking please ask. Take a look at 802.1Q standard to understand the issue: http://en.wikipedia.org/wiki/IEEE_802.1Q -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From alan.zg at gmail.com Wed Sep 9 12:32:57 2009 From: alan.zg at gmail.com (Alan A) Date: Wed, 9 Sep 2009 07:32:57 -0500 Subject: [Linux-cluster] Multicasting problems In-Reply-To: <20090909110802.5a8c5113@nb-jsosic> References: <20090909110802.5a8c5113@nb-jsosic> Message-ID: The problem lays in creating the VLAN that allows PIM. Firewall and the switch are one physical device, and once the firewall is on it manages directly ports on the switch, and firewall is not capable (according to our LAN /WAN engineers) at least not on this Cisco model of managing or allowing PIM. For PIM we need other dedicated device that would handle Sparse/Dense mode before the firewall, which is a major problem. That is why I am interested in what can be done on the server side, what options can we enable on the NIC's directly to mimic PIM. Switch will allow IGMPv2 communication, but in our tests without Router like device with PIM enabled, we were unable to form the cluster. Each node woud send IGMP messages and it would be totally unaware of other nodes sending their messages. On Wed, Sep 9, 2009 at 4:08 AM, Jakov Sosic wrote: > On Tue, 8 Sep 2009 17:34:11 -0500 > Alan A wrote: > > > It has come to the point where our cluster production configuration > > has halted due to the unexpected issues with multicasting on LAN/WAN. > > > > The problem is that the firewall enabled on the switch ports does not > > support multicasting, and between cluster nodes and the routers lays > > firewall. > > > > Nodes -> Switch with integrated Firewall devices -> Router > > > > We are aware of problems encountered with Cisco switches and are > > trying to clear some things. For instance in RHEL Knowledgebase > > article 5933 it states: > > > > *The recommended method is to enable multicast routing for a given > > vlan so that the Catalyst will act as the IGMP querier. This consists > > of the following steps:* > > > > * * > > > > 1. > > > > *Enabling multicast on the switch globally* > > 2. > > > > *Choosing the vlan the cluster nodes are using* > > 3. > > > > *Turning on PIM routing for that subnet* > > > > > > My Questions: > > > > Can we enable PIM routing on the Server NIC itself without using > > dedicated network device? Meaning IGMP multicast would be managed by > > the NIC's itself from each node, can the nodes awarnes function this > > way? > > > > Any suggestions on how to get around firewall issue without purchesing > > firewalls with routing tables? > > > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. > > It seems that I was right with my diagnostics :D > > > Why don't you create VLAN with private subnet addresses, in for example > 10.0.0.0/8, and allow PIM on that VLAN, and trunk it with regular > wlan that you use now. And then configure RHCS to heartbeat over > this new private VLAN with enabled PIM? You wouldn't need the firewall > because the VLAN would be used only for cluster communication, and it > could be fully isolated. It does not need to be routed at all - because > heartbeat packages go only between nodes. So no external access to that > VLAN would be enabled. It's perfectly safe. > > If you need help on configuring either Cisco 6500 or RHEL for VLAN > trunking please ask. Take a look at 802.1Q standard to understand the > issue: > > http://en.wikipedia.org/wiki/IEEE_802.1Q > > > > -- > | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | > ================================================================= > | start fighting cancer -> http://www.worldcommunitygrid.org/ | > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luis.Cerezo at pgs.com Wed Sep 9 18:22:19 2009 From: Luis.Cerezo at pgs.com (Luis Cerezo) Date: Wed, 9 Sep 2009 13:22:19 -0500 Subject: [Linux-cluster] Multicasting problems In-Reply-To: References: <20090909110802.5a8c5113@nb-jsosic> Message-ID: this may be completely unhelpful... have you tried changing the mcast address of the cluster? Luis E. Cerezo Global IT GV: +1 412 223 7396 On Sep 9, 2009, at 7:32 AM, Alan A wrote: The problem lays in creating the VLAN that allows PIM. Firewall and the switch are one physical device, and once the firewall is on it manages directly ports on the switch, and firewall is not capable (according to our LAN /WAN engineers) at least not on this Cisco model of managing or allowing PIM. For PIM we need other dedicated device that would handle Sparse/Dense mode before the firewall, which is a major problem. That is why I am interested in what can be done on the server side, what options can we enable on the NIC's directly to mimic PIM. Switch will allow IGMPv2 communication, but in our tests without Router like device with PIM enabled, we were unable to form the cluster. Each node woud send IGMP messages and it would be totally unaware of other nodes sending their messages. On Wed, Sep 9, 2009 at 4:08 AM, Jakov Sosic > wrote: On Tue, 8 Sep 2009 17:34:11 -0500 Alan A > wrote: > It has come to the point where our cluster production configuration > has halted due to the unexpected issues with multicasting on LAN/WAN. > > The problem is that the firewall enabled on the switch ports does not > support multicasting, and between cluster nodes and the routers lays > firewall. > > Nodes -> Switch with integrated Firewall devices -> Router > > We are aware of problems encountered with Cisco switches and are > trying to clear some things. For instance in RHEL Knowledgebase > article 5933 it states: > > *The recommended method is to enable multicast routing for a given > vlan so that the Catalyst will act as the IGMP querier. This consists > of the following steps:* > > * * > > 1. > > *Enabling multicast on the switch globally* > 2. > > *Choosing the vlan the cluster nodes are using* > 3. > > *Turning on PIM routing for that subnet* > > > My Questions: > > Can we enable PIM routing on the Server NIC itself without using > dedicated network device? Meaning IGMP multicast would be managed by > the NIC's itself from each node, can the nodes awarnes function this > way? > > Any suggestions on how to get around firewall issue without purchesing > firewalls with routing tables? > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. It seems that I was right with my diagnostics :D Why don't you create VLAN with private subnet addresses, in for example 10.0.0.0/8, and allow PIM on that VLAN, and trunk it with regular wlan that you use now. And then configure RHCS to heartbeat over this new private VLAN with enabled PIM? You wouldn't need the firewall because the VLAN would be used only for cluster communication, and it could be fully isolated. It does not need to be routed at all - because heartbeat packages go only between nodes. So no external access to that VLAN would be enabled. It's perfectly safe. If you need help on configuring either Cisco 6500 or RHEL for VLAN trunking please ask. Take a look at 802.1Q standard to understand the issue: http://en.wikipedia.org/wiki/IEEE_802.1Q -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Alan A. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster This e-mail, including any attachments and response string, may contain proprietary information which is confidential and may be legally privileged. It is for the intended recipient only. If you are not the intended recipient or transmission error has misdirected this e-mail, please notify the author by return e-mail and delete this message and any attachment immediately. If you are not the intended recipient you must not use, disclose, distribute, forward, copy, print or rely on this e-mail in any way except as permitted by the author. From alan.zg at gmail.com Wed Sep 9 18:38:03 2009 From: alan.zg at gmail.com (Alan A) Date: Wed, 9 Sep 2009 13:38:03 -0500 Subject: [Linux-cluster] Multicasting problems In-Reply-To: References: <20090909110802.5a8c5113@nb-jsosic> Message-ID: Haven't done that but I am not positive that it would help in our setting. My tests were to establish private VLAN with 3 private addresses for 3 node cluster. I hade node1 on 192.168.10.21, node2 192.168.10.22, and node3 on 192.168.10.23. I could ping each node from each node, so node1 would see node2 and node3, node2 would see node1 and node3, and node3 would see node1 and node2. I made /etc/host entries and checked with the 'route' command that the device eth2 on each node was dedicated to access private network on 192.168.10.2x, as it showed. There was no additional network devise on the Cisco switch, just the 3 cluster nodes. I issued cman_tool status command and got the multicast address - checked that it is the same on all three nodes and when I pinged the address I just got the dead air..... Nothing... I tried this by forcing cluster via sysclt command to use IGMPv1 v2 and v3... None worked. On Wed, Sep 9, 2009 at 1:22 PM, Luis Cerezo wrote: > this may be completely unhelpful... > > have you tried changing the mcast address of the cluster? > > Luis E. Cerezo > Global IT > GV: +1 412 223 7396 > > On Sep 9, 2009, at 7:32 AM, Alan A wrote: > > The problem lays in creating the VLAN that allows PIM. Firewall and the > switch are one physical device, and once the firewall is on it manages > directly ports on the switch, and firewall is not capable (according to our > LAN /WAN engineers) at least not on this Cisco model of managing or allowing > PIM. For PIM we need other dedicated device that would handle Sparse/Dense > mode before the firewall, which is a major problem. That is why I am > interested in what can be done on the server side, what options can we > enable on the NIC's directly to mimic PIM. Switch will allow IGMPv2 > communication, but in our tests without Router like device with PIM enabled, > we were unable to form the cluster. Each node woud send IGMP messages and it > would be totally unaware of other nodes sending their messages. > > On Wed, Sep 9, 2009 at 4:08 AM, Jakov Sosic jakov.sosic at srce.hr>> wrote: > On Tue, 8 Sep 2009 17:34:11 -0500 > Alan A > wrote: > > > It has come to the point where our cluster production configuration > > has halted due to the unexpected issues with multicasting on LAN/WAN. > > > > The problem is that the firewall enabled on the switch ports does not > > support multicasting, and between cluster nodes and the routers lays > > firewall. > > > > Nodes -> Switch with integrated Firewall devices -> Router > > > > We are aware of problems encountered with Cisco switches and are > > trying to clear some things. For instance in RHEL Knowledgebase > > article 5933 it states: > > > > *The recommended method is to enable multicast routing for a given > > vlan so that the Catalyst will act as the IGMP querier. This consists > > of the following steps:* > > > > * * > > > > 1. > > > > *Enabling multicast on the switch globally* > > 2. > > > > *Choosing the vlan the cluster nodes are using* > > 3. > > > > *Turning on PIM routing for that subnet* > > > > > > My Questions: > > > > Can we enable PIM routing on the Server NIC itself without using > > dedicated network device? Meaning IGMP multicast would be managed by > > the NIC's itself from each node, can the nodes awarnes function this > > way? > > > > Any suggestions on how to get around firewall issue without purchesing > > firewalls with routing tables? > > > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. > > It seems that I was right with my diagnostics :D > > > Why don't you create VLAN with private subnet addresses, in for example > 10.0.0.0/8, and allow PIM on that VLAN, and trunk it > with regular > wlan that you use now. And then configure RHCS to heartbeat over > this new private VLAN with enabled PIM? You wouldn't need the firewall > because the VLAN would be used only for cluster communication, and it > could be fully isolated. It does not need to be routed at all - because > heartbeat packages go only between nodes. So no external access to that > VLAN would be enabled. It's perfectly safe. > > If you need help on configuring either Cisco 6500 or RHEL for VLAN > trunking please ask. Take a look at 802.1Q standard to understand the > issue: > > http://en.wikipedia.org/wiki/IEEE_802.1Q > > > > -- > | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | > ================================================================= > | start fighting cancer -> http://www.worldcommunitygrid.org/ | > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Alan A. > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > This e-mail, including any attachments and response string, may contain > proprietary information which is confidential and may be legally privileged. > It is for the intended recipient only. If you are not the intended recipient > or transmission error has misdirected this e-mail, please notify the author > by return e-mail and delete this message and any attachment immediately. If > you are not the intended recipient you must not use, disclose, distribute, > forward, copy, print or rely on this e-mail in any way except as permitted > by the author. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luis.Cerezo at pgs.com Wed Sep 9 20:47:54 2009 From: Luis.Cerezo at pgs.com (Luis Cerezo) Date: Wed, 9 Sep 2009 15:47:54 -0500 Subject: [Linux-cluster] Multicasting problems In-Reply-To: References: <20090909110802.5a8c5113@nb-jsosic> Message-ID: <04FEBDD4-54DA-4457-A68D-BD4BE379D023@pgs.com> try adding something like to your cluster.conf (of course uptick the rev, ccs_tool update..) -luis Luis E. Cerezo Global IT GV: +1 412 223 7396 On Sep 9, 2009, at 1:38 PM, Alan A wrote: Haven't done that but I am not positive that it would help in our setting. My tests were to establish private VLAN with 3 private addresses for 3 node cluster. I hade node1 on 192.168.10.21, node2 192.168.10.22, and node3 on 192.168.10.23. I could ping each node from each node, so node1 would see node2 and node3, node2 would see node1 and node3, and node3 would see node1 and node2. I made /etc/host entries and checked with the 'route' command that the device eth2 on each node was dedicated to access private network on 192.168.10.2x, as it showed. There was no additional network devise on the Cisco switch, just the 3 cluster nodes. I issued cman_tool status command and got the multicast address - checked that it is the same on all three nodes and when I pinged the address I just got the dead air..... Nothing... I tried this by forcing cluster via sysclt command to use IGMPv1 v2 and v3... None worked. On Wed, Sep 9, 2009 at 1:22 PM, Luis Cerezo > wrote: this may be completely unhelpful... have you tried changing the mcast address of the cluster? Luis E. Cerezo Global IT GV: +1 412 223 7396 On Sep 9, 2009, at 7:32 AM, Alan A wrote: The problem lays in creating the VLAN that allows PIM. Firewall and the switch are one physical device, and once the firewall is on it manages directly ports on the switch, and firewall is not capable (according to our LAN /WAN engineers) at least not on this Cisco model of managing or allowing PIM. For PIM we need other dedicated device that would handle Sparse/Dense mode before the firewall, which is a major problem. That is why I am interested in what can be done on the server side, what options can we enable on the NIC's directly to mimic PIM. Switch will allow IGMPv2 communication, but in our tests without Router like device with PIM enabled, we were unable to form the cluster. Each node woud send IGMP messages and it would be totally unaware of other nodes sending their messages. On Wed, Sep 9, 2009 at 4:08 AM, Jakov Sosic >> wrote: On Tue, 8 Sep 2009 17:34:11 -0500 Alan A >> wrote: > It has come to the point where our cluster production configuration > has halted due to the unexpected issues with multicasting on LAN/WAN. > > The problem is that the firewall enabled on the switch ports does not > support multicasting, and between cluster nodes and the routers lays > firewall. > > Nodes -> Switch with integrated Firewall devices -> Router > > We are aware of problems encountered with Cisco switches and are > trying to clear some things. For instance in RHEL Knowledgebase > article 5933 it states: > > *The recommended method is to enable multicast routing for a given > vlan so that the Catalyst will act as the IGMP querier. This consists > of the following steps:* > > * * > > 1. > > *Enabling multicast on the switch globally* > 2. > > *Choosing the vlan the cluster nodes are using* > 3. > > *Turning on PIM routing for that subnet* > > > My Questions: > > Can we enable PIM routing on the Server NIC itself without using > dedicated network device? Meaning IGMP multicast would be managed by > the NIC's itself from each node, can the nodes awarnes function this > way? > > Any suggestions on how to get around firewall issue without purchesing > firewalls with routing tables? > > Cisco switch model is: switch 6509 running 12.2(18) SXF and IGMP v2. It seems that I was right with my diagnostics :D Why don't you create VLAN with private subnet addresses, in for example 10.0.0.0/8, and allow PIM on that VLAN, and trunk it with regular wlan that you use now. And then configure RHCS to heartbeat over this new private VLAN with enabled PIM? You wouldn't need the firewall because the VLAN would be used only for cluster communication, and it could be fully isolated. It does not need to be routed at all - because heartbeat packages go only between nodes. So no external access to that VLAN would be enabled. It's perfectly safe. If you need help on configuring either Cisco 6500 or RHEL for VLAN trunking please ask. Take a look at 802.1Q standard to understand the issue: http://en.wikipedia.org/wiki/IEEE_802.1Q -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | -- Linux-cluster mailing list Linux-cluster at redhat.com> https://www.redhat.com/mailman/listinfo/linux-cluster -- Alan A. -- Linux-cluster mailing list Linux-cluster at redhat.com> https://www.redhat.com/mailman/listinfo/linux-cluster This e-mail, including any attachments and response string, may contain proprietary information which is confidential and may be legally privileged. It is for the intended recipient only. If you are not the intended recipient or transmission error has misdirected this e-mail, please notify the author by return e-mail and delete this message and any attachment immediately. If you are not the intended recipient you must not use, disclose, distribute, forward, copy, print or rely on this e-mail in any way except as permitted by the author. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Alan A. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster This e-mail, including any attachments and response string, may contain proprietary information which is confidential and may be legally privileged. It is for the intended recipient only. If you are not the intended recipient or transmission error has misdirected this e-mail, please notify the author by return e-mail and delete this message and any attachment immediately. If you are not the intended recipient you must not use, disclose, distribute, forward, copy, print or rely on this e-mail in any way except as permitted by the author. From esggrupos at gmail.com Thu Sep 10 10:51:38 2009 From: esggrupos at gmail.com (ESGLinux) Date: Thu, 10 Sep 2009 12:51:38 +0200 Subject: [Linux-cluster] do I have a fence DRAC device? In-Reply-To: <3128ba140908180535o4f62b011vc41e5ec6517ac388@mail.gmail.com> References: <3128ba140908100324l6cdb4c34ra5f5edb39c6903e9@mail.gmail.com> <8b711df40908101134t69b8e12cof6cc551809421e45@mail.gmail.com> <3128ba140908170350ge619930w5c17368ff0d3cf42@mail.gmail.com> <8b711df40908171128j1fc18525nfd7df01d7604cda0@mail.gmail.com> <3128ba140908180535o4f62b011vc41e5ec6517ac388@mail.gmail.com> Message-ID: <3128ba140909100351s544df2e5k8878324907ae09b5@mail.gmail.com> Hi all, after a long time without the opportunity to check the boot process of my server to see the message I have done it. I can see the following message: BMC Revision 2.05 Remote Access Configuration Utility 1.25 I enter in the utility pressing F2. I have configured the ip to 192.168.1.250. and now I can make ping to the ip but notning more. ping 192.168.1.250 PING 192.168.1.250 (192.168.1.250) 56(84) bytes of data. 64 bytes from 192.168.1.250: icmp_seq=1 ttl=128 time=60.3 ms anybody knows what I need to do it to be able to manage the server? Thanks ESG -------------- next part -------------- An HTML attachment was scrubbed... URL: From robejrm at gmail.com Thu Sep 10 11:04:48 2009 From: robejrm at gmail.com (Juan Ramon Martin Blanco) Date: Thu, 10 Sep 2009 13:04:48 +0200 Subject: [Linux-cluster] do I have a fence DRAC device? In-Reply-To: <3128ba140909100351s544df2e5k8878324907ae09b5@mail.gmail.com> References: <3128ba140908100324l6cdb4c34ra5f5edb39c6903e9@mail.gmail.com> <8b711df40908101134t69b8e12cof6cc551809421e45@mail.gmail.com> <3128ba140908170350ge619930w5c17368ff0d3cf42@mail.gmail.com> <8b711df40908171128j1fc18525nfd7df01d7604cda0@mail.gmail.com> <3128ba140908180535o4f62b011vc41e5ec6517ac388@mail.gmail.com> <3128ba140909100351s544df2e5k8878324907ae09b5@mail.gmail.com> Message-ID: <8a5668960909100404u3d86f7cbv6be51d4530527b23@mail.gmail.com> On Thu, Sep 10, 2009 at 12:51 PM, ESGLinux wrote: > Hi all, > after a long time without the opportunity to check the boot process of my > server to see the message I have done it. > > I can see the following message: > BMC Revision 2.05 > Remote Access Configuration Utility 1.25 > > I enter in the utility pressing F2. I have configured the ip to > 192.168.1.250. > > > and now I can make ping to the ip but notning more. > ping 192.168.1.250 > PING 192.168.1.250 (192.168.1.250) 56(84) bytes of data. > 64 bytes from 192.168.1.250: icmp_seq=1 ttl=128 time=60.3 ms > > anybody knows what I need to do it to be able to manage the server? > > From another machine connect to the bmc ip using ipmitool utility man ipmitool ;) Greetings, Juanra > Thanks > > ESG > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gianluca.cecchi at gmail.com Thu Sep 10 11:29:27 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Thu, 10 Sep 2009 13:29:27 +0200 Subject: [Linux-cluster] where exactly cluster services are stoppped during shutdown? Message-ID: <561c252c0909100429q7671ad0cj3880792a603a24a5@mail.gmail.com> Hello, suppose that I have a service srvname defined in chkconfig and I would like to insert it as a resource/service in my cluster.conf (version 3 of cluster as found in f11, but thanks for answer for version 2 as in rhel 5 if different) So my cluster.conf is something like this: