From alex at joelly.net Sun May 2 14:33:17 2010 From: alex at joelly.net (Joelly Alexander) Date: Sun, 02 May 2010 16:33:17 +0200 Subject: [Linux-cluster] nfs mountpoints Message-ID: <4BDD8D2D.9070702@joelly.net> hello, i have an working 2node cluster which serves an gfs2 mountpoint via nfs; how to configure that subdirectories under the gfs2 mountpoint are exported via nfs and not the entire gfs2 mountpoint is exported via nfs? i do not want to create many gfs2 mountpoints and export them as different nfs exports - this should be possible with only one gfs2 mountpoint also, or? it is possible to configure nfs outside the rhcs, but if possible i want to have it inside... thx, -------------- next part -------------- An embedded message was scrubbed... From: Joelly Alexander Subject: nfs mountpoints Date: Sun, 02 May 2010 16:30:23 +0200 Size: 2038 URL: From alex at joelly.net Sun May 2 18:05:26 2010 From: alex at joelly.net (Joelly Alexander) Date: Sun, 02 May 2010 20:05:26 +0200 Subject: [Linux-cluster] nfs mountpoints In-Reply-To: <4BDD8D2D.9070702@joelly.net> References: <4BDD8D2D.9070702@joelly.net> Message-ID: <4BDDBEE6.90904@joelly.net> found by myself - path does the trick... thx On 02.05.2010 16:33, Joelly Alexander wrote: > hello, > > i have an working 2node cluster which serves an gfs2 mountpoint via nfs; > how to configure that subdirectories under the gfs2 mountpoint are > exported via nfs and not the entire gfs2 mountpoint is exported via nfs? > i do not want to create many gfs2 mountpoints and export them as > different nfs exports - this should be possible with only one gfs2 > mountpoint also, or? > it is possible to configure nfs outside the rhcs, but if possible i > want to have it inside... > > thx, > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swhiteho at redhat.com Tue May 4 15:50:31 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 04 May 2010 16:50:31 +0100 Subject: [Linux-cluster] nfs mountpoints In-Reply-To: <4BDD8D2D.9070702@joelly.net> References: <4BDD8D2D.9070702@joelly.net> Message-ID: <1272988231.7196.372.camel@localhost.localdomain> Hi, On Sun, 2010-05-02 at 16:33 +0200, Joelly Alexander wrote: > hello, > > i have an working 2node cluster which serves an gfs2 mountpoint via nfs; > how to configure that subdirectories under the gfs2 mountpoint are exported via nfs and not the entire gfs2 mountpoint is exported via nfs? > i do not want to create many gfs2 mountpoints and export them as different nfs exports - this should be possible with only one gfs2 mountpoint also, or? > it is possible to configure nfs outside the rhcs, but if possible i want to have it inside... > > thx, > I'd suggest bind mounting the required subdirs somewhere and exporting that directory, Steve. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swhiteho at redhat.com Tue May 4 15:55:38 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 04 May 2010 16:55:38 +0100 Subject: [Linux-cluster] Maximum number of nodes In-Reply-To: References: Message-ID: <1272988538.7196.376.camel@localhost.localdomain> Hi, On Fri, 2010-04-30 at 14:37 -0500, Dusty wrote: > Hello, > > Regarding the component versions of "Redhat Cluster Suite" as released > on the 5.4 and 5.5 ISOs...: > > What is the maximum number of nodes that will work within a single > cluster? > The limit is 16 nodes. > From where do the limitations come? GFS2? Qdisk? What if not using > qdisk? What if not using GFS2? > > Thank you! The limit is down to what we can reasonably test, and thus what is supported. The theoretical limit is much higher and it may be possible to raise the supported node limits in future, Steve. > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From jeff.sturm at eprize.com Tue May 4 18:14:20 2010 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Tue, 4 May 2010 14:14:20 -0400 Subject: [Linux-cluster] Maximum number of nodes In-Reply-To: <1272988538.7196.376.camel@localhost.localdomain> References: <1272988538.7196.376.camel@localhost.localdomain> Message-ID: <64D0546C5EBBD147B75DE133D798665F055D917F@hugo.eprize.local> > -----Original Message----- > > What is the maximum number of nodes that will work within a single > > cluster? > > > The limit is 16 nodes. > > > From where do the limitations come? GFS2? Qdisk? What if not using > > qdisk? What if not using GFS2? > > > > Thank you! > The limit is down to what we can reasonably test, and thus what is > supported. The theoretical limit is much higher and it may be possible > to raise the supported node limits in future, As a data point, we run a production cluster with 24 nodes and haven't observed any adverse effects. Our cluster has grown slowly from a smaller initial deployment by adding nodes to meet capacity. (However we don't run a Red Hat OS nor are we a Red Hat support customer. Therefore we are relying on our own resources, community goodwill, and a little bit of luck to keep this thing running.) A few notes about our deployment... CMAN, based on OpenAIS (now CoroSync), is a remarkably efficient cluster monitor. Virtual Synchrony is an elegant protocol and appears to scale well in practice. We don't observe significant overhead on our network interfaces due to cluster traffic, and we don't see erratic behavior due to latency as the cluster grows (though totem parameters may have to be adjusted at some point). That said, the cluster is only as good as your network, and your network *must* handle IP multicast properly. (If you ever suspect a network is faulty somewhere, try running a cluster on it. You suspicions may be quickly confirmed!) DLM and GFS are not part of CMAN, but work alongside it. I don't know what limits they may have. I suspect we'd reach throughput limits on our SAN before anything else if we tried to grow our cluster significantly--we're already at several thousand iops sustained, and the SAN is a specialized component that doesn't scale just by adding cluster nodes. DLM scalabilities depends highly on the application--when we made our app locality-aware, locking problems went away. We have many GFS filesystems, not just one, and none of them span all nodes of our cluster. The largest one is mounted across 22 nodes. I don't plan to increase the node count of our cluster much further, if at all. With increasing multicore hardware available, we're more likely to scale up by replacing nodes with 8-way or 16-way units. (I'm curious to know what the practical limits are, but don't plan to really push the envelope in a production cluster.) Jeff From dxh at yahoo.com Wed May 5 18:03:21 2010 From: dxh at yahoo.com (Don Hoover) Date: Wed, 5 May 2010 11:03:21 -0700 (PDT) Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x Message-ID: <140728.23705.qm@web65510.mail.ac4.yahoo.com> I have recently learned that its dangerous to use /dev/dm-x devices for LVM because they do not function correctly with multipath, and that you should always use the /dev/mapper/mpathx devices. Something about the fact that udev actually destroys and recreates the dm devices when paths change. But, qdisk seems to prefer using dm-x devices. Is there a way to get it to prefer using /dev/mapper/mpathx devices if they are available? I know you can spec the device instead of the label, but I would like to avoid that. From cmaiolino at redhat.com Wed May 5 20:37:41 2010 From: cmaiolino at redhat.com (Carlos Maiolino) Date: Wed, 5 May 2010 17:37:41 -0300 Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x In-Reply-To: <140728.23705.qm@web65510.mail.ac4.yahoo.com> References: <140728.23705.qm@web65510.mail.ac4.yahoo.com> Message-ID: <20100505203741.GA22566@andromeda.usersys.redhat.com> On Wed, May 05, 2010 at 11:03:21AM -0700, Don Hoover wrote: > I have recently learned that its dangerous to use /dev/dm-x devices for LVM because they do not function correctly with multipath, and that you should always use the /dev/mapper/mpathx devices. Something about the fact that udev actually destroys and recreates the dm devices when paths change. > > But, qdisk seems to prefer using dm-x devices. Is there a way to get it to prefer using /dev/mapper/mpathx devices if they are available? > > I know you can spec the device instead of the label, but I would like to avoid that. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster If I understant correctly your question and what you wrote, You are specifing qdisk label in cluster.conf. this is sufficient, independent of which device it will use to mount qdisk partition during qdiskd startup. -- --- Best Regards Carlos Eduardo Maiolino Software Maintenance Engineer Red Hat - Global Support Services From kitgerrits at gmail.com Thu May 6 00:01:57 2010 From: kitgerrits at gmail.com (Kit Gerrits) Date: Thu, 6 May 2010 02:01:57 +0200 Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices insteadof /dev/dm-x In-Reply-To: <140728.23705.qm@web65510.mail.ac4.yahoo.com> Message-ID: <4be206f7.0e67f10a.3af0.1600@mx.google.com> The danger in using /dev/dm-x is that there is no guarantee that the same device will show up at the same place after a reboot. Seeing as you use disklabels, this becomes a non-issue. Aside from that, I am not aware of any other reasons not to use /dev/dm-x -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Don Hoover Sent: woensdag 5 mei 2010 20:03 To: linux-cluster at redhat.com Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices insteadof /dev/dm-x I have recently learned that its dangerous to use /dev/dm-x devices for LVM because they do not function correctly with multipath, and that you should always use the /dev/mapper/mpathx devices. Something about the fact that udev actually destroys and recreates the dm devices when paths change. But, qdisk seems to prefer using dm-x devices. Is there a way to get it to prefer using /dev/mapper/mpathx devices if they are available? I know you can spec the device instead of the label, but I would like to avoid that. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.814 / Virus Database: 271.1.1/2853 - Release Date: 05/05/10 08:26:00 From celsowebber at yahoo.com Thu May 6 04:00:20 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Wed, 5 May 2010 21:00:20 -0700 (PDT) Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x In-Reply-To: <20100505203741.GA22566@andromeda.usersys.redhat.com> References: <140728.23705.qm@web65510.mail.ac4.yahoo.com> <20100505203741.GA22566@andromeda.usersys.redhat.com> Message-ID: <106818.2620.qm@web111712.mail.gq1.yahoo.com> Hello all, In my experience, I tend no to use the label in qdisk specification, because it leads qdisk to use the first device it encounters containing that label. Another case is when the device is not found at all, this happened to me before. I know your question is related to native DM Multipathing, but under EMC storages, using EMC's PowerPath multipath software, there is no way to force qdisk to use the "pseudo device" other than specifying the "mutipath device" instead of the label of the quorum device. Specifically under EMC PowerPath, you may end up with qdisk pointing to one of the devices that corresponds to a single path (/dev/sdXXX), instead of the pseudo device that passes the MP software for failover (/dev/emcpowerXXX under EMC PowerPath). So I always specify the device I want whenever the multipath software maintains multiple devices to the same LUN. Some MP software (for instance, Linux RDAC) "hide" the various devices for the same LUN, so you'll end with only one single device per LUN. In this case, you can use the qdisk label safely. Hope this helps. Regards, Celso. ----- Original Message ---- From: Carlos Maiolino To: linux clustering Sent: Wed, May 5, 2010 5:37:41 PM Subject: Re: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x On Wed, May 05, 2010 at 11:03:21AM -0700, Don Hoover wrote: > I have recently learned that its dangerous to use /dev/dm-x devices for LVM because they do not function correctly with multipath, and that you should always use the /dev/mapper/mpathx devices. Something about the fact that udev actually destroys and recreates the dm devices when paths change. > > But, qdisk seems to prefer using dm-x devices. Is there a way to get it to prefer using /dev/mapper/mpathx devices if they are available? > > I know you can spec the device instead of the label, but I would like to avoid that. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster If I understant correctly your question and what you wrote, You are specifing qdisk label in cluster.conf. this is sufficient, independent of which device it will use to mount qdisk partition during qdiskd startup. -- --- Best Regards Carlos Eduardo Maiolino Software Maintenance Engineer Red Hat - Global Support Services -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From celsowebber at yahoo.com Thu May 6 04:01:42 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Wed, 5 May 2010 21:01:42 -0700 (PDT) Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices insteadof /dev/dm-x In-Reply-To: <4be206f7.0e67f10a.3af0.1600@mx.google.com> References: <4be206f7.0e67f10a.3af0.1600@mx.google.com> Message-ID: <640443.89841.qm@web111719.mail.gq1.yahoo.com> Or you can map a specific LUN device ID number with a specific name under /etc/multipath.conf, so you'll always have the same device name for a specific LUN designated for the quorum device. Regards, Celso ----- Original Message ---- From: Kit Gerrits To: linux clustering Sent: Wed, May 5, 2010 9:01:57 PM Subject: Re: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices insteadof /dev/dm-x The danger in using /dev/dm-x is that there is no guarantee that the same device will show up at the same place after a reboot. Seeing as you use disklabels, this becomes a non-issue. Aside from that, I am not aware of any other reasons not to use /dev/dm-x -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Don Hoover Sent: woensdag 5 mei 2010 20:03 To: linux-cluster at redhat.com Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices insteadof /dev/dm-x I have recently learned that its dangerous to use /dev/dm-x devices for LVM because they do not function correctly with multipath, and that you should always use the /dev/mapper/mpathx devices. Something about the fact that udev actually destroys and recreates the dm devices when paths change. But, qdisk seems to prefer using dm-x devices. Is there a way to get it to prefer using /dev/mapper/mpathx devices if they are available? I know you can spec the device instead of the label, but I would like to avoid that. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.814 / Virus Database: 271.1.1/2853 - Release Date: 05/05/10 08:26:00 -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From andrew at ntsg.umt.edu Thu May 6 15:21:05 2010 From: andrew at ntsg.umt.edu (Andrew A. Neuschwander) Date: Thu, 06 May 2010 09:21:05 -0600 Subject: [Linux-cluster] gfs rg and journal size Message-ID: <4BE2DE61.5000200@ntsg.umt.edu> Is there a ways to determine to rg size and the journal size and count of a mounted gfs filesystems? Thanks, -Andrew -- Andrew A. Neuschwander, RHCE Manager, Systems Engineer Science Compute Services College of Forestry and Conservation The University of Montana http://www.ntsg.umt.edu andrew at ntsg.umt.edu - 406.243.6310 From jds at techma.com Fri May 7 00:07:03 2010 From: jds at techma.com (Simmons, Dan A) Date: Fri, 7 May 2010 00:07:03 +0000 Subject: [Linux-cluster] cman not looking at all valid interfaces In-Reply-To: References: Message-ID: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons From jumanjiman at gmail.com Fri May 7 00:23:01 2010 From: jumanjiman at gmail.com (Paul Morgan) Date: Thu, 6 May 2010 20:23:01 -0400 Subject: [Linux-cluster] cman not looking at all valid interfaces In-Reply-To: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> References: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> Message-ID: Can you post your configs? On May 6, 2010 8:16 PM, "Simmons, Dan A" wrote: Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Waite at datacash.com Fri May 7 09:38:47 2010 From: Martin.Waite at datacash.com (Martin Waite) Date: Fri, 7 May 2010 10:38:47 +0100 Subject: [Linux-cluster] is there a master rgmanager ? Message-ID: Hi, Is there a master rgmanager instance that makes decisions for the whole cluster, or does each rgmanager arrive at exactly the same decision as all the other instances based on the totally-ordered sequence of cluster events that update their state machines ? If there is a master rgmanager instance, is it possible to identify which node it is running on ? regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmaiolino at redhat.com Fri May 7 13:13:15 2010 From: cmaiolino at redhat.com (Carlos Maiolino) Date: Fri, 7 May 2010 10:13:15 -0300 Subject: [Linux-cluster] gfs rg and journal size In-Reply-To: <4BE2DE61.5000200@ntsg.umt.edu> References: <4BE2DE61.5000200@ntsg.umt.edu> Message-ID: <20100507131315.GA2474@andromeda.usersys.redhat.com> On Thu, May 06, 2010 at 09:21:05AM -0600, Andrew A. Neuschwander wrote: > Is there a ways to determine to rg size and the journal size and count of a mounted gfs filesystems? > > Thanks, > -Andrew > -- > Andrew A. Neuschwander, RHCE > Manager, Systems Engineer > Science Compute Services > College of Forestry and Conservation > The University of Montana > http://www.ntsg.umt.edu > andrew at ntsg.umt.edu - 406.243.6310 > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster I don't remember if there is a tool for it, but if you do not make a gfs FS with different parameters, the default is 128 MB -- --- Best Regards Carlos Eduardo Maiolino Red Hat - Global Support Services From charlieb-linux-cluster at budge.apana.org.au Fri May 7 13:26:38 2010 From: charlieb-linux-cluster at budge.apana.org.au (Charlie Brady) Date: Fri, 7 May 2010 09:26:38 -0400 (EDT) Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x In-Reply-To: <106818.2620.qm@web111712.mail.gq1.yahoo.com> References: <140728.23705.qm@web65510.mail.ac4.yahoo.com> <20100505203741.GA22566@andromeda.usersys.redhat.com> <106818.2620.qm@web111712.mail.gq1.yahoo.com> Message-ID: On Wed, 5 May 2010, Celso K. Webber wrote: > In my experience, I tend no to use the label in qdisk specification, > because it leads qdisk to use the first device it encounters containing > that label. Your qdisk label should be unique, and you should be controlling visibility of qdisks. > Another case is when the device is not found at all, this > happened to me before. If that happens, you have a bigger problem than just using the qdisk label to identify the qdisk, don't you? From rpeterso at redhat.com Fri May 7 13:57:51 2010 From: rpeterso at redhat.com (Bob Peterson) Date: Fri, 7 May 2010 09:57:51 -0400 (EDT) Subject: [Linux-cluster] gfs rg and journal size In-Reply-To: <1550609619.244881273240560734.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <569169375.245311273240671353.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- "Andrew A. Neuschwander" wrote: | Is there a ways to determine to rg size and the journal size and count | of a mounted gfs filesystems? | | Thanks, | -Andrew In gfs2, there's an easy way: [root at roth-01 ~]# gfs2_tool journals /mnt/gfs2 journal2 - 128MB journal1 - 128MB journal0 - 128MB 3 journal(s) found. With gfs1, gfs_tool df tells you how many journals but it doesn't tell you their size, and there's no "journals" option in gfs_tool. You could use gfs_tool jindex, but the output is cryptic; it prints out the number of 64K segments, so 2048 corresponds to 128MB, assuming a default 4K block size. Regards, Bob Peterson Red Hat File Systems From jds at techma.com Fri May 7 18:27:59 2010 From: jds at techma.com (Simmons, Dan A) Date: Fri, 7 May 2010 18:27:59 +0000 Subject: [Linux-cluster] Linux-cluster Digest, Vol 73, Issue 7 In-Reply-To: References: Message-ID: <78A76986A50E6A4089BD229281D1E2990C0E0300@FXMAIL.techma.com> Paul, I will have to clear the release of my config files through our security folks. It will take a couple days. Thanks for taking an interest. Dan -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of linux-cluster-request at redhat.com Sent: Friday, May 07, 2010 12:01 PM To: linux-cluster at redhat.com Subject: Linux-cluster Digest, Vol 73, Issue 7 Send Linux-cluster mailing list submissions to linux-cluster at redhat.com To subscribe or unsubscribe via the World Wide Web, visit https://www.redhat.com/mailman/listinfo/linux-cluster or, via email, send a message with subject or body 'help' to linux-cluster-request at redhat.com You can reach the person managing the list at linux-cluster-owner at redhat.com When replying, please edit your Subject line so it is more specific than "Re: Contents of Linux-cluster digest..." Today's Topics: 1. cman not looking at all valid interfaces (Simmons, Dan A) 2. Re: cman not looking at all valid interfaces (Paul Morgan) 3. is there a master rgmanager ? (Martin Waite) 4. Re: gfs rg and journal size (Carlos Maiolino) 5. Re: Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x (Charlie Brady) 6. Re: gfs rg and journal size (Bob Peterson) ---------------------------------------------------------------------- Message: 1 Date: Fri, 7 May 2010 00:07:03 +0000 From: "Simmons, Dan A" To: "'linux-cluster at redhat.com'" Subject: [Linux-cluster] cman not looking at all valid interfaces Message-ID: <78A76986A50E6A4089BD229281D1E2990C0DDB6D at FXMAIL.techma.com> Content-Type: text/plain; charset="us-ascii" Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons ------------------------------ Message: 2 Date: Thu, 6 May 2010 20:23:01 -0400 From: Paul Morgan To: linux clustering Subject: Re: [Linux-cluster] cman not looking at all valid interfaces Message-ID: Content-Type: text/plain; charset="iso-8859-1" Can you post your configs? On May 6, 2010 8:16 PM, "Simmons, Dan A" wrote: Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Fri, 7 May 2010 10:38:47 +0100 From: "Martin Waite" To: "linux clustering" Subject: [Linux-cluster] is there a master rgmanager ? Message-ID: Content-Type: text/plain; charset="us-ascii" Hi, Is there a master rgmanager instance that makes decisions for the whole cluster, or does each rgmanager arrive at exactly the same decision as all the other instances based on the totally-ordered sequence of cluster events that update their state machines ? If there is a master rgmanager instance, is it possible to identify which node it is running on ? regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 4 Date: Fri, 7 May 2010 10:13:15 -0300 From: Carlos Maiolino To: linux clustering Subject: Re: [Linux-cluster] gfs rg and journal size Message-ID: <20100507131315.GA2474 at andromeda.usersys.redhat.com> Content-Type: text/plain; charset=us-ascii On Thu, May 06, 2010 at 09:21:05AM -0600, Andrew A. Neuschwander wrote: > Is there a ways to determine to rg size and the journal size and count of a mounted gfs filesystems? > > Thanks, > -Andrew > -- > Andrew A. Neuschwander, RHCE > Manager, Systems Engineer > Science Compute Services > College of Forestry and Conservation > The University of Montana > http://www.ntsg.umt.edu > andrew at ntsg.umt.edu - 406.243.6310 > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster I don't remember if there is a tool for it, but if you do not make a gfs FS with different parameters, the default is 128 MB -- --- Best Regards Carlos Eduardo Maiolino Red Hat - Global Support Services ------------------------------ Message: 5 Date: Fri, 7 May 2010 09:26:38 -0400 (EDT) From: Charlie Brady To: linux clustering Subject: Re: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x Message-ID: Content-Type: TEXT/PLAIN; charset=US-ASCII On Wed, 5 May 2010, Celso K. Webber wrote: > In my experience, I tend no to use the label in qdisk specification, > because it leads qdisk to use the first device it encounters containing > that label. Your qdisk label should be unique, and you should be controlling visibility of qdisks. > Another case is when the device is not found at all, this > happened to me before. If that happens, you have a bigger problem than just using the qdisk label to identify the qdisk, don't you? ------------------------------ Message: 6 Date: Fri, 7 May 2010 09:57:51 -0400 (EDT) From: Bob Peterson To: linux clustering Subject: Re: [Linux-cluster] gfs rg and journal size Message-ID: <569169375.245311273240671353.JavaMail.root at zmail06.collab.prod.int.phx2.redhat.com> Content-Type: text/plain; charset=utf-8 ----- "Andrew A. Neuschwander" wrote: | Is there a ways to determine to rg size and the journal size and count | of a mounted gfs filesystems? | | Thanks, | -Andrew In gfs2, there's an easy way: [root at roth-01 ~]# gfs2_tool journals /mnt/gfs2 journal2 - 128MB journal1 - 128MB journal0 - 128MB 3 journal(s) found. With gfs1, gfs_tool df tells you how many journals but it doesn't tell you their size, and there's no "journals" option in gfs_tool. You could use gfs_tool jindex, but the output is cryptic; it prints out the number of 64K segments, so 2048 corresponds to 128MB, assuming a default 4K block size. Regards, Bob Peterson Red Hat File Systems ------------------------------ -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster End of Linux-cluster Digest, Vol 73, Issue 7 ******************************************** From kitgerrits at gmail.com Sat May 8 00:50:46 2010 From: kitgerrits at gmail.com (Kit Gerrits) Date: Sat, 8 May 2010 02:50:46 +0200 Subject: [Linux-cluster] cman not looking at all valid interfaces In-Reply-To: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> Message-ID: <4be4b564.0f67f10a.67a9.ffffb074@mx.google.com> You might want to check your cman version. This is a known bug in an earlier version and has been fixed. Among others: https://bugzilla.redhat.com/show_bug.cgi?id=488565 (fixed in 2.0.115) 2008-01-21 22:00:00 Chris Feist - 2.0.79-1: - ccs lookup functions now recognize alternative hostnames better Also of interest: 2009-10-06 22:00:00 Christine Caulfield - 2.0.115-8: - fence: Allow IP addresses as node names -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Simmons, Dan A Sent: vrijdag 7 mei 2010 2:07 To: 'linux-cluster at redhat.com' Subject: [Linux-cluster] cman not looking at all valid interfaces Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.819 / Virus Database: 271.1.1/2857 - Release Date: 05/06/10 08:26:00 From celsowebber at yahoo.com Sat May 8 02:43:34 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 19:43:34 -0700 (PDT) Subject: [Linux-cluster] Get qdisk to use /dev/mapper/mpathx devices instead of /dev/dm-x In-Reply-To: References: <140728.23705.qm@web65510.mail.ac4.yahoo.com> <20100505203741.GA22566@andromeda.usersys.redhat.com> <106818.2620.qm@web111712.mail.gq1.yahoo.com> Message-ID: <218667.28100.qm@web111713.mail.gq1.yahoo.com> Hello, ----- Original Message ---- From: Charlie Brady > Your qdisk label should be unique, and you should be controlling > visibility of qdisks. I think you did not understand what I meant. If you have multiple devices pointing to the same LUN (this happens with many multipathing software), your OS will end up with various block devices in /dev pointing to the same LUN, thus making qdiskd to pick up the first one it founds, and ignoring the rest. If it not picks the right one, you'll have no reason to have multipath connections to your shared storage. So its obvious that my disk label is unique, the problem is when you have multiple devices pointing to it. > Another case is when the device is not found at all, this > happened to me before. >> If that happens, you have a bigger problem than just using the qdisk label >> to identify the qdisk, don't you? No, I don't agree. I had a very recent situation with a RHEL 5.3 AP installation, EMC CX series storage (Clariion), PowerPath multipathing software, where the qdiskd simply didn't recognize the /dev devices associated to the quorum disk LUN. It simply didn't initialize because didn't find the label on any device it scanned. Simply pointing it to the correct PowerPath pseudo device, /dev/emcpowerq in this case, solved the problem. Again, I'm not discussing specific situations using Linux DM multipathing software, but other solutions such as EMC's PowerPath. And I'm sure I have enough experience with that, working for more than 5 years on field implementations for Dell. My intention was to add experience, for me the best option is ALWAYS to use the device you want. Even with Linux DM Multipath devices. thank you. Regards, Celso. From celsowebber at yahoo.com Sat May 8 02:50:42 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 19:50:42 -0700 (PDT) Subject: [Linux-cluster] cman not looking at all valid interfaces In-Reply-To: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> References: <78A76986A50E6A4089BD229281D1E2990C0DDB6D@FXMAIL.techma.com> Message-ID: <592539.4912.qm@web111708.mail.gq1.yahoo.com> Hi Simmons, While we wait for your configuration, please check if you did use a fully qulified domainname in cluster.conf. I had problems with that in the days of RHEL 5.0 and 5.1, so nowadays I always setup my cluster.conf using names like "nodeX.localdomain" and map these names to the IP addresses of my heartbeat network. Please try this and tell us if it works, ok? Regards, Celso. ----- Original Message ---- From: "Simmons, Dan A" To: "linux-cluster at redhat.com" Sent: Thu, May 6, 2010 9:07:03 PM Subject: [Linux-cluster] cman not looking at all valid interfaces Hi, cman appears to not scan for interfaces correctly on my RHEL 5.3 64bit cluster. When I change /etc/hosts and /etc/cluster/cluster.conf to use names "mynode1" and "mynode2" cman boots up correctly and I get a good cluster. If I change my configuration to use the interfaces named "mynode1-clu" and "mynode2-clu" which are defined in /etc/hosts I get an error: Cman not started: Can't find local node name in cluster.conf /usr/sbin/cman_tool: aisexec daemon didn't start However, if I do `hostname mynode1-clu` on the first node and `hostname mynode2-clu` on the second node and then restart cman I get a good cluster. I think this proves that my -clu interfaces are valid and properly defined in /etc/cluster/cluster.conf. This node was created from a disk clone of another cluster. Are there any cluster files that would retain system name or interface information? I have triple checked my /etc/hosts, /etc/sysconfig/network /etc/sysconfig/network-scripts/*eth* , /etc/nsswitch.conf, /etc/cluster/cluster.conf and dns files. Any suggestions would be appreciated. J.Dan Simmons From celsowebber at yahoo.com Sat May 8 02:53:41 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 19:53:41 -0700 (PDT) Subject: [Linux-cluster] is there a master rgmanager ? In-Reply-To: References: Message-ID: <587718.43004.qm@web111716.mail.gq1.yahoo.com> Hi, I have a similar question to the master qdiskd daemon: how can I say which node has the qdisk master role? Today I go to the logs of each node and find out which one has the mos recent message "assuming master role" message. Thank you. ________________________________ From: Martin Waite To: linux clustering Sent: Fri, May 7, 2010 6:38:47 AM Subject: [Linux-cluster] is there a master rgmanager ? Hi, Is there a master rgmanager instance that makes decisions for the whole cluster, or does each rgmanager arrive at exactly the same decision as all the other instances based on the totally-ordered sequence of cluster events that update their state machines ? If there is a master rgmanager instance, is it possible to identify which node it is running on ? regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From celsowebber at yahoo.com Sat May 8 03:00:05 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 20:00:05 -0700 (PDT) Subject: [Linux-cluster] Maximum number of nodes In-Reply-To: <1272988538.7196.376.camel@localhost.localdomain> References: <1272988538.7196.376.camel@localhost.localdomain> Message-ID: <154006.12408.qm@web111721.mail.gq1.yahoo.com> Hi Steve, I believe the 16 node is a limit of qdiskd only, isn't it? CMAN node limit can be much higher, I believe. I don't know about GFS/GFS2 node limits, on the other hand. >From the Cluster FAQ at http://sources.redhat.com/cluster/wiki/FAQ/CMAN: "Is quorum disk/partition reserved for two-node clusters, and if not, how many nodes can it support?Currently a quorum disk/partition may be used in clusters of up to 16 nodes." Regards, Celso. ----- Original Message ---- From: Steven Whitehouse To: linux clustering Sent: Tue, May 4, 2010 12:55:38 PM Subject: Re: [Linux-cluster] Maximum number of nodes Hi, On Fri, 2010-04-30 at 14:37 -0500, Dusty wrote: > Hello, > > Regarding the component versions of "Redhat Cluster Suite" as released > on the 5.4 and 5.5 ISOs...: > > What is the maximum number of nodes that will work within a single > cluster? > The limit is 16 nodes. > From where do the limitations come? GFS2? Qdisk? What if not using > qdisk? What if not using GFS2? > > Thank you! The limit is down to what we can reasonably test, and thus what is supported. The theoretical limit is much higher and it may be possible to raise the supported node limits in future, Steve. From celsowebber at yahoo.com Sat May 8 03:03:43 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 20:03:43 -0700 (PDT) Subject: [Linux-cluster] nfs mountpoints In-Reply-To: <4BDDBEE6.90904@joelly.net> References: <4BDD8D2D.9070702@joelly.net> <4BDDBEE6.90904@joelly.net> Message-ID: <901805.3143.qm@web111722.mail.gq1.yahoo.com> Hello Joelly, Does this really work in RHEL / CS 5.x? I agree that this was the way to go in Cluster Suite v3.x (don't remember if worked in 4.x), but I tried this recently with a RHEL 5.4 installation and it didn't work. It always exported the whole GFS filesystems. Thanks. ----- Original Message ---- From: Joelly Alexander found by myself - path does the trick... thx On 02.05.2010 16:33, Joelly Alexander wrote: > hello, > > i have an working 2node cluster which serves an gfs2 mountpoint via nfs; > how to configure that subdirectories under the gfs2 mountpoint are exported via nfs and not the entire gfs2 mountpoint is exported via nfs? > i do not want to create many gfs2 mountpoints and export them as different nfs exports - this should be possible with only one gfs2 mountpoint also, or? > it is possible to configure nfs outside the rhcs, but if possible i want to have it inside... > From celsowebber at yahoo.com Sat May 8 03:08:52 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Fri, 7 May 2010 20:08:52 -0700 (PDT) Subject: [Linux-cluster] Cluster v2 online adding nodes? In-Reply-To: <64D0546C5EBBD147B75DE133D798665F055D90A8@hugo.eprize.local> References: <4BD4A06C.4090402@srce.hr> <64D0546C5EBBD147B75DE133D798665F055D90A8@hugo.eprize.local> Message-ID: <184478.11931.qm@web111704.mail.gq1.yahoo.com> I believe you can online add a third node on a 2-node cluster with you use qdiskd since the beginning, right? This worked for me in a project where we knew in advance that we'd have a third node later on, but starting only with 2 nodes. ----- Original Message ---- From: Jeff Sturm To: linux clustering Sent: Sun, April 25, 2010 8:33:45 PM Subject: Re: [Linux-cluster] Cluster v2 online adding nodes? > -----Original Message----- > Can I add or remove a node from cluster by just adding/removing it from > cluster.conf? Is that kind of cluster reconfiguration supported without > reboots? Provided your cluster has more than 2 nodes, this should work fine. (I believe two node clusters are special.) From jakov.sosic at srce.hr Sat May 8 09:18:45 2010 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Sat, 08 May 2010 11:18:45 +0200 Subject: [Linux-cluster] is there a master rgmanager ? In-Reply-To: <587718.43004.qm@web111716.mail.gq1.yahoo.com> References: <587718.43004.qm@web111716.mail.gq1.yahoo.com> Message-ID: <4BE52C75.5060608@srce.hr> On 05/08/2010 04:53 AM, Celso K. Webber wrote: > Hi, I have a similar question to the master qdiskd daemon: how can I say > which node has the qdisk master role? > > Today I go to the logs of each node and find out which one has the mos > recent message "assuming master role" message. > > Thank you. > > ------------------------------------------------------------------------ > *From:* Martin Waite > *To:* linux clustering > *Sent:* Fri, May 7, 2010 6:38:47 AM > *Subject:* [Linux-cluster] is there a master rgmanager ? > > Hi, > > > > Is there a master rgmanager instance that makes decisions for the whole > cluster, or does each rgmanager arrive at exactly the same decision as > all the other instances based on the totally-ordered sequence of cluster > events that update their state machines ? > > > > If there is a master rgmanager instance, is it possible to identify > which node it is running on ? Just add a "status_file" to your section: # cat /etc/cluster/cluster.conf | egrep "var.run.cluster" # cat /var/run/cluster/qdisk Time Stamp: Sat May 8 11:17:00 2010 Node ID: 1 Score: 1/1 (Minimum required = 1) Current state: Master Initializing Set: { } Visible Set: { 1 2 3 } Master Node ID: 1 Quorate Set: { 1 2 3 } -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From kitgerrits at gmail.com Sat May 8 10:13:03 2010 From: kitgerrits at gmail.com (Kit Gerrits) Date: Sat, 8 May 2010 12:13:03 +0200 Subject: [Linux-cluster] Cluster v2 online adding nodes? In-Reply-To: <184478.11931.qm@web111704.mail.gq1.yahoo.com> Message-ID: <4be5392c.0f67f10a.5abf.ffffb6be@mx.google.com> You can add a node, but remember that there are special variables you will need to un-override: Both of these vabiables will no longer be necessary. Keep in mind that your heartbeat network will need ot support a third node. (no cross-cables, check your network for multicast functionality) Regards, Kit -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Celso K. Webber Sent: zaterdag 8 mei 2010 5:09 To: linux clustering Subject: Re: [Linux-cluster] Cluster v2 online adding nodes? I believe you can online add a third node on a 2-node cluster with you use qdiskd since the beginning, right? This worked for me in a project where we knew in advance that we'd have a third node later on, but starting only with 2 nodes. ----- Original Message ---- From: Jeff Sturm To: linux clustering Sent: Sun, April 25, 2010 8:33:45 PM Subject: Re: [Linux-cluster] Cluster v2 online adding nodes? > -----Original Message----- > Can I add or remove a node from cluster by just adding/removing it from > cluster.conf? Is that kind of cluster reconfiguration supported without > reboots? Provided your cluster has more than 2 nodes, this should work fine. (I believe two node clusters are special.) -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.819 / Virus Database: 271.1.1/2859 - Release Date: 05/07/10 08:26:00 From joao.miguel.c.ferreira at gmail.com Sat May 8 14:01:24 2010 From: joao.miguel.c.ferreira at gmail.com (Joao Ferreira gmail) Date: Sat, 08 May 2010 15:01:24 +0100 Subject: [Linux-cluster] mount root on gfs Message-ID: <1273327284.7547.6.camel@debj5n.critical.pt> Hello all, I'dd like to know if the Linux kernel can mount the root filesystem on a gfs partition. I can't find an answer to this question any where. I'dd apreciate any insight on this matter (how to do it, prerequisites ...) Thank you. Joao From jeff.sturm at eprize.com Sat May 8 21:28:19 2010 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Sat, 8 May 2010 17:28:19 -0400 Subject: [Linux-cluster] mount root on gfs In-Reply-To: <1273327284.7547.6.camel@debj5n.critical.pt> References: <1273327284.7547.6.camel@debj5n.critical.pt> Message-ID: <64D0546C5EBBD147B75DE133D798665F055D91D1@hugo.eprize.local> > -----Original Message----- > From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] > On Behalf Of Joao Ferreira gmail > Sent: Saturday, May 08, 2010 10:01 AM > To: linux clustering > Subject: [Linux-cluster] mount root on gfs > > I'dd like to know if the Linux kernel can mount the root filesystem on a > gfs partition. Haven't tried it myself, but this site offers step-by-step procedures: http://www.open-sharedroot.org/documentation/rhel5-gfs-shared-root-mini- howto/ Jeff From sdake at redhat.com Sun May 9 00:44:25 2010 From: sdake at redhat.com (Steven Dake) Date: Sat, 08 May 2010 17:44:25 -0700 Subject: [Linux-cluster] Maximum number of nodes In-Reply-To: <154006.12408.qm@web111721.mail.gq1.yahoo.com> References: <1272988538.7196.376.camel@localhost.localdomain> <154006.12408.qm@web111721.mail.gq1.yahoo.com> Message-ID: <1273365865.2625.24.camel@localhost.localdomain> On Fri, 2010-05-07 at 20:00 -0700, Celso K. Webber wrote: > Hi Steve, > > I believe the 16 node is a limit of qdiskd only, isn't it? CMAN node limit can be much higher, I believe. I don't know about GFS/GFS2 node limits, on the other hand. > > >From the Cluster FAQ at http://sources.redhat.com/cluster/wiki/FAQ/CMAN: > > "Is quorum disk/partition reserved for two-node clusters, and if not, how many nodes can it support?Currently a quorum disk/partition may be used in clusters of up to 16 nodes." > > Regards, > > Celso. > > Celso, BOth Chrissie and I have tested the cluster3/gfs2/corosync combo with 48 physical nodes with success. However, at this point in time, we don't have a constant 48 nodes to test upstream development trees. 16 nodes is what we support at the moment because this is what most of the developers have available to test with regularly. We are working to expand this in the future. Regards -steve > ----- Original Message ---- > From: Steven Whitehouse > To: linux clustering > Sent: Tue, May 4, 2010 12:55:38 PM > Subject: Re: [Linux-cluster] Maximum number of nodes > > Hi, > > On Fri, 2010-04-30 at 14:37 -0500, Dusty wrote: > > Hello, > > > > Regarding the component versions of "Redhat Cluster Suite" as released > > on the 5.4 and 5.5 ISOs...: > > > > What is the maximum number of nodes that will work within a single > > cluster? > > > The limit is 16 nodes. > > > From where do the limitations come? GFS2? Qdisk? What if not using > > qdisk? What if not using GFS2? > > > > Thank you! > The limit is down to what we can reasonably test, and thus what is > supported. The theoretical limit is much higher and it may be possible > to raise the supported node limits in future, > > Steve. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swhiteho at redhat.com Mon May 10 10:11:13 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 10 May 2010 11:11:13 +0100 Subject: [Linux-cluster] mount root on gfs In-Reply-To: <64D0546C5EBBD147B75DE133D798665F055D91D1@hugo.eprize.local> References: <1273327284.7547.6.camel@debj5n.critical.pt> <64D0546C5EBBD147B75DE133D798665F055D91D1@hugo.eprize.local> Message-ID: <1273486273.2996.1.camel@localhost> Hi, On Sat, 2010-05-08 at 17:28 -0400, Jeff Sturm wrote: > > -----Original Message----- > > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] > > On Behalf Of Joao Ferreira gmail > > Sent: Saturday, May 08, 2010 10:01 AM > > To: linux clustering > > Subject: [Linux-cluster] mount root on gfs > > > > I'dd like to know if the Linux kernel can mount the root filesystem on > a > > gfs partition. > > Haven't tried it myself, but this site offers step-by-step procedures: > > http://www.open-sharedroot.org/documentation/rhel5-gfs-shared-root-mini- > howto/ > > Jeff > > It should work and there are people actually doing it (might need some tweeking of scripts) but it isn't actually supported so far as RHEL support goes, Steve. From swhiteho at redhat.com Mon May 10 10:14:45 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 10 May 2010 11:14:45 +0100 Subject: [Linux-cluster] Maximum number of nodes In-Reply-To: <154006.12408.qm@web111721.mail.gq1.yahoo.com> References: <1272988538.7196.376.camel@localhost.localdomain> <154006.12408.qm@web111721.mail.gq1.yahoo.com> Message-ID: <1273486485.2996.4.camel@localhost> Hi, On Fri, 2010-05-07 at 20:00 -0700, Celso K. Webber wrote: > Hi Steve, > I believe the 16 node is a limit of qdiskd only, isn't it? CMAN node limit can be much higher, I believe. I don't know about GFS/GFS2 node limits, on the other hand. > > As per the original message, the limit is down to the number of nodes we can reasonably test and not any limit of the actual software. Much larger clusters are possible, but are not supported via RHEL. Having said that, I'm not a qdisk expert and other limits may apply to that specifically. Steve. > >From the Cluster FAQ at http://sources.redhat.com/cluster/wiki/FAQ/CMAN: > > "Is quorum disk/partition reserved for two-node clusters, and if not, how many nodes can it support?Currently a quorum disk/partition may be used in clusters of up to 16 nodes." > > Regards, > > Celso. > > > ----- Original Message ---- > From: Steven Whitehouse > To: linux clustering > Sent: Tue, May 4, 2010 12:55:38 PM > Subject: Re: [Linux-cluster] Maximum number of nodes > > Hi, > > On Fri, 2010-04-30 at 14:37 -0500, Dusty wrote: > > Hello, > > > > Regarding the component versions of "Redhat Cluster Suite" as released > > on the 5.4 and 5.5 ISOs...: > > > > What is the maximum number of nodes that will work within a single > > cluster? > > > The limit is 16 nodes. > > > From where do the limitations come? GFS2? Qdisk? What if not using > > qdisk? What if not using GFS2? > > > > Thank you! > The limit is down to what we can reasonably test, and thus what is > supported. The theoretical limit is much higher and it may be possible > to raise the supported node limits in future, > > Steve. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From joao.miguel.c.ferreira at gmail.com Mon May 10 10:46:19 2010 From: joao.miguel.c.ferreira at gmail.com (Joao Ferreira gmail) Date: Mon, 10 May 2010 11:46:19 +0100 Subject: [Linux-cluster] gfs limitations regarding some features Message-ID: <1273488379.5604.9.camel@debj5n.critical.pt> Hello all, a friend told me that I might run into problems if I try to use gfs/gfs2 in systems with the following requirements: - Samba - POSIX ACLs - users and group quotas - RAID (mdadm) Not being an expert in any of these areas, I'dd like to have some insight on any related limitations I might encounter when using those features on gfs/gfs2 filesystems. Thank you Joao From swhiteho at redhat.com Mon May 10 11:26:14 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 10 May 2010 12:26:14 +0100 Subject: [Linux-cluster] gfs limitations regarding some features In-Reply-To: <1273488379.5604.9.camel@debj5n.critical.pt> References: <1273488379.5604.9.camel@debj5n.critical.pt> Message-ID: <1273490774.7196.395.camel@localhost.localdomain> On Mon, 2010-05-10 at 11:46 +0100, Joao Ferreira gmail wrote: > Hello all, > > a friend told me that I might run into problems if I try to use gfs/gfs2 > in systems with the following requirements: > > - Samba You need to be careful to ensure that (a) you don't share samba data with either local applications or via NFS at the same time and (b) that you use the version of samba which understands a clustered fs backend. Beyond that it should work and is supported. > - POSIX ACLs No issues that I'm aware of. These should just work. > - users and group quotas There are changes in this area. The old gfs2_quota tool still works. We intend to replace that with the generic quota-tools at some future point, but both will be available for the time being. > - RAID (mdadm) You can't run md over shared block devices. You can use hardware raid though, Steve. > > Not being an expert in any of these areas, I'dd like to have some > insight on any related limitations I might encounter when using those > features on gfs/gfs2 filesystems. > > Thank you > Joao > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From azzopardi at eib.org Mon May 10 15:41:30 2010 From: azzopardi at eib.org (AZZOPARDI Konrad) Date: Mon, 10 May 2010 17:41:30 +0200 Subject: [Linux-cluster] Sybase Cluster on RedHAt Message-ID: <0L2700FQYNLB9S40@comexp1-srv.lux.eib.org> Dear all, I have a working two node RedHAt cluster and need to configure two Sybase resources, I am using a script called ASEHAagent-eib which is basically a copy of the original with an extra kill so nothing new. I think there is a problem here because I am not allowed to have two ASEHAagent running at the same time and my /var/log/messages states something like this : May 10 17:40:32 dc1-x6270-a clurgmgrd[7692]: Reconfiguring May 10 17:40:32 dc1-x6270-a clurgmgrd[7692]: Loading Service Data May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Unique attribute collision. type=ASEHAagent-eib attr=sybase_home value=/app/sybase May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Error storing ASEHAagent-eib resource May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Applying new configuration #30 May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Stopping changed resources. May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Restarting changed resources. May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Starting changed resources. Is there a more elegant way to do it other than having multiple names for the same script ? Tnx konrad -------------------------------------------------------------------- Les informations contenues dans ce message et/ou ses annexes sont reservees a l'attention et a l'utilisation de leur destinataire et peuvent etre confidentielles. Si vous n'etes pas destinataire de ce message, vous etes informes que vous l'avez recu par erreur et que toute utilisation en est interdite. Dans ce cas, vous etes pries de le detruire et d'en informer la Banque Europeenne d'Investissement. The information in this message and/or attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are hereby notified that you have received this transmittal in error and that any use of it is prohibited. In such a case please delete this message and kindly notify the European Investment Bank accordingly. -------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Tue May 11 02:56:50 2010 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Tue, 11 May 2010 04:56:50 +0200 Subject: [Linux-cluster] Cluster 3.0.12 stable release Message-ID: <4BE8C772.3010603@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 The cluster team and its community are proud to announce the 3.0.12 stable release from the STABLE3 branch. This release contains a few major bug fixes. We strongly recommend people to update their clusters. In order to build/run the 3.0.12 release you will need: - - corosync 1.2.1 - - openais 1.1.2 - - linux kernel 2.6.31 (only for GFS1 users) The new source tarball can be downloaded here: https://fedorahosted.org/releases/c/l/cluster/cluster-3.0.12.tar.bz2 To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio Under the hood (from 3.0.11): Abhijith Das (3): gfs2_quota: Keep quota file length always sizeof(struct gfs2_quota) aligned Merge branch 'STABLE3' of ssh://git.fedoraproject.org/git/cluster into mySTABLE3 gfs2_convert: gfs2_convert doesn't convert quota files Andrew Beekhof (1): dlm_controld.pcmk: Prevent use-of-NULL by checking the node has a valid address before adding it to configfs Christine Caulfield (2): Make totem.fail_to_recv_const default to 2500 cman: fix name of fail_recv_const David Teigland (1): gfs_controld: fix do_leave arg Jonathan Brassow (1): HA LVM: Use CLVM with local machine kernel targets (bz 585217) Lon Hohberger (13): gfs2: Fix handling of mount points with spaces resource-agents: Clean up file system agents rgmanager: Fix resource start/check times rgmanager: Fix max_restart threshold handling rgmanager: Fix build warnings rgmanager: Add per-resource status check tolerances resource-agents: Add status_program attribute config: Update config schema config: Add fail_recv_const rgmanager: Do hard shut down if corosync dies rgmanager: Fix man pages qdiskd: Change votes instead of cman-visible status qdiskd: Change votes instead of cman-visible status Steven Whitehouse (1): Merge branch 'STABLE3' of git://git.fedorahosted.org/cluster into STABLE3 cman/daemon/cman-preconfig.c | 5 + cman/qdisk/main.c | 63 ++- config/plugins/ldap/99cluster.ldif | 74 ++- config/plugins/ldap/ldap-base.csv | 6 +- config/tools/xml/cluster.rng.in | 144 +++++ gfs2/convert/gfs2_convert.c | 49 ++- gfs2/mount/util.c | 55 ++- gfs2/quota/check.c | 80 ++- gfs2/quota/main.c | 2 +- group/dlm_controld/pacemaker.c | 2 +- group/gfs_controld/main.c | 11 +- rgmanager/include/groups.h | 1 + rgmanager/include/reslist.h | 2 +- rgmanager/man/Makefile | 2 +- rgmanager/man/clurgmgrd.8 | 42 +-- rgmanager/man/clurmtabd.8 | 37 -- rgmanager/man/rgmanager.8 | 43 ++ rgmanager/src/clulib/msg_cluster.c | 7 +- rgmanager/src/daemons/groups.c | 16 + rgmanager/src/daemons/restart_counter.c | 5 + rgmanager/src/daemons/restree.c | 50 ++- rgmanager/src/daemons/rg_state.c | 6 +- rgmanager/src/daemons/slang_event.c | 20 +- rgmanager/src/resources/Makefile | 2 +- rgmanager/src/resources/clusterfs.sh | 608 +------------------- rgmanager/src/resources/fs.sh.in | 797 +------------------------- rgmanager/src/resources/lvm.sh | 52 +-- rgmanager/src/resources/lvm_by_lv.sh | 75 +++- rgmanager/src/resources/lvm_by_vg.sh | 147 +++++- rgmanager/src/resources/netfs.sh | 368 ++----------- rgmanager/src/resources/ra2rng.xsl | 8 +- rgmanager/src/resources/utils/fs-lib.sh | 942 +++++++++++++++++++++++++++++++ rgmanager/src/resources/vm.sh | 33 +- 33 files changed, 1846 insertions(+), 1908 deletions(-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJL6MdwAAoJEFA6oBJjVJ+O4ckP/jF4QsGljlvKZhz0oPtkhjBT NEbjCllCOInb1wEKWY7mIYo7FEp7CXd9l6mqoXbqWbmSrr8fWM2coXWG5I7orXuG x3VUq/37HmykIKWdnSGPoFzJ0wtKBdAW7bnXNJaoC00HffIBwPyz7OEfwc8iMjX/ lAhGDt/83lRY2z/3h9ytPJaUvzfp1R/T7JCpHKEQuhcxa2CMORYhhiPlKf51c7cX FE0P/JlAv/V9Q1p9zzkUd670Ei+9H02ajkK+Y5CqPCKZgWrSQ7m3aG1WFBWYm3iS yHns9zCuH2ISXN+3rnNminDfOcEbAiOg0H9/0P7LfNXEEveQb+Ofk4VGtnA0mZDY yMAs+doL0M7t9LAw0KJ8KukLCuIEGoso9LQs0ZD3Hj35IuY8dG8vTN+cvoTZ9lVE /XTG01XeXtqqr38ifmgLR/sAcJSIXEgX3Qjc7KR+KRoQ4dLXkyNqgPgioh5VuPAg YZIMHDDSLCrYY4LdjeFest9pQnMXIQwDb0gQkHgz8V+06VITcQwO3hl2NvJPVnj4 nHb63ZzLomH6Pd56vduQOvv6IVD+D4OwyLilBkLaKv7TKnBQdwwwKNJhODX/RQOb mo2AGlwC6L5ndynm1UIhh6QILTnU9xgfJsI3A8hGQGlc2rNSUWzVyuNufxaNC6UZ D+jefxJ4UDRauJiwN9zX =kU4B -----END PGP SIGNATURE----- From cvermejo at softwarelibreandino.com Tue May 11 03:42:42 2010 From: cvermejo at softwarelibreandino.com (Carlos VERMEJO RUIZ) Date: Mon, 10 May 2010 22:42:42 -0500 (PET) Subject: [Linux-cluster] Problem with service migration with xen domU on diferent dom0 with redhat 5.4 In-Reply-To: <19071032.32418.1273549084065.JavaMail.root@zimbra.softwarelibreandino.com> Message-ID: <29374383.32421.1273549361054.JavaMail.root@zimbra.softwarelibreandino.com> I just come back from a trip and made some changes at my cluster.conf but now I am getting a more clear error: May 10 20:27:23 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:27:23 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed Also I got more information telling me that cluster services on node 1 are down, when I restart rgmanager it starts working. More details: [root at vmapache2 ~]# service rgmanager status Se est? ejecutando clurgmgrd (pid 1866)... [root at vmapache2 ~]# cman_tool status Version: 6.2.0 Config Version: 60 Cluster Name: clusterapache01 Cluster Id: 38965 Cluster Member: Yes Cluster Generation: 300 Membership state: Cluster-Member Nodes: 2 Expected votes: 3 Quorum device votes: 1 Total votes: 3 Quorum: 2 Active subsystems: 10 Flags: Dirty Ports Bound: 0 11 177 Node name: vmapache2.foo.com Node ID: 2 Multicast addresses: 225.0.0.1 Node addresses: 172.19.168.122 [root at vmapache2 ~]# /Var/log/messages May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.121 May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.122 May 10 20:27:07 vmapache2 openais[1562]: [CPG ] got joinlist message from node 2 May 10 20:27:23 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:27:23 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (35940). May 10 20:27:23 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:27:23 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:27:29 vmapache2 kernel: dlm: connecting to 1 May 10 20:27:29 vmapache2 kernel: dlm: got connection from 1 May 10 20:27:41 vmapache2 clurgmgrd[1867]: State change: vmapache1.foo.com UP May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.121 May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.122 May 10 20:27:07 vmapache2 openais[1562]: [CPG ] got joinlist message from node 2 May 10 20:27:23 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:27:23 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (35940). May 10 20:27:23 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:27:23 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:27:29 vmapache2 kernel: dlm: connecting to 1 May 10 20:27:29 vmapache2 kernel: dlm: got connection from 1 May 10 20:27:41 vmapache2 clurgmgrd[1867]: State change: vmapache1.foo.com UP [root at vmapache2 ~]# tail -n 100 /var/log/messages May 10 20:24:25 vmapache2 openais[1562]: [TOTEM] Receive multicast socket recv buffer size (288000 bytes). May 10 20:24:25 vmapache2 openais[1562]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). May 10 20:24:25 vmapache2 openais[1562]: [TOTEM] entering GATHER state from 2. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] entering GATHER state from 0. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] Creating commit token because I am the rep. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] Saving state aru 49 high seq received 49 May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] Storing new sequence id for ring 128 May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] entering COMMIT state. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] entering RECOVERY state. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] position [0] member 172.19.168.122: May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] previous ring seq 292 rep 172.19.168.121 May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] aru 49 high delivered 49 received flag 1 May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] Did not need to originate any messages in recovery. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] Sending initial ORF token May 10 20:24:30 vmapache2 openais[1562]: [CLM ] CLM CONFIGURATION CHANGE May 10 20:24:30 vmapache2 openais[1562]: [CLM ] New Configuration: May 10 20:24:30 vmapache2 fenced[1620]: vmapache1.foo.com not a cluster member after 0 sec post_fail_delay May 10 20:24:30 vmapache2 kernel: dlm: closing connection to node 1 May 10 20:24:30 vmapache2 clurgmgrd[1867]: State change: vmapache1.foo.com DOWN May 10 20:24:30 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.122) May 10 20:24:30 vmapache2 fenced[1620]: fencing node "vmapache1.foo.com" May 10 20:24:30 vmapache2 openais[1562]: [CLM ] Members Left: May 10 20:24:30 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.121) May 10 20:24:30 vmapache2 openais[1562]: [CLM ] Members Joined: May 10 20:24:30 vmapache2 openais[1562]: [CLM ] CLM CONFIGURATION CHANGE May 10 20:24:30 vmapache2 openais[1562]: [CLM ] New Configuration: May 10 20:24:30 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.122) May 10 20:24:30 vmapache2 openais[1562]: [CLM ] Members Left: May 10 20:24:30 vmapache2 openais[1562]: [CLM ] Members Joined: May 10 20:24:30 vmapache2 openais[1562]: [SYNC ] This node is within the primary component and will provide service. May 10 20:24:30 vmapache2 openais[1562]: [TOTEM] entering OPERATIONAL state. May 10 20:24:30 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.122 May 10 20:24:30 vmapache2 openais[1562]: [CPG ] got joinlist message from node 2 May 10 20:24:35 vmapache2 clurgmgrd[1867]: Waiting for node #1 to be fenced May 10 20:24:47 vmapache2 qdiskd[1604]: Assuming master role May 10 20:24:49 vmapache2 openais[1562]: [CMAN ] lost contact with quorum device May 10 20:24:49 vmapache2 openais[1562]: [CMAN ] quorum lost, blocking activity May 10 20:24:49 vmapache2 clurgmgrd[1867]: #1: Quorum Dissolved May 10 20:24:49 vmapache2 qdiskd[1604]: Writing eviction notice for node 1 May 10 20:24:49 vmapache2 openais[1562]: [CMAN ] quorum regained, resuming activity May 10 20:24:49 vmapache2 clurgmgrd: [1867]: Stopping Service apache:web1 May 10 20:24:49 vmapache2 clurgmgrd: [1867]: Checking Existence Of File /var/run/cluster/apache/apache:web1.pid [apache:web1] > Failed - File Doesn't Exist May 10 20:24:49 vmapache2 clurgmgrd: [1867]: Stopping Service apache:web1 > Succeed May 10 20:24:49 vmapache2 clurgmgrd[1867]: Quorum Regained May 10 20:24:49 vmapache2 clurgmgrd[1867]: State change: Local UP May 10 20:24:51 vmapache2 qdiskd[1604]: Node 1 evicted May 10 20:25:00 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:25:00 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (32130). May 10 20:25:00 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:25:00 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:25:05 vmapache2 fenced[1620]: fencing node "vmapache1.foo.com" May 10 20:25:36 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:25:36 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (33270). May 10 20:25:36 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:25:36 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:25:41 vmapache2 fenced[1620]: fencing node "vmapache1.foo.com" May 10 20:26:11 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:26:11 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:26:16 vmapache2 fenced[1620]: fencing node "vmapache1.foo.com" May 10 20:26:47 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:26:47 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (35010). May 10 20:26:47 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:26:47 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:26:52 vmapache2 fenced[1620]: fencing node "vmapache1.foo.com" May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] entering GATHER state from 11. May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] Saving state aru 10 high seq received 10 May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] Storing new sequence id for ring 12c May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] entering COMMIT state. May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] entering RECOVERY state. May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] position [0] member 172.19.168.121: May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] previous ring seq 296 rep 172.19.168.121 May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] aru a high delivered a received flag 1 May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] position [1] member 172.19.168.122: May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] previous ring seq 296 rep 172.19.168.122 May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] aru 10 high delivered 10 received flag 1 May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] Did not need to originate any messages in recovery. May 10 20:27:07 vmapache2 openais[1562]: [CLM ] CLM CONFIGURATION CHANGE May 10 20:27:07 vmapache2 openais[1562]: [CLM ] New Configuration: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.122) May 10 20:27:07 vmapache2 openais[1562]: [CLM ] Members Left: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] Members Joined: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] CLM CONFIGURATION CHANGE May 10 20:27:07 vmapache2 openais[1562]: [CLM ] New Configuration: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.121) May 10 20:27:07 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.122) May 10 20:27:07 vmapache2 openais[1562]: [CLM ] Members Left: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] Members Joined: May 10 20:27:07 vmapache2 openais[1562]: [CLM ] r(0) ip(172.19.168.121) May 10 20:27:07 vmapache2 openais[1562]: [SYNC ] This node is within the primary component and will provide service. May 10 20:27:07 vmapache2 openais[1562]: [TOTEM] entering OPERATIONAL state. May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.121 May 10 20:27:07 vmapache2 openais[1562]: [CLM ] got nodejoin message 172.19.168.122 May 10 20:27:07 vmapache2 openais[1562]: [CPG ] got joinlist message from node 2 May 10 20:27:23 vmapache2 fenced[1620]: agent "fence_xvm" reports: Timed out waiting for response May 10 20:27:23 vmapache2 ccsd[1550]: Attempt to close an unopened CCS descriptor (35940). May 10 20:27:23 vmapache2 ccsd[1550]: Error while processing disconnect: Invalid request descriptor May 10 20:27:23 vmapache2 fenced[1620]: fence "vmapache1.foo.com" failed May 10 20:27:29 vmapache2 kernel: dlm: connecting to 1 May 10 20:27:29 vmapache2 kernel: dlm: got connection from 1 May 10 20:27:41 vmapache2 clurgmgrd[1867]: State change: vmapache1.foo.com UP Here is my cluster.conf file: Best Regards, Carlos Vermejo Ruiz -------------- next part -------------- An HTML attachment was scrubbed... URL: From jose.neto at liber4e.com Wed May 12 10:38:06 2010 From: jose.neto at liber4e.com (jose nuno neto) Date: Wed, 12 May 2010 12:38:06 +0200 Subject: [Linux-cluster] Sybase Cluster on RedHAt In-Reply-To: <0L2700FQYNLB9S40@comexp1-srv.lux.eib.org> References: <0L2700FQYNLB9S40@comexp1-srv.lux.eib.org> Message-ID: <4BEA850E.603@liber4e.com> Hello Mr Azzopardi :-) did you created another sybase resource for this new service? its needed. Unique resource for each service. check that Cheers from Portugal Jose On 05/10/2010 05:41 PM, AZZOPARDI Konrad wrote: > Dear all, > I have a working two node RedHAt cluster and need to configure two > Sybase resources, I am using a script called ASEHAagent-eib which is > basically a copy of the original with an extra kill so nothing new. > I think there is a problem here because I am not allowed to have two > ASEHAagent running at the same time and my /var/log/messages states > something like this : > May 10 17:40:32 dc1-x6270-a clurgmgrd[7692]: > Reconfiguring > May 10 17:40:32 dc1-x6270-a clurgmgrd[7692]: > Loading Service Data > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Unique > attribute collision. type=ASEHAagent-eib attr=sybase_home > value=/app/sybase > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: Error > storing ASEHAagent-eib resource > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: > Applying new configuration #30 > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: > Stopping changed resources. > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: > Restarting changed resources. > May 10 17:40:33 dc1-x6270-a clurgmgrd[7692]: > Starting changed resources. > Is there a more elegant way to do it other than having multiple names > for the same script ? > Tnx > konrad > sybase_home="/app/sybase" sybase_ase="ASE-15_0" sybase_ocs="OCS-15_0" > server_name="SYB1_fkdbclm" > login_file="/app/sybase/fkdbclm/admin/login_file" > interfaces_file="/app/sybase/interfaces" shutdown_timeout="200" > start_timeout="300" deep_probe_timeout="300"/> > sybase_home="/app/sybase" sybase_ase="ASE-15_0" sybase_ocs="OCS-15_0" > server_name="SYB1_fkdbtrm" > login_file="/app/sybase/fkdbtrm/admin/login_file" > interfaces_file="/app/sybase/interfaces" shutdown_timeout="200" > start_timeout="300" deep_probe_timeout="300"/> > > name="fkdbclm" recovery="relocate"> > > > > > > > > > > > name="fkdbtrm" recovery="relocate"> > > > > > > > > > > > -------------------------------------------------------------------- > > Les informations contenues dans ce message et/ou ses annexes sont > reservees a l'attention et a l'utilisation de leur destinataire et peuvent etre > confidentielles. Si vous n'etes pas destinataire de ce message, vous etes > informes que vous l'avez recu par erreur et que toute utilisation en est > interdite. Dans ce cas, vous etes pries de le detruire et d'en informer la > Banque Europeenne d'Investissement. > > The information in this message and/or attachments is intended solely for > the attention and use of the named addressee and may be confidential. If > you are not the intended recipient, you are hereby notified that you have > received this transmittal in error and that any use of it is prohibited. In > such a case please delete this message and kindly notify the European > Investment Bank accordingly. > -------------------------------------------------------------------- > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.lense at convergys.com Wed May 12 14:40:50 2010 From: michael.lense at convergys.com (Michael F Lense) Date: Wed, 12 May 2010 10:40:50 -0400 Subject: [Linux-cluster] Issues installing Clustering Software Message-ID: Having issues installing Clustering Software on server: rhel-x86_64-as-4-cluster I am running the following kernel: # uname -r 2.6.9-89.0.25.ELsmp The server is up to date with updates, but I am having trouble installing the Clustering software and don't know what to do now... # up2date -uf Fetching Obsoletes list for channel: rhel-x86_64-as-4... Fetching Obsoletes list for channel: rhel-x86_64-as-4-cluster... Name Version Rel Arch ---------------------------------------------------------------------------------------- All packages are currently up to date # # up2date --show-channels rhel-x86_64-as-4 rhel-x86_64-as-4-cluster # According to a Doc I found in Red Hat Knowledge base, attached: DOC-3451.pdf, and DOC-4188.pdf I ran the following for my kernel, but received following errors... # up2date cman cman-kernel-smp dlm dlm-kernel-smp magma magma-plugins system-config-cluster rgmanager ccs fence modcluster --force Fetching Obsoletes list for channel: rhel-x86_64-as-4... Fetching Obsoletes list for channel: rhel-x86_64-as-4-cluster... Name Version Rel Arch ---------------------------------------------------------------------------------------- ccs 1.0.12 1 x86_64 cman 1.0.27 1.el4 x86_64 cman-kernel-smp 2.6.9 56.7.el4_8.13 x86_64 dlm 1.0.7 1 x86_64 dlm-kernel-smp 2.6.9 58.6.el4_8.15 x86_64 fence 1.32.67 1.el4_8.2 x86_64 magma 1.0.8 1 x86_64 magma-plugins 1.0.15 1.el4_8.1 x86_64 rgmanager 1.9.87 1.el4_8.1 x86_64 system-config-cluster 1.0.56 2.2 noarch Testing package set / solving RPM inter-dependencies... There was a package dependency problem. The message was: Unresolvable chain of dependencies: cman-kernel-smp 2.6.9-56.7.el4_8.13 requires /lib/modules/2.6.9-89.0.23.ELsmp cman-kernel-smp-2.6.9-56.7.el4_8.13 requires kernel-smp = 2.6.9-89.0.23.EL dlm-kernel-smp 2.6.9-58.6.el4_8.15 requires /lib/modules/2.6.9-89.0.23.ELsmp dlm-kernel-smp-2.6.9-58.6.el4_8.15 requires kernel-smp = 2.6.9-89.0.23.EL The following packages were added to your selection to satisfy dependencies: Package Required by ---------------------------------------------------------------------------- # Any ideas how to get this installed ???? Thanks Mike ________________________________ NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dxh at yahoo.com Wed May 12 14:56:04 2010 From: dxh at yahoo.com (Don Hoover) Date: Wed, 12 May 2010 07:56:04 -0700 (PDT) Subject: [Linux-cluster] clvm ignores lvm.conf filter? Message-ID: <734565.71894.qm@web65514.mail.ac4.yahoo.com> Just curious...I just noticed that clvm seems to ignore the filter configuration in /etc/lvm.conf. We are getting "Found duplicate PV" errors on the /dev/sd? devices on our systems that are using clvm even though we filter those out in the lvm.conf to use the /dev/mapper/mpath* devices. In the non-clvm systems, this makes these errors go away because lvm will only look at the devices that match the lvm.conf filter. Does clvm ignore this filter? Is there a way to stop these "duplicate pv" errors on clvm boxes like their is on stand along? From dxh at yahoo.com Wed May 12 15:06:59 2010 From: dxh at yahoo.com (Don Hoover) Date: Wed, 12 May 2010 08:06:59 -0700 (PDT) Subject: [Linux-cluster] clvm ignores lvm.conf filter? Message-ID: <773465.27687.qm@web65503.mail.ac4.yahoo.com> On further discovery it seems that its lvdisplay that is ignoring them. For "lvscan -vvvvvvvvvvvvvvvvv" I see this in the debug output: #filters/filter-regex.c:172? ? ? ???/dev/sdgy: Skipping (regex) For "lvdisplay -vvvvvvvvvvvvvvvvv" I see this in the debug output: #cache/lvmcache.c:1224? ? ???Ignoring duplicate PV Gw7zqCRUj0mjfGltw3AaHhMfznSjmMTc on /dev/sdgy - using dm /dev/mapper/mpath91 And no filter/skipping lines at all for any /dev/sd* devices...it scans them anyway for some reason. I wonder if this is a bug that has been introdced in the latest lvm packages...I am seeing this on a RHEL55 box running: lvm2-2.02.56-8.el5.x86_64 lvm2-cluster-2.02.56-7.el5.x86_64 I think I will post to the linux-lvm list as well. From celsowebber at yahoo.com Wed May 12 16:10:40 2010 From: celsowebber at yahoo.com (Celso K. Webber) Date: Wed, 12 May 2010 09:10:40 -0700 (PDT) Subject: [Linux-cluster] gfs limitations regarding some features In-Reply-To: <1273488379.5604.9.camel@debj5n.critical.pt> References: <1273488379.5604.9.camel@debj5n.critical.pt> Message-ID: <104815.83521.qm@web111721.mail.gq1.yahoo.com> Hello Joao, Although you didn't ask about NFS, I had many issues under RHEL 5.4 when sharing GFS volumes using NFS, usually regarding file locks. I thought it would be interesting to share this with you. Regards, Celso. ----- Original Message ---- From: Joao Ferreira gmail To: linux clustering Sent: Mon, May 10, 2010 7:46:19 AM Subject: [Linux-cluster] gfs limitations regarding some features Hello all, a friend told me that I might run into problems if I try to use gfs/gfs2 in systems with the following requirements: - Samba - POSIX ACLs - users and group quotas - RAID (mdadm) Not being an expert in any of these areas, I'dd like to have some insight on any related limitations I might encounter when using those features on gfs/gfs2 filesystems. Thank you Joao -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From linux at alteeve.com Wed May 12 20:27:53 2010 From: linux at alteeve.com (Digimer) Date: Wed, 12 May 2010 16:27:53 -0400 Subject: [Linux-cluster] List admins - question Message-ID: <4BEB0F49.5070302@alteeve.com> I'd like to as the list admins a question before (possibly) posting something here. Who would be the best to speak to? Perhaps you/they could email me off-list (digimer at alteeve.com). Thanks! -- Digimer E-Mail: linux at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From jacob.ishak at gmail.com Thu May 13 05:40:42 2010 From: jacob.ishak at gmail.com (jacob ishak) Date: Thu, 13 May 2010 08:40:42 +0300 Subject: [Linux-cluster] dlm: lockspace 20002 from 2 type 1 not found ERROR ?? Message-ID: Dear all i would like to ask i question someone else asked in 2008 in this mailing group but had no answer back maybe someone has an answer or clue . after setting up a cluster using RHCS on two RHEL5 nodes , connected directly to SAN FC storage with gfs as fs on shared storage , i m getting this msg below on both nodes . any idea what this means ??any help is appreciated note that the cluster is an HA cluster and is working fine . May 10 17:34:08 RHEL-node2 kernel: dlm: lockspace 20002 from 2 type 1 not found May 10 17:34:08 RHEL-node2 kernel: dlm: lockspace 30002 from 2 type 1 not found BR, Yacoub Ishak systems engineer -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailtoaneeshvs at gmail.com Thu May 13 06:30:25 2010 From: mailtoaneeshvs at gmail.com (aneesh vs) Date: Thu, 13 May 2010 12:00:25 +0530 Subject: [Linux-cluster] Issues installing Clustering Software In-Reply-To: References: Message-ID: Hello, You are in kernel 2.6.9-89.0.25.ELsmp but up2date shows cman-kernel-* packages require kernel 2.6.9-89.0.23 . This is because Red Hat releases cman-kernel-* packages after one or two weeks of kernel releases. So you may install the kernel 2.6.9-89.0.23 and boot using it, then run up2date. Once the latest cman-kernel package is available in RHN, you can go ahead and install it and boot using latest kernel. --Aneesh On Wed, May 12, 2010 at 8:10 PM, Michael F Lense < michael.lense at convergys.com> wrote: > *Having issues installing Clustering Software on server: > rhel-x86_64-as-4-cluster > > I am running the following kernel: > *# uname -r > > *2.6.9-89.0.25.ELsmp* > > * > The server is up to date with updates, but I am having trouble installing > the Clustering software and don?t know what to do now? > > > # up2date -uf > > Fetching Obsoletes list for channel: rhel-x86_64-as-4... > > Fetching Obsoletes list for channel: rhel-x86_64-as-4-cluster... > > Name Version Rel Arch > > > ---------------------------------------------------------------------------------------- > > All packages are currently up to date > # * > > * * > > *# up2date --show-channels > rhel-x86_64-as-4 > rhel-x86_64-as-4-cluster > # > > According to a Doc I found in Red Hat Knowledge base, attached: > DOC-3451.pdf, and DOC-4188.pdf > > I ran the following for my kernel, but received following errors? > > # up2date cman cman-kernel-smp dlm dlm-kernel-smp magma magma-pluginssystem- > config-cluster rgmanager ccs fence modcluster --force > > Fetching Obsoletes list for channel: rhel-x86_64-as-4... > > Fetching Obsoletes list for channel: rhel-x86_64-as-4-cluster... > > Name Version Rel Arch > ---------------------------------------------------------------------------------------- > > ccs 1.0.12 1 > x86_64 > cman 1.0.27 1.el4 > x86_64 > cman-kernel-smp 2.6.9 56.7.el4_8.13 > x86_64 > dlm 1.0.7 1 > x86_64 > dlm-kernel-smp 2.6.9 58.6.el4_8.15 > x86_64 > fence 1.32.67 1.el4_8.2 > x86_64 > magma 1.0.8 1 > x86_64 > magma-plugins 1.0.15 1.el4_8.1 > x86_64 > rgmanager 1.9.87 1.el4_8.1 > x86_64 > system-config-cluster 1.0.56 2.2 > noarch > > > Testing package set / solving RPM inter-dependencies... > There was a package dependency problem. The message was: > > Unresolvable chain of dependencies: > cman-kernel-smp 2.6.9-56.7.el4_8.13 requires > /lib/modules/2.6.9-89.0.23.ELsmp > cman-kernel-smp-2.6.9-56.7.el4_8.13 requires kernel-smp = > 2.6.9-89.0.23.EL > dlm-kernel-smp 2.6.9-58.6.el4_8.15 requires > /lib/modules/2.6.9-89.0.23.ELsmp > dlm-kernel-smp-2.6.9-58.6.el4_8.15 requires kernel-smp = > 2.6.9-89.0.23.EL > > > The following packages were added to your selection to satisfy > dependencies: > Package Required by > ---------------------------------------------------------------------------- > > > # > > Any ideas how to get this installed ????* > > * * > > *Thanks* > > *Mike* > > ------------------------------ > NOTICE: The information contained in this electronic mail transmission is > intended by Convergys Corporation for the use of the named individual or > entity to which it is directed and may contain information that is > privileged or otherwise confidential. If you have received this electronic > mail transmission in error, please delete it from your system without > copying or forwarding it, and notify the sender of the error by reply email > or by telephone (collect), so that the sender's address records can be > corrected. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yamato at redhat.com Thu May 13 07:33:13 2010 From: yamato at redhat.com (Masatake YAMATO) Date: Thu, 13 May 2010 16:33:13 +0900 (JST) Subject: [Linux-cluster] [PATCH] fsfreeze: suspend and resume access to an filesystem In-Reply-To: <1194896399.685821273733580088.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> References: <1194896399.685821273733580088.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <20100513.163313.67205054690538917.yamato@redhat.com> Hi, (The disscussion can be found at http://thread.gmane.org/gmane.linux.utilities.util-linux-ng/3181/focus=3193) > Hello. > > I understand reason when it use with device-mapper. > I think, fsfreeze command need for filesystem on physical block device without device-mapper. > For example, by storage controller based LUN snapshot. > > # fsfreeze -f /data > # ssh root at 192.168.0.1 "take snapshot lun0" > # fsfreeze -u /data > > * /data is mounted physical block device(/dev/sdb1) As Hajime wrote, taking snapshot in physical storage level is popular situation. It seems that xfs_freeze can be used for the purpose but the name `xfs_freeze' gives the impression that the command is only for xfs. My argument can be applicable to gfs2_tool, too. "gfs2_tool freeze" also does ``ioctl(fd, FIFREEZE, 0)''. One of the solution is to add xxx_freeze for each file system implementation which has freeze/unfreeze methods to eash util-xxx, xxx-progs or xxx-utils. e.g. Adding ext4_freeze or ext3_freeze command to e2fsprogs package. However, I think this is not good idea. Linux provides file system neutral interface already. So it is better to have file system neutral command(fsfreeze) and the command is included in file system neutral package, util-linux-ng. Masatake YAMATO From linux at alteeve.com Thu May 13 14:55:39 2010 From: linux at alteeve.com (Digimer) Date: Thu, 13 May 2010 10:55:39 -0400 Subject: [Linux-cluster] Cluster Workshop - Build a 2-Node Cluster! - Toronto, Ontario, Canada Message-ID: <4BEC12EB.8040707@alteeve.com> Before I get to the root of this message, let me add the disclaimer I shared with the list admin: ===================================================================== I am trying to promote cluster use, or at least help lower the barrier to entry. To that end, I am looking to offer a free workshop on building a simple 2-node cluster in the greater Toronto area in Ontario, Canada. I'm certainly not a cluster guru, and have been learning a lot as I go. I've actually based a recent talk and an in-progress HowTo on what I've learned and where I've tripped along the way. I've also created an open-hardware, open-source fence device to make it cheaper to build clusters using commodity hardware (links below). ===================================================================== The announcement: --------------------------------------------------------------------- Cluster Workshop - Build a 2-Node Cluster! -=] What: You or your group would come with two bare servers and leave with a fully-functioning 2-Node cluster. -=] When: Looking at the weekend of Jul. 31 - Aug. 1. The date may shift a weekend or two in either direction depending on the availability of space and the schedules of interested attendees. -=] Where: In the greater Toronto area of Ontario, Canada. Exact location to be determined based on interest levels. -=] Cost: I will donate my time and I may have access to a suitable location, depending on how much interest there is. If there are costs associated with the location, that price will be converted into a ticket price based on the costs incurred. Given these potential costs, please be clear on your interest level. -=] Requirements: You are responsible for your own hardware (capable machines, fence, monitor(s), peripherals). You will need to be familiar with: - CentOS/Red Hat OS Install. - LVM (in concept, if not practice). - Understand basic networking. - No prior experience with clustering is required. If you are interested but aren't sure if you've got the skills yet, let me know and I will be happy to help bring people up to speed before hand as best I can. -=] What else? If I can get a board laid up soon enough, I can provide Node Assassin fence devices for a modest cost. If you have access to servers with IPMI, you're set. If you need help with sourcing hardware, let me know and I will help. -=] Last thoughts? This whole exercise is contingent on their being enough interest. --------------------------------------------------------------------- Links: Fence device: - http://nodeassassin.org In progress How-To: - http://wiki.alteeve.com/index.php/2-Node_CentOS5_Cluster -- Digimer E-Mail: linux at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From brentgclarklist at gmail.com Fri May 14 18:26:46 2010 From: brentgclarklist at gmail.com (Brent Clark) Date: Fri, 14 May 2010 20:26:46 +0200 Subject: [Linux-cluster] GFS on Debian Lenny Message-ID: <4BED95E6.4040006@gmail.com> Hiya Im trying to get GFS working on Debian Lenny. Unfortuantely documentation seems to be non existent. And the one site that google recommends, gcharriere.com, is down. I used googles caching mechanism to try and make head and tails of whats needed to be done, but unfortunately Im unsuccessful. Would anyone have any documentation or any sites or if you have a heart, provide a howto to get GFS working. From myside, all ive done is: aptitude install gfs2-tools modprobe gfs2 gfs_mkfs -p lock_dlm -t lolcats:drbdtest /dev/drbd0 -j 2 thats all, ive done. No editting of configs etc. When I try, mount -t gfs2 /dev/drbd0 /drbd/ I get the following message: /sbin/mount.gfs2: can't connect to gfs_controld: Connection refused /sbin/mount.gfs2: gfs_controld not running /sbin/mount.gfs2: error mounting lockproto lock_dlm If anyone can help, it would be appreciated. Kind Regards Brent Clark From dhoffutt at gmail.com Fri May 14 19:45:11 2010 From: dhoffutt at gmail.com (Dusty) Date: Fri, 14 May 2010 14:45:11 -0500 Subject: [Linux-cluster] pull plug on node, service never relocates Message-ID: Greetings, Using stock "clustering" and "cluster-storage" from RHEL5 update 4 X86_64 ISO. As an example using my below config: Node1 is running service1, node2 is running service2, etc, etc, node5 is spare and available for the relocation of any failover domain / cluster service. If I go into the APC PDU and turn off the electrical port to node1, node2 will fence node1 (going into the APC PDU and doing and off, on on node1's port), this is fine. Works well. When node1 comes back up, then it shuts down service1 and service1 relocates to node5. Now if I go in the lab and literally pull the plug on node5 running service1, another node fences node5 via the APC - can check the APC PDU log and see that it has done an off/on on node5's electrical port just fine. But I pulled the plug on node5 - resetting the power doesn't matter. I want to simulate a completely dead node, and have the service relocate in this case of complete node failure. In this RHEL5.4 cluster, the service never relocates. I can similate this on any node for any service. What if a node's motherboard fries? What can I set to have the remaining nodes stop waiting for the reboot of a failed node and just go ahead and relocate the cluster service that had been running on the now failed node? Thank you! versions: cman-2.0.115-1.el5 openais-0.80.6-8.el5 modcluster-0.12.1-2.el5 lvm2-cluster-2.02.46-8.el5 rgmanager-2.0.52-1.el5 ricci-0.12.2-6.el5 cluster.conf (sanitized, real scripts removed, all gfs2 mounts gone for clarity):