From linux at alteeve.com Tue Nov 1 04:54:57 2011 From: linux at alteeve.com (Digimer) Date: Tue, 01 Nov 2011 00:54:57 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager Message-ID: <4EAF7BA1.1000500@alteeve.com> Hi all, I've run into something of a corner case; EL6 / cman 3 rgmanager KVM VMs Win2008 R2 guest I want to allow my UPS to shut down my cluster when the batteries are about to fail. The problem with this is that when I try to stop rgmanager (or even simply disabling the VM resource), an application on the windows KVM guest pops up a "Are you sure you want to close X?". This blocks the VMs shutdown, which leaves rgmanager sitting there indefinitely waiting for the guest VM to stop and nothing actually shuts down until the batteries drain. The application in question does not have a "don't prompt me" option, so I need one of; * A way to either tell the windows guest to forcibly stop to process. * A way to have rgmanager pause and write out to disk the state of a VM. * A way to 'virsh destory' a guest as a special kind of 'clusvcadm -d ...' call. I'm using the virtio drivers, which I believe (perhaps wrongly) provides the ACPI hook to start the guest VM. Any suggestions/ideas? Anything has to be better than waiting and letting the whole cluster hard power off. -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From Sagar.Shimpi at tieto.com Tue Nov 1 09:29:56 2011 From: Sagar.Shimpi at tieto.com (Sagar.Shimpi at tieto.com) Date: Tue, 1 Nov 2011 11:29:56 +0200 Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 Message-ID: Hi, Following is my setup - Redhat -6.0 ==> 64-bit Cluster configuration using LUCI. I had setup 2 node cluster Load Balancing Cluster having Mysql service active on both the nodes using different Failover Domain. Node1 [Mysql-1 running with IP - 192.168.56.2 ] Node2 [Mysql-2 running with IP - 192.168.56.3 ] For both the above Mysql services I had used common storage using GFS2 file system. But I am facing the problem in syncing the storage. On both the nodes data is not in sync. Is it possible to sync the data using GFS2 file system while configuring MYSQL load Balancing Cluster??? Regards, Sagar Shimpi, Senior Technical Specialist, OSS Labs Tieto email sagar.shimpi at tieto.com, Wing 1, Cluster D, EON Free Zone, Plot No. 1, Survery # 77, MIDC Kharadi Knowledge Park, Pune 411014, India, www.tieto.com www.tieto.in TIETO. Knowledge. Passion. Results. -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Tue Nov 1 11:43:07 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 01 Nov 2011 11:43:07 +0000 Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 In-Reply-To: References: Message-ID: <1320147787.2707.8.camel@menhir> Hi, On Tue, 2011-11-01 at 11:29 +0200, Sagar.Shimpi at tieto.com wrote: > Hi, > > > > Following is my setup ? > > > > Redhat -6.0 ? 64-bit > > Cluster configuration using LUCI. > > > > I had setup 2 node cluster Load Balancing Cluster having Mysql service > active on both the nodes using different Failover Domain. > > Node1 [Mysql-1 running with IP ? 192.168.56.2 ] > > Node2 [Mysql-2 running with IP ? 192.168.56.3 ] > > > > For both the above Mysql services I had used common storage using GFS2 > file system. But I am facing the problem in syncing the storage. On > both the nodes data is not in sync. > > > > Is it possible to sync the data using GFS2 file system while > configuring MYSQL load Balancing Cluster??? > That is really a MySQL question rather than a cluster question. In general it is not likely that running multiple copies of MySQL across a set of nodes will work. At least, not with the standard set up, anyway. You'd need a cluster aware database in order to do that, Steve. > > > > > Regards, > > > > Sagar Shimpi, Senior Technical Specialist, OSS Labs > > > > Tieto > > email sagar.shimpi at tieto.com, > > Wing 1, Cluster D, EON Free Zone, Plot No. 1, Survery # 77, > > MIDC Kharadi Knowledge Park, Pune 411014, India, www.tieto.com > www.tieto.in > > > TIETO. Knowledge. Passion. Results. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From kkovachev at varna.net Tue Nov 1 11:53:00 2011 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 01 Nov 2011 13:53:00 +0200 Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 In-Reply-To: References: Message-ID: <85a8f0a04031dfef1d4e679a7526291c@mx.varna.net> On Tue, 1 Nov 2011 11:29:56 +0200, wrote: > Hi, > > Following is my setup - > > Redhat -6.0 ==> 64-bit > Cluster configuration using LUCI. > > I had setup 2 node cluster Load Balancing Cluster having Mysql service > active on both the nodes using different Failover Domain. > Node1 [Mysql-1 running with IP - 192.168.56.2 ] > Node2 [Mysql-2 running with IP - 192.168.56.3 ] > > For both the above Mysql services I had used common storage using GFS2 > file system. But I am facing the problem in syncing the storage. On both > the nodes data is not in sync. Which one is not true "I had used common storage" or "On both the nodes data is not in sync" - if it is a common storage the data is the same? if you are using GFS2 without a cluster and dlm locking i.e. local_locking then it is possible both to be true > > Is it possible to sync the data using GFS2 file system while configuring > MYSQL load Balancing Cluster??? > GFS2 has nothing to do with syncing the data between two storages - if that's what you are after, check DRBD if you are using improperly configured GFS2 on a shared storage i.e. without cluster and dlm it is no different than any other local filesystem and corruption is guaranteed on simultaneous use how did you create the GFS2 filesystem? Also please show your cluster.conf and relevant storage details From symack at gmail.com Tue Nov 1 12:27:04 2011 From: symack at gmail.com (Nick Khamis) Date: Tue, 1 Nov 2011 08:27:04 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: <4EAF7BA1.1000500@alteeve.com> References: <4EAF7BA1.1000500@alteeve.com> Message-ID: Hello Digmer, We are working on a similer project, only: EL6 / pcmk+cman (for dlm and fenced) No rgmanager since we will be using pacemaker KVM Could you kindly share some whitepapers that you have been using for your setup? Documentation for things like live migration, fenching the VMs etc..? PLEASE! Thanks in Advance, Nick. On Tue, Nov 1, 2011 at 12:54 AM, Digimer wrote: > Hi all, > > ?I've run into something of a corner case; > > EL6 / cman 3 > rgmanager > KVM VMs > Win2008 R2 guest > > I want to allow my UPS to shut down my cluster when the batteries are > about to fail. > > The problem with this is that when I try to stop rgmanager (or even > simply disabling the VM resource), an application on the windows KVM > guest pops up a "Are you sure you want to close X?". This blocks the VMs > shutdown, which leaves rgmanager sitting there indefinitely waiting for > the guest VM to stop and nothing actually shuts down until the batteries > drain. > > The application in question does not have a "don't prompt me" option, so > I need one of; > * A way to either tell the windows guest to forcibly stop to process. > * A way to have rgmanager pause and write out to disk the state of a VM. > * A way to 'virsh destory' a guest as a special kind of 'clusvcadm -d > ...' call. > > I'm using the virtio drivers, which I believe (perhaps wrongly) provides > the ACPI hook to start the guest VM. > > Any suggestions/ideas? Anything has to be better than waiting and > letting the whole cluster hard power off. > > -- > Digimer > E-Mail: ? ? ? ? ? ? ?digimer at alteeve.com > Freenode handle: ? ? digimer > Papers and Projects: http://alteeve.com > Node Assassin: ? ? ? http://nodeassassin.org > "omg my singularity battery is dead again. > stupid hawking radiation." - epitron > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From rpeterso at redhat.com Tue Nov 1 13:03:41 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Tue, 01 Nov 2011 09:03:41 -0400 (EDT) Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 In-Reply-To: <85a8f0a04031dfef1d4e679a7526291c@mx.varna.net> Message-ID: <0335948a-fbf7-4108-8786-3f7de0805a5f@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | Which one is not true "I had used common storage" or "On both the | nodes | data is not in sync" - if it is a common storage the data is the | same? | | if you are using GFS2 without a cluster and dlm locking i.e. | local_locking | then it is possible both to be true (snip) | GFS2 has nothing to do with syncing the data between two storages - | if | that's what you are after, check DRBD | | if you are using improperly configured GFS2 on a shared storage i.e. | without cluster and dlm it is no different than any other local | filesystem | and corruption is guaranteed on simultaneous use Hi, IMHO, the most important things to bear in mind here are: (1) The job of GFS2 is to keep the file system _metadata_ consistent between nodes in the cluster. (2) It does _not_ keep DATA within the files consistent within the cluster: that's the job of the application. (3) If the application is not cluster-aware (i.e. one instance of mysql doesn't know about another instance in the cluster) they will trounce each other's updates, making the data inconsistent. (4) The general rule is: If two instances of an app can run on the same computer, in general it will work properly without data corruption. But if one computer is not allowed to run two instances of the same app, in general it will not work properly. (5) With clustering you can essentially think of it this way: it makes multiple computers run an app as if they were running multiple instances on the same computer. Almost like forcing the app to run two instances on the same computer (although that's not at all what really happens). Multiple instances on the same machine will use some kind of locking mechanism, like posix locks, to maintain data integrity. (6) Many apps are written with clustering in mind and there may be special "clustered" versions of apps, like mysql. It's best to check with the app experts or clustering experts or the cluster FAQ before implementing this kind of thing. So bottom line: You can't run two copies of regular mysql on the same box (unless it's a special cluster-aware mysql) without conflicts so you can't run two copies of regular mysql in a cluster without data corruption, because they are not cluster-aware. Regards, Bob Peterson Red Hat File Systems From symack at gmail.com Tue Nov 1 13:17:36 2011 From: symack at gmail.com (Nick Khamis) Date: Tue, 1 Nov 2011 09:17:36 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: References: <4EAF7BA1.1000500@alteeve.com> Message-ID: You make it look so easy: https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial ;) Nick. From kkovachev at varna.net Tue Nov 1 14:13:50 2011 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 01 Nov 2011 16:13:50 +0200 Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 In-Reply-To: <0335948a-fbf7-4108-8786-3f7de0805a5f@zmail06.collab.prod.int.phx2.redhat.com> References: <0335948a-fbf7-4108-8786-3f7de0805a5f@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <331c3d89b5cfd9dc4893b28d3961f667@mx.varna.net> On Tue, 01 Nov 2011 09:03:41 -0400 (EDT), Bob Peterson wrote: > ----- Original Message ----- > | Which one is not true "I had used common storage" or "On both the > | nodes > | data is not in sync" - if it is a common storage the data is the > | same? > | > | if you are using GFS2 without a cluster and dlm locking i.e. > | local_locking > | then it is possible both to be true > (snip) > | GFS2 has nothing to do with syncing the data between two storages - > | if > | that's what you are after, check DRBD > | > | if you are using improperly configured GFS2 on a shared storage i.e. > | without cluster and dlm it is no different than any other local > | filesystem > | and corruption is guaranteed on simultaneous use > > Hi, > > IMHO, the most important things to bear in mind here are: > > (1) The job of GFS2 is to keep the file system _metadata_ consistent > between nodes in the cluster. > (2) It does _not_ keep DATA within the files consistent within the > cluster: that's the job of the application. > (3) If the application is not cluster-aware (i.e. one instance of > mysql doesn't know about another instance in the cluster) they > will trounce each other's updates, making the data inconsistent. > (4) The general rule is: If two instances of an app can run on the > same computer, in general it will work properly without data > corruption. But if one computer is not allowed to run two > instances of the same app, in general it will not work properly. > (5) With clustering you can essentially think of it this way: it > makes multiple computers run an app as if they were running > multiple instances on the same computer. Almost like forcing > the app to run two instances on the same computer (although > that's not at all what really happens). Multiple instances > on the same machine will use some kind of locking mechanism, > like posix locks, to maintain data integrity. > (6) Many apps are written with clustering in mind and there > may be special "clustered" versions of apps, like mysql. > It's best to check with the app experts or clustering experts > or the cluster FAQ before implementing this kind of thing. > > So bottom line: You can't run two copies of regular mysql on the > same box (unless it's a special cluster-aware mysql) without conflicts > so you can't run two copies of regular mysql in a cluster without > data corruption, because they are not cluster-aware. > I agree with all said, but it is possible to run more than one instance of regular mysql on the same box. I run 3 (slave of master 1, slave of master 2 and combined RO export) instances (on the same machine), using the same data without problems, but you need to define 'external-locking' which slows them down running two instances in a cluster from shared storage is possible, but much slower and not a solution. > Regards, > > Bob Peterson > Red Hat File Systems > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From linux at alteeve.com Tue Nov 1 15:42:36 2011 From: linux at alteeve.com (Digimer) Date: Tue, 01 Nov 2011 11:42:36 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: References: <4EAF7BA1.1000500@alteeve.com> Message-ID: <4EB0136C.1080804@alteeve.com> On 11/01/2011 09:17 AM, Nick Khamis wrote: > You make it look so easy: > https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial ;) > > > Nick. Just a fair warning; That tutorial is well on it's way, but it is not complete and I've not yet gone over it looking for mistakes. Please feel free to read it and follow it, but be caustious of mistakes or omissions at this time. I plan to post an announcement when it's finished. In the mean time, if you run into problems, feel free to ask me at this address or stop by #linux-cluster on IRC. :) -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From mmorgan at dca.net Tue Nov 1 15:56:42 2011 From: mmorgan at dca.net (Michael Morgan) Date: Tue, 1 Nov 2011 11:56:42 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: <4EAF7BA1.1000500@alteeve.com> References: <4EAF7BA1.1000500@alteeve.com> Message-ID: <20111101155642.GA30573@staff.dca.net> > The problem with this is that when I try to stop rgmanager (or even > simply disabling the VM resource), an application on the windows KVM > guest pops up a "Are you sure you want to close X?". This blocks the VMs > shutdown, which leaves rgmanager sitting there indefinitely waiting for > the guest VM to stop and nothing actually shuts down until the batteries > drain. How much much lead time are you giving rgmanager? In my experience (and according to vm.sh) rgmanager will issue a virsh destroy roughly 2 minutes after a virsh shutdown. From vm.sh: 263 264 265 ... 467 for op in $*; do 468 echo virsh $op $OCF_RESKEY_name ... 469 virsh $op $OCF_RESKEY_name 470 471 timeout=$(get_timeout) 472 while [ $timeout -gt 0 ]; do 473 sleep 5 474 ((timeout -= 5)) 475 state=$(do_status) 476 [ $? -eq 0 ] || return 0 477 478 if [ "$state" = "paused" ]; then 479 virsh destroy $OCF_RESKEY_name 480 fi 481 done 482 done I don't know off hand if the action parameters can be adjusted in the rgmanager config, I've never had cause to change it personally. Your site is a very useful resource btw. Great job and many thanks! -Mike From symack at gmail.com Tue Nov 1 15:59:54 2011 From: symack at gmail.com (Nick Khamis) Date: Tue, 1 Nov 2011 11:59:54 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: <4EB0136C.1080804@alteeve.com> References: <4EAF7BA1.1000500@alteeve.com> <4EB0136C.1080804@alteeve.com> Message-ID: Hey Digmer, Most definitely. I've made some headway over the past month and a half, our project is using a pcmk+cman+corosync+ocfs2 stack, all nose bleed versions, and all built from source. We basically hit ALL the errors. Currently looking to integrate the fenced part of the equation using fence_xvm. Any input or heads up would be appreciated? Also, if I am not mistaken, I noticed that you lean towards the Cluster3 stack, any reason for staying away from pcmk+cman with corosync? Cheers, Nick from Toronto (And sometimes Montreal) From linux at alteeve.com Tue Nov 1 16:06:43 2011 From: linux at alteeve.com (Digimer) Date: Tue, 01 Nov 2011 12:06:43 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: References: <4EAF7BA1.1000500@alteeve.com> <4EB0136C.1080804@alteeve.com> Message-ID: <4EB01913.7010309@alteeve.com> On 11/01/2011 11:59 AM, Nick Khamis wrote: > Hey Digmer, > > Most definitely. I've made some headway over the past month and a half, our > project is using a pcmk+cman+corosync+ocfs2 stack, all nose bleed versions, > and all built from source. We basically hit ALL the errors. Currently looking > to integrate the fenced part of the equation using fence_xvm. Any input or > heads up would be appreciated? > Also, if I am not mistaken, I noticed that you lean towards the Cluster3 stack, > any reason for staying away from pcmk+cman with corosync? > > Cheers, > > Nick from Toronto (And sometimes Montreal) Hey, a fellow Torontonian. :) I stick with what is officially supported by Red Hat, quite simply. I am tracking Pacemaker's progress closely, and plan to start learning it more earnestly before too long. At the moment though, there are two major issues for me with Pacemaker; * It's in technology preview in EL6 at the moment, so no z-stream updates. * The implementation of fencing (aka stonith) are not the way I need them to be. Pacemaker is awesome, and it will be the future, but at the moment, cman+rgmanager is what is most stable, so that is where I work. :) Also, for anything aiming at production, I *strongly* recommend staying away from compiling your own apps. For learning/testing though, running from the latest versions and then filing bugs against the code is always appreciated. :) -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From linux at alteeve.com Tue Nov 1 16:08:38 2011 From: linux at alteeve.com (Digimer) Date: Tue, 01 Nov 2011 12:08:38 -0400 Subject: [Linux-cluster] Forcing off KVM Windows guests from rgmanager In-Reply-To: <20111101155642.GA30573@staff.dca.net> References: <4EAF7BA1.1000500@alteeve.com> <20111101155642.GA30573@staff.dca.net> Message-ID: <4EB01986.1070108@alteeve.com> On 11/01/2011 11:56 AM, Michael Morgan wrote: >> The problem with this is that when I try to stop rgmanager (or even >> simply disabling the VM resource), an application on the windows KVM >> guest pops up a "Are you sure you want to close X?". This blocks the VMs >> shutdown, which leaves rgmanager sitting there indefinitely waiting for >> the guest VM to stop and nothing actually shuts down until the batteries >> drain. > > How much much lead time are you giving rgmanager? In my experience (and > according to vm.sh) rgmanager will issue a virsh destroy roughly 2 > minutes after a virsh shutdown. From vm.sh: > > 263 > 264 > 265 > ... > 467 for op in $*; do > 468 echo virsh $op $OCF_RESKEY_name ... > 469 virsh $op $OCF_RESKEY_name > 470 > 471 timeout=$(get_timeout) > 472 while [ $timeout -gt 0 ]; do > 473 sleep 5 > 474 ((timeout -= 5)) > 475 state=$(do_status) > 476 [ $? -eq 0 ] || return 0 > 477 > 478 if [ "$state" = "paused" ]; then > 479 virsh destroy $OCF_RESKEY_name > 480 fi > 481 done > 482 done > > I don't know off hand if the action parameters can be adjusted in the > rgmanager config, I've never had cause to change it personally. > > Your site is a very useful resource btw. Great job and many thanks! > > -Mike Not two minutes. ;) I will try letting is wait on my test cluster shortly and will report back. If the destroy is indeed called, that would be fantastic. Glad to hear the site help! :D -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From wewrussell at gmail.com Tue Nov 1 20:48:55 2011 From: wewrussell at gmail.com (W. E. W. Russell) Date: Tue, 1 Nov 2011 16:48:55 -0400 Subject: [Linux-cluster] Issue with Conga on RHEL 5.7 Message-ID: My name is William Russell and I'm new on this list. I'm having an issue that I think is really simple, but I can't even figure out where the problem is located. I have installed 'luci' and 'ricci' on my main server and 'ricci' on my other two servers. All servers are running the latest RHEL 5.7 with all the yum updates (RHN registered with the Clustering entitlements). When I create a new cluster, I get to the progress screen and that's where everything falls apart. It just sits there for hours! It never creates the cluster. The dot for "Install" NEVER fills in. After much research, I understand what 'luci' is trying to do in terms of installing the packages necessary to configure, manage, and run the cluster, but I have done the 'yum groupinstall clustering' on another sever and it took 10 mins, if that. Communication between the servers has been verified - iptables is not running, selinux is disabled. If you need more information on the issue, feel free to ask. I looked at the syslog and see no failures or errors. If anyone can even point me in the direction of what might be causing the problem, it would be much appreciated. -- W. E. W. Russell Director, Systems Intergration at incNETWORKS, Inc. Work Phone # 732-508-2224 Active Alumni member of Sigma Lambda Beta International Fraternity, Inc. Cell Phone # 732-744-6483 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sagar.Shimpi at tieto.com Wed Nov 2 04:56:02 2011 From: Sagar.Shimpi at tieto.com (Sagar.Shimpi at tieto.com) Date: Wed, 2 Nov 2011 06:56:02 +0200 Subject: [Linux-cluster] Need help regarding Sared storage with GFS2 In-Reply-To: <0335948a-fbf7-4108-8786-3f7de0805a5f@zmail06.collab.prod.int.phx2.redhat.com> References: <85a8f0a04031dfef1d4e679a7526291c@mx.varna.net> <0335948a-fbf7-4108-8786-3f7de0805a5f@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: Thanks A lot for the detail explanation. Regards, Sagar Shimpi, Senior Technical Specialist, OSS Labs Tieto email sagar.shimpi at tieto.com, Wing 1, Cluster D, EON Free Zone, Plot No. 1, Survery # 77, MIDC Kharadi Knowledge Park, Pune 411014, India, www.tieto.com www.tieto.in TIETO. Knowledge. Passion. Results. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bob Peterson Sent: Tuesday, November 01, 2011 6:34 PM To: linux clustering Subject: Re: [Linux-cluster] Need help regarding Sared storage with GFS2 ----- Original Message ----- | Which one is not true "I had used common storage" or "On both the | nodes | data is not in sync" - if it is a common storage the data is the | same? | | if you are using GFS2 without a cluster and dlm locking i.e. | local_locking | then it is possible both to be true (snip) | GFS2 has nothing to do with syncing the data between two storages - | if | that's what you are after, check DRBD | | if you are using improperly configured GFS2 on a shared storage i.e. | without cluster and dlm it is no different than any other local | filesystem | and corruption is guaranteed on simultaneous use Hi, IMHO, the most important things to bear in mind here are: (1) The job of GFS2 is to keep the file system _metadata_ consistent between nodes in the cluster. (2) It does _not_ keep DATA within the files consistent within the cluster: that's the job of the application. (3) If the application is not cluster-aware (i.e. one instance of mysql doesn't know about another instance in the cluster) they will trounce each other's updates, making the data inconsistent. (4) The general rule is: If two instances of an app can run on the same computer, in general it will work properly without data corruption. But if one computer is not allowed to run two instances of the same app, in general it will not work properly. (5) With clustering you can essentially think of it this way: it makes multiple computers run an app as if they were running multiple instances on the same computer. Almost like forcing the app to run two instances on the same computer (although that's not at all what really happens). Multiple instances on the same machine will use some kind of locking mechanism, like posix locks, to maintain data integrity. (6) Many apps are written with clustering in mind and there may be special "clustered" versions of apps, like mysql. It's best to check with the app experts or clustering experts or the cluster FAQ before implementing this kind of thing. So bottom line: You can't run two copies of regular mysql on the same box (unless it's a special cluster-aware mysql) without conflicts so you can't run two copies of regular mysql in a cluster without data corruption, because they are not cluster-aware. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From ext.thales.jean-daniel.bonnetot at sncf.fr Wed Nov 2 08:42:29 2011 From: ext.thales.jean-daniel.bonnetot at sncf.fr (BONNETOT Jean-Daniel (EXT THALES)) Date: Wed, 2 Nov 2011 09:42:29 +0100 Subject: [Linux-cluster] Issue with Conga on RHEL 5.7 In-Reply-To: References: Message-ID: Hello, We are many with same problem. Since RHEL 5.7, packages installation don't work from ricci. For now, our solution is to install packages manually without using "groupinstall" command. yum install cman rgmanager qdiskd ... Regards, Jean-Daniel BONNETOT Ing?nierie Syst?me Aix & Linux De?: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] De la part de W. E. W. Russell Envoy??: mardi 1 novembre 2011 21:49 ??: linux-cluster at redhat.com Cc?: Gerardo Laracuente Objet?: [Linux-cluster] Issue with Conga on RHEL 5.7 My name is William Russell and I'm new on this list. I'm having an issue that I think is really simple, but I can't even figure out where the problem is located. I have installed 'luci' and 'ricci' on my main server and 'ricci' on my other two servers. All servers are running the latest RHEL 5.7 with all the yum updates (RHN registered with the Clustering entitlements). When I create a new cluster, I get to the progress screen and that's where everything falls apart. It just sits there for hours! It never creates the cluster. The dot for "Install" NEVER fills in. After much research, I understand what 'luci' is trying to do in terms of installing the packages necessary to configure, manage, and run the cluster, but I have done the 'yum groupinstall clustering' on another sever and it took 10 mins, if that. Communication between the servers has been verified - iptables is not running, selinux is disabled.? If you need more information on the issue, feel free to ask. I looked at the syslog and see no failures or errors. If anyone can even point me in the direction of what might be causing the problem, it would be much appreciated. -- W. E. W. Russell Director, Systems Intergration at incNETWORKS, Inc. Work Phone # 732-508-2224? Active Alumni member of Sigma Lambda Beta International Fraternity, Inc. Cell Phone # 732-744-6483 ------- Ce message et toutes les pi?ces jointes sont ?tablis ? l'intention exclusive de ses destinataires et sont confidentiels. L'int?grit? de ce message n'?tant pas assur?e sur Internet, la SNCF ne peut ?tre tenue responsable des alt?rations qui pourraient se produire sur son contenu. Toute publication, utilisation, reproduction, ou diffusion, m?me partielle, non autoris?e pr?alablement par la SNCF, est strictement interdite. Si vous n'?tes pas le destinataire de ce message, merci d'en avertir imm?diatement l'exp?diteur et de le d?truire. ------- This message and any attachments are intended solely for the addressees and are confidential. SNCF may not be held responsible for their contents whose accuracy and completeness cannot be guaranteed over the Internet. Unauthorized use, disclosure, distribution, copying, or any part thereof is strictly prohibited. If you are not the intended recipient of this message, please notify the sender immediately and delete it. From szekelyi at niif.hu Wed Nov 2 20:00:18 2011 From: szekelyi at niif.hu (=?ISO-8859-1?Q?Sz=E9kelyi?= Szabolcs) Date: Wed, 02 Nov 2011 21:00:18 +0100 Subject: [Linux-cluster] Running a cluster on routed networks Message-ID: <19707654.WTfKzLJxNB@mranderson> Hello, how can I run a cluster on a network where nodes are on different subnets? Currently the main problem is that heartbeats are sent with their IP level TTL set to 1, which keeps them from reaching the other nodes. How can I change this? I'm using multicasting. Thanks, -- Szabolcs From szekelyi at niif.hu Thu Nov 3 17:12:46 2011 From: szekelyi at niif.hu (=?ISO-8859-1?Q?Sz=E9kelyi?= Szabolcs) Date: Thu, 03 Nov 2011 18:12:46 +0100 Subject: [Linux-cluster] Running a cluster on routed networks In-Reply-To: <19707654.WTfKzLJxNB@mranderson> References: <19707654.WTfKzLJxNB@mranderson> Message-ID: <9710609.JpD2iA48WT@mranderson> On 2011. November 2. 21:00:18 Sz?kelyi Szabolcs wrote: > how can I run a cluster on a network where nodes are on different subnets? > Currently the main problem is that heartbeats are sent with their IP level > TTL set to 1, which keeps them from reaching the other nodes. How can I > change this? I'm using multicasting. OK, I've found this: https://bugzilla.redhat.com/show_bug.cgi?id=640311 , saying that it's now possible to set the TTL for multicast. But I haven't found any info on *how* to set it. I've seen the following possible solutions: But whatever I do, ccs_config_validate always says that my cluster.conf is invalid, and the TTL (as reported by tcpdump) is still zero. Is it possible that my cman is out of date? I'm using version 3.0.12. Can you tell me which is the eariest version that has this feature? Thanks, -- Szabolcs From linux at alteeve.com Thu Nov 3 17:29:25 2011 From: linux at alteeve.com (Digimer) Date: Thu, 03 Nov 2011 13:29:25 -0400 Subject: [Linux-cluster] Running a cluster on routed networks In-Reply-To: <9710609.JpD2iA48WT@mranderson> References: <19707654.WTfKzLJxNB@mranderson> <9710609.JpD2iA48WT@mranderson> Message-ID: <4EB2CF75.60904@alteeve.com> On 11/03/2011 01:12 PM, Sz?kelyi Szabolcs wrote: > On 2011. November 2. 21:00:18 Sz?kelyi Szabolcs wrote: >> how can I run a cluster on a network where nodes are on different subnets? >> Currently the main problem is that heartbeats are sent with their IP level >> TTL set to 1, which keeps them from reaching the other nodes. How can I >> change this? I'm using multicasting. > > OK, I've found this: https://bugzilla.redhat.com/show_bug.cgi?id=640311 , > saying that it's now possible to set the TTL for multicast. But I haven't > found any info on *how* to set it. I've seen the following possible solutions: > > > > > > > > > > > > But whatever I do, ccs_config_validate always says that my cluster.conf is > invalid, and the TTL (as reported by tcpdump) is still zero. Is it possible > that my cman is out of date? I'm using version 3.0.12. Can you tell me which > is the eariest version that has this feature? > > Thanks, Looking in the cluster.rng file (the one used to validate cluster.conf), '' should be valid. What version of cman are you using? -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From szekelyi at niif.hu Thu Nov 3 19:17:46 2011 From: szekelyi at niif.hu (=?ISO-8859-1?Q?Sz=E9kelyi?= Szabolcs) Date: Thu, 03 Nov 2011 20:17:46 +0100 Subject: [Linux-cluster] Running a cluster on routed networks In-Reply-To: <4EB2CF75.60904@alteeve.com> References: <19707654.WTfKzLJxNB@mranderson> <9710609.JpD2iA48WT@mranderson> <4EB2CF75.60904@alteeve.com> Message-ID: <1864491.GxLbRM0kJ3@mranderson> On 2011. November 3. 13:29:25 Digimer wrote: > On 11/03/2011 01:12 PM, Sz?kelyi Szabolcs wrote: > > On 2011. November 2. 21:00:18 Sz?kelyi Szabolcs wrote: > >> how can I run a cluster on a network where nodes are on different > >> subnets? Currently the main problem is that heartbeats are sent with > >> their IP level TTL set to 1, which keeps them from reaching the other > >> nodes. How can I change this? I'm using multicasting. > > > > OK, I've found this: https://bugzilla.redhat.com/show_bug.cgi?id=640311 > > , > > saying that it's now possible to set the TTL for multicast. But I > > haven't > > found any info on *how* to set it. [...] > > But whatever I do, ccs_config_validate always says that my cluster.conf > > is invalid, and the TTL (as reported by tcpdump) is still zero. Is it > > possible that my cman is out of date? I'm using version 3.0.12. Can you > > tell me which is the eariest version that has this feature? > > > > Thanks, > > Looking in the cluster.rng file (the one used to validate cluster.conf), > '' should be valid. What version of cman are you using? If I add the ttl="8" attribute to in cluster.conf, it fails to validate according to ccs_config_validate. Without this attribute it validates. I've grepped cluster.rng for "ttl", but found nothing sensible. It looks like it's missing from my cluster.rng. My cman's version is 3.0.12: $ sudo cman_tool -V cman_tool 3.0.12 (built Jul 2 2010 09:55:13) The cluster starts with the attibute (it issues a warning), but the TTL is still 1. I've already upgraded corosync to support TTL adjustment, but it looks like I just have a problem to push it through cman. Thanks, -- cc From jpokorny at redhat.com Thu Nov 3 19:29:50 2011 From: jpokorny at redhat.com (Jan Pokorny) Date: Thu, 03 Nov 2011 20:29:50 +0100 Subject: [Linux-cluster] Issue with Conga on RHEL 5.7 In-Reply-To: References: Message-ID: <4EB2EBAE.9010801@redhat.com> Hello, On 11/02/2011 09:42 AM, BONNETOT Jean-Daniel (EXT THALES) wrote: > Hello, > > We are many with same problem. Since RHEL 5.7, packages installation don't work from ricci. > For now, our solution is to install packages manually without using "groupinstall" command. > > yum install cman rgmanager qdiskd ... yes, as I mentioned in the (late-coming, I admit) reply [1] to already posted observation [2], this is a known issue with the fixes (more releases affected) being almost out of the door -- updated packages for RHEL 5.7 should be already available [3]. > Regards, > > Jean-Daniel BONNETOT > Ing?nierie Syst?me Aix& Linux > > De : linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] De la part de W. E. W. Russell > Envoy? : mardi 1 novembre 2011 21:49 > ? : linux-cluster at redhat.com > Cc : Gerardo Laracuente > Objet : [Linux-cluster] Issue with Conga on RHEL 5.7 > > My name is William Russell and I'm new on this list. > > I'm having an issue that I think is really simple, but I can't even figure out where the problem is located. > > I have installed 'luci' and 'ricci' on my main server and 'ricci' on my other two servers. All servers are running the latest RHEL 5.7 with all the yum updates (RHN registered with the Clustering entitlements). > When I create a new cluster, I get to the progress screen and that's where everything falls apart. It just sits there for hours! It never creates the cluster. The dot for "Install" NEVER fills in. After much research, I understand what 'luci' is trying to do in terms of installing the packages necessary to configure, manage, and run the cluster, but I have done the 'yum groupinstall clustering' on another sever and it took 10 mins, if that. > > Communication between the servers has been verified - iptables is not running, selinux is disabled. > > If you need more information on the issue, feel free to ask. I looked at the syslog and see no failures or errors. If anyone can even point me in the direction of what might be causing the problem, it would be much appreciated. > Also, as I mentioned in [1], while RHEL 5.7 exhibits the issue reliably, we would appreciate details about reproducers with 5.6 and especially with 6.x. [1] https://www.redhat.com/archives/linux-cluster/2011-October/msg00058.html [2] https://www.redhat.com/archives/linux-cluster/2011-September/msg00020.html [3] http://rhn.redhat.com/errata/RHBA-2011-1421.html Thanks, Jan From jochen.schneider at gmail.com Fri Nov 4 09:03:52 2011 From: jochen.schneider at gmail.com (Jochen Schneider) Date: Fri, 4 Nov 2011 10:03:52 +0100 Subject: [Linux-cluster] Failover after partial failure because of SAN? Message-ID: Hi, We are setting up a cluster for a storage application with SAN disks managed through HA-LVM and connected through multipath. There are actually two applications which have to run on the same node, but only one of them needs the disk. Both of them have clients. The question I have is what should happen when the SAN fails: Should both applications failover to another machine (possibly after a retry) or should the application which doesn't need the disk keep running while the other is shut down? I'm not sure how much recovery can come out of a failover in case of a SAN failure, if it's not both network cards of the node which are defective or whatever. Thanks, Jochen -------------- next part -------------- An HTML attachment was scrubbed... URL: From list at fajar.net Fri Nov 4 10:04:57 2011 From: list at fajar.net (Fajar A. Nugraha) Date: Fri, 4 Nov 2011 17:04:57 +0700 Subject: [Linux-cluster] Failover after partial failure because of SAN? In-Reply-To: References: Message-ID: On Fri, Nov 4, 2011 at 4:03 PM, Jochen Schneider wrote: > Hi, > > We are setting up a cluster for a storage application with SAN disks managed > through HA-LVM and connected through multipath. There are actually two > applications which have to run on the same node, HAVE to run on the same node? Why? Can't they communicate via TCP/IP? > but only one of them needs > the disk. Both of them have clients. > > The question I have is what should happen when the SAN fails: Should both > applications failover to another machine (possibly after a retry) or should > the application which doesn't need the disk keep running while the other is > shut down? You're not giving yourself much option. Since you say both application HAVE to run on the same node, I assume both are related (e.g. one needs the other). In that case, the only viable option is to failover. Having said that, I'm curious what do you mean by "SAN fails". It's rare for a cluster node to be suddenly unable to access a node while the other can access it just fine. Usually it's either the SAN unaccessible completely (e.g. broken SAN or switches) or a server node fails. > I'm not sure how much recovery can come out of a failover in case > of a SAN failure, if it's not both network cards of the node which are > defective or whatever. Exactly :) If no node can access the SAN, then it can't failover anywhere. -- Fajar From jochen.schneider at gmail.com Fri Nov 4 11:11:31 2011 From: jochen.schneider at gmail.com (Jochen Schneider) Date: Fri, 4 Nov 2011 12:11:31 +0100 Subject: [Linux-cluster] Failover after partial failure because of SAN? In-Reply-To: References: Message-ID: On Fri, Nov 4, 2011 at 11:04 AM, Fajar A. Nugraha wrote: > > On Fri, Nov 4, 2011 at 4:03 PM, Jochen Schneider > wrote: > > Hi, > > > > We are setting up a cluster for a storage application with SAN disks managed > > through HA-LVM and connected through multipath. There are actually two > > applications which have to run on the same node, > > HAVE to run on the same node? Why? Can't they communicate via TCP/IP? They are already communicating via TCP/IP, so they could be running on different nodes, you are right. But they are working in pairs so they shouldn't be like randomly distributed over the nodes. Also, we would have to see what the performance impact would be to have them on different nodes. > > but only one of them needs > > the disk. Both of them have clients. > > > > The question I have is what should happen when the SAN fails: Should both > > applications failover to another machine (possibly after a retry) or should > > the application which doesn't need the disk keep running while the other is > > shut down? > > You're not giving yourself much option. Since you say both application > HAVE to run on the same node, I assume both are related (e.g. one > needs the other). In that case, the only viable option is to failover. The one application not needing disk access can run without the other so in case of SAN failure there could be a degraded mode where only the first is serving its clients and the other is down. > Having said that, I'm curious what do you mean by "SAN fails". It's > rare for a cluster node to be suddenly unable to access a node while > the other can access it just fine. Usually it's either the SAN > inaccessible completely (e.g. broken SAN or switches) or a server node > fails. I'm am not sure, actually. I don't have any practical data points of a "real" SAN failure, only one due to misconfiguration. That's why I find it hard to decide on our configuration, I'm not sure about possible failures, dependencies between them and (even rough) probability estimates. (Has anybody ever come across a document addressing that, maybe as failure assumptions behind a clustering package and its configuration?) > > I'm not sure how much recovery can come out of a failover in case > > of a SAN failure, if it's not both network cards of the node which are > > defective or whatever. > > Exactly :) > > If no node can access the SAN, then it can't failover anywhere. If it is more likely that SAN access fails on the SAN side than on the node side, I guess that would mean it would be better to keep the application not needing the SAN running, i.e., not failing over. Or maybe failover should be tried once and then my service should go in the degraded mode described above? I'm not sure whether that is possible. > -- > Fajar Thanks! Jochen From list at fajar.net Fri Nov 4 13:25:58 2011 From: list at fajar.net (Fajar A. Nugraha) Date: Fri, 4 Nov 2011 20:25:58 +0700 Subject: [Linux-cluster] Failover after partial failure because of SAN? In-Reply-To: References: Message-ID: On Fri, Nov 4, 2011 at 6:11 PM, Jochen Schneider wrote: >> > I'm not sure how much recovery can come out of a failover in case >> > of a SAN failure, if it's not both network cards of the node which are >> > defective or whatever. >> >> Exactly :) >> >> If no node can access the SAN, then it can't failover anywhere. > > If it is more likely that SAN access fails on the SAN side than on the > node side, I guess that would mean it would be better to keep the > application not needing the SAN running, i.e., not failing over. Or > maybe failover should be tried once and then my service should go in > the degraded mode described above? I'm not sure whether that is > possible. I recommend you just keep it simple: treat the two applications differently. Don't put any dependcy between them. Period. That way when a node dies, they will be migrated to other nodes. If the SAN dies, the one that doesn't need external disk will still work just fine, while the one that needs it will be marked as dead (I assume you have some kind of monitoring script for this already). Then the dead one will try to either restart or moved to another node, and if the SAN is also not available there it will simply die. -- Fajar From raju.rajsand at gmail.com Fri Nov 4 17:00:00 2011 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Fri, 4 Nov 2011 22:30:00 +0530 Subject: [Linux-cluster] Failover after partial failure because of SAN? In-Reply-To: References: Message-ID: Greetings, On Fri, Nov 4, 2011 at 2:33 PM, Jochen Schneider wrote: > Hi, > > The question I have is what should happen when the SAN fails: You should be looking at SAN replication solutions if it fits your budget. If you want alternatives, have look at DRBD for local storage redundancy. I can't perceive any redundancy from you setup for the SPF of storage. RHEL and VMware has been shouting from the top of their roof about Storage Virtualization. Have a look at that. An oh, don't forget offsite DR and BCP (or BPC: Business process continuity--permute the words) if your application is mission critical. -- Regards, Rajagopal From rossnick-lists at cybercat.ca Fri Nov 4 18:05:34 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Fri, 4 Nov 2011 14:05:34 -0400 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement Message-ID: <57026C2F7E1748F1A63F624ED8087C1B@versa> Hi ! We are curently using RHEL 6.1 with GFS2 file systems on top of fiber-channel stoarage for our cluster. All fs' are in lv's, with clvmd. Our services are divided into directories. For exemple : /GFSVolume1/Service1, GFSVolume1/Service2, and so forth. Almost everuything the service needs to run is under those directories (apache, php executables, websites data, java servers, etc). On some services, there are document directories that are huge, not that much in size (about 35 gigs), but in number of files, around one million. One service even has 3 data directories with that many files each. It works pretty well for now, but when it comes to data update (via rsync) and backup (also via rsync), the node doing the rsync crawls to a stop, all 16 logical cores are used at 100% system, and it sometimes freezes the file system for other services on other nodes. We've changed recently the way the rsync is done, we just to a rsync -nv to see what files would be transfered and transfer thos files manually. But it's still too much sometimes for the gfs. In our case, nfs is not an option, there is a lot of is_file called that access this directory structure all the time, and the added latency of nfs is not viable. So, I'm thinking of putting each of thos directories into a single ext4 filesystem of about 100 gigs to speed up all of those process. Where those huge directories are used, they are used by one service and one service alone. So, I would do a file system in cluster.conf, something like : So, first, is this doable ? Second, is this risky ? In the sens of that with force_unmont true, I assume that no other node would mount that filesystem before it is unmounted on the stopping service. I know that for some reason umount could hang, but it's not likely since this data is mostly read-only. In that case the service would be failed and need to be manually restarted. What would be the consequence if the filesystem happens to be mounted on 2 nodes ? One could add self_fence="1" to the fs line, so that even if it fails, it will self-fence the node to force the umount. But I'm not there yet. Third, I've been told that it's not recommended to mount a file system like this "on top" of another clustered fs. Why is that ? I suppose I'll have to mount under /mnt/something and symlink to that. Thanks for any insights. From rossnick-lists at cybercat.ca Fri Nov 4 18:05:47 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Fri, 4 Nov 2011 14:05:47 -0400 Subject: [Linux-cluster] Corosync goes cpu to 95-99% References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com><51BB988BCCF547E69BF222BDAF34C4DE@versa><4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com><4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com><4E2D8ECB.6020305@redhat.com> <4E2D8F87.30508@gmail.com><4E2D940B.5020803@redhat.com> <4E73073D.8010209@gmail.com> Message-ID: <16366A53AA0D47A7A935FD7FE920D462@versa> >> get a support signoff. Also the corosync updates have not finished >> through our validation process. Only hot fixes (from support) are >> available >> >> Regards >> -steve >> > > Sorry to re-open this thread ... But exists any news about this problem?? In fact, there is ! It appears that this situation is within the microcode of some specific xeon "nahalem" (sorry for the spelling) processors... It has to do with switching cstate and the way rhel6.1 now switch state that was not done in 6.0. You can look at bugzilla # 710265 and kb docs # 61105. Our temporary fix for the moment was to disable cstate transition by adding : intel_idle.max_cstate=0 processor.max_cstate=1 to the kernel line in grub.conf, update and reboot. We hadn't had any cpu spikes on any of the 5 nodes we've updated yet. The 3 remaining still haven't been updated due to production downtime. Get a support signoff for this, I'm in no way endorsing this solution, as I can't know if you're in the same situation as mine. Have fun ! From rossnick-lists at cybercat.ca Fri Nov 4 18:11:45 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Fri, 4 Nov 2011 14:11:45 -0400 Subject: [Linux-cluster] When corosync-1.4.1-3.el6 will be released for rhel6.x? References: <4E804353.1040605@gmail.com> <4E80548D.1070904@redhat.com> <4E805928.3020009@gmail.com> <4E80633E.8020409@redhat.com><4E806843.6060202@gmail.com> <4E807B5B.5000606@redhat.com> Message-ID: <62E11F24A0C846B9A39FC4AC1E08B209@versa> >> But problem described in 709758 appears in my enviroment: One RHEL6.1 > > Please contact GSS (Global Support Service). They can help you to: > - Check if your configuration is valid > - Check if architecture is valid > - Give you "not yet" released package and/or hot fix > - Propose backport to Z-stream for given bug > > -> Basically everything what you are/will pay them for. You might read : https://www.redhat.com/archives/linux-cluster/2011-November/msg00027.html for a temp fix. Regards, From carlopmart at gmail.com Fri Nov 4 18:20:43 2011 From: carlopmart at gmail.com (carlopmart) Date: Fri, 04 Nov 2011 19:20:43 +0100 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <16366A53AA0D47A7A935FD7FE920D462@versa> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com><51BB988BCCF547E69BF222BDAF34C4DE@versa><4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com><4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com><4E2D8ECB.6020305@redhat.com> <4E2D8F87.30508@gmail.com><4E2D940B.5020803@redhat.com> <4E73073D.8010209@gmail.com> <16366A53AA0D47A7A935FD7FE920D462@versa> Message-ID: <4EB42CFB.7070908@gmail.com> On 11/04/2011 07:05 PM, Nicolas Ross wrote: >>> get a support signoff. Also the corosync updates have not finished >>> through our validation process. Only hot fixes (from support) are >>> available >>> >>> Regards >>> -steve >>> >> >> Sorry to re-open this thread ... But exists any news about this problem?? > > In fact, there is ! > > It appears that this situation is within the microcode of some specific > xeon "nahalem" (sorry for the spelling) processors... It has to do with > switching cstate and the way rhel6.1 now switch state that was not done > in 6.0. > > You can look at bugzilla # 710265 and kb docs # 61105. > > Our temporary fix for the moment was to disable cstate transition by > adding : > > intel_idle.max_cstate=0 processor.max_cstate=1 > > to the kernel line in grub.conf, update and reboot. We hadn't had any > cpu spikes on any of the 5 nodes we've updated yet. The 3 remaining > still haven't been updated due to production downtime. > > Get a support signoff for this, I'm in no way endorsing this solution, > as I can't know if you're in the same situation as mine. > good!! ... But this problem appears in AMD Opteron QuadCore, too ... At least in my installation .. -- CL Martinez carlopmart {at} gmail {d0t} com From Colin.Simpson at iongeo.com Fri Nov 4 18:50:17 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Fri, 04 Nov 2011 18:50:17 +0000 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <57026C2F7E1748F1A63F624ED8087C1B@versa> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: <1320432617.20017.104.camel@bhac.iouk.ioroot.tld> On Fri, 2011-11-04 at 14:05 -0400, Nicolas Ross wrote: > Hi ! > > So, I'm thinking of putting each of thos directories into a single ext4 > filesystem of about 100 gigs to speed up all of those process. Where those > huge directories are used, they are used by one service and one service > alone. So, I would do a file system in cluster.conf, something like : > > mountpoint="/GFSVolume1/Service1/documentsA" name="documentsA" > options="noatime,quota=off"/> > > So, first, is this doable ? Yup, I have had some tasks that have needed to switch over to ext4 from GFS2 (I'm on RHEL6 in case any of this makes a difference). I'm using, I fully let cluster.conf manage ext4 and have no mention of it in fstab. > > Second, is this risky ? In the sens of that with force_unmont true, I assume > that no other node would mount that filesystem before it is unmounted on the > stopping service. I know that for some reason umount could hang, but it's > not likely since this data is mostly read-only. In that case the service > would be failed and need to be manually restarted. What would be the > consequence if the filesystem happens to be mounted on 2 nodes ? The failing to umount mostly happens to me because I NFS export this file system. Now in theory the cluster should take care of this by freeing the NFS lockd's, but doesn't always happen for me. But you are probably in a better position as it doesn't sound like you are doing NFS on this. I've never seen it fail when NFS isn't involved on a fs. If the filesystems fails to umount, the service gets marked as failed, so won't start on another node (and so won't mount on another node). However badness will happen if you manually disable the service and reenable it on another node. The other node will assume the filesystem isn't mounted anywhere else and mount it itself. The "solution", of course, is to check any failed service to see where it was last running on and make sure it's dependant fs's are umounted from that node, before disabling it and bringing it back up. There was a resource agent patch floating around somewhere (that didn't make it in so far) that would (as I remember) lock the clvmd to prevent double mounting of non-clustered fs's. But I guess most people are using GFS2 so not really a priority. As you say below a failing to umount can be tackled by a self_fence, but I haven't needed to go there yet. Also depending on how quickly you need the service back, quick_status and force_fsck will have to be set accordingly. I wanted the paranoia of checking for a good file system, others may want faster start times. > > One could add self_fence="1" to the fs line, so that even if it fails, it > will self-fence the node to force the umount. But I'm not there yet. > > Third, I've been told that it's not recommended to mount a file system like > this "on top" of another clustered fs. Why is that ? I suppose I'll have to > mount under /mnt/something and symlink to that. > Don't know on this. Maybe due to extra dependency issues that might effect operations (or maybe just not thoroughly tested in GFS2, as not a priority). > Thanks for any insights. Hopefully Thanks Colin > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From rossnick-lists at cybercat.ca Fri Nov 4 19:40:11 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Fri, 4 Nov 2011 15:40:11 -0400 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <1320432617.20017.104.camel@bhac.iouk.ioroot.tld> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> <1320432617.20017.104.camel@bhac.iouk.ioroot.tld> Message-ID: <0CF2B72C-5D44-4B80-B212-83F43C24A3BC@cybercat.ca> > Also depending on how quickly you need the service back, quick_status > and force_fsck will have to be set accordingly. I wanted the paranoia of > checking for a good file system, others may want faster start times. Thanks for the rsponse, I will go ahead and do some tests... What is the quick_status setting ? I haven't seen it in the doc ? From Colin.Simpson at iongeo.com Fri Nov 4 19:50:34 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Fri, 04 Nov 2011 19:50:34 +0000 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <0CF2B72C-5D44-4B80-B212-83F43C24A3BC@cybercat.ca> References: <57026C2F7E1748F1A63F624ED8087C1B@versa><1320432617.20017.104.camel@bhac.iouk.ioroot.tld> <0CF2B72C-5D44-4B80-B212-83F43C24A3BC@cybercat.ca> Message-ID: <1320436234.20017.112.camel@bhac.iouk.ioroot.tld> On Fri, 2011-11-04 at 19:40 +0000, Nicolas Ross wrote: > > > Also depending on how quickly you need the service back, > quick_status > > and force_fsck will have to be set accordingly. I wanted the > paranoia of > > checking for a good file system, others may want faster start times. > > Thanks for the rsponse, I will go ahead and do some tests... > > What is the quick_status setting ? I haven't seen it in the doc ? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > According to the comment in the resource script: Use quick status checks. When set to 0 (the default), this agent behaves normally. When set to 1, this agent will not log errors incurred or perform the file system accessibility check (e.g. it will not try to read from/write to the file system). You should only set this to 1 if you have lots of file systems on your cluster or you are seeing very high load spikes as a direct result of this agent. I'm guessing that if the checking of the filesystem is causing high load you can disable this checking (presumably probes the filesystem periodically). My reading is you really want 0 unless you are seeing an issue. Thanks Colin This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From bohdan at harazd.net Fri Nov 4 21:35:44 2011 From: bohdan at harazd.net (Bohdan Sydor) Date: Fri, 4 Nov 2011 22:35:44 +0100 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <57026C2F7E1748F1A63F624ED8087C1B@versa> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: Hi, On Fri, Nov 4, 2011 at 7:05 PM, Nicolas Ross wrote: > On some services, there are document directories that are huge, not that > much in size (about 35 gigs), but in number of files, around one million. > One service even has 3 data directories with that many files each. > > It works pretty well for now, but when it comes to data update (via rsync) > and backup (also via rsync), the node doing the rsync crawls to a stop, all > 16 logical cores are used at 100% system, and it sometimes freezes the file > system for other services on other nodes. I have a 600GB GFS2 FS, and I resolved the issue with rsync that I run ionice -c3 rsync -av ... That way rsync is given the CPU for IO, if all other processes don't require IO. Of course it takes a lot of time to compete the sync, but if the time is not an issue, it can be a solution. -- Regards, Bohdan From rossnick-lists at cybercat.ca Fri Nov 4 22:42:00 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Fri, 4 Nov 2011 18:42:00 -0400 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: <3E26DA09-2E25-493E-B64B-956A9535C97F@cybercat.ca> > I have a 600GB GFS2 FS, and I resolved the issue with rsync that I run > ionice -c3 rsync -av ... > That way rsync is given the CPU for IO, if all other processes don't > require IO. Of course it takes a lot of time to compete the sync, but > if the time is not an issue, it can be a solution. Oh, greet, I will try that asap! How much more time? I don't mind taking 3 or 4 hours instead of 2.5, but if it goes up to 5 or 6, I'll consider an ext4 fs... From bohdan at harazd.net Fri Nov 4 22:58:05 2011 From: bohdan at harazd.net (Bohdan Sydor) Date: Fri, 4 Nov 2011 23:58:05 +0100 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <3E26DA09-2E25-493E-B64B-956A9535C97F@cybercat.ca> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> <3E26DA09-2E25-493E-B64B-956A9535C97F@cybercat.ca> Message-ID: On Fri, Nov 4, 2011 at 11:42 PM, Nicolas Ross wrote: >> ionice -c3 rsync -av ... > Oh, greet, I will try that asap! > > How much more time? I don't mind taking 3 or 4 hours instead of 2.5, but if it goes up to 5 or 6, I'll consider an ext4 fs... I can't answer your question because it all depends on other IO operations that are running on your system with higher priority. You can also consider setting the nice value eg 19 for rsync processes. -- regards, Bohdan From rossnick-lists at cybercat.ca Sat Nov 5 15:38:31 2011 From: rossnick-lists at cybercat.ca (rossnick-lists at cybercat.ca) Date: Sat, 05 Nov 2011 11:38:31 -0400 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: > I have a 600GB GFS2 FS, and I resolved the issue with > rsync that I run > ionice -c3 rsync -av ... > That way rsync is given the CPU for IO, if all other > processes don't > require IO. Of course it takes a lot of time to compete > the sync, but > if the time is not an issue, it can be a solution. I tried it on some directories, it seems that the peeks in cpu are still present, but it seems not to affect the other nodes as before. I'm not that sure, since it was not 100% of the time I saw impact on other nodes. I will keep that trick in mind... From bergman at merctech.com Sat Nov 5 18:17:09 2011 From: bergman at merctech.com (bergman at merctech.com) Date: Sat, 05 Nov 2011 14:17:09 -0400 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: Your message of "Fri, 04 Nov 2011 14:05:34 EDT." <57026C2F7E1748F1A63F624ED8087C1B@versa> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: <5424.1320517029@localhost> In the message dated: Fri, 04 Nov 2011 14:05:34 EDT, The pithy ruminations from "Nicolas Ross" on <[Linux-cluster] Ext3/ext4 in a clustered environement> were: => Hi ! => [SNIP!] => => On some services, there are document directories that are huge, not that => much in size (about 35 gigs), but in number of files, around one million. => One service even has 3 data directories with that many files each. Ouch. I've seen significant a performance drop with ext3 (and other) filesystems with 10s to 100s of thousands of files per directory. Make sure that the "directory hash" option is enabled for ext3. With ~1M files per directory, I'd do some performance tests comparing rsync under ext3, ext4, and gfs befor changing filesystems...while ext3/4 do perform better than gfs, the directory size may be such an overwhelming factor that the filesystem choice is irrelevent. => => It works pretty well for now, but when it comes to data update (via rsync) => and backup (also via rsync), the node doing the rsync crawls to a stop, all => 16 logical cores are used at 100% system, and it sometimes freezes the file => system for other services on other nodes. Ouch! => => We've changed recently the way the rsync is done, we just to a rsync -nv to => see what files would be transfered and transfer thos files manually. But => it's still too much sometimes for the gfs. Is this a GFS issue strictly, or an issue with rsync. Have you set up a similar environment under ext3/4 to test jus the rsync part? Rsync is known for being a memory & resource hog, particularly at the initial stage of building the filesystem tree. I would strongly recommend benchmarking rsync on ext3/4 before making the switch. One option would be to do several 'rsync' operations (serially, not in parallel!), each operating on a subset of the filesystem, while continuing to use gfs. [SNIP!] => => mountpoint="/GFSVolume1/Service1/documentsA" name="documentsA" => options="noatime,quota=off"/> => => So, first, is this doable ? Yes. We have been doing something very similar for the past ~2 years, except not mounting the ext3/4 partition under a GFS mountpoint. => => Second, is this risky ? In the sens of that with force_unmont true, I assume => that no other node would mount that filesystem before it is unmounted on the => stopping service. I know that for some reason umount could hang, but it's => not likely since this data is mostly read-only. In that case the service We've experienced numerous cases where the filesystem hangs after a service migration due a node (or service) failover. These hangs all seem to be related to quota or NFS issues, so this may not be an issue in your environment. => would be failed and need to be manually restarted. What would be the => consequence if the filesystem happens to be mounted on 2 nodes ? Most likely, filesystem corruption. => => One could add self_fence="1" to the fs line, so that even if it fails, it => will self-fence the node to force the umount. But I'm not there yet. We don't do that...and haven't felt the need to. => => Third, I've been told that it's not recommended to mount a file system like => this "on top" of another clustered fs. Why is that ? I suppose I'll have to First of all, that's introducing another dependency. If you mount the ext3/4 partition under a local directory (ie., /export), then you could have nodes that provide your rsync data service, without requiring GFS. => mount under /mnt/something and symlink to that. => => Thanks for any insights. => Mark From kkovachev at varna.net Mon Nov 7 10:13:52 2011 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Mon, 07 Nov 2011 12:13:52 +0200 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <57026C2F7E1748F1A63F624ED8087C1B@versa> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: <67914ccd78ddcf19d72a3d2302bf9298@mx.varna.net> Hi, On Fri, 4 Nov 2011 14:05:34 -0400, "Nicolas Ross" wrote: > Hi ! > > We are curently using RHEL 6.1 with GFS2 file systems on top of > fiber-channel stoarage for our cluster. All fs' are in lv's, with clvmd. > As they are LV's, you may try to make a snapshot and then mount it with lock_nolock - faster and won't interfere with other services. If you are not mounting it on dedicated backup node where the fs is not mounted, you may need to change the UUID and lock table to be able to mount the snapshot on the same machine. From swhiteho at redhat.com Mon Nov 7 10:54:30 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 07 Nov 2011 10:54:30 +0000 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <5424.1320517029@localhost> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> <5424.1320517029@localhost> Message-ID: <1320663270.2762.16.camel@menhir> Hi, On Sat, 2011-11-05 at 14:17 -0400, bergman at merctech.com wrote: > In the message dated: Fri, 04 Nov 2011 14:05:34 EDT, > The pithy ruminations from "Nicolas Ross" on > <[Linux-cluster] Ext3/ext4 in a clustered environement> were: > => Hi ! > => > > [SNIP!] > > => > => On some services, there are document directories that are huge, not that > => much in size (about 35 gigs), but in number of files, around one million. > => One service even has 3 data directories with that many files each. > > Ouch. > > I've seen significant a performance drop with ext3 (and other) filesystems > with 10s to 100s of thousands of files per directory. Make sure that the > "directory hash" option is enabled for ext3. With ~1M files per directory, I'd > do some performance tests comparing rsync under ext3, ext4, and gfs befor > changing filesystems...while ext3/4 do perform better than gfs, the directory > size may be such an overwhelming factor that the filesystem choice is > irrelevent. > There are really two issues here, one is the performance of readdir and listing the directory and the other is the performance of look ups of individual inodes. Turning on the hashing option for ext3 will improve the look up performance, but make next to no different to the readdir performance. GFS2 has had hashed directories, inherited from GFS, so on the look up side of things, both should be fairly similar. One issue though is that GFS2 will return the directory entries from readdir in hash order. That is due to a restriction imposed by the unfortunate combination of the Linux VFS readdir code and the GFS2 algorithm for expanding the directory hash table when it fills up. Ideally, one would sort the returned entries into inode number order before beginning the look ups of the individual inodes. I don't know if rsync does this, or whether it is an option that can be turned on. It should make a difference though. Also, being able to look up multiple inodes in parallel should also dramatically improve the speed, if this is possible. > => > => It works pretty well for now, but when it comes to data update (via rsync) > => and backup (also via rsync), the node doing the rsync crawls to a stop, all > => 16 logical cores are used at 100% system, and it sometimes freezes the file > => system for other services on other nodes. > > Ouch! > So the question is what is using all this cpu time? Is this being used by rsync, or by some of the gfs2/dlm system daemons or even by some other threads? > => > => We've changed recently the way the rsync is done, we just to a rsync -nv to > => see what files would be transfered and transfer thos files manually. But > => it's still too much sometimes for the gfs. > > Is this a GFS issue strictly, or an issue with rsync. Have you set up a > similar environment under ext3/4 to test jus the rsync part? Rsync is > known for being a memory & resource hog, particularly at the initial > stage of building the filesystem tree. > > I would strongly recommend benchmarking rsync on ext3/4 before making the > switch. > > One option would be to do several 'rsync' operations (serially, not in > parallel!), each operating on a subset of the filesystem, while continuing > to use gfs. > > I agree that we don't have enough information yet to make a judgement on where the problem lies. It may well be something that can be resolved by making some alterations in the way that rsync is done, Steve. From ajb2 at mssl.ucl.ac.uk Mon Nov 7 11:43:29 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Mon, 07 Nov 2011 11:43:29 +0000 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <57026C2F7E1748F1A63F624ED8087C1B@versa> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> Message-ID: <4EB7C461.1010608@mssl.ucl.ac.uk> Nicolas Ross wrote: > On some services, there are document directories that are huge, not that > much in size (about 35 gigs), but in number of files, around one > million. One service even has 3 data directories with that many files each. You are utterly mad. Apart from the human readability aspects if someone attempts a directory listing, you're putting a substantial load on your system each time you attempt to go into those directories, even with dentry/inode caching tweaked out to maximums. Directory inode hashing helps, but not for filesystem abuse on this scale. Be glad you're using ext3/4 and not GFS, the problems are several orders of magnitude worse there (it can take 10 minutes to list a directory with 10,000 files in it, let alone 1,000,000) > It works pretty well for now, but when it comes to data update (via > rsync) and backup (also via rsync), the node doing the rsync crawls to a > stop, all 16 logical cores are used at 100% system, and it sometimes > freezes the file system for other services on other nodes. That's not particularly surprising - and a fairly solid hint you should be revisiting the way you lay out your files. If you go for a hierarchical layout you'll see several orders of magnitude speedup in access time without any real effort at all. If you absolutely must put that many files in a directory, then use a filesystem able to cope with such activities. Ext3/4 aren't it. From xavier.montagutelli at unilim.fr Mon Nov 7 12:57:35 2011 From: xavier.montagutelli at unilim.fr (Xavier Montagutelli) Date: Mon, 7 Nov 2011 13:57:35 +0100 Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <67914ccd78ddcf19d72a3d2302bf9298@mx.varna.net> References: <57026C2F7E1748F1A63F624ED8087C1B@versa> <67914ccd78ddcf19d72a3d2302bf9298@mx.varna.net> Message-ID: <201111071357.35844.xavier.montagutelli@unilim.fr> On Monday 07 November 2011 11:13:52 Kaloyan Kovachev wrote: > Hi, > > On Fri, 4 Nov 2011 14:05:34 -0400, "Nicolas Ross" > > wrote: > > Hi ! > > > > We are curently using RHEL 6.1 with GFS2 file systems on top of > > fiber-channel stoarage for our cluster. All fs' are in lv's, with clvmd. > > As they are LV's, you may try to make a snapshot Is it possible to make snapshots in a *cluster* LVM environment ? Last time I read the manual it was not possible. Oops, possible starting with RH 6.1, okay ... http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html- single/Logical_Volume_Manager_Administration/index.html#snapshot_command > and then mount it with > lock_nolock - faster and won't interfere with other services. If you can make snapshots, I agree, it is a good solution to mount a snapshot on a dedicated node. But perhaps this can also be done at the storage level, one layer deeper ? > If you are > not mounting it on dedicated backup node where the fs is not mounted, you > may need to change the UUID and lock table to be able to mount the snapshot > on the same machine. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Xavier Montagutelli http://twitter.com/#!/XMontagutelli Service Commun Informatique - Universite de Limoges 123, avenue Albert Thomas - 87060 Limoges cedex Tel : +33 (0)5 55 45 77 20 / Fax : +33 (0)5 55 45 75 95 From rpeterso at redhat.com Mon Nov 7 13:35:47 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Mon, 07 Nov 2011 08:35:47 -0500 (EST) Subject: [Linux-cluster] Ext3/ext4 in a clustered environement In-Reply-To: <201111071357.35844.xavier.montagutelli@unilim.fr> Message-ID: ----- Original Message ----- | On Monday 07 November 2011 11:13:52 Kaloyan Kovachev wrote: | > Hi, | > | > On Fri, 4 Nov 2011 14:05:34 -0400, "Nicolas Ross" | > | > wrote: | > > Hi ! | > > | > > We are curently using RHEL 6.1 with GFS2 file systems on top of | > > fiber-channel stoarage for our cluster. All fs' are in lv's, with | > > clvmd. | > | > As they are LV's, you may try to make a snapshot | | Is it possible to make snapshots in a *cluster* LVM environment ? | Last time I | read the manual it was not possible. I highly suspect Nick was talking about _hardware_ snapshotting that is supported by some SANs, _not_ our clustered snapshot software. Regards, Bob Peterson Red Hat File Systems From Nicholas.Geovanis at uscellular.com Mon Nov 7 17:30:48 2011 From: Nicholas.Geovanis at uscellular.com (Geovanis, Nicholas) Date: Mon, 7 Nov 2011 11:30:48 -0600 Subject: [Linux-cluster] NTP sync cause CNAM shutdown Message-ID: I can't find from where I leaned this "trick", but if you look at the stock RH 5.6 startup script for ntpd you'll see it: If you put an IP address (not hostname but numeric address) in the file /etc/ntp/step-tickers, the startup script takes that to mean the following: "Run ntpdate against the server(s) in /etc/ntp/step-tickers before you establish yourself as ntp client". I point it at my very same ntp server (just by address, name resolution isn't necessarily up yet). This way the local clock gets "normalised" before it really tries to properly sync via ntpd and that subsequent time sync isn't problematic. More importantly, in one datacenter I have clusters serving GFS2 which take so long to establish client-server with the NTP servers that they'll startup inquorate almost every time _without_ doing this. Nick Geovanis US Cellular/Kforce Inc v. 708-674-4924 e. Nicholas.Geovanis at uscellular.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ufimtseva at gmail.com Mon Nov 7 19:05:52 2011 From: ufimtseva at gmail.com (Elena Ufimtseva) Date: Mon, 7 Nov 2011 14:05:52 -0500 Subject: [Linux-cluster] fence_ilo question Message-ID: Hello All Anyone knows what is the latest version of fence_ilo or if fence_ilo (ILo3) should support timeout parameter? I try connecting to ILO (its hp ilo v3) manually and it works fine. But fencing does not work in cluster. Checking fence_ilo -l admin -p password -o status -a 172.28.84.33 Unable to connect/login to fencing device fence_ilo -V 2.0.115 (built Wed Aug 5 08:25:06 EDT 2009) Copyright (C) Red Hat, Inc. 2004 All rights reserved. in strace output it looks like a timeout: ioctl(3, TIOCGPTN, [6]) = 0 stat("/dev/pts/6", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 6), ...}) = 0 statfs("/dev/pts/6", {f_type="DEVPTS_SUPER_MAGIC", f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, ioctl(3, TIOCSPTLCK, [0]) = 0 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(3, TIOCGPTN, [6]) = 0 stat("/dev/pts/6", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 6), ...}) = 0 open("/dev/pts/6", O_RDWR|O_NOCTTY) = 4 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2acc82a54020) = 3120 close(4) = 0 select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout) write(3, "\r\n", 23) = 23 wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 select(4, [3], [], [], {10, 0}) = 1 (in [3], left {10, 0}) read(3, "\r\n\r\n", 2000) = 25 select(0, NULL, NULL, NULL, {0, 100}) = 0 (Timeout) wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 select(4, [3], [], [], {9, 997862}) = 1 (in [3], left {6, 413000}) read(3, "HTTP/1.1 405 Method Not Allowed\r"..., 2000) = 132 select(0, NULL, NULL, NULL, {0, 100}) = 0 (Timeout) wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 select(4, [3], [], [], {6, 410183}) = 1 (in [3], left {6, 365000}) --- SIGCHLD (Child exited) @ 0 (0) --- read(3, 0x1108faa4, 2000) = -1 EIO (Input/output error) write(2, "Unable to connect/login to fenci"..., 42Unable to connect/login to fencing device ) = 42 close(3) = 0 select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout) wait4(3120, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3120 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x39ec40e7c0}, {0x39fdebc330, [], SA_RESTORER, 0x39ec40e7c0}, 8) = 0 That makes me think, that the default time out should be modified, but this version of fence_ilo doesn't have timeout option. Does anyone knows if there is another version and if there is, where to get it. Thanks. -- Elena -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Mon Nov 7 19:12:03 2011 From: linux at alteeve.com (Digimer) Date: Mon, 07 Nov 2011 14:12:03 -0500 Subject: [Linux-cluster] fence_ilo question In-Reply-To: References: Message-ID: <4EB82D83.4020301@alteeve.com> On 11/07/2011 02:05 PM, Elena Ufimtseva wrote: > Hello All > > Anyone knows what is the latest version of fence_ilo or if fence_ilo > (ILo3) should support timeout parameter? I try connecting to > ILO (its hp ilo v3) manually and it works fine. But fencing does not > work in cluster. > > Checking > > fence_ilo -l admin -p password -o status -a 172.28.84.33 > Unable to connect/login to fencing device > > fence_ilo -V > 2.0.115 (built Wed Aug 5 08:25:06 EDT 2009) Copyright (C) Red Hat, Inc. > 2004 All rights reserved. > > in strace output it looks like a timeout: > > ioctl(3, TIOCGPTN, [6]) = 0 stat("/dev/pts/6", {st_mode=S_IFCHR|0620, > st_rdev=makedev(136, 6), ...}) = 0 statfs("/dev/pts/6", > {f_type="DEVPTS_SUPER_MAGIC", f_bsize=4096, f_blocks=0, f_bfree=0, > f_bavail=0, f_files=0, f_ffree=0, ioctl(3, TIOCSPTLCK, [0]) = 0 ioctl(3, > SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 > ioctl(3, TIOCGPTN, [6]) = 0 stat("/dev/pts/6", {st_mode=S_IFCHR|0620, > st_rdev=makedev(136, 6), ...}) = 0 open("/dev/pts/6", O_RDWR|O_NOCTTY) = > 4 clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x2acc82a54020) = 3120 close(4) = 0 select(0, NULL, NULL, > NULL, {0, 50000}) = 0 (Timeout) write(3, "\r\n", > 23) = 23 wait4(3120, 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, > 0x7fffd7c58474, WNOHANG, NULL) = 0 select(4, [3], [], [], {10, 0}) = 1 > (in [3], left {10, 0}) read(3, "\r\n\r\n", 2000) > = 25 select(0, NULL, NULL, NULL, {0, 100}) = 0 (Timeout) wait4(3120, > 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, 0x7fffd7c58474, WNOHANG, > NULL) = 0 select(4, [3], [], [], {9, 997862}) = 1 (in [3], left {6, > 413000}) read(3, "HTTP/1.1 405 Method Not Allowed\r"..., 2000) = 132 > select(0, NULL, NULL, NULL, {0, 100}) = 0 (Timeout) wait4(3120, > 0x7fffd7c58474, WNOHANG, NULL) = 0 wait4(3120, 0x7fffd7c58474, WNOHANG, > NULL) = 0 select(4, [3], [], [], {6, 410183}) = 1 (in [3], left {6, > 365000}) --- SIGCHLD (Child exited) @ 0 (0) --- read(3, 0x1108faa4, > 2000) = -1 EIO (Input/output error) write(2, "Unable to connect/login to > fenci"..., 42Unable to connect/login to fencing device ) = 42 close(3) = > 0 select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout) wait4(3120, > [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3120 > rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x39ec40e7c0}, > {0x39fdebc330, [], SA_RESTORER, 0x39ec40e7c0}, 8) = 0 > > That makes me think, that the default time out should be modified, but > this version of fence_ilo > doesn't have timeout option. > > Does anyone knows if there is another version and if there is, where to > get it. > > > Thanks. > > -- > Elena > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster Looking at the cluster.rng, I see the follow options as being valid; To use these, try, for example, If this doesn't help, can you paste your cluster.conf file and the shell call that works? -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron From linux at alteeve.com Mon Nov 7 19:36:24 2011 From: linux at alteeve.com (Digimer) Date: Mon, 07 Nov 2011 14:36:24 -0500 Subject: [Linux-cluster] fence_ilo question In-Reply-To: References: <4EB82D83.4020301@alteeve.com> Message-ID: <4EB83338.2050502@alteeve.com> On 11/07/2011 02:18 PM, Elena Ufimtseva wrote: > Thanks Digimer > > > But look what confuses me here. > Cluster will run fence_node, right? It will read cluster.conf, get all > these parameters, timouts and etc... and will run an agent which is in > my case fence_ilo, correct? > > Ok, looking at fence_ilo: > > fence_ilo [options] > Options: > -o Action: status, reboot (default), off or on > -a IP address or hostname of fencing device > -l Login name > -p Login password or passphrase > -S