From hitesh1907nayyar at gmail.com Mon Oct 1 03:16:31 2012 From: hitesh1907nayyar at gmail.com (hitesh nayyar) Date: Mon, 1 Oct 2012 08:46:31 +0530 Subject: [Linux-cluster] linux-cluster Message-ID: Hi, Hi, I am facing issuing in setting up Linux cluster. Here is the issue that i am facing. I have 2 Linux desktop and have following ip's and name: hitesh12-192.168.1.23 saanvi12-192.168.1.30 i enabled ricci service and have setup passwod as well.Enabled luci service as well. When cluster using GUI by activating luci GUI i see error logs in my /var/log/messages from hitesh12 -192.168.1.23 Sep 30 22:31:57 localhost dlm_controld[2945]: dlm_controld 3.0.12 started Sep 30 22:32:18 localhost gfs_controld[3010]: gfs_controld 3.0.12 started Sep 30 22:33:39 localhost kernel: dlm: Using TCP for communications Sep 30 22:33:41 localhost fenced[2930]: fencing node saanvi12 Sep 30 22:33:44 localhost fenced[2930]: fence saanvi12 dev 0.0 agent none result: error no method *Sep 30 22:33:44 localhost fenced[2930]: fence saanvi12 failed Sep 30 22:33:47 localhost fenced[2930]: fencing node saanvi12 Sep 30 22:33:49 localhost fenced[2930]: fence saanvi12 dev 0.0 agent none result: error no method Sep 30 22:33:49 localhost fenced[2930]: fence saanvi12 failed Sep 30 22:33:52 localhost fenced[2930]: fencing node saanvi12* With the above error the result in by issuing clustat -i 1 command is : *Cluster Status for dhoni @ Sun Sep 30 23:04:08 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ hitesh12 1 Online, Local saanvi12 2 Offline* I have disabled by firewall on both my linux servers and is able to telnet each other. Can somebdy please help me out as how can i remove my fence error ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Mon Oct 1 03:22:14 2012 From: lists at alteeve.ca (Digimer) Date: Sun, 30 Sep 2012 23:22:14 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: Message-ID: <50690C66.2070603@alteeve.ca> Did you setup fencing? Can you send your cluster.conf file please? digimer On 09/30/2012 11:16 PM, hitesh nayyar wrote: > Hi, > > Hi, > > I am facing issuing in setting up Linux cluster. Here is the issue that > i am facing. > > I have 2 Linux desktop and have following ip's and name: > > hitesh12-192.168.1.23 > saanvi12-192.168.1.30 > > i enabled ricci service and have setup passwod as well.Enabled luci > service as well. > > When cluster using GUI by activating luci GUI i see error logs in my > /var/log/messages from hitesh12 -192.168.1.23 > > > Sep 30 22:31:57 localhost dlm_controld[2945]: dlm_controld 3.0.12 started > Sep 30 22:32:18 localhost gfs_controld[3010]: gfs_controld 3.0.12 started > Sep 30 22:33:39 localhost kernel: dlm: Using TCP for communications > Sep 30 22:33:41 localhost fenced[2930]: fencing node saanvi12 > Sep 30 22:33:44 localhost fenced[2930]: fence saanvi12 dev 0.0 agent > none result: error no method > *Sep 30 22:33:44 localhost fenced[2930]: fence saanvi12 failed > Sep 30 22:33:47 localhost fenced[2930]: fencing node saanvi12 > Sep 30 22:33:49 localhost fenced[2930]: fence saanvi12 dev 0.0 agent > none result: error no method > Sep 30 22:33:49 localhost fenced[2930]: fence saanvi12 failed > Sep 30 22:33:52 localhost fenced[2930]: fencing node saanvi12* > > With the above error the result in by issuing clustat -i 1 command is : > > *Cluster Status for dhoni @ Sun Sep 30 23:04:08 2012 > Member Status: Quorate > > Member Name ID Status > ------ ---- ---- ------ > hitesh12 1 Online, Local > saanvi12 2 Offline* > > > I have disabled by firewall on both my linux servers and is able to > telnet each other. > > > Can somebdy please help me out as how can i remove my fence error ? > > -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From raju.rajsand at gmail.com Mon Oct 1 03:43:39 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Mon, 1 Oct 2012 09:13:39 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: Message-ID: Greetings, On Mon, Oct 1, 2012 at 8:46 AM, hitesh nayyar wrote: > > I have 2 Linux desktop and have following ip's and name: > >From what I can gather, This seems to be a desktop class machines. Hence they may not have IPMI/ILO etc. I am 99.99% certain that fencing has not been configured. I also doubt if it has external storage. (I can identify as I tried my experiments with clusters first time with such desktop class machines) The only solution for this is Power Fencing. -- Regards, Rajagopal From hitesh1907nayyar at gmail.com Mon Oct 1 04:53:48 2012 From: hitesh1907nayyar at gmail.com (hitesh nayyar) Date: Mon, 1 Oct 2012 10:23:48 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: Message-ID: Hello, I am not aware of fencing.Yes you are correct i have not configured anything related to fencing. Can you please let me know as how can i proceed...Do i have to purchase some sort of hardware? How can i implement Power Fencing? I am using virtual box machines on my desktop in which linux are installed and have been connected through swtich. [root at hitesh12 ~]# cat /etc/cluster/cluster.conf On Mon, Oct 1, 2012 at 9:13 AM, Rajagopal Swaminathan < raju.rajsand at gmail.com> wrote: > Greetings, > > On Mon, Oct 1, 2012 at 8:46 AM, hitesh nayyar > wrote: > > > > I have 2 Linux desktop and have following ip's and name: > > > > >From what I can gather, This seems to be a desktop class machines. > Hence they may not have IPMI/ILO etc. > > I am 99.99% certain that fencing has not been configured. > > I also doubt if it has external storage. > > (I can identify as I tried my experiments with clusters first time > with such desktop class machines) > > The only solution for this is Power Fencing. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Mon Oct 1 05:14:02 2012 From: lists at alteeve.ca (Digimer) Date: Mon, 01 Oct 2012 01:14:02 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: Message-ID: <5069269A.2050405@alteeve.ca> First up, give this section a read. It will explain what fencing does and why you're seeing what you are; https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing As for being virtualbox guests, you're in a bit of a spot as there doesn't seem to be a native agent. Google shows this though; https://forums.virtualbox.org/viewtopic.php?f=7&t=35372 If you can switch to KVM or Xen, then you can use the fence_virsh or fence_xvm agents. cheers On 10/01/2012 12:53 AM, hitesh nayyar wrote: > Hello, > > I am not aware of fencing.Yes you are correct i have not configured > anything related to fencing. > > Can you please let me know as how can i proceed...Do i have to purchase > some sort of hardware? > > How can i implement Power Fencing? > > I am using virtual box machines on my desktop in which linux are > installed and have been connected through swtich. > > > [root at hitesh12 ~]# cat /etc/cluster/cluster.conf > > > > > > > > > > > > > > > > > On Mon, Oct 1, 2012 at 9:13 AM, Rajagopal Swaminathan > > wrote: > > Greetings, > > On Mon, Oct 1, 2012 at 8:46 AM, hitesh nayyar > > wrote: > > > > I have 2 Linux desktop and have following ip's and name: > > > > >From what I can gather, This seems to be a desktop class machines. > Hence they may not have IPMI/ILO etc. > > I am 99.99% certain that fencing has not been configured. > > I also doubt if it has external storage. > > (I can identify as I tried my experiments with clusters first time > with such desktop class machines) > > The only solution for this is Power Fencing. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From lists at alteeve.ca Mon Oct 1 05:49:36 2012 From: lists at alteeve.ca (Digimer) Date: Mon, 01 Oct 2012 01:49:36 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: <5069269A.2050405@alteeve.ca> Message-ID: <50692EF0.2000505@alteeve.ca> Please keep replies on the mailing list. These question helps other in the future by being seachable in the archives. Xen and KVM are types of hypervisors, like virtualbox is, but they run on Linux hosts. To use them, you would need to install linux on the bare machine. This in turn requires a CPU that supports virtualization (which most CPUs made in the last few years do support). If you can install CentOS 6, you can use KVM which is what I recommend. digimer On 10/01/2012 01:26 AM, hitesh nayyar wrote: > One more thing...till now i have used this setup: > > have Windows vista OS ---> Virtual Box---->Red Hat installed. > > If i download Xen or KVM can i use the same setup instead of Virtual Box? > > Windows vista OS ---->Xen or KVM ---->Red Hat installed > > BR// > Hitesh > > On Mon, Oct 1, 2012 at 10:48 AM, hitesh nayyar > > wrote: > > Hello Again, > > Can you please let me know what refers to KVM or Xen? I have never > used this > > Thanks > > > On Mon, Oct 1, 2012 at 10:44 AM, Digimer > wrote: > > First up, give this section a read. It will explain what fencing > does and why you're seeing what you are; > > https://alteeve.ca/w/2-Node___Red_Hat_KVM_Cluster_Tutorial#__Concept.3B_Fencing > > > As for being virtualbox guests, you're in a bit of a spot as > there doesn't seem to be a native agent. Google shows this though; > > https://forums.virtualbox.org/__viewtopic.php?f=7&t=35372 > > > If you can switch to KVM or Xen, then you can use the > fence_virsh or fence_xvm agents. > > cheers > > > On 10/01/2012 12:53 AM, hitesh nayyar wrote: > > Hello, > > I am not aware of fencing.Yes you are correct i have not > configured > anything related to fencing. > > Can you please let me know as how can i proceed...Do i have > to purchase > some sort of hardware? > > How can i implement Power Fencing? > > I am using virtual box machines on my desktop in which linux are > installed and have been connected through swtich. > > > [root at hitesh12 ~]# cat /etc/cluster/cluster.conf > > > > > > > > > > > > > > > > > On Mon, Oct 1, 2012 at 9:13 AM, Rajagopal Swaminathan > > __>> wrote: > > Greetings, > > On Mon, Oct 1, 2012 at 8:46 AM, hitesh nayyar > > >> wrote: > > > > I have 2 Linux desktop and have following ip's and name: > > > > >From what I can gather, This seems to be a desktop > class machines. > Hence they may not have IPMI/ILO etc. > > I am 99.99% certain that fencing has not been configured. > > I also doubt if it has external storage. > > (I can identify as I tried my experiments with clusters > first time > with such desktop class machines) > > The only solution for this is Power Fencing. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > > > https://www.redhat.com/__mailman/listinfo/linux-cluster > > > > > > -- > Digimer > Papers and Projects: https://alteeve.ca > "Hydrogen is just a colourless, odorless gas which, if left > alone in sufficient quantities for long periods of time, begins > to think about itself." > > > -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From lists at alteeve.ca Mon Oct 1 06:19:39 2012 From: lists at alteeve.ca (Digimer) Date: Mon, 01 Oct 2012 02:19:39 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> Message-ID: <506935FB.8060606@alteeve.ca> You don't seem to be reading what I am typing. Please go back over the various replies and read again what I said. Follow the links and read what they say. And please don't reply only to me. Click "Reply All" and include the mailing list. digimer On 10/01/2012 02:11 AM, hitesh nayyar wrote: > Hi, > > I have a constraint of using Linux on bare machine for my 2 desktop. > > Is there not any way i can get use the agent or perform clustering on > Virtual box or VM ware software..... > > On Mon, Oct 1, 2012 at 11:19 AM, Digimer > wrote: > > Please keep replies on the mailing list. These question helps other > in the future by being seachable in the archives. > > Xen and KVM are types of hypervisors, like virtualbox is, but they > run on Linux hosts. To use them, you would need to install linux on > the bare machine. This in turn requires a CPU that supports > virtualization (which most CPUs made in the last few years do support). > > If you can install CentOS 6, you can use KVM which is what I recommend. > > digimer > > > On 10/01/2012 01:26 AM, hitesh nayyar wrote: > > One more thing...till now i have used this setup: > > have Windows vista OS ---> Virtual Box---->Red Hat installed. > > If i download Xen or KVM can i use the same setup instead of > Virtual Box? > > Windows vista OS ---->Xen or KVM ---->Red Hat installed > > BR// > Hitesh > > On Mon, Oct 1, 2012 at 10:48 AM, hitesh nayyar > > >> wrote: > > Hello Again, > > Can you please let me know what refers to KVM or Xen? I > have never > used this > > Thanks > > > On Mon, Oct 1, 2012 at 10:44 AM, Digimer > >> wrote: > > First up, give this section a read. It will explain > what fencing > does and why you're seeing what you are; > > https://alteeve.ca/w/2-Node_____Red_Hat_KVM_Cluster_Tutorial#____Concept.3B_Fencing > > > > > > > As for being virtualbox guests, you're in a bit of a > spot as > there doesn't seem to be a native agent. Google shows > this though; > > https://forums.virtualbox.org/____viewtopic.php?f=7&t=35372 > > > > > > > If you can switch to KVM or Xen, then you can use the > fence_virsh or fence_xvm agents. > > cheers > > > On 10/01/2012 12:53 AM, hitesh nayyar wrote: > > Hello, > > I am not aware of fencing.Yes you are correct i > have not > configured > anything related to fencing. > > Can you please let me know as how can i > proceed...Do i have > to purchase > some sort of hardware? > > How can i implement Power Fencing? > > I am using virtual box machines on my desktop in > which linux are > installed and have been connected through swtich. > > > [root at hitesh12 ~]# cat /etc/cluster/cluster.conf > > > > nodeid="1"/> > nodeid="2"/> > > > > > > > > > > > > On Mon, Oct 1, 2012 at 9:13 AM, Rajagopal Swaminathan > __> > > > __>__>> wrote: > > Greetings, > > On Mon, Oct 1, 2012 at 8:46 AM, hitesh nayyar > > > > __gma__il.com > > >>> wrote: > > > > I have 2 Linux desktop and have following > ip's and name: > > > > >From what I can gather, This seems to be a > desktop > class machines. > Hence they may not have IPMI/ILO etc. > > I am 99.99% certain that fencing has not been > configured. > > I also doubt if it has external storage. > > (I can identify as I tried my experiments with > clusters > first time > with such desktop class machines) > > The only solution for this is Power Fencing. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > > > ____com > >> > https://www.redhat.com/____mailman/listinfo/linux-cluster > > > > __> > > > > > -- > Digimer > Papers and Projects: https://alteeve.ca > "Hydrogen is just a colourless, odorless gas which, if left > alone in sufficient quantities for long periods of > time, begins > to think about itself." > > > > > > -- > Digimer > Papers and Projects: https://alteeve.ca > "Hydrogen is just a colourless, odorless gas which, if left alone in > sufficient quantities for long periods of time, begins to think > about itself." > > -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From tserong at suse.com Mon Oct 1 12:22:57 2012 From: tserong at suse.com (Tim Serong) Date: Mon, 01 Oct 2012 22:22:57 +1000 Subject: [Linux-cluster] CFP: Cloud Infrastructure, Distributed Storage and High Availability at LCA 2013 Message-ID: <50698B21.4020802@suse.com> I'm pleased to announce that we will be holding a one day Cloud Infrastructure, Distributed Storage and High Availability mini conference[1] on Monday 28 January 2013 as part of linux.conf.au 2013 in Canberra, Australia[2]. This miniconf is about building reliable infrastructure, from two-node HA failover pairs to multi-thousand-core cloud systems. You might like to think of it as a sequel to the LCA 2012 High Availability and Distributed Storage miniconf[3]. Do any of the following describe you? * You're building cloud infrastructure for others to use (openstack, cloudstack, eucalyptus, ...) * Your data needs to be reliably available everywhere (ceph, glusterfs, drbd, ...) * Your system absolutely must be up all the time (pacemaker, corosync, ...) If so, this is the miniconf for you! Please consider submitting a presentation at: http://tinyurl.com/cidsha-lca2013 We're expecting most talk slots to be 25 minutes (including questions and changeover), but there will be openings for shorter lightning talks and maybe a couple of longer talks. CFP closes on Sunday November 4, 2012. Notifications of acceptance will be emailed out after this date. Note that there is also an OpenStack-specific miniconf[4] running on Tuesday 29 January. We're hoping this will give us a pretty awesome two-day LCA 2013 CloudFest. As a rough rule of thumb, more generic or infrastructure-related talks should go to Cloud, Distributed Storage & HA, while deeper OpenStack-specific talks should probably go to the OpenStack miniconf. If in doubt, or if you have any other questions, please contact me directly at tserong at suse.com. Thanks! Tim [1] http://lca2013.linux.org.au/schedule/30073/view_talk [2] http://lca2013.linux.org.au/ [3] http://lca2012.linux.org.au/wiki/index.php/Miniconfs/HighAvailabilityAndDistributedStorage (also videos at http://www.youtube.com/playlist?list=PLE70D0FFF98BC9579) [4] http://lca2013.linux.org.au/schedule/30100/view_talk?day=tuesday -- Tim Serong Senior Clustering Engineer SUSE tserong at suse.com From raju.rajsand at gmail.com Mon Oct 1 16:45:28 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Mon, 1 Oct 2012 22:15:28 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: <506935FB.8060606@alteeve.ca> References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> Message-ID: Greetings, Hitesh, Please follow list guidlines. On Mon, Oct 1, 2012 at 11:49 AM, Digimer wrote: > You don't seem to be reading what I am typing. Please go back over the > various replies and read again what I said. Follow the links and read what > they say. > > And please don't reply only to me. Click "Reply All" and include the mailing > list. > >> >> I have a constraint of using Linux on bare machine for my 2 desktop. >> Can you please let me know as how can i >> proceed...Do i have >> to purchase >> some sort of hardware? Yes. You will need to buy power fencing device -- basically a power strip with a ethernet port I would strongly suggest you have two network port on each system. What you want to do with a cluster? >> >> One more thing...till now i have used this setup: >> >> have Windows vista OS ---> Virtual Box---->Red Hat installed. >> You have to be kidding. You are using vista on bare metal for your HA? >> If i download Xen or KVM can i use the same setup instead of >> Virtual Box? >> >> Windows vista OS ---->Xen or KVM ---->Red Hat installed http://www.youtube.com/watch?v=oKI-tD0L18A >> [root at hitesh12 ~]# cat /etc/cluster/cluster.conf >> >> There needs to be two_node directive somewher there. Read up. Better yet get help of some local technical person who knows what is HA. It is a lot more than simple desktop install. Or you need to invest quite a bit of time in learning and money in getting some extra hardware (fence devices, switches, NIC, External storage -- if required). And dont commit for or do that on production without knowing what you are getting into. If you can post more descriptively the objective of using cluster, perhaps you will get more specific information. Digimer's _*excellent*_ tutorial covers more or less all that you need to know about clusters. I wish I had that when I started playing around with that way back in 2007. -- Regards, Rajagopal From parvez.h.shaikh at gmail.com Tue Oct 2 08:00:32 2012 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Tue, 2 Oct 2012 13:30:32 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> Message-ID: What kind of cluster is this - an academic project or production quality solution? If its former - go for manual fencing. You wont need fence device but failover wont be automatic If its later - yes you'll need fence device On Mon, Oct 1, 2012 at 10:15 PM, Rajagopal Swaminathan < raju.rajsand at gmail.com> wrote: > Greetings, > > > Hitesh, > > Please follow list guidlines. > > > On Mon, Oct 1, 2012 at 11:49 AM, Digimer wrote: > > You don't seem to be reading what I am typing. Please go back over the > > various replies and read again what I said. Follow the links and read > what > > they say. > > > > And please don't reply only to me. Click "Reply All" and include the > mailing > > list. > > > >> > >> I have a constraint of using Linux on bare machine for my 2 desktop. > > >> Can you please let me know as how can i > >> proceed...Do i have > >> to purchase > >> some sort of hardware? > > Yes. You will need to buy power fencing device -- basically a power > strip with a ethernet port > > I would strongly suggest you have two network port on each system. > > What you want to do with a cluster? > > >> > >> One more thing...till now i have used this setup: > >> > >> have Windows vista OS ---> Virtual Box---->Red Hat installed. > >> > > You have to be kidding. You are using vista on bare metal for your HA? > > >> If i download Xen or KVM can i use the same setup instead of > >> Virtual Box? > >> > >> Windows vista OS ---->Xen or KVM ---->Red Hat installed > > http://www.youtube.com/watch?v=oKI-tD0L18A > > > > >> [root at hitesh12 ~]# cat /etc/cluster/cluster.conf > >> > >> > > There needs to be two_node directive somewher there. Read up. > > Better yet get help of some local technical person who knows what is HA. > > It is a lot more than simple desktop install. > > Or you need to invest quite a bit of time in learning and money in > getting some extra hardware (fence devices, switches, NIC, External > storage -- if required). And dont commit for or do that on production > without knowing what you are getting into. > > If you can post more descriptively the objective of using cluster, > perhaps you will get more specific information. > > Digimer's _*excellent*_ tutorial covers more or less all that you need > to know about clusters. > > I wish I had that when I started playing around with that way back in 2007. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Tue Oct 2 13:38:28 2012 From: lists at alteeve.ca (Digimer) Date: Tue, 02 Oct 2012 09:38:28 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> Message-ID: <506AEE54.7070209@alteeve.ca> On 10/02/2012 04:00 AM, Parvez Shaikh wrote: > What kind of cluster is this - an academic project or production quality > solution? > > If its former - go for manual fencing. You wont need fence device but > failover wont be automatic *Please* don't do this. Manual fencing support was dropped for a reason. It's *far* too easy to mess things up when an admin uses it before identifying a problem. > If its later - yes you'll need fence device This is the only sane option; Academic or production. Fencing is an integral part of the cluster and you do yourself no favour by not learning it in an academic setup. -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odourless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From parvez.h.shaikh at gmail.com Tue Oct 2 13:43:40 2012 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Tue, 2 Oct 2012 19:13:40 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: <506AEE54.7070209@alteeve.ca> References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> <506AEE54.7070209@alteeve.ca> Message-ID: Hi Digimer, Could you please give me reference/case studies of problem about why manual fencing was dropped and how automated fencing is fixing those? Thanks, Parvez On Tue, Oct 2, 2012 at 7:08 PM, Digimer wrote: > On 10/02/2012 04:00 AM, Parvez Shaikh wrote: > >> What kind of cluster is this - an academic project or production quality >> solution? >> >> If its former - go for manual fencing. You wont need fence device but >> failover wont be automatic >> > > *Please* don't do this. Manual fencing support was dropped for a reason. > It's *far* too easy to mess things up when an admin uses it before > identifying a problem. > > > If its later - yes you'll need fence device >> > > This is the only sane option; Academic or production. Fencing is an > integral part of the cluster and you do yourself no favour by not learning > it in an academic setup. > > > -- > Digimer > Papers and Projects: https://alteeve.ca > "Hydrogen is just a colourless, odourless gas which, if left alone in > sufficient quantities for long periods of time, begins to think about > itself." > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Tue Oct 2 14:03:06 2012 From: lists at alteeve.ca (Digimer) Date: Tue, 02 Oct 2012 10:03:06 -0400 Subject: [Linux-cluster] linux-cluster In-Reply-To: References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> <506AEE54.7070209@alteeve.ca> Message-ID: <506AF41A.5000906@alteeve.ca> This talks about how manual fencing isn't actual fencing; https://fedorahosted.org/cluster/wiki/Fence There was a page where it was said that manual fencing was in no way supported, but I can't find it at the moment. The reason it is not safe is that an admin is likely to issue it in a panic while trying to get a hung cluster back online. If this happens without first ensuring the peer node(s) is fenced, you can walk into a split-brain. digimer On 10/02/2012 09:43 AM, Parvez Shaikh wrote: > Hi Digimer, > > Could you please give me reference/case studies of problem about why > manual fencing was dropped and how automated fencing is fixing those? > > Thanks, > Parvez > > On Tue, Oct 2, 2012 at 7:08 PM, Digimer > wrote: > > On 10/02/2012 04:00 AM, Parvez Shaikh wrote: > > What kind of cluster is this - an academic project or production > quality > solution? > > If its former - go for manual fencing. You wont need fence > device but > failover wont be automatic > > > *Please* don't do this. Manual fencing support was dropped for a > reason. It's *far* too easy to mess things up when an admin uses it > before identifying a problem. > > > If its later - yes you'll need fence device > > > This is the only sane option; Academic or production. Fencing is an > integral part of the cluster and you do yourself no favour by not > learning it in an academic setup. > > > -- > Digimer > Papers and Projects: https://alteeve.ca > "Hydrogen is just a colourless, odourless gas which, if left alone > in sufficient quantities for long periods of time, begins to think > about itself." > > -- Digimer Papers and Projects: https://alteeve.ca "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From raju.rajsand at gmail.com Tue Oct 2 16:08:09 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Tue, 2 Oct 2012 21:38:09 +0530 Subject: [Linux-cluster] linux-cluster In-Reply-To: <506AF41A.5000906@alteeve.ca> References: <5069269A.2050405@alteeve.ca> <50692EF0.2000505@alteeve.ca> <506935FB.8060606@alteeve.ca> <506AEE54.7070209@alteeve.ca> <506AF41A.5000906@alteeve.ca> Message-ID: Greetings, On Tue, Oct 2, 2012 at 7:33 PM, Digimer wrote: > > The reason it is not safe is that an admin is likely to issue it in a panic > while trying to get a hung cluster back online. If this happens without > +1 +1 +1 > first ensuring the peer node(s) is fenced, you can walk into a split-brain. My first attempts without fencing landed me in the nightmare of "been there done that" of cleaning up the mess of split brain or more aptly, "the fluid brain which hit the fan" To rephrase the old adage "To err is human, but to really, completely mess it up, it takes a cluster with split brain". It might be probably easier to use magnet to write bits on the disk than to clean up *that* mess. Not to talk about downtime and the fury of users. No baby, A HA cluster ain't worth its name without fencing. And mine was in an academic environment. -- Regards, Rajagopal From parvez.h.shaikh at gmail.com Wed Oct 3 05:23:17 2012 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Wed, 3 Oct 2012 10:53:17 +0530 Subject: [Linux-cluster] Hi In-Reply-To: References: <1347383423.10128.555.camel@aeapen.blr.redhat.com> <1347445264.32050.398.camel@aeapen.blr.redhat.com> Message-ID: A curious observation, there is a sudden surge of sending emails on private addresses rather than sending over a mailing list. Please send your doubts / questions on mailing list " linux-cluster at redhat.com" instead of addressing personally. Regarding configuration for manual fencing - I don't have it with me, it was available with RHEL 5.5. Check it out in system-config-cluster tool if you can add manual fencing. Thanks, Parvez On Wed, Oct 3, 2012 at 10:46 AM, Renchu Mathew wrote: > Hi Purvez, > > I am trying to setup a test cluster environmet. But I haven't doen > fencing. Please find below error messages. Some time after the nodes > restarted, the other node is going down. can you please send me > theconfiguration for manual fencing? > > >> > Please find attached my cluster setup. It is not stable >> > and /var/log/messages shows the below errors. >> > >> > >> > Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2 >> > Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2 >> > Sep 11 08:49:10 node1 corosync[1814]: [CPG ] chosen downlist: >> > sender r(0) ip(192.168.1.251) ; members(old:2 left:1) >> > Sep 11 08:49:10 node1 corosync[1814]: [MAIN ] Completed service >> > synchronization, ready to provide service. >> > Sep 11 08:49:11 node1 corosync[1814]: cman killed by node 2 because we >> > were killed by cman_tool or other application >> > Sep 11 08:49:11 node1 fenced[1875]: telling cman to remove nodeid 2 >> > from cluster >> > Sep 11 08:49:11 node1 fenced[1875]: cluster is down, exiting >> > Sep 11 08:49:11 node1 gfs_controld[1950]: cluster is down, exiting >> > Sep 11 08:49:11 node1 gfs_controld[1950]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 gfs_controld[1950]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cluster is down, exiting >> > Sep 11 08:49:11 node1 dlm_controld[1889]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 fenced[1875]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 rgmanager[2409]: #67: Shutting down uncleanly >> > Sep 11 08:49:11 node1 rgmanager[17059]: [clusterfs] unmounting /Data >> > Sep 11 08:49:11 node1 rgmanager[17068]: [clusterfs] Sending SIGTERM to >> > processes on /Data >> > Sep 11 08:49:16 node1 rgmanager[17104]: [clusterfs] unmounting /Data >> > Sep 11 08:49:16 node1 rgmanager[17113]: [clusterfs] Sending SIGKILL to >> > processes on /Data >> > Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 2 >> > Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 1 >> > Sep 11 08:49:19 node1 kernel: dlm: gfs2: no userland control daemon, >> > stopping lockspace >> > Sep 11 08:49:22 node1 rgmanager[17149]: [clusterfs] unmounting /Data >> > Sep 11 08:49:22 node1 rgmanager[17158]: [clusterfs] Sending SIGKILL to >> > processes on /Data >> > >> > >> > >> > Also when I try to restart the cman service, below error comes. >> > Starting cluster: >> > Checking if cluster has been disabled at boot... [ OK ] >> > Checking Network Manager... [ OK ] >> > Global setup... [ OK ] >> > Loading kernel modules... [ OK ] >> > Mounting configfs... [ OK ] >> > Starting cman... [ OK ] >> > Waiting for quorum... [ OK ] >> > Starting fenced... [ OK ] >> > Starting dlm_controld... [ OK ] >> > Starting gfs_controld... [ OK ] >> > Unfencing self... fence_node: cannot connect to cman >> > [FAILED] >> > Stopping cluster: >> > Leaving fence domain... [ OK ] >> > Stopping gfs_controld... [ OK ] >> > Stopping dlm_controld... [ OK ] >> > Stopping fenced... [ OK ] >> > Stopping cman... [ OK ] >> > Unloading kernel modules... [ OK ] >> > Unmounting configfs... [ OK ] >> > >> > Thanks again. >> > Renchu Mathew >> > On Tue, Sep 11, 2012 at 9:10 PM, Arun Eapen CISSP, RHCA >> > wrote: >> > >> > >> > >> > Put the fenced in debug mode and copy the error messages, for >> > me to >> > debug >> > >> > On Tue, 2012-09-11 at 11:52 +0400, Renchu Mathew wrote: >> > > Hi Arun, >> > > >> > > I have done the RH436 course in conducted by you at Redhat >> > b'lore. How >> > > r u? >> > > >> > > I have configured a 2 node failover cluster setup (almost >> > same like >> > > our RH436 lab setup in b'lore) It is almost ok except >> > fencing. If I >> > > pull the active node network cable it is not switching to >> > the other >> > > automatically. It is getting hung. Then I have to do this >> > manually. Is >> > > there any script for creating the dummy fencing in RHCS >> > which will >> > > restart or shutdown the other node. Please find attached my >> > > cluster.conf file. is there anyway we can power fence using >> > APC UPS. >> > > >> > > Could you please help me if you get some time. >> > > >> > > Thanks and regards >> > > Renchu Mathew >> > > >> > > >> > > >> > >> > >> > >> > -- >> > Arun Eapen >> > CISSP, RHC{A,DS,E,I,SS,VA,X} >> > Senior Technical Consultant & Certification Poobah >> > Red Hat India Pvt. Ltd., >> > No - 4/1, Bannergatta Road, >> > IBC Knowledge Park, >> > 11th floor, Tower D, >> > Bangalore - 560029, INDIA. >> > >> > >> > >> >> >> -- >> Arun Eapen >> CISSP, RHC{A,DS,E,I,SS,VA,X} >> Senior Technical Consultant & Certification Poobah >> Red Hat India Pvt. Ltd., >> No - 4/1, Bannergatta Road, >> IBC Knowledge Park, >> 11th floor, Tower D, >> Bangalore - 560029, INDIA. >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at verwilst.be Thu Oct 4 13:47:31 2012 From: lists at verwilst.be (Bart Verwilst) Date: Thu, 04 Oct 2012 15:47:31 +0200 Subject: [Linux-cluster] Failover network device with rgmanager Message-ID: <2c3f847bbba16467723fe057dbded285@verwilst.be> Hi, I would like to make rgmanager manage a network interface i configured under sysconfig ( ifcfg-ethX ). It should be brought up by the active node as a resource, and ifdown'ed by the standby node. ( It's actually a GRE tunnel interface ). Is there a straightforward way on how to do this with CentOS 6.2 cman/rgmanager? Thanks in advance! Kind regards, Bart Verwilst From lhh at redhat.com Thu Oct 4 15:56:52 2012 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 04 Oct 2012 11:56:52 -0400 Subject: [Linux-cluster] Failover network device with rgmanager In-Reply-To: <2c3f847bbba16467723fe057dbded285@verwilst.be> References: <2c3f847bbba16467723fe057dbded285@verwilst.be> Message-ID: <506DB1C4.2080609@redhat.com> On 10/04/2012 09:47 AM, Bart Verwilst wrote: > Hi, > > I would like to make rgmanager manage a network interface i configured > under sysconfig ( ifcfg-ethX ). It should be brought up by the active > node as a resource, and ifdown'ed by the standby node. ( It's actually a > GRE tunnel interface ). Is there a straightforward way on how to do this > with CentOS 6.2 cman/rgmanager? > 'script' resource, like: #!/bin/sh case $1 in start) ifup ethX exit $? ;; stop) ifdown ethX exit $? ;; status) ... ;; esac exit 1 -- Lon From heiko.nardmann at itechnical.de Thu Oct 4 16:22:49 2012 From: heiko.nardmann at itechnical.de (Heiko Nardmann) Date: Thu, 04 Oct 2012 18:22:49 +0200 Subject: [Linux-cluster] Failover network device with rgmanager In-Reply-To: <506DB1C4.2080609@redhat.com> References: <2c3f847bbba16467723fe057dbded285@verwilst.be> <506DB1C4.2080609@redhat.com> Message-ID: <506DB7D9.3080909@itechnical.de> Isn't that a standard ip resource inside cluster.conf? Kind regards, Heiko Am 04.10.2012 17:56, schrieb Lon Hohberger: > On 10/04/2012 09:47 AM, Bart Verwilst wrote: >> Hi, >> >> I would like to make rgmanager manage a network interface i configured >> under sysconfig ( ifcfg-ethX ). It should be brought up by the active >> node as a resource, and ifdown'ed by the standby node. ( It's actually a >> GRE tunnel interface ). Is there a straightforward way on how to do this >> with CentOS 6.2 cman/rgmanager? >> > 'script' resource, like: > > #!/bin/sh > > case $1 in > start) > ifup ethX > exit $? > ;; > stop) > ifdown ethX > exit $? > ;; > status) > ... > ;; > esac > > exit 1 > > -- Lon From mgrac at redhat.com Fri Oct 5 11:19:02 2012 From: mgrac at redhat.com (Marek Grac) Date: Fri, 05 Oct 2012 13:19:02 +0200 Subject: [Linux-cluster] fence-agents-3.1.10 stable release Message-ID: <506EC226.1010900@redhat.com> Welcome to the fence-agents 3.1.10 release. This release includes these updates: * Faster fencing in fence_vmware_soap * Action metadata is supported also on older fence agents * support for using sudo in fence_virsh The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-3.1.10.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. m, From mmorgan at dca.net Fri Oct 5 20:22:16 2012 From: mmorgan at dca.net (Michael Morgan) Date: Fri, 5 Oct 2012 16:22:16 -0400 Subject: [Linux-cluster] GFS2 showing the wrong directory contents on one node? Message-ID: <20121005202216.GK17352@staff.dca.net> Hello, I have a 6 node CentOS 5.8 cluster with 4 nodes mounting a GFS2 filesystem. Everything had been running nicely for about 2 years but over the past few months I've had a strange occurence happen twice. One of the two web server nodes will suddenly start listing the wrong directory contents, both nodes have been affected at different times. This only seems to affect one or two directories but it's hard to be certain since there are a large numer of them. There are no errors logged anywhere on the cluster. Unmounting GFS2 on this node usually causes a hang and eventual fence. The node will come back online without issue and begin functioning normally again. Just a few minutes ago it started happening again. I currently have services stopped but have not gone through the unmount/reboot process yet. Before I do that I figured I'd check the list to see if anyone has come across this before. Is there any GFS2/cluster information I should be dumping to track down the cause? Any insight would be appreciated. Thanks. -Mike From shanti.pahari at sierra.sg Mon Oct 8 01:57:01 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Mon, 8 Oct 2012 09:57:01 +0800 (SGT) Subject: [Linux-cluster] LVM cannot initilize Message-ID: Hi all, After I added root volume in volume_list in lvm.conf I cannot inlitilize other lvm . If I remove volume_list from lvm.conf then only I can intilize other lvm. But volume_list with only root volume is required for the cluster. Volume_list = [ "myrootvolume", "@hostname" ] Can you help me how can I solve this? Regards, Shanti -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmr at redhat.com Mon Oct 8 10:42:56 2012 From: bmr at redhat.com (Bryn M. Reeves) Date: Mon, 08 Oct 2012 11:42:56 +0100 Subject: [Linux-cluster] LVM cannot initilize In-Reply-To: References: Message-ID: <5072AE30.3000904@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/08/2012 02:57 AM, Shanti Pahari wrote: > If I remove volume_list from lvm.conf then only I can intilize > other lvm. But volume_list with only root volume is required for > the cluster. > > > > Volume_list = [ "myrootvolume", "@hostname" ] It's 'volume_list' (lowercase 'v') and if you're specifying a logical volume (rather than a volume group) it needs to be "vgname/lvname". Also make sure it goes in the 'activation' section - there should be a commented-out example you can use as a template in the default lvm.conf: # If volume_list is defined, each LV is only activated if there is a # match against the list. # "vgname" and "vgname/lvname" are matched exactly. # "@tag" matches any tag set in the LV or VG. # "@*" matches if any tag defined on the host is set in the LV or VG # # volume_list = [ "vg1", "vg2/lvol1", "@tag1", "@*" ] Regards, Bryn. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlByri8ACgkQ6YSQoMYUY94ZNgCeL/ectFyPgippkiQVEYTPWpn7 lP0AoIls3TalqQZgQ0M5fxJppFrUnjVK =AsGP -----END PGP SIGNATURE----- From shanti.pahari at sierra.sg Mon Oct 8 12:13:33 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Mon, 8 Oct 2012 20:13:33 +0800 (SGT) Subject: [Linux-cluster] LVM cannot initilize In-Reply-To: <5072AE30.3000904@redhat.com> References: <5072AE30.3000904@redhat.com> Message-ID: <25537b6b.00000e68.00001f81@sierra-A66> volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-01" ] where my vg_pdcpicpl01 is Volume group rather than logical volume. This is my root volume . In example I saw that I have to specify only VG of root . Can you help me where I am wrong ? Thanks, Shanti -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bryn M. Reeves Sent: Monday, 8 October, 2012 6:43 PM To: linux clustering Subject: Re: [Linux-cluster] LVM cannot initilize -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/08/2012 02:57 AM, Shanti Pahari wrote: > If I remove volume_list from lvm.conf then only I can intilize other > lvm. But volume_list with only root volume is required for the > cluster. > > > > Volume_list = [ "myrootvolume", "@hostname" ] It's 'volume_list' (lowercase 'v') and if you're specifying a logical volume (rather than a volume group) it needs to be "vgname/lvname". Also make sure it goes in the 'activation' section - there should be a commented-out example you can use as a template in the default lvm.conf: # If volume_list is defined, each LV is only activated if there is a # match against the list. # "vgname" and "vgname/lvname" are matched exactly. # "@tag" matches any tag set in the LV or VG. # "@*" matches if any tag defined on the host is set in the LV or VG # # volume_list = [ "vg1", "vg2/lvol1", "@tag1", "@*" ] Regards, Bryn. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlByri8ACgkQ6YSQoMYUY94ZNgCeL/ectFyPgippkiQVEYTPWpn7 lP0AoIls3TalqQZgQ0M5fxJppFrUnjVK =AsGP -----END PGP SIGNATURE----- -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From bmr at redhat.com Mon Oct 8 12:46:57 2012 From: bmr at redhat.com (Bryn M. Reeves) Date: Mon, 08 Oct 2012 13:46:57 +0100 Subject: [Linux-cluster] LVM cannot initilize In-Reply-To: <25537b6b.00000e68.00001f81@sierra-A66> References: <5072AE30.3000904@redhat.com> <25537b6b.00000e68.00001f81@sierra-A66> Message-ID: <5072CB41.4090006@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/08/2012 01:13 PM, Shanti Pahari wrote: > volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-01" ] > > where my vg_pdcpicpl01 is Volume group rather than logical volume. > This is my root volume . Either a tag "@tagname", a volume group as "vgname", or a logical volume name as "vgname/lvname" (otherwise the tools cannot know which LV you mean if there are multiple LVs with the same name in different VGs). In your earlier example you seemed to have an LV name: >> Volume_list = [ "myrootvolume", "@hostname" ] Which won't work since the LV name is unqualified by a VG. Your later example: > volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-01" ] Looks correct assuming that vg_pdcpicpl01 is the name of a VG on your system. > In example I saw that I have to specify only VG of root . It's up to you whether you want to specify just an LV or a whole VG. > Can you help me where I am wrong ? It's hard to say. What error do you get initialising LVM? Try adding more -v if there's nothing useful printed (normally syntax errors in lvm.conf give a useful message). Failing that you could post your full lvm.conf to a pastebin somewhere and mail the link so that others can review your config. Regards, Bryn. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlByy0EACgkQ6YSQoMYUY95VEgCdFK9QqeTOGAiiYnZiuAJ6iHY5 BrUAmwbCBULcq9gEVBtqOg8wAAGXph5M =UsOX -----END PGP SIGNATURE----- From cfeist at redhat.com Tue Oct 9 00:27:36 2012 From: cfeist at redhat.com (Chris Feist) Date: Mon, 08 Oct 2012 19:27:36 -0500 Subject: [Linux-cluster] Announce: pcs-0.9.26 Message-ID: <50736F78.3060906@redhat.com> We've been making improvements to the pcs (pacemaker/corosync configuration system) command line tool over the past few months. Currently you can setup a basic cluster (including configuring corosync 2.0 udpu). David Vossel has also created a version of the "Clusters from Scratch" document that illustrates setting up a cluster using pcs. This should be showing up shortly. You can view the source here: https://github.com/feist/pcs/ Or download the latest tarball: https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz There is also a Fedora 18 package that will be included with the next release. You should be able to find that package in the following locations... RPM: http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm SRPM: http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm In the near future we are planning on having builds for SUSE & Ubuntu/Debian. We're also actively working on a GUI/Daemon that will allow control of your entire cluster from one node and/or a web browser. Please feel free to email me (cfeist at redhat.com) or open issues on the pcs project at github (https://github.com/feist/pcs/issues) if you have any questions or problems. Thanks! Chris From ming-ming.chen at hp.com Tue Oct 9 01:47:14 2012 From: ming-ming.chen at hp.com (Chen, Ming Ming) Date: Tue, 9 Oct 2012 01:47:14 +0000 Subject: [Linux-cluster] Configure multiple heartbeat on a redhat cluster In-Reply-To: <5057ED14.4030601@alteeve.ca> References: <5057D17D.9060108@alteeve.ca> <5057E9C5.60506@alteeve.ca> <5057ED14.4030601@alteeve.ca> Message-ID: <1D241511770E2F4BA89AFD224EDD527141025BAE@G9W0733.americas.hpqcorp.net> Hi, Is there a way to configure multiple heartbeat network in the /etc/cluster.conf file. I'm using redhat cluster. Regards Ming From ming-ming.chen at hp.com Tue Oct 9 01:55:01 2012 From: ming-ming.chen at hp.com (Chen, Ming Ming) Date: Tue, 9 Oct 2012 01:55:01 +0000 Subject: [Linux-cluster] problem quorum cman In-Reply-To: References: Message-ID: <1D241511770E2F4BA89AFD224EDD527141025C87@G9W0733.americas.hpqcorp.net> Hi, Have you ever resolved this issue? If so, what is the problem? I sometime see the same issue on my cluster. Ming From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of hakim abdellaoui Sent: Friday, July 20, 2012 2:15 AM To: linux-cluster at redhat.com Subject: [Linux-cluster] problem quorum cman Hi, I use rhel6.3 with packages : cman-3.0.12.1-32.el6.x86_64 rgmanager-3.0.12.1-12.el6.x86_64 openais-1.1.1-7.el6.x86_64 I have two virtual nodes (vmware) and a quorum share disk (it's a virtual disk i use scsi sharing multi-write) the cluster work sometime. if i reboot node2 the cman not start i have : Waiting for quorum... Timed-out waiting for cluster. On the log corosync i have : Jul 20 10:51:22 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jul 20 10:51:22 corosync [CPG ] chosen downlist: sender r(0) ip(192.168.10.154) ; members(old:1 left:0) Jul 20 10:51:22 corosync [MAIN ] Completed service synchronization, ready to provide service. Jul 20 10:51:23 corosync [CMAN ] quorum device unregistered On the node1 when i type clustat i have : Cluster Status for clusterweb @ Fri Jul 20 10:38:57 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ server-1 1 Online, Local server-2 2 Offline /dev/block/8:16 0 Online, Quorum Disk If i restart cman on node1 and i restart cman on node2 the cman start properly a When i type clustat on both nodes i can see all online. I don't understand why i must restart on node1 the cman if i want to add the node2 on the cluster . You can see my cluster.conf Very thanks for your help Best regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Tue Oct 9 01:58:04 2012 From: lists at alteeve.ca (Digimer) Date: Mon, 08 Oct 2012 21:58:04 -0400 Subject: [Linux-cluster] Announce: pcs-0.9.26 In-Reply-To: <50736F78.3060906@redhat.com> References: <50736F78.3060906@redhat.com> Message-ID: <507384AC.5010805@alteeve.ca> Well, I was looking for a reason to download and start testing Fedora 18. Suppose this is a good enough reason. :) digimer On 10/08/2012 08:27 PM, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync > configuration system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync > 2.0 udpu). > > David Vossel has also created a version of the "Clusters from Scratch" > document that illustrates setting up a cluster using pcs. This should > be showing up shortly. > > You can view the source here: https://github.com/feist/pcs/ > > Or download the latest tarball: > https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz > > There is also a Fedora 18 package that will be included with the next > release. You should be able to find that package in the following > locations... > > RPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > > SRPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm > > In the near future we are planning on having builds for SUSE & > Ubuntu/Debian. > > We're also actively working on a GUI/Daemon that will allow control of > your entire cluster from one node and/or a web browser. > > Please feel free to email me (cfeist at redhat.com) or open issues on the > pcs project at github (https://github.com/feist/pcs/issues) if you have > any questions or problems. > > Thanks! > Chris > -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From fdinitto at redhat.com Tue Oct 9 06:53:23 2012 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Tue, 09 Oct 2012 08:53:23 +0200 Subject: [Linux-cluster] Announce: pcs-0.9.26 In-Reply-To: <50736F78.3060906@redhat.com> References: <50736F78.3060906@redhat.com> Message-ID: <5073C9E3.7000009@redhat.com> On 10/9/2012 2:27 AM, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync > configuration system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync > 2.0 udpu). > > David Vossel has also created a version of the "Clusters from Scratch" > document that illustrates setting up a cluster using pcs. This should > be showing up shortly. > well done guys!!! Fabio From shanti.pahari at sierra.sg Tue Oct 9 09:07:10 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Tue, 9 Oct 2012 17:07:10 +0800 (SGT) Subject: [Linux-cluster] LVM cannot initilize In-Reply-To: <5072CB41.4090006@redhat.com> References: <5072AE30.3000904@redhat.com> <25537b6b.00000e68.00001f81@sierra-A66> <5072CB41.4090006@redhat.com> Message-ID: <97898eab.00001f6c.00000041@sierra-A66> Hi Bryn, >From example: On each cluster node, edit /etc/lvm/lvm.conf and change the volume_list field to match the boot volume (myvg) and name of the node cluster interconnect (ha-web1). This restricts the list of volumes available during system boot to only the root volume and prevents cluster nodes from updating and potentially corrupting the metadata on the HA-LVM volume: volume_list = [ "myvg", "@ha-web1" ] so I added volume_list = [ " vg_pdcpicpl01 " , "@PDC-PIC-PL-01" ] PDC-PIC-PL-01 : is my hostname # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) # shutdown -r now "Activating ramdisk LVM changes" Reboot I have /dev/HA-Web-VG/ha-web-lv also, but after reboot I cannot initialize my volume it throws nor create logical volume lvcreate gets error message "not activating volume group lv does not pass activation filter" -----Original Message----- From: Bryn M. Reeves [mailto:bmr at redhat.com] Sent: Monday, 8 October, 2012 8:47 PM To: linux clustering Cc: Shanti Pahari Subject: Re: [Linux-cluster] LVM cannot initilize -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/08/2012 01:13 PM, Shanti Pahari wrote: > volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-01" ] > > where my vg_pdcpicpl01 is Volume group rather than logical volume. > This is my root volume . Either a tag "@tagname", a volume group as "vgname", or a logical volume name as "vgname/lvname" (otherwise the tools cannot know which LV you mean if there are multiple LVs with the same name in different VGs). In your earlier example you seemed to have an LV name: >> Volume_list = [ "myrootvolume", "@hostname" ] Which won't work since the LV name is unqualified by a VG. Your later example: > volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-01" ] Looks correct assuming that vg_pdcpicpl01 is the name of a VG on your system. > In example I saw that I have to specify only VG of root . It's up to you whether you want to specify just an LV or a whole VG. > Can you help me where I am wrong ? It's hard to say. What error do you get initialising LVM? Try adding more -v if there's nothing useful printed (normally syntax errors in lvm.conf give a useful message). Failing that you could post your full lvm.conf to a pastebin somewhere and mail the link so that others can review your config. Regards, Bryn. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlByy0EACgkQ6YSQoMYUY95VEgCdFK9QqeTOGAiiYnZiuAJ6iHY5 BrUAmwbCBULcq9gEVBtqOg8wAAGXph5M =UsOX -----END PGP SIGNATURE----- From mmorgan at dca.net Tue Oct 9 13:44:06 2012 From: mmorgan at dca.net (Michael Morgan) Date: Tue, 9 Oct 2012 09:44:06 -0400 Subject: [Linux-cluster] GFS2 showing the wrong directory contents on one node? In-Reply-To: <20121005202216.GK17352@staff.dca.net> References: <20121005202216.GK17352@staff.dca.net> Message-ID: <20121009134406.GB9351@staff.dca.net> Replying to myself, I was out of the office yesterday but I come back in this morning and everything looks correct again. Apache is still stopped and nobody has touched the server since I stopped services on Friday. Very strange -Mike On Fri, Oct 05, 2012 at 04:22:16PM -0400, Michael Morgan wrote: > Hello, > > I have a 6 node CentOS 5.8 cluster with 4 nodes mounting a GFS2 filesystem. > Everything had been running nicely for about 2 years but over the past few > months I've had a strange occurence happen twice. One of the two web server > nodes will suddenly start listing the wrong directory contents, both nodes have > been affected at different times. This only seems to affect one or two > directories but it's hard to be certain since there are a large numer of them. > There are no errors logged anywhere on the cluster. Unmounting GFS2 on this > node usually causes a hang and eventual fence. The node will come back online > without issue and begin functioning normally again. > > Just a few minutes ago it started happening again. I currently have services > stopped but have not gone through the unmount/reboot process yet. Before I do > that I figured I'd check the list to see if anyone has come across this before. > Is there any GFS2/cluster information I should be dumping to track down the > cause? Any insight would be appreciated. Thanks. > > -Mike > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From queszama at yahoo.in Tue Oct 9 16:25:00 2012 From: queszama at yahoo.in (Zama Ques) Date: Wed, 10 Oct 2012 00:25:00 +0800 (SGT) Subject: [Linux-cluster] Choosing a fencing device Message-ID: <1349799900.92783.YahooMailNeo@web193006.mail.sg3.yahoo.com> Hi All, Need help in selecting the right fencing device for our HA cluster of two nodes . The server hardware used is HP Proliant Servers and OS we are using is CentOS 5 There are two options for us in selecting the fencing device . One is selecting a SAN Brocade switch. In this case , we will use ILO as secondary fencing device . Other option for us is using HP ILO as primary fencing device and IPMI fencing for secondary fencing Of the two options which will be better to go for configuring fencing .Any known issues with ILO or SAN Brocade switch in configuring fencing ?? Any suggestions will be greatly helpful.? Thanks Zaman -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Tue Oct 9 16:58:41 2012 From: lists at alteeve.ca (Digimer) Date: Tue, 09 Oct 2012 12:58:41 -0400 Subject: [Linux-cluster] Choosing a fencing device In-Reply-To: <1349799900.92783.YahooMailNeo@web193006.mail.sg3.yahoo.com> References: <1349799900.92783.YahooMailNeo@web193006.mail.sg3.yahoo.com> Message-ID: <507457C1.1020402@alteeve.ca> On 10/09/2012 12:25 PM, Zama Ques wrote: > Hi All, > > Need help in selecting the right fencing device for our HA cluster of > two nodes . The server hardware used is HP Proliant Servers and OS we > are using is CentOS 5 > > There are two options for us in selecting the fencing device . > > > One is selecting a SAN Brocade switch. In this case , we will use ILO as > secondary fencing device . > > Other option for us is using HP ILO as primary fencing device and IPMI > fencing for secondary fencing > > > Of the two options which will be better to go for configuring fencing > .Any known issues with ILO or SAN Brocade switch in configuring fencing > ? Any suggestions will be greatly helpful. > > Thanks > Zaman There is no benefit to use fence_ilo and fence_ipmilan as they work on the same device... If one fails, the other will, too. Personally, I'd use fence_ipmilan (more tested than fence_ilo) as primary and SAN fencing as a backup in case the out of band management fails (as could happen if the node lost it's power). The reason I recommend the oob interface as primary is that power fencing has a chance of recovering the node where fabric fencing merely cuts it off, which is fine for fencing, but stays offline until an admin solves the problem and unfences the nodes. -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From shanti.pahari at sierra.sg Wed Oct 10 07:06:30 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Wed, 10 Oct 2012 15:06:30 +0800 (SGT) Subject: [Linux-cluster] cannot run cluster service Message-ID: Dear all, I have cluster setup with 2 node and created web cluster service on it but it cannot run. I have not listed anything in lvm.conf volume_list because once I add anything in volume_list and reboot the system then I cannot mount and even cannot read the lv which I created for my web . It throws error as error message "not activating volume group lv does not pass activation filter" Therefore I didn't add anything in lvm.conf . Then I try to start my cluster servers for web server but the service failed. Please help me so that I can solve this out. I have attached my cluster.conf , lvdisplay , /var/log/messages and lvm.conf and my /etc/hosts. I will be greatful if anyone can help me! Thanks And my clustat: Cluster Status for PDC-PIC-PL-CL @ Wed Oct 10 14:58:05 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ PDC-PIC-PL-CL1 1 Online, Local, rgmanager PDC-PIC-PL-CL2 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:ha-web-service (PDC-PIC-PL-CL1) failed -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cluster.conf Type: application/octet-stream Size: 1438 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lvdisplay.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: lvm.conf Type: application/octet-stream Size: 24568 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hosts.txt URL: From andrew at beekhof.net Wed Oct 10 10:47:17 2012 From: andrew at beekhof.net (Andrew Beekhof) Date: Wed, 10 Oct 2012 21:47:17 +1100 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <50736F78.3060906@redhat.com> References: <50736F78.3060906@redhat.com> Message-ID: On Tue, Oct 9, 2012 at 11:27 AM, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync configuration > system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync 2.0 > udpu). > > David Vossel has also created a version of the "Clusters from Scratch" > document that illustrates setting up a cluster using pcs. This should be > showing up shortly. Its now available at the usual location: http://www.clusterlabs.org/doc > > You can view the source here: https://github.com/feist/pcs/ > > Or download the latest tarball: > https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz > > There is also a Fedora 18 package that will be included with the next > release. You should be able to find that package in the following > locations... > > RPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > > SRPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm > > In the near future we are planning on having builds for SUSE & > Ubuntu/Debian. > > We're also actively working on a GUI/Daemon that will allow control of your > entire cluster from one node and/or a web browser. > > Please feel free to email me (cfeist at redhat.com) or open issues on the pcs > project at github (https://github.com/feist/pcs/issues) if you have any > questions or problems. > > Thanks! > Chris > > _______________________________________________ > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org From andrewd at sterling.net Wed Oct 10 15:27:02 2012 From: andrewd at sterling.net (Andrew Denton) Date: Wed, 10 Oct 2012 08:27:02 -0700 Subject: [Linux-cluster] cannot run cluster service In-Reply-To: References: Message-ID: <507593C6.5040708@sterling.net> On 10/10/2012 12:06 AM, Shanti Pahari wrote: > I have cluster setup with 2 node and created web cluster service on it > but it cannot run. > > I have not listed anything in lvm.conf volume_list because once I add > anything in volume_list and reboot the system then I cannot mount and > even cannot read the lv which I created for my web . It throws error as > > error message "not activating volume group lv does not pass activation > filter" > > Therefore I didn't add anything in lvm.conf . Then I try to start my > cluster servers for web server but the service failed. > I've seen this failure too when building my cluster. You either need to add the system's volume groups to volume_list, or tag the system's vgs with the @hostname so it can still activate them. e.g. volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on one node and volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on the other. Next it will complain about initrd being older than lvm.conf, so I've been running # mkinitrd -f /boot/initrd-`uname -r`.img `uname -r` Not sure if that's the right command but it works for me =) One of these days I'm going to tag the system's vgs properly so I can use the same lvm.conf across the nodes. I think it's something like lvchange --addtag PDC-PIC-PL-CL1 vg_pdcpicpl01/lv_root etc... By the way, to display how things are tagged, you have to do lvs -o +tags I wish it displayed them in lvdisplay, but it doesn't. -- Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 897 bytes Desc: OpenPGP digital signature URL: From lists at alteeve.ca Wed Oct 10 15:50:30 2012 From: lists at alteeve.ca (Digimer) Date: Wed, 10 Oct 2012 11:50:30 -0400 Subject: [Linux-cluster] cannot run cluster service In-Reply-To: References: Message-ID: <50759946.7060303@alteeve.ca> Your fencing is not setup properly. You define the fence devices, but do not use them as 's under each node definition. You should be able to 'fence_node ' and watch it get rebooted. Until then, your cluster will hang (by design) the first time there is a problem. Assuming you want to use clustered LVM, you need to change locking_type to '3' and, I advise, change 'falllback_to_local_locking' to '0'. This also requires that the 'clvmd' daemon is running, which in turn needs cman to be running first. What is your backing device for the LVM PV? digimer On 10/10/2012 03:06 AM, Shanti Pahari wrote: > Dear all, > > I have cluster setup with 2 node and created web cluster service on it > but it cannot run. > > I have not listed anything in lvm.conf volume_list because once I add > anything in volume_list and reboot the system then I cannot mount and > even cannot read the lv which I created for my web . It throws error as > > error message "not activating volume group lv does not pass activation > filter" > > Therefore I didn?t add anything in lvm.conf . Then I try to start my > cluster servers for web server but the service failed. > > Please help me so that I can solve this out. > > I have attached my cluster.conf , lvdisplay , /var/log/messages and > lvm.conf and my /etc/hosts. > > I will be greatful if anyone can help me! > > Thanks > > And my clustat: > > Cluster Status for PDC-PIC-PL-CL @ Wed Oct 10 14:58:05 2012 > > Member Status: Quorate > > Member Name ID Status > > ------ ---- ---- ------ > > PDC-PIC-PL-CL1 1 > Online, Local, rgmanager > > PDC-PIC-PL-CL2 2 > Online, rgmanager > > Service Name Owner > (Last) State > > ------- ---- ----- > ------ ----- > > service:ha-web-service > (PDC-PIC-PL-CL1) failed > > > -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From CBurke at innova-partners.com Wed Oct 10 18:45:13 2012 From: CBurke at innova-partners.com (Chip Burke) Date: Wed, 10 Oct 2012 18:45:13 +0000 Subject: [Linux-cluster] Configure multiple heartbeat on a redhat cluster In-Reply-To: <1D241511770E2F4BA89AFD224EDD527141025BAE@G9W0733.americas.hpqcorp.net> Message-ID: I have been looking for an answer to this myself. The only answer I have found is using bonded interfaces. https://access.redhat.com/knowledge/node/48157 However, seeing that it uses multicast, I am not sure it say you have NICs on a production LAN and then NICs on an iSCSI LAN, that they all send/receive heartbeat packets to the multicast address on all attached LANs. ________________________________________ Chip Burke On 10/8/12 9:47 PM, "Chen, Ming Ming" wrote: > > > Hi, > Is there a way to configure multiple heartbeat network in the >/etc/cluster.conf file. >I'm using redhat cluster. >Regards >Ming > > >-- >Linux-cluster mailing list >Linux-cluster at redhat.com >https://www.redhat.com/mailman/listinfo/linux-cluster From lists at alteeve.ca Thu Oct 11 02:36:48 2012 From: lists at alteeve.ca (Digimer) Date: Wed, 10 Oct 2012 22:36:48 -0400 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <50736F78.3060906@redhat.com> References: <50736F78.3060906@redhat.com> Message-ID: <507630C0.2010607@alteeve.ca> On 10/08/2012 08:27 PM, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync > configuration system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync > 2.0 udpu). > > David Vossel has also created a version of the "Clusters from Scratch" > document that illustrates setting up a cluster using pcs. This should > be showing up shortly. > > You can view the source here: https://github.com/feist/pcs/ > > Or download the latest tarball: > https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz > > There is also a Fedora 18 package that will be included with the next > release. You should be able to find that package in the following > locations... > > RPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > > SRPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm > > In the near future we are planning on having builds for SUSE & > Ubuntu/Debian. > > We're also actively working on a GUI/Daemon that will allow control of > your entire cluster from one node and/or a web browser. > > Please feel free to email me (cfeist at redhat.com) or open issues on the > pcs project at github (https://github.com/feist/pcs/issues) if you have > any questions or problems. > > Thanks! > Chris Hi Chris, I started following Andrew's new pcs-based tutorial today on a fresh, minimal F17 x86_64 install. Section 2.5 of CfS-pcs shows; === yum install -y pcs 2.5 Setup # systemctl start pcsd.service # systemctl enable pcsd.service === This fails, and Andrew suggested using the version of pcs you annouced here. Same problem though; === [root at an-c01n01 ~]# rpm -Uvh http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm Retrieving http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm Preparing... ########################################### [100%] 1:pcs ########################################### [100%] [root at an-c01n01 ~]# systemctl start pcsd.service Failed to issue method call: Unit pcsd.service failed to load: No such file or directory. See system logs and 'systemctl status pcsd.service' for details. [root at an-c01n01 ~]# rpm -q pacemaker corosync pcs pacemaker-1.1.7-2.fc17.x86_64 corosync-2.0.1-1.fc17.x86_64 pcs-0.9.26-1.fc18.noarch === Any thoughts? Cheers! -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From shanti.pahari at sierra.sg Thu Oct 11 02:54:42 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Thu, 11 Oct 2012 10:54:42 +0800 (SGT) Subject: [Linux-cluster] cannot run cluster service In-Reply-To: <507593C6.5040708@sterling.net> References: <507593C6.5040708@sterling.net> Message-ID: <27d23437.00001f6c.000000ff@sierra-A66> Hi Andrew, Now I added following in my lvm.conf volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] and # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) # shutdown -r now "Activating ramdisk LVM changes" After that when the system tries to boot up: Kernel panic - not syncing: Attempted to kill init! So didn't have luck L volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] The hostname should be cluster connect name or initial hostname ? My /etc/hosts: 192.168.24.32 PDC-PIC-PL-01 PDC-PIC-PL-01.chcs.sg 192.168.25.132 PDC-PIC-PL-01-PM 192.168.26.13 PDC-PIC-PL-CL1 192.168.24.33 PDC-PIC-PL-02 PDC-PIC-PL-02.chcs.sg 192.168.25.133 PDC-PIC-PL-02-PM 192.168.26.14 PDC-PIC-PL-CL2 From: Andrew Denton [mailto:andrewd at sterling.net] Sent: Wednesday, 10 October, 2012 11:27 PM To: linux clustering Cc: Shanti Pahari Subject: Re: [Linux-cluster] cannot run cluster service On 10/10/2012 12:06 AM, Shanti Pahari wrote: I have cluster setup with 2 node and created web cluster service on it but it cannot run. I have not listed anything in lvm.conf volume_list because once I add anything in volume_list and reboot the system then I cannot mount and even cannot read the lv which I created for my web . It throws error as error message "not activating volume group lv does not pass activation filter" Therefore I didn't add anything in lvm.conf . Then I try to start my cluster servers for web server but the service failed. I've seen this failure too when building my cluster. You either need to add the system's volume groups to volume_list, or tag the system's vgs with the @hostname so it can still activate them. e.g. volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on one node and volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on the other. Next it will complain about initrd being older than lvm.conf, so I've been running # mkinitrd -f /boot/initrd-`uname -r`.img `uname -r` Not sure if that's the right command but it works for me =) One of these days I'm going to tag the system's vgs properly so I can use the same lvm.conf across the nodes. I think it's something like lvchange --addtag PDC-PIC-PL-CL1 vg_pdcpicpl01/lv_root etc... By the way, to display how things are tagged, you have to do lvs -o +tags I wish it displayed them in lvdisplay, but it doesn't. -- Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From shanti.pahari at sierra.sg Thu Oct 11 03:02:11 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Thu, 11 Oct 2012 11:02:11 +0800 (SGT) Subject: [Linux-cluster] cannot run cluster service In-Reply-To: <50759946.7060303@alteeve.ca> References: <50759946.7060303@alteeve.ca> Message-ID: Thanks! I updated the fencing method in each of the nodes. But after I added volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] and # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) # shutdown -r now "Activating ramdisk LVM changes" After that when the system tries to boot up: Kernel panic ? not syncing: Attempted to kill init! So didn?t have luck ? Can you help me pls! -----Original Message----- From: Digimer [mailto:lists at alteeve.ca] Sent: Wednesday, 10 October, 2012 11:51 PM To: linux clustering Cc: Shanti Pahari Subject: Re: [Linux-cluster] cannot run cluster service Your fencing is not setup properly. You define the fence devices, but do not use them as 's under each node definition. You should be able to 'fence_node ' and watch it get rebooted. Until then, your cluster will hang (by design) the first time there is a problem. Assuming you want to use clustered LVM, you need to change locking_type to '3' and, I advise, change 'falllback_to_local_locking' to '0'. This also requires that the 'clvmd' daemon is running, which in turn needs cman to be running first. What is your backing device for the LVM PV? digimer On 10/10/2012 03:06 AM, Shanti Pahari wrote: > Dear all, > > I have cluster setup with 2 node and created web cluster service on it > but it cannot run. > > I have not listed anything in lvm.conf volume_list because once I add > anything in volume_list and reboot the system then I cannot mount and > even cannot read the lv which I created for my web . It throws error > as > > error message "not activating volume group lv does not pass activation > filter" > > Therefore I didn?t add anything in lvm.conf . Then I try to start my > cluster servers for web server but the service failed. > > Please help me so that I can solve this out. > > I have attached my cluster.conf , lvdisplay , /var/log/messages and > lvm.conf and my /etc/hosts. > > I will be greatful if anyone can help me! > > Thanks > > And my clustat: > > Cluster Status for PDC-PIC-PL-CL @ Wed Oct 10 14:58:05 2012 > > Member Status: Quorate > > Member Name ID > Status > > ------ ---- ---- ------ > > PDC-PIC-PL-CL1 1 > Online, Local, rgmanager > > PDC-PIC-PL-CL2 2 > Online, rgmanager > > Service Name Owner > (Last) State > > ------- ---- ----- > ------ ----- > > service:ha-web-service > (PDC-PIC-PL-CL1) failed > > > -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From ali.bendriss at gmail.com Thu Oct 11 10:03:02 2012 From: ali.bendriss at gmail.com (Ali Bendriss) Date: Thu, 11 Oct 2012 12:03:02 +0200 Subject: [Linux-cluster] locking_type Message-ID: <1552642.6zb82LHdvl@zapp> Hello, I'm runnning a two nodes clusters on linux: - current setup: cluster-3.1.93 LVM2-2.02.96 kernel 3.2.29 - previous setup: cluster-3.1.92 LVM2.2.02.84 kernel 3.4.3 Since the updrade to current setup, I'm only able to run clvmd if it is compiled using "--with-cluster=shared" and setting the locking_type = 2 in clvm.conf before that using the previous setup I was able to compile clvmd using "--with-cluster=internal" and setting the locking_type = 3 Is there any problem running a gfs2 fs with clvmd using the external shared library locking_library ? thanks -- Ali From ali.bendriss at gmail.com Thu Oct 11 10:13:45 2012 From: ali.bendriss at gmail.com (Ali Bendriss) Date: Thu, 11 Oct 2012 12:13:45 +0200 Subject: [Linux-cluster] snapshot status Message-ID: <1780785.LbQT9riKYZ@zapp> Hello, I'm runnning a two nodes clusters on linux using gfs2 (cluster-3.1.93, LVM2-2.02.96, kernel 3.2.29). I would like to know what is the current status of the snapshot support. I've got a third node that I would like to use for the backup. Could someone give me some hint about backuping a gfs2 shared file system. thanks -- Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From agk at redhat.com Thu Oct 11 10:36:05 2012 From: agk at redhat.com (Alasdair G Kergon) Date: Thu, 11 Oct 2012 11:36:05 +0100 Subject: [Linux-cluster] locking_type In-Reply-To: <1552642.6zb82LHdvl@zapp> References: <1552642.6zb82LHdvl@zapp> Message-ID: <20121011103604.GD2133@agk-dp.fab.redhat.com> On Thu, Oct 11, 2012 at 12:03:02PM +0200, Ali Bendriss wrote: > Since the updrade to current setup, I'm only able to run clvmd if it is > compiled using "--with-cluster=shared" and setting the locking_type = 2 in > clvm.conf What is the error you get? I don't think we stopped this working intentionally, but admittedly it's not a configuration we test very often. Alasdair From ali.bendriss at gmail.com Thu Oct 11 11:19:09 2012 From: ali.bendriss at gmail.com (Ali Bendriss) Date: Thu, 11 Oct 2012 13:19:09 +0200 Subject: [Linux-cluster] locking_type In-Reply-To: <20121011103604.GD2133@agk-dp.fab.redhat.com> References: <1552642.6zb82LHdvl@zapp> <20121011103604.GD2133@agk-dp.fab.redhat.com> Message-ID: <2094091.Y1l6vVcmm9@zapp> On Thursday, October 11, 2012 11:36:05 AM Alasdair G Kergon wrote: > On Thu, Oct 11, 2012 at 12:03:02PM +0200, Ali Bendriss wrote: > > Since the updrade to current setup, I'm only able to run clvmd if it is > > compiled using "--with-cluster=shared" and setting the locking_type = 2 in > > clvm.conf > > What is the error you get? > Starting the node with locking type = 3 and clvmd compiled using with- cluster=internal , I've got no error: clvmd: Cluster LVM daemon started - connected to CMAN but running for example vgscan desn't work: /sbin/vgscan connect() failed on local socket: No such file or directory Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Reading all physical volumes. This may take a while... Skipping clustered volume group samba4 Skipping clustered volume group ctdb Skipping clustered volume group shared Found volume group "main" using metadata type lvm2 the same command is working using clvmd compiled using the shared locking. more log below > I don't think we stopped this working intentionally, but admittedly it's > not a configuration we test very often. > I was thinking that "--with-cluster=internal", was the recommended configuration. > Alasdair -------------------------------------------------------------------------------------------- (using locking type = 3) and calling vgscan #clvmd -d 1 CLVMD[9d22b740]: Oct 11 13:03:51 CLVMD started CLVMD[9d22b740]: Oct 11 13:03:51 Connected to CMAN CLVMD[9d22b740]: Oct 11 13:03:51 CMAN initialisation complete CLVMD[9d22b740]: Oct 11 13:03:51 Created DLM lockspace for CLVMD. CLVMD[9d22b740]: Oct 11 13:03:51 DLM initialisation complete CLVMD[9d22b740]: Oct 11 13:03:51 Cluster ready, doing some more initialisation CLVMD[9d22b740]: Oct 11 13:03:51 starting LVM thread CLVMD[9d22a700]: Oct 11 13:03:51 LVM thread function started WARNING: Locking disabled. Be careful! This could corrupt your metadata. Incorrect metadata area header checksum on /dev/sdd at offset 4096 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for ve26qDQ7hcpgDH2fw19GFZkbKgTadysCNUNjh9w8HFdbVvQLBjZidl8QseraUBc0 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 've26qDQ7hcpgDH2fw19GFZkbKgTadysCNUNjh9w8HFdbVvQLBjZidl8QseraUBc0' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 1 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for ve26qDQ7hcpgDH2fw19GFZkbKgTadysCAXDuTvYJ4ambKnLALpOffDSxPrjHliO0 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 've26qDQ7hcpgDH2fw19GFZkbKgTadysCAXDuTvYJ4ambKnLALpOffDSxPrjHliO0' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 2 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for ByYnMJeHNgSIBuJENA2WLMe148edovbofR4f9clPHk2BUveeSUstcSEJzcOHt2BE CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 'ByYnMJeHNgSIBuJENA2WLMe148edovbofR4f9clPHk2BUveeSUstcSEJzcOHt2BE' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 3 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx6QSKUI7riC86QhzIX98cmu8rL4lHXJlO CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx6QSKUI7riC86QhzIX98cmu8rL4lHXJlO' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 4 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjxS36cpLJRPxWYtjPQChMIQZW7Zxx97aGL CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjxS36cpLJRPxWYtjPQChMIQZW7Zxx97aGL' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 5 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx7Vov15UStgsS1tgOISASG7bPjYf7NpYO CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx7Vov15UStgsS1tgOISASG7bPjYf7NpYO' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 6 CLVMD[9d22a700]: Oct 11 13:03:51 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx1OWkvGiyJGfz9u5Pcedbzj4hnT2Q6TY0 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx1OWkvGiyJGfz9u5Pcedbzj4hnT2Q6TY0' mode:1 flags=1 CLVMD[9d22a700]: Oct 11 13:03:51 sync_lock: returning lkid 7 CLVMD[9d22a700]: Oct 11 13:03:51 Sub thread ready for work. CLVMD[9d22b740]: Oct 11 13:03:51 clvmd ready for work CLVMD[9d22b740]: Oct 11 13:03:51 Using timeout of 60 seconds CLVMD[9d22a700]: Oct 11 13:03:51 LVM thread waiting for work ------------------------------------------------------------------------------------------------------------------------------------- using locking type = 2 and calling vgscan # clvmd -d 1 CLVMD[6d153740]: Oct 11 13:07:57 CLVMD started CLVMD[6d153740]: Oct 11 13:07:57 Connected to CMAN CLVMD[6d153740]: Oct 11 13:07:57 CMAN initialisation complete CLVMD[6d153740]: Oct 11 13:07:57 Created DLM lockspace for CLVMD. CLVMD[6d153740]: Oct 11 13:07:57 DLM initialisation complete CLVMD[6d153740]: Oct 11 13:07:57 Cluster ready, doing some more initialisation CLVMD[6d153740]: Oct 11 13:07:57 starting LVM thread CLVMD[6d152700]: Oct 11 13:07:57 LVM thread function started WARNING: Locking disabled. Be careful! This could corrupt your metadata. Incorrect metadata area header checksum on /dev/sdd at offset 4096 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for ve26qDQ7hcpgDH2fw19GFZkbKgTadysCNUNjh9w8HFdbVvQLBjZidl8QseraUBc0 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 've26qDQ7hcpgDH2fw19GFZkbKgTadysCNUNjh9w8HFdbVvQLBjZidl8QseraUBc0' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 1 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for ve26qDQ7hcpgDH2fw19GFZkbKgTadysCAXDuTvYJ4ambKnLALpOffDSxPrjHliO0 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 've26qDQ7hcpgDH2fw19GFZkbKgTadysCAXDuTvYJ4ambKnLALpOffDSxPrjHliO0' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 2 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for ByYnMJeHNgSIBuJENA2WLMe148edovbofR4f9clPHk2BUveeSUstcSEJzcOHt2BE CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 'ByYnMJeHNgSIBuJENA2WLMe148edovbofR4f9clPHk2BUveeSUstcSEJzcOHt2BE' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 3 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx6QSKUI7riC86QhzIX98cmu8rL4lHXJlO CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx6QSKUI7riC86QhzIX98cmu8rL4lHXJlO' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 4 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjxS36cpLJRPxWYtjPQChMIQZW7Zxx97aGL CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjxS36cpLJRPxWYtjPQChMIQZW7Zxx97aGL' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 5 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx7Vov15UStgsS1tgOISASG7bPjYf7NpYO CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx7Vov15UStgsS1tgOISASG7bPjYf7NpYO' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 6 CLVMD[6d152700]: Oct 11 13:07:57 getting initial lock for GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx1OWkvGiyJGfz9u5Pcedbzj4hnT2Q6TY0 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: 'GkJTcHFLl6YNc8M0QN7yYA0nWwhzHKjx1OWkvGiyJGfz9u5Pcedbzj4hnT2Q6TY0' mode:1 flags=1 CLVMD[6d152700]: Oct 11 13:07:57 sync_lock: returning lkid 7 Incorrect LVM locking library specified in lvm.conf, cluster operations may not work. CLVMD[6d152700]: Oct 11 13:07:57 Sub thread ready for work. CLVMD[6d152700]: Oct 11 13:07:57 LVM thread waiting for work CLVMD[6d153740]: Oct 11 13:07:57 clvmd ready for work CLVMD[6d153740]: Oct 11 13:07:57 Using timeout of 60 seconds CLVMD[6d153740]: Oct 11 13:08:56 Got new connection on fd 11 CLVMD[6d153740]: Oct 11 13:08:56 Read on local socket 11, len = 29 CLVMD[6d153740]: Oct 11 13:08:56 check_all_clvmds_running CLVMD[6d153740]: Oct 11 13:08:56 creating pipe, [12, 13] CLVMD[6d153740]: Oct 11 13:08:56 Creating pre&post thread CLVMD[6d153740]: Oct 11 13:08:56 Created pre&post thread, state = 0 CLVMD[6d131700]: Oct 11 13:08:56 in sub thread: client = 0x12d06d0 CLVMD[6d131700]: Oct 11 13:08:56 doing PRE command LOCK_VG 'P_#global' at 4 (client=0x12d06d0) CLVMD[6d131700]: Oct 11 13:08:56 sync_lock: 'P_#global' mode:4 flags=0 CLVMD[6d131700]: Oct 11 13:08:56 sync_lock: returning lkid 8 CLVMD[6d131700]: Oct 11 13:08:56 Writing status 0 down pipe 13 CLVMD[6d131700]: Oct 11 13:08:56 Waiting to do post command - state = 0 CLVMD[6d153740]: Oct 11 13:08:56 read on PIPE 12: 4 bytes: status: 0 CLVMD[6d153740]: Oct 11 13:08:56 background routine status was 0, sock_client=0x12d06d0 CLVMD[6d153740]: Oct 11 13:08:56 distribute command: XID = 0, flags=0x0 () CLVMD[6d153740]: Oct 11 13:08:56 add_to_lvmqueue: cmd=0x12d0a10. client=0x12d06d0, msg=0x12d02f0, len=29, csid=(nil), xid=0 CLVMD[6d153740]: Oct 11 13:08:56 Sending message to all cluster nodes CLVMD[6d152700]: Oct 11 13:08:56 process_work_item: local CLVMD[6d152700]: Oct 11 13:08:56 process_local_command: LOCK_VG (0x33) msg=0x12d0a50, msglen =29, client=0x12d06d0 CLVMD[6d152700]: Oct 11 13:08:56 do_lock_vg: resource 'P_#global', cmd = 0x4 LCK_VG (WRITE|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 CLVMD[6d152700]: Oct 11 13:08:56 Refreshing context Incorrect metadata area header checksum on /dev/sdd at offset 4096 CLVMD[6d152700]: Oct 11 13:08:56 Reply from node node-10: 0 bytes CLVMD[6d152700]: Oct 11 13:08:56 Got 1 replies, expecting: 2 CLVMD[6d152700]: Oct 11 13:08:56 LVM thread waiting for work CLVMD[6d153740]: Oct 11 13:08:56 Reply from node node-11: 0 bytes CLVMD[6d153740]: Oct 11 13:08:56 Got 2 replies, expecting: 2 CLVMD[6d131700]: Oct 11 13:08:56 Got post command condition... CLVMD[6d131700]: Oct 11 13:08:56 Waiting for next pre command CLVMD[6d153740]: Oct 11 13:08:56 read on PIPE 12: 4 bytes: status: 0 CLVMD[6d153740]: Oct 11 13:08:56 background routine status was 0, sock_client=0x12d06d0 CLVMD[6d153740]: Oct 11 13:08:56 Send local reply CLVMD[6d153740]: Oct 11 13:08:56 Read on local socket 11, len = 28 CLVMD[6d131700]: Oct 11 13:08:56 Got pre command condition... CLVMD[6d131700]: Oct 11 13:08:56 doing PRE command LOCK_VG 'V_samba4' at 1 (client=0x12d06d0) CLVMD[6d131700]: Oct 11 13:08:56 sync_lock: 'V_samba4' mode:3 flags=0 CLVMD[6d131700]: Oct 11 13:08:56 sync_lock: returning lkid 9 CLVMD[6d131700]: Oct 11 13:08:56 Writing status 0 down pipe 13 CLVMD[6d131700]: Oct 11 13:08:56 Waiting to do post command - state = 0 CLVMD[6d153740]: Oct 11 13:08:56 read on PIPE 12: 4 bytes: status: 0 CLVMD[6d153740]: Oct 11 13:08:56 background routine status was 0, sock_client=0x12d06d0 CLVMD[6d153740]: Oct 11 13:08:56 distribute command: XID = 1, flags=0x1 (LOCAL) CLVMD[6d153740]: Oct 11 13:08:56 add_to_lvmqueue: cmd=0x12d0a10. client=0x12d06d0, msg=0x12d02f0, len=28, csid=(nil), xid=1 CLVMD[6d152700]: Oct 11 13:08:56 process_work_item: local CLVMD[6d152700]: Oct 11 13:08:56 process_local_command: LOCK_VG (0x33) msg=0x12d0a50, msglen =28, client=0x12d06d0 CLVMD[6d152700]: Oct 11 13:08:56 do_lock_vg: resource 'V_samba4', cmd = 0x1 LCK_VG (READ|VG), flags = 0x4 ( DMEVENTD_MONITOR ), critical_section = 0 CLVMD[6d152700]: Oct 11 13:08:56 Invalidating cached metadata for VG samba4 ... From jpokorny at redhat.com Thu Oct 11 11:25:54 2012 From: jpokorny at redhat.com (Jan =?utf-8?Q?Pokorn=C3=BD?=) Date: Thu, 11 Oct 2012 13:25:54 +0200 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <507630C0.2010607@alteeve.ca> References: <50736F78.3060906@redhat.com> <507630C0.2010607@alteeve.ca> Message-ID: <20121011112554.GC29887@redhat.com> Hello Digimer, On 10/10/12 22:36 -0400, Digimer wrote: > I started following Andrew's new pcs-based tutorial today on a fresh, > minimal F17 x86_64 install. Section 2.5 of CfS-pcs shows; > > === > yum install -y pcs > > 2.5 Setup > > > > # systemctl start pcsd.service > # systemctl enable pcsd.service > === > > This fails, and Andrew suggested using the version of pcs you annouced here. > Same problem though; > > === > [root at an-c01n01 ~]# rpm -Uvh > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > Retrieving http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > Preparing... ########################################### > [100%] > 1:pcs ########################################### > [100%] > [root at an-c01n01 ~]# systemctl start pcsd.service > Failed to issue method call: Unit pcsd.service failed to load: No such file > or directory. See system logs and 'systemctl status pcsd.service' for > details. > > [...] > > Any thoughts? this is part of pcs-gui project [1] packaging of which is probably pending. [1] https://github.com/feist/pcs-gui -- Jan From shanti.pahari at sierra.sg Thu Oct 11 13:43:23 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Thu, 11 Oct 2012 21:43:23 +0800 (SGT) Subject: [Linux-cluster] lvm.conf for HA LVM Message-ID: <0a55cf02.00000e4c.00000039@sierra-A66> Hi all, Please help me to configure lvm.conf . I always get problem when I add volume_list in lvm.conf file. After I add volume_list = [ "vg_pdcpicpl101" , "@pdc-pic-pl-01" ] and reboot it later my other lvm goes down L cannot initialize it. Or do I need to add all my VG in volume list? Please help! Thanks, Shanti -------------- next part -------------- An HTML attachment was scrubbed... URL: From bergman at merctech.com Thu Oct 11 17:01:37 2012 From: bergman at merctech.com (bergman at merctech.com) Date: Thu, 11 Oct 2012 13:01:37 -0400 Subject: [Linux-cluster] cannot run cluster service In-Reply-To: Your message of "Thu, 11 Oct 2012 11:02:11 +0800." References: <50759946.7060303@alteeve.ca> Message-ID: <4136.1349974897@localhost> In the message dated: Thu, 11 Oct 2012 11:02:11 +0800, The pithy ruminations from "Shanti Pahari" on were: => Thanks! => => I updated the fencing method in each of the nodes. But after I added => volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] => and => # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) Note that the lvm.conf data that is stored within the initrd.img is not an exact copy of /etc/lvm/lvm.conf, but it is a filtered version, created by "lvm dumpconfig". Previous bugs have meant that the embedded version was not alway bootable...perhaps you're having a similar problem. => # shutdown -r now "Activating ramdisk LVM changes" => => After that when the system tries to boot up: => Kernel panic ??? not syncing: Attempted to kill init! I don't recall information from you about what OS distribution you are using, but this problem sounds similar to: https://bugzilla.redhat.com/show_bug.cgi?id=517868 Before giving up on LVM and and moving all filesystem management out of RHCS control, I had several problems with the the embedded copy of /etc/lvm.conf that's stored within the initrd image. I ended up with a procedure where I would replace the embedded version (produced with "lvm dumpconfig" by the mkinitrd process) inside the initrd image with the full working version. This procedure was necessary after any kernel or lvm.conf change. An abbreviated set of steps (ommitting all error checking, etc) is: gunzip < $initrdfile > /tmp/initrd.unzipped.img cpio -i --make-directories < /tmp/initrd.unzipped.img cd /tmp/initrd.$$/etc/lvm cp /etc/lvm/lvm.conf . cd /tmp/initrd.$$ find ./ | cpio -H newc -o > /tmp/initrd.unzipped.img gzip < /tmp/initrd.unzipped.img > /tmp/initrd.zipped.img mv /tmp/initrd.zipped.img $initrdfile I do not recommend this as a standard procedure, but it might be worth doing once, in order to see whether the embedded version of the lvm.conf file used in the initrd.img is causing your problem. Mark => => So didn???t have luck ??? => => Can you help me pls! => => -----Original Message----- => From: Digimer [mailto:lists at alteeve.ca] => Sent: Wednesday, 10 October, 2012 11:51 PM => To: linux clustering => Cc: Shanti Pahari => Subject: Re: [Linux-cluster] cannot run cluster service => => Your fencing is not setup properly. You define the fence devices, but do not => use them as 's under each node definition. You should be able to => 'fence_node ' and watch it get rebooted. Until then, your cluster => will hang (by design) the first time there is a problem. => => Assuming you want to use clustered LVM, you need to change locking_type to => '3' and, I advise, change 'falllback_to_local_locking' to '0'. This also => requires that the 'clvmd' daemon is running, which in turn needs cman to be => running first. => => What is your backing device for the LVM PV? => => digimer => => On 10/10/2012 03:06 AM, Shanti Pahari wrote: => > Dear all, => > => > I have cluster setup with 2 node and created web cluster service on it => > but it cannot run. => > => > I have not listed anything in lvm.conf volume_list because once I add => > anything in volume_list and reboot the system then I cannot mount and => > even cannot read the lv which I created for my web . It throws error => > as => > => > error message "not activating volume group lv does not pass activation => > filter" => > => > Therefore I didn???t add anything in lvm.conf . Then I try to start my => > cluster servers for web server but the service failed. => > => > Please help me so that I can solve this out. => > => > I have attached my cluster.conf , lvdisplay , /var/log/messages and => > lvm.conf and my /etc/hosts. => > => > I will be greatful if anyone can help me! => > => > Thanks => > => > And my clustat: => > => > Cluster Status for PDC-PIC-PL-CL @ Wed Oct 10 14:58:05 2012 => > => > Member Status: Quorate => > => > Member Name ID => > Status => > => > ------ ---- ---- ------ => > => > PDC-PIC-PL-CL1 1 => > Online, Local, rgmanager => > => > PDC-PIC-PL-CL2 2 => > Online, rgmanager => > => > Service Name Owner => > (Last) State => > => > ------- ---- ----- => > ------ ----- => > => > service:ha-web-service => > (PDC-PIC-PL-CL1) failed => > => > => > => => => -- => Digimer => Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, => odorless gas which, if left alone in sufficient quantities for long periods => of time, begins to think about itself." => => -- => Linux-cluster mailing list => Linux-cluster at redhat.com => https://www.redhat.com/mailman/listinfo/linux-cluster => From lists at alteeve.ca Thu Oct 11 17:14:10 2012 From: lists at alteeve.ca (Digimer) Date: Thu, 11 Oct 2012 13:14:10 -0400 Subject: [Linux-cluster] cannot run cluster service In-Reply-To: References: <50759946.7060303@alteeve.ca> Message-ID: <5076FE62.9000707@alteeve.ca> I don't think you addressed any of my comments. On 10/10/2012 11:02 PM, Shanti Pahari wrote: > Thanks! > > I updated the fencing method in each of the nodes. But after I added > volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] > and > # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) > # shutdown -r now "Activating ramdisk LVM changes" > > After that when the system tries to boot up: > Kernel panic ? not syncing: Attempted to kill init! > > So didn?t have luck ? > > Can you help me pls! > > -----Original Message----- > From: Digimer [mailto:lists at alteeve.ca] > Sent: Wednesday, 10 October, 2012 11:51 PM > To: linux clustering > Cc: Shanti Pahari > Subject: Re: [Linux-cluster] cannot run cluster service > > Your fencing is not setup properly. You define the fence devices, but do not > use them as 's under each node definition. You should be able to > 'fence_node ' and watch it get rebooted. Until then, your cluster > will hang (by design) the first time there is a problem. > > Assuming you want to use clustered LVM, you need to change locking_type to > '3' and, I advise, change 'falllback_to_local_locking' to '0'. This also > requires that the 'clvmd' daemon is running, which in turn needs cman to be > running first. > > What is your backing device for the LVM PV? > > digimer > > On 10/10/2012 03:06 AM, Shanti Pahari wrote: >> Dear all, >> >> I have cluster setup with 2 node and created web cluster service on it >> but it cannot run. >> >> I have not listed anything in lvm.conf volume_list because once I add >> anything in volume_list and reboot the system then I cannot mount and >> even cannot read the lv which I created for my web . It throws error >> as >> >> error message "not activating volume group lv does not pass activation >> filter" >> >> Therefore I didn?t add anything in lvm.conf . Then I try to start my >> cluster servers for web server but the service failed. >> >> Please help me so that I can solve this out. >> >> I have attached my cluster.conf , lvdisplay , /var/log/messages and >> lvm.conf and my /etc/hosts. >> >> I will be greatful if anyone can help me! >> >> Thanks >> >> And my clustat: >> >> Cluster Status for PDC-PIC-PL-CL @ Wed Oct 10 14:58:05 2012 >> >> Member Status: Quorate >> >> Member Name ID >> Status >> >> ------ ---- ---- ------ >> >> PDC-PIC-PL-CL1 1 >> Online, Local, rgmanager >> >> PDC-PIC-PL-CL2 2 >> Online, rgmanager >> >> Service Name Owner >> (Last) State >> >> ------- ---- ----- >> ------ ----- >> >> service:ha-web-service >> (PDC-PIC-PL-CL1) failed >> >> >> > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, > odorless gas which, if left alone in sufficient quantities for long periods > of time, begins to think about itself." > -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From mkathuria at tuxtechnologies.co.in Thu Oct 11 17:36:17 2012 From: mkathuria at tuxtechnologies.co.in (Manish Kathuria) Date: Thu, 11 Oct 2012 23:06:17 +0530 Subject: [Linux-cluster] lvm.conf for HA LVM In-Reply-To: <0a55cf02.00000e4c.00000039@sierra-A66> References: <0a55cf02.00000e4c.00000039@sierra-A66> Message-ID: On Thu, Oct 11, 2012 at 7:13 PM, Shanti Pahari wrote: > Hi all, > > > > Please help me to configure lvm.conf . > > I always get problem when I add volume_list in lvm.conf file. > > After I add volume_list = [ ?vg_pdcpicpl101? , ?@pdc-pic-pl-01? ] and > reboot it later my other lvm goes down L cannot initialize it. Or do I need > to add all my VG in volume list? The Volume Groups which are to be shared using HA LVM should not be added to this volume list. They need to be included as resources in the cluster configuration. > You can refer to the following document for the steps to configure HA LVM https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/ap-ha-halvm-CA.html Thanks, -- Manish Kathuria From gumbs.alfred at att.net Thu Oct 11 17:49:25 2012 From: gumbs.alfred at att.net (Alfred Gumbs) Date: Thu, 11 Oct 2012 12:49:25 -0500 Subject: [Linux-cluster] cannot run cluster service In-Reply-To: <27d23437.00001f6c.000000ff@sierra-A66> References: <507593C6.5040708@sterling.net> <27d23437.00001f6c.000000ff@sierra-A66> Message-ID: <318E18D064474C05AA6F41B04602BF0A@AlfredPC> I'm not certain of your complete configuration. However looking at the entry that you placed in the volume_list. It looks like you listed the VG that is part of your cluster's resource group. However, the volume_lists should actually list all the volume groups that are not part of the cluster. The VG's in the volume_list are actually the ones that the system needs to bring up. The volume group that are part of the cluster will be brought up by rgmanager, so they should not be in volumes_list. The reason for the system panic is because the required volume groups were not listed in volume_list. So the kernel could not varyon the system VG. If I have mistakenly interprettted what you've done I'm sorry. ----- Original Message ----- From: Shanti Pahari To: 'Andrew Denton' ; 'linux clustering' Sent: Wednesday, October 10, 2012 9:54 PM Subject: Re: [Linux-cluster] cannot run cluster service Hi Andrew, Now I added following in my lvm.conf volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] and # dracut --hostonly --force /boot/initramfs-$(uname -r).img $(uname -r) # shutdown -r now "Activating ramdisk LVM changes" After that when the system tries to boot up: Kernel panic - not syncing: Attempted to kill init! So didn't have luck L volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] The hostname should be cluster connect name or initial hostname ? My /etc/hosts: 192.168.24.32 PDC-PIC-PL-01 PDC-PIC-PL-01.chcs.sg 192.168.25.132 PDC-PIC-PL-01-PM 192.168.26.13 PDC-PIC-PL-CL1 192.168.24.33 PDC-PIC-PL-02 PDC-PIC-PL-02.chcs.sg 192.168.25.133 PDC-PIC-PL-02-PM 192.168.26.14 PDC-PIC-PL-CL2 From: Andrew Denton [mailto:andrewd at sterling.net] Sent: Wednesday, 10 October, 2012 11:27 PM To: linux clustering Cc: Shanti Pahari Subject: Re: [Linux-cluster] cannot run cluster service On 10/10/2012 12:06 AM, Shanti Pahari wrote: I have cluster setup with 2 node and created web cluster service on it but it cannot run. I have not listed anything in lvm.conf volume_list because once I add anything in volume_list and reboot the system then I cannot mount and even cannot read the lv which I created for my web . It throws error as error message "not activating volume group lv does not pass activation filter" Therefore I didn't add anything in lvm.conf . Then I try to start my cluster servers for web server but the service failed. I've seen this failure too when building my cluster. You either need to add the system's volume groups to volume_list, or tag the system's vgs with the @hostname so it can still activate them. e.g. volume_list = [ "vg_pdcpicpl01", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on one node and volume_list = [ "vg_pdcpicpl02", "@PDC-PIC-PL-CL1", "@PDC-PIC-PL-CL2" ] on the other. Next it will complain about initrd being older than lvm.conf, so I've been running # mkinitrd -f /boot/initrd-`uname -r`.img `uname -r` Not sure if that's the right command but it works for me =) One of these days I'm going to tag the system's vgs properly so I can use the same lvm.conf across the nodes. I think it's something like lvchange --addtag PDC-PIC-PL-CL1 vg_pdcpicpl01/lv_root etc... By the way, to display how things are tagged, you have to do lvs -o +tags I wish it displayed them in lvdisplay, but it doesn't. -- Andrew ------------------------------------------------------------------------------ -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Thu Oct 11 18:46:49 2012 From: lists at alteeve.ca (Digimer) Date: Thu, 11 Oct 2012 14:46:49 -0400 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <20121011112554.GC29887@redhat.com> References: <50736F78.3060906@redhat.com> <507630C0.2010607@alteeve.ca> <20121011112554.GC29887@redhat.com> Message-ID: <50771419.1000306@alteeve.ca> On 10/11/2012 07:25 AM, Jan Pokorn? wrote: > Hello Digimer, > > On 10/10/12 22:36 -0400, Digimer wrote: >> I started following Andrew's new pcs-based tutorial today on a fresh, >> minimal F17 x86_64 install. Section 2.5 of CfS-pcs shows; >> >> === >> yum install -y pcs >> >> 2.5 Setup >> >> >> >> # systemctl start pcsd.service >> # systemctl enable pcsd.service >> === >> >> This fails, and Andrew suggested using the version of pcs you annouced here. >> Same problem though; >> >> === >> [root at an-c01n01 ~]# rpm -Uvh >> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >> Retrieving http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >> Preparing... ########################################### >> [100%] >> 1:pcs ########################################### >> [100%] >> [root at an-c01n01 ~]# systemctl start pcsd.service >> Failed to issue method call: Unit pcsd.service failed to load: No such file >> or directory. See system logs and 'systemctl status pcsd.service' for >> details. >> >> [...] >> >> Any thoughts? > > this is part of pcs-gui project [1] packaging of which is probably pending. > > [1] https://github.com/feist/pcs-gui Ah, so the daemon isn't needed if a user doesn't care to use the GUI? -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From andrew at beekhof.net Fri Oct 12 01:00:11 2012 From: andrew at beekhof.net (Andrew Beekhof) Date: Fri, 12 Oct 2012 12:00:11 +1100 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <50771419.1000306@alteeve.ca> References: <50736F78.3060906@redhat.com> <507630C0.2010607@alteeve.ca> <20121011112554.GC29887@redhat.com> <50771419.1000306@alteeve.ca> Message-ID: On Fri, Oct 12, 2012 at 5:46 AM, Digimer wrote: > On 10/11/2012 07:25 AM, Jan Pokorn? wrote: >> >> Hello Digimer, >> >> On 10/10/12 22:36 -0400, Digimer wrote: >>> >>> I started following Andrew's new pcs-based tutorial today on a fresh, >>> minimal F17 x86_64 install. Section 2.5 of CfS-pcs shows; >>> >>> === >>> yum install -y pcs >>> >>> 2.5 Setup >>> >>> >>> >>> # systemctl start pcsd.service >>> # systemctl enable pcsd.service >>> === >>> >>> This fails, and Andrew suggested using the version of pcs you annouced >>> here. >>> Same problem though; >>> >>> === >>> [root at an-c01n01 ~]# rpm -Uvh >>> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >>> Retrieving >>> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >>> Preparing... ########################################### >>> [100%] >>> 1:pcs ########################################### >>> [100%] >>> [root at an-c01n01 ~]# systemctl start pcsd.service >>> Failed to issue method call: Unit pcsd.service failed to load: No such >>> file >>> or directory. See system logs and 'systemctl status pcsd.service' for >>> details. >>> >>> [...] >>> >>> Any thoughts? >> >> >> this is part of pcs-gui project [1] packaging of which is probably >> pending. >> >> [1] https://github.com/feist/pcs-gui > > > Ah, so the daemon isn't needed if a user doesn't care to use the GUI? I believe it is needed if you want to do anything more than talk to the local node. Which includes initial cluster setup. I talked to Chris just now, he wanted to add PAM support (instead of using pcs_passwd) before releasing that part for upstream. New packages including the daemon pieces (with PAM support) should land in the next day or so. From lists at alteeve.ca Fri Oct 12 01:26:29 2012 From: lists at alteeve.ca (Digimer) Date: Thu, 11 Oct 2012 21:26:29 -0400 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: References: <50736F78.3060906@redhat.com> <507630C0.2010607@alteeve.ca> <20121011112554.GC29887@redhat.com> <50771419.1000306@alteeve.ca> Message-ID: <507771C5.2080001@alteeve.ca> On 10/11/2012 09:00 PM, Andrew Beekhof wrote: > On Fri, Oct 12, 2012 at 5:46 AM, Digimer wrote: >> On 10/11/2012 07:25 AM, Jan Pokorn? wrote: >>> >>> Hello Digimer, >>> >>> On 10/10/12 22:36 -0400, Digimer wrote: >>>> >>>> I started following Andrew's new pcs-based tutorial today on a fresh, >>>> minimal F17 x86_64 install. Section 2.5 of CfS-pcs shows; >>>> >>>> === >>>> yum install -y pcs >>>> >>>> 2.5 Setup >>>> >>>> >>>> >>>> # systemctl start pcsd.service >>>> # systemctl enable pcsd.service >>>> === >>>> >>>> This fails, and Andrew suggested using the version of pcs you annouced >>>> here. >>>> Same problem though; >>>> >>>> === >>>> [root at an-c01n01 ~]# rpm -Uvh >>>> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >>>> Retrieving >>>> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >>>> Preparing... ########################################### >>>> [100%] >>>> 1:pcs ########################################### >>>> [100%] >>>> [root at an-c01n01 ~]# systemctl start pcsd.service >>>> Failed to issue method call: Unit pcsd.service failed to load: No such >>>> file >>>> or directory. See system logs and 'systemctl status pcsd.service' for >>>> details. >>>> >>>> [...] >>>> >>>> Any thoughts? >>> >>> >>> this is part of pcs-gui project [1] packaging of which is probably >>> pending. >>> >>> [1] https://github.com/feist/pcs-gui >> >> >> Ah, so the daemon isn't needed if a user doesn't care to use the GUI? > > I believe it is needed if you want to do anything more than talk to > the local node. Which includes initial cluster setup. > I talked to Chris just now, he wanted to add PAM support (instead of > using pcs_passwd) before releasing that part for upstream. > > New packages including the daemon pieces (with PAM support) should > land in the next day or so. > Awesome, I'll try it out once it's available. -- Digimer Papers and Projects: https://alteeve.ca/w/ "Hydrogen is just a colourless, odorless gas which, if left alone in sufficient quantities for long periods of time, begins to think about itself." From a.holway at syseleven.de Fri Oct 12 15:22:39 2012 From: a.holway at syseleven.de (Andrew Holway) Date: Fri, 12 Oct 2012 17:22:39 +0200 Subject: [Linux-cluster] some clvmd / locking problem Message-ID: <1D60B84E-0ABD-4A5C-8E53-67D1EA796537@syseleven.de> Hello, I am trying to set up a 4 node cluster with a shared iSCSI storage device. I cannot start clvmd: service clvmd start just hangs. I cannot stop cman: node001: Working directory: /root node001: Stopping cluster: node001: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd node001: fence_tool: cannot leave due to active systems node001: [FAILED] I find these errors in /var/log/cluster/dlm_controld.log Oct 12 17:12:13 dlm_controld daemon cpg_join error retrying Oct 12 17:12:23 dlm_controld daemon cpg_join error retrying Oct 12 17:12:33 dlm_controld daemon cpg_join error retrying [root at node001 clvmd]# clvmd status clvmd failed in initialisation Any ideas? Thanks, Andrew From lists at alteeve.ca Fri Oct 12 15:28:36 2012 From: lists at alteeve.ca (Digimer) Date: Fri, 12 Oct 2012 11:28:36 -0400 Subject: [Linux-cluster] some clvmd / locking problem In-Reply-To: <1D60B84E-0ABD-4A5C-8E53-67D1EA796537@syseleven.de> References: <1D60B84E-0ABD-4A5C-8E53-67D1EA796537@syseleven.de> Message-ID: <50783724.5040703@alteeve.ca> On 10/12/2012 11:22 AM, Andrew Holway wrote: > Hello, > > I am trying to set up a 4 node cluster with a shared iSCSI storage device. > > I cannot start clvmd: service clvmd start just hangs. > > I cannot stop cman: > > node001: Working directory: /root > node001: Stopping cluster: > node001: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd > node001: fence_tool: cannot leave due to active systems > node001: [FAILED] > > I find these errors in /var/log/cluster/dlm_controld.log > > Oct 12 17:12:13 dlm_controld daemon cpg_join error retrying > Oct 12 17:12:23 dlm_controld daemon cpg_join error retrying > Oct 12 17:12:33 dlm_controld daemon cpg_join error retrying > > [root at node001 clvmd]# clvmd status > clvmd failed in initialisation > > Any ideas? > > Thanks, > > Andrew Can you paste your cluster.conf please? I suspect something went wrong, it tried to fence and then failed to do so, so it's blocked. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From a.holway at syseleven.de Fri Oct 12 18:52:35 2012 From: a.holway at syseleven.de (Andrew Holway) Date: Fri, 12 Oct 2012 20:52:35 +0200 Subject: [Linux-cluster] some clvmd / locking problem In-Reply-To: <50783724.5040703@alteeve.ca> References: <1D60B84E-0ABD-4A5C-8E53-67D1EA796537@syseleven.de> <50783724.5040703@alteeve.ca> Message-ID: <48CCA736-1A1B-43F5-B94E-8C9EE28E0384@syseleven.de> On Oct 12, 2012, at 5:28 PM, Digimer wrote: > On 10/12/2012 11:22 AM, Andrew Holway wrote: >> Hello, >> >> I am trying to set up a 4 node cluster with a shared iSCSI storage device. >> >> I cannot start clvmd: service clvmd start just hangs. >> >> I cannot stop cman: >> >> node001: Working directory: /root >> node001: Stopping cluster: >> node001: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd >> node001: fence_tool: cannot leave due to active systems >> node001: [FAILED] >> >> I find these errors in /var/log/cluster/dlm_controld.log >> >> Oct 12 17:12:13 dlm_controld daemon cpg_join error retrying >> Oct 12 17:12:23 dlm_controld daemon cpg_join error retrying >> Oct 12 17:12:33 dlm_controld daemon cpg_join error retrying >> >> [root at node001 clvmd]# clvmd status >> clvmd failed in initialisation >> >> Any ideas? >> >> Thanks, >> >> Andrew > > Can you paste your cluster.conf please? I suspect something went wrong, it tried to fence and then failed to do so, so it's blocked. :) I had two node id's the same. Thanks Andrew > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without access to education? From shanti.pahari at sierra.sg Fri Oct 12 19:47:19 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Sat, 13 Oct 2012 03:47:19 +0800 (SGT) Subject: [Linux-cluster] cannot detect SAN disk in RHEL6.1 Message-ID: <423ded17.00000e6c.00000023@sierra-A66> Hi , When I added FC external disk in RHEL 6.1 it didn't load in /dev/mapper . After reboot also it didn't detect. Any help ? Highly appreciated. thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Fri Oct 12 19:53:12 2012 From: lists at alteeve.ca (Digimer) Date: Fri, 12 Oct 2012 15:53:12 -0400 Subject: [Linux-cluster] some clvmd / locking problem In-Reply-To: <48CCA736-1A1B-43F5-B94E-8C9EE28E0384@syseleven.de> References: <1D60B84E-0ABD-4A5C-8E53-67D1EA796537@syseleven.de> <50783724.5040703@alteeve.ca> <48CCA736-1A1B-43F5-B94E-8C9EE28E0384@syseleven.de> Message-ID: <50787528.7000602@alteeve.ca> On 10/12/2012 02:52 PM, Andrew Holway wrote: > > On Oct 12, 2012, at 5:28 PM, Digimer wrote: > >> On 10/12/2012 11:22 AM, Andrew Holway wrote: >>> Hello, >>> >>> I am trying to set up a 4 node cluster with a shared iSCSI storage device. >>> >>> I cannot start clvmd: service clvmd start just hangs. >>> >>> I cannot stop cman: >>> >>> node001: Working directory: /root >>> node001: Stopping cluster: >>> node001: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd >>> node001: fence_tool: cannot leave due to active systems >>> node001: [FAILED] >>> >>> I find these errors in /var/log/cluster/dlm_controld.log >>> >>> Oct 12 17:12:13 dlm_controld daemon cpg_join error retrying >>> Oct 12 17:12:23 dlm_controld daemon cpg_join error retrying >>> Oct 12 17:12:33 dlm_controld daemon cpg_join error retrying >>> >>> [root at node001 clvmd]# clvmd status >>> clvmd failed in initialisation >>> >>> Any ideas? >>> >>> Thanks, >>> >>> Andrew >> >> Can you paste your cluster.conf please? I suspect something went wrong, it tried to fence and then failed to do so, so it's blocked. > > :) I had two node id's the same. > > Thanks > > Andrew Heh, I've done that before, too. >_> -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From raju.rajsand at gmail.com Fri Oct 12 20:00:24 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Sat, 13 Oct 2012 01:30:24 +0530 Subject: [Linux-cluster] cannot detect SAN disk in RHEL6.1 In-Reply-To: <423ded17.00000e6c.00000023@sierra-A66> References: <423ded17.00000e6c.00000023@sierra-A66> Message-ID: Greetings, On Sat, Oct 13, 2012 at 1:17 AM, Shanti Pahari wrote: > When I added FC external disk in RHEL 6.1 it didn?t load in /dev/mapper . > > After reboot also it didn?t detect. > > > Highly appreciated. IMHO not appreciated. You have been very cryptic in your answers. Why don't you buy Redhat support? -- Regards, Rajagopal From shanti.pahari at sierra.sg Fri Oct 12 23:37:47 2012 From: shanti.pahari at sierra.sg (Shanti Pahari) Date: Sat, 13 Oct 2012 07:37:47 +0800 (SGT) Subject: [Linux-cluster] cannot detect SAN disk in RHEL6.1 In-Reply-To: References: <423ded17.00000e6c.00000023@sierra-A66> Message-ID: <6f3021d4.00000e6c.00000027@sierra-A66> Multipathd is running And when I try /sbin/multipath -v0 it didn't show anything but when lsmod |grep dm then I saw dm-multipath here. But still cannot detect. [root at PDC-PIC-PL-01 ~]# /sbin/multipath -v0 [root at PDC-PIC-PL-01 ~]# modprobe dm-multipath [root at PDC-PIC-PL-01 ~]# /sbin/multipath -v0 [root at PDC-PIC-PL-01 ~]# lsmod |grep dm dm_mirror 14067 0 dm_region_hash 12136 1 dm_mirror dm_log 10120 2 dm_mirror,dm_region_hash dm_round_robin 2651 6 dm_multipath 18266 4 dm_round_robin dm_mod 75539 16 dm_mirror,dm_log,dm_multipath [root at PDC-PIC-PL-01 ~]# service multipathd status multipathd (pid 1567) is running... [root at PDC-PIC-PL-01 ~]# From: Ben .T.George [mailto:bentech4you at gmail.com] Sent: Saturday, 13 October, 2012 4:57 AM To: shanti.pahari at sierra.sg Subject: Re: [Linux-cluster] cannot detect SAN disk in RHEL6.1 HI check multipath demon is running or not..also check dm-multipath kernel module is loaded or not regards, Ben On Fri, Oct 12, 2012 at 10:47 PM, Shanti Pahari wrote: Hi , When I added FC external disk in RHEL 6.1 it didn't load in /dev/mapper . After reboot also it didn't detect. Any help ? Highly appreciated. thanks -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.holway at syseleven.de Sat Oct 13 10:40:11 2012 From: a.holway at syseleven.de (Andrew Holway) Date: Sat, 13 Oct 2012 12:40:11 +0200 Subject: [Linux-cluster] Linux clustering for high availability databases and other services Message-ID: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> Hello, We have been experimenting with various storage technologies in order to create moderately highly available database services. I have the following equipment in my lab: 4x HP G8 servers with * Mellanox QDR InfiniBand * 10GE adapters * Lots of memory and the latest, most powerful CPUS. * Centos 6.0 Oracle ZFS appliance * Infiniband * NFS (over ethernet and infiniband) * iSCSI (over ethernet and infiniband) * Various RDMA protocols that are not supported by oracle on redhat. Nimble Storage device * iSCSI over 10G ethernet Brocade 10G switches. Can I have some guidance on possible setups for HA database services? I have tested and have a good understanding of all the technology components but I am a bit confused how I should be glueing them together. I need a focus :) Thanks, Andrew From raju.rajsand at gmail.com Sat Oct 13 16:01:59 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Sat, 13 Oct 2012 21:31:59 +0530 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> Message-ID: Greetings, On Sat, Oct 13, 2012 at 4:10 PM, Andrew Holway wrote: > Hello, > > We have been experimenting with various storage technologies in order to create moderately highly available database services. > > Can I have some guidance on possible setups for HA database services? I have tested and have a good understanding of all the technology components but I am a bit confused how I should be glueing them together. I need a focus :) > > Thanks, Do you mean active/active Oracle RAC type? -- Regards, Rajagopal From heiko.nardmann at itechnical.de Sun Oct 14 14:06:02 2012 From: heiko.nardmann at itechnical.de (Heiko Nardmann) Date: Sun, 14 Oct 2012 16:06:02 +0200 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> Message-ID: <507AC6CA.8080602@itechnical.de> Hi, RedHat provides support for setting up such scenarios. I recommend buying RHEL and contacting them. Kind regards, Heiko Am 13.10.2012 12:40, schrieb Andrew Holway: > Hello, > > We have been experimenting with various storage technologies in order to create moderately highly available database services. > > I have the following equipment in my lab: > > 4x HP G8 servers with > * Mellanox QDR InfiniBand > * 10GE adapters > * Lots of memory and the latest, most powerful CPUS. > * Centos 6.0 > > Oracle ZFS appliance > * Infiniband > * NFS (over ethernet and infiniband) > * iSCSI (over ethernet and infiniband) > * Various RDMA protocols that are not supported by oracle on redhat. > > Nimble Storage device > * iSCSI over 10G ethernet > > Brocade 10G switches. > > Can I have some guidance on possible setups for HA database services? I have tested and have a good understanding of all the technology components but I am a bit confused how I should be glueing them together. I need a focus :) > > Thanks, > > Andrew > > From raju.rajsand at gmail.com Sun Oct 14 14:38:18 2012 From: raju.rajsand at gmail.com (Rajagopal Swaminathan) Date: Sun, 14 Oct 2012 20:08:18 +0530 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: <507AC6CA.8080602@itechnical.de> References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> <507AC6CA.8080602@itechnical.de> Message-ID: Greetings, On Sun, Oct 14, 2012 at 7:36 PM, Heiko Nardmann wrote: > > RedHat provides support for setting up such scenarios. I recommend > buying RHEL and contacting them. ++1 Please note that active/active file sharing is *very* different from active/active DB server. AFAIK, only Oracle RAC and IBM DB2 have Active/Active HA DB options: no escape from spending money .... :) -- Regards, Rajagopal From a.holway at syseleven.de Sun Oct 14 17:38:57 2012 From: a.holway at syseleven.de (Andrew Holway) Date: Sun, 14 Oct 2012 19:38:57 +0200 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> Message-ID: <75F0F61B-117E-4EFE-B265-C61DB52C75D0@syseleven.de> > > Do you mean active/active Oracle RAC type? No, Perhaps active / passive mysql type. > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From a.holway at syseleven.de Sun Oct 14 17:57:24 2012 From: a.holway at syseleven.de (Andrew Holway) Date: Sun, 14 Oct 2012 19:57:24 +0200 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> Message-ID: <8114940B-2405-4A7B-A189-71B50666B67E@syseleven.de> > > Do you mean active/active Oracle RAC type? We are using a lot Mysql in house. Perhaps high availability is the wrong phrase. More Available perhaps? > > -- > Regards, > > Rajagopal > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From christian.masopust at siemens.com Sun Oct 14 19:36:48 2012 From: christian.masopust at siemens.com (Masopust, Christian) Date: Sun, 14 Oct 2012 21:36:48 +0200 Subject: [Linux-cluster] Linux clustering for high availability databases and other services In-Reply-To: <507AC6CA.8080602@itechnical.de> References: <344E72A6-BFE0-4413-88AD-8A066ED39D54@syseleven.de> <507AC6CA.8080602@itechnical.de> Message-ID: Hi Andrew, maybe I understand you completely wrong, but... Do you know "Galera Cluster" ? If you are on MySQL maybe you can give it a try? br, christian > Am 13.10.2012 12:40, schrieb Andrew Holway: > > Hello, > > > > We have been experimenting with various storage > technologies in order to create moderately highly available > database services. > > > > I have the following equipment in my lab: > > > > 4x HP G8 servers with > > * Mellanox QDR InfiniBand > > * 10GE adapters > > * Lots of memory and the latest, most powerful CPUS. > > * Centos 6.0 > > > > Oracle ZFS appliance > > * Infiniband > > * NFS (over ethernet and infiniband) > > * iSCSI (over ethernet and infiniband) > > * Various RDMA protocols that are not supported by oracle > on redhat. > > > > Nimble Storage device > > * iSCSI over 10G ethernet > > > > Brocade 10G switches. > > > > Can I have some guidance on possible setups for HA database > services? I have tested and have a good understanding of all > the technology components but I am a bit confused how I > should be glueing them together. I need a focus :) > > > > Thanks, > > > > Andrew > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From bmr at redhat.com Mon Oct 15 11:08:13 2012 From: bmr at redhat.com (Bryn M. Reeves) Date: Mon, 15 Oct 2012 12:08:13 +0100 Subject: [Linux-cluster] cannot detect SAN disk in RHEL6.1 In-Reply-To: <423ded17.00000e6c.00000023@sierra-A66> References: <423ded17.00000e6c.00000023@sierra-A66> Message-ID: <507BEE9D.10304@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/12/2012 08:47 PM, Shanti Pahari wrote: > When I added FC external disk in RHEL 6.1 it didn't load in > /dev/mapper . > > After reboot also it didn't detect. Check for SCSI devices being registered (/proc/scsi/scsi, lsscsi and dmesg). Also read: http://tinyurl.com/93rnlbn [access.redhat.com, RHEL Storage Administration Guide). Regards, Bryn. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlB77p0ACgkQ6YSQoMYUY94sBwCdFpKSB5gdgdR27CcoE/RTTPkO xXsAoIPFK1KA2CpHgsYHaubW5s1ocIWg =XaKW -----END PGP SIGNATURE----- From epretorious at yahoo.com Mon Oct 15 20:41:00 2012 From: epretorious at yahoo.com (Eric) Date: Mon, 15 Oct 2012 13:41:00 -0700 (PDT) Subject: [Linux-cluster] Getting started with cLVM Message-ID: <1350333660.23559.YahooMailNeo@web121705.mail.ne1.yahoo.com> I've been reading about cLVM and I'm having a difficult time getting my mind wrapped around The Big Picture. Looking at Figure?1.2, ?CLVM Overview, of? the Red Hat LVM Administrator Guide I'm a bit puzzled. Specifically: Are storage devices exported directly to cLVM daemons? Or are LV's exported directly to cLVM daemonds for consumption? Or will another shared-disk protocol (e.g., GNBD or iSCSI) be required for "the last mile"? Eric Pretorious Truckee, CA From lists at alteeve.ca Mon Oct 15 21:37:10 2012 From: lists at alteeve.ca (Digimer) Date: Mon, 15 Oct 2012 17:37:10 -0400 Subject: [Linux-cluster] Getting started with cLVM In-Reply-To: <1350333660.23559.YahooMailNeo@web121705.mail.ne1.yahoo.com> References: <1350333660.23559.YahooMailNeo@web121705.mail.ne1.yahoo.com> Message-ID: <507C8206.6090908@alteeve.ca> On 10/15/2012 04:41 PM, Eric wrote: > I've been reading about cLVM and I'm having a difficult time getting my mind wrapped around The Big Picture. > > Looking at Figure 1.2, ?CLVM Overview, of the Red Hat LVM Administrator Guide I'm a bit puzzled. Specifically: Are storage devices exported directly to cLVM daemons? Or are LV's exported directly to cLVM daemonds for consumption? Or will another shared-disk protocol (e.g., GNBD or iSCSI) be required for "the last mile"? > > Eric Pretorious > Truckee, CA Clustered LVM, fundamentally, swaps out the normal internal locking to DLM (distributed lock manager), which is provided by cman under EL6 and corosync (v2+). Beyond this, you can think of LVM as you always have. So, commonly, the PV(s) would be some form of shared storage; DRBD and SANs are the most common I think. When you pvcreate /dev/foo (foo being your shared storage), you can immediately 'pvscan' on the other nodes and see the new PV. Likewise, once you assign the PV to a VG on one node, 'vgscan' on all the other nodes will immediately see the new/expanded VG. Likewise with LV creation/resize/removes. Beyond this, there is nothing further special about clustered LVM over "normal" LVM. hth digimer -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From cfeist at redhat.com Wed Oct 17 00:02:41 2012 From: cfeist at redhat.com (Chris Feist) Date: Tue, 16 Oct 2012 19:02:41 -0500 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <50736F78.3060906@redhat.com> References: <50736F78.3060906@redhat.com> Message-ID: <507DF5A1.3060905@redhat.com> On 10/08/12 19:27, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync configuration > system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync 2.0 udpu). > > David Vossel has also created a version of the "Clusters from Scratch" document > that illustrates setting up a cluster using pcs. This should be showing up > shortly. Just an update, I've updated the pcs (to 0.9.27) and included the pcsd daemon with the fedora packages. You can grab the updated packages here: http://people.redhat.com/cfeist/pcs/ And you should be able to used the new Clusters from Scratch optimized for the pcs CLI here: http://www.clusterlabs.org/doc/ Just a couple things to note (this should be shortly updated in the notes). To run pcs on Fedora 17/18 you'll need to turn off selinux & disable the firewall (or at least allow traffic on port 2224). To disable SELinux set 'SELINUX=permissive' in /etc/selinux/config and reboot To disable the firewall run 'systemctl stop iptables.service' (to permanently disable run 'systemctl disable iptables.service') The pcs_passwd command has been removed. In it's place you can do authentication with the hacluster user. Just set the hacluster user password (passwd hacluster) and then use that user and password to authenticate with pcs. If you have any questions or any issues don't hesitate to contact me, we're still working out the bugs in the new pcsd daemon and we appreciate all the feedback we can get. Thanks, Chris > > You can view the source here: https://github.com/feist/pcs/ > > Or download the latest tarball: > https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz > > There is also a Fedora 18 package that will be included with the next release. > You should be able to find that package in the following locations... > > RPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm > > SRPM: > http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm > > In the near future we are planning on having builds for SUSE & Ubuntu/Debian. > > We're also actively working on a GUI/Daemon that will allow control of your > entire cluster from one node and/or a web browser. > > Please feel free to email me (cfeist at redhat.com) or open issues on the pcs > project at github (https://github.com/feist/pcs/issues) if you have any > questions or problems. > > Thanks! > Chris > > _______________________________________________ > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org From andrew at beekhof.net Wed Oct 17 02:09:18 2012 From: andrew at beekhof.net (Andrew Beekhof) Date: Wed, 17 Oct 2012 13:09:18 +1100 Subject: [Linux-cluster] [Pacemaker] Announce: pcs-0.9.26 In-Reply-To: <507DF5A1.3060905@redhat.com> References: <50736F78.3060906@redhat.com> <507DF5A1.3060905@redhat.com> Message-ID: On Wed, Oct 17, 2012 at 11:02 AM, Chris Feist wrote: > On 10/08/12 19:27, Chris Feist wrote: >> >> We've been making improvements to the pcs (pacemaker/corosync >> configuration >> system) command line tool over the past few months. >> >> Currently you can setup a basic cluster (including configuring corosync >> 2.0 udpu). >> >> David Vossel has also created a version of the "Clusters from Scratch" >> document >> that illustrates setting up a cluster using pcs. This should be showing >> up >> shortly. > > > Just an update, I've updated the pcs (to 0.9.27) and included the pcsd > daemon with the fedora packages. You can grab the updated packages here: > > http://people.redhat.com/cfeist/pcs/ > > And you should be able to used the new Clusters from Scratch optimized for > the pcs CLI here: http://www.clusterlabs.org/doc/ Those docs have now been updated to match the new release. > > Just a couple things to note (this should be shortly updated in the notes). > > To run pcs on Fedora 17/18 you'll need to turn off selinux & disable the > firewall (or at least allow traffic on port 2224). > > To disable SELinux set 'SELINUX=permissive' in /etc/selinux/config and > reboot > To disable the firewall run 'systemctl stop iptables.service' (to > permanently disable run 'systemctl disable iptables.service') > > The pcs_passwd command has been removed. In it's place you can do > authentication with the hacluster user. Just set the hacluster user > password (passwd hacluster) and then use that user and password to > authenticate with pcs. > > If you have any questions or any issues don't hesitate to contact me, we're > still working out the bugs in the new pcsd daemon and we appreciate all the > feedback we can get. > > Thanks, > Chris > > >> >> You can view the source here: https://github.com/feist/pcs/ >> >> Or download the latest tarball: >> https://github.com/downloads/feist/pcs/pcs-0.9.26.tar.gz >> >> There is also a Fedora 18 package that will be included with the next >> release. >> You should be able to find that package in the following locations... >> >> RPM: >> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.noarch.rpm >> >> SRPM: >> http://people.redhat.com/cfeist/pcs/pcs-0.9.26-1.fc18.src.rpm >> >> In the near future we are planning on having builds for SUSE & >> Ubuntu/Debian. >> >> We're also actively working on a GUI/Daemon that will allow control of >> your >> entire cluster from one node and/or a web browser. >> >> Please feel free to email me (cfeist at redhat.com) or open issues on the pcs >> project at github (https://github.com/feist/pcs/issues) if you have any >> questions or problems. >> >> Thanks! >> Chris >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org From terance at socialtwist.com Wed Oct 17 13:12:21 2012 From: terance at socialtwist.com (Terance Dias) Date: Wed, 17 Oct 2012 18:42:21 +0530 Subject: [Linux-cluster] CMAN nodes in different LANs In-Reply-To: References: Message-ID: Hi, We're trying to create a cluster in which the nodes lie in 2 different LANs. Since the nodes lie in different networks, they cannot resolve the other node by their internal IP. So in my cluster.conf file, I've provided their external IPs. But now when I start CMAN service, I get the following error. ----------------------------------- Starting cluster: Checking Network Manager... [ OK ] Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs... [ OK ] Starting cman... Cannot find node name in cluster.conf Unable to get the configuration Cannot find node name in cluster.conf cman_tool: corosync daemon didn't start [FAILED] ------------------------------------- My cluster.conf file is as below -------------------------------------