From pieter.baele at gmail.com Wed Feb 2 08:43:36 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Wed, 2 Feb 2011 09:43:36 +0100 Subject: [Linux-cluster] setting multicast address on bond0 Message-ID: Hi, What's the correct way to set the multicast address on a bonded interface? (RHEL6 - Cluster 3.x) I can't use multicast on the primary interface because of network topology (2 sites....). So I want to set up a multicast address on bond0 (2 interfaces so this is fault-tolerant) Adding fails ccs_config_validate validation. (RHEL 5 cluster worked, but for some reason this doesn't work in the latest version) I've taken a look at the cluster.rng file.... Sincereley Pieter Baele From mgrac at redhat.com Wed Feb 2 15:01:34 2011 From: mgrac at redhat.com (Marek Grac) Date: Wed, 02 Feb 2011 16:01:34 +0100 Subject: [Linux-cluster] Configuring a samba resource under RHCS In-Reply-To: References: Message-ID: <4D4971CE.6080905@redhat.com> On 10/18/2010 01:46 PM, C. L. Martinez wrote: > Hi all, > > How can I configure different shared folders with samba under RHCS?? > Exists some resource agents?? I need to allow to access to sme Windows > 7 And Windows 2008 R2 clients without AD authentication. > Resource agent for samba does not modify shared folders configuration, so there is no difference between setting one or more of them. All you have to do is to add samba resource agent, ip address(es) on which samba should listen, and filesystem on which are shared folders (there is no auto-detection). m, From pieter.baele at gmail.com Wed Feb 2 15:47:57 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Wed, 2 Feb 2011 16:47:57 +0100 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY Message-ID: Hi, After doing a lot of research on the several ways to mirror a device from one LUN to another on another side, I had this problem: Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Feb 2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY ............ What does it mean and could be the reason? clustat on the one node shows bode online, on the other one is offline... From linux at alteeve.com Wed Feb 2 15:50:58 2011 From: linux at alteeve.com (Digimer) Date: Wed, 02 Feb 2011 10:50:58 -0500 Subject: [Linux-cluster] setting multicast address on bond0 In-Reply-To: References: Message-ID: <4D497D62.2050500@alteeve.com> On 02/02/2011 03:43 AM, Pieter Baele wrote: > Adding fails > ccs_config_validate validation. > (RHEL 5 cluster worked, but for some reason this doesn't work in the > latest version) > > I've taken a look at the cluster.rng file.... > > Sincereley > Pieter Baele I don't believe that 'interface=' is valid. -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From pieter.baele at gmail.com Wed Feb 2 15:57:39 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Wed, 2 Feb 2011 16:57:39 +0100 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: On Wed, Feb 2, 2011 at 16:47, Pieter Baele wrote: > Hi, > > After doing a lot of research on the several ways to mirror a device > from one LUN to another on another side, > I had this problem: > > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > Feb ?2 16:42:49 x cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY > > ............ > > > What does it mean and could be the reason? > > clustat on the one node shows bode online, on the other one is offline... > I also received this warning some minutes ago: Message from syslogd at x at Feb 2 16:54:35 ... corosync[6368]: [TOTEM ] LOGSYS EMERGENCY: TOTEM Unable to write to /var/log/cluster/corosync.log. Message from syslogd at x at Feb 2 16:54:36 ... corosync[6368]: [QUORUM] LOGSYS EMERGENCY: QUORUM Unable to write to /var/log/cluster/corosync.log. Do I have to fiddle with the mcast parameters? From brett.dellegrazie at gmail.com Wed Feb 2 16:52:50 2011 From: brett.dellegrazie at gmail.com (Brett Delle Grazie) Date: Wed, 2 Feb 2011 16:52:50 +0000 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: Hi, On 2 February 2011 15:57, Pieter Baele wrote: > On Wed, Feb 2, 2011 at 16:47, Pieter Baele wrote: > I also received this warning some minutes ago: > > Message from syslogd at x at Feb ?2 16:54:35 ... > ?corosync[6368]: ? [TOTEM ] LOGSYS EMERGENCY: TOTEM Unable to write to > /var/log/cluster/corosync.log. > > Message from syslogd at x at Feb ?2 16:54:36 ... > ?corosync[6368]: ? [QUORUM] LOGSYS EMERGENCY: QUORUM Unable to write > to /var/log/cluster/corosync.log. > > Do I have to fiddle with the mcast parameters? > I know this is obvious but is /var/log full? -- Best Regards, Brett Delle Grazie From pieter.baele at gmail.com Wed Feb 2 19:59:54 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Wed, 2 Feb 2011 20:59:54 +0100 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: On Wed, Feb 2, 2011 at 17:52, Brett Delle Grazie wrote: > Hi, >> >> Do I have to fiddle with the mcast parameters? >> > > I know this is obvious but is /var/log full? > > 4 GB free ;-) Regards, Pieter From dgmorales at gmail.com Wed Feb 2 22:53:33 2011 From: dgmorales at gmail.com (Diego Morales) Date: Wed, 2 Feb 2011 20:53:33 -0200 Subject: [Linux-cluster] Fence agent for Citrix XenServer / XCP Message-ID: I'm setting up some GFS clusters on top of Citrix XenServer (XCP is its "free as in freedom" counterpart). And so I'm looking for some fencing agents to use with that. I did some googling and it seems that these do not support libvirt (at least not "officially", not yet). So I guess the use of fence_xvm or fence_virsh may be tricky or even impossible. What I expected to find was some fence_agent built using XenAPI.py (that they support) or SSH'ing and using the xe command. Didn't find, thought about doing it myself. But before that... probably I'm not the only one using rhcs & friends on XenServer, so does anybody has some nice pointers? Thanks in advance, Diego Morales From sklemer at gmail.com Thu Feb 3 07:26:35 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Thu, 3 Feb 2011 09:26:35 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 Message-ID: Hello. I followed redhat instruction trying install HA-LVM with clvmd. ( rhcs 5.6 - rgmanager 2.0.52-9 ) I can't make it work. lvm.conf- locking_type=3 clvmd work Its failed saying HA-LVM is not configured correctly. The manual said that we should run "lvchange -a n lvxx" edit the cluster.conf & start the service. But From lvm.conf : case $1 in start) if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ .....c ]]; then ha_lvm_proper_setup_check || exit 1 If the vg is not taged as cluster than the ha_lvm is looking for volume_list in lvm.conf. I am confused- Does the VG should taged as cluster ?? ( BTW - the old fashion HA-LVM is worked with no problems ) redhat instructions : *To set up HA LVM Failover (using the preferred CLVM variant), perform the following steps:* 1. Ensure that the parameter locking_type in the global section of /etc/lvm/lvm.conf is set to the value '3', that all the necessary LVM cluster packages are installed, and the necessary daemons are started (like 'clvmd' and the cluster mirror log daemon - if necessary). 2. Create the logical volume and filesystem using standard LVM2 and file system commands. For example: # pvcreate /dev/sd[cde]1 # vgcreate /dev/sd[cde]1 # lvcreate -L 10G -n # mkfs.ext3 /dev// # lvchange -an / 3. Edit /etc/cluster/cluster.conf to include the newly created logical volume as a resource in one of your services. Alternatively, configuration tools such as Conga or system-config-cluster may be used to create these entries. Below is a sample resource manager section from /etc/cluster/cluster.conf: Regards Shalom. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pieter.baele at gmail.com Thu Feb 3 07:37:20 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Thu, 3 Feb 2011 08:37:20 +0100 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: On Wed, Feb 2, 2011 at 20:59, Pieter Baele wrote: > On Wed, Feb 2, 2011 at 17:52, Brett Delle Grazie > wrote: >> Hi, >>> >>> Do I have to fiddle with the mcast parameters? >>> >> >> I know this is obvious but is /var/log full? I was wrong, looked at the wrong server /var/log/messages is full very very fast always the same message: Feb 3 08:35:13 nodex cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY From corey.kovacs at gmail.com Thu Feb 3 09:13:47 2011 From: corey.kovacs at gmail.com (Corey Kovacs) Date: Thu, 3 Feb 2011 09:13:47 +0000 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Is using ha-lvm with clvmd a new capability? It's always been my understanding that the lvm locking type for using ha-lvm had to be set to '1'. I'd much rather be using clvmd if it is the way to go. Can you point me to the docs you are seeing these instructions in please? As for why your config isn't working, clvmd requires that it's resources are indeed tagged as cluster volumes, so you might try doing that and see how it goes. -C On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: > Hello. > > > > I followed redhat instruction trying install HA-LVM with clvmd. ( rhcs 5.6 - > rgmanager 2.0.52-9 ) > > > > I can't make it work. > > > > lvm.conf- locking_type=3 > > clvmd work > > Its failed saying HA-LVM is not configured correctly. > > The manual said that we should run "lvchange -a n lvxx" edit the > cluster.conf & start the service. > > > > But From lvm.conf : > > > > case $1 in > > start) > > ?? ? ? ?if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ .....c > ]]; then > > ?? ? ? ? ? ? ? ?ha_lvm_proper_setup_check || exit 1 > > > > If the vg is not taged as cluster than the ha_lvm is looking for volume_list > in lvm.conf. > > > > I am confused- Does the VG should taged as cluster ?? ?( BTW - the old > fashion HA-LVM is worked with no problems ) > > redhat instructions : > > To set up HA LVM Failover (using the preferred CLVM variant), perform the > following steps: > > > > 1. Ensure that the parameter?locking_type?in the global section > of?/etc/lvm/lvm.conf?is set to the value?'3', that all the necessary LVM > cluster packages are installed, and the necessary daemons are started (like > 'clvmd' and the cluster mirror log daemon - if necessary). > > > > 2. Create the logical volume and filesystem using standard LVM2 and file > system commands. For example: > > # pvcreate /dev/sd[cde]1 > > ?# vgcreate /dev/sd[cde]1 > > ?# lvcreate -L 10G -n > > ?# mkfs.ext3 /dev// > > ?# lvchange -an / > > > > 3. Edit /etc/cluster/cluster.conf to include the newly created logical > volume as a resource in one of your services. Alternatively, configuration > tools such as?Conga?or?system-config-cluster?may be used to create these > entries.? Below is a sample resource manager section > from?/etc/cluster/cluster.conf: > > > > ? ?? ?????? restricted="0"> ????????? > ????????? > ?? ?? ?????? name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> ?????? device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" fsid="64050" > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> ?? > ?? > ?????? ?????? ?? > > > > Regards > > Shalom. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From brett.dellegrazie at gmail.com Thu Feb 3 09:17:14 2011 From: brett.dellegrazie at gmail.com (Brett Delle Grazie) Date: Thu, 3 Feb 2011 09:17:14 +0000 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: On 3 February 2011 07:37, Pieter Baele wrote: > On Wed, Feb 2, 2011 at 20:59, Pieter Baele wrote: >> On Wed, Feb 2, 2011 at 17:52, Brett Delle Grazie >> wrote: >>> Hi, >>>> >>>> Do I have to fiddle with the mcast parameters? >>>> >>> >>> I know this is obvious but is /var/log full? > > I was wrong, looked at the wrong server > /var/log/messages is full very very fast > > always the same message: > > Feb ?3 08:35:13 nodex cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY Something is very odd and/or broken. What versions are you running: OS:? Kernel:? cman:? openais:? lvm2:? lvm2-cluster:? This is one you're probably going to have to raise with RedHat or someone on this list far more experienced than I. -- Best Regards, Brett Delle Grazie From sklemer at gmail.com Thu Feb 3 10:35:13 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Thu, 3 Feb 2011 12:35:13 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: https://access.redhat.com/kb/docs/DOC-3068 On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs wrote: > Is using ha-lvm with clvmd a new capability? It's always been my > understanding that the lvm locking type for using ha-lvm had to be set > to '1'. > > I'd much rather be using clvmd if it is the way to go. Can you point > me to the docs you are seeing these instructions in please? > > As for why your config isn't working, clvmd requires that it's > resources are indeed tagged as cluster volumes, so you might try doing > that and see how it goes. > > -C > > On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: > > Hello. > > > > > > > > I followed redhat instruction trying install HA-LVM with clvmd. ( rhcs > 5.6 - > > rgmanager 2.0.52-9 ) > > > > > > > > I can't make it work. > > > > > > > > lvm.conf- locking_type=3 > > > > clvmd work > > > > Its failed saying HA-LVM is not configured correctly. > > > > The manual said that we should run "lvchange -a n lvxx" edit the > > cluster.conf & start the service. > > > > > > > > But From lvm.conf : > > > > > > > > case $1 in > > > > start) > > > > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ .....c > > ]]; then > > > > ha_lvm_proper_setup_check || exit 1 > > > > > > > > If the vg is not taged as cluster than the ha_lvm is looking for > volume_list > > in lvm.conf. > > > > > > > > I am confused- Does the VG should taged as cluster ?? ( BTW - the old > > fashion HA-LVM is worked with no problems ) > > > > redhat instructions : > > > > To set up HA LVM Failover (using the preferred CLVM variant), perform the > > following steps: > > > > > > > > 1. Ensure that the parameter locking_type in the global section > > of /etc/lvm/lvm.conf is set to the value '3', that all the necessary LVM > > cluster packages are installed, and the necessary daemons are started > (like > > 'clvmd' and the cluster mirror log daemon - if necessary). > > > > > > > > 2. Create the logical volume and filesystem using standard LVM2 and file > > system commands. For example: > > > > # pvcreate /dev/sd[cde]1 > > > > # vgcreate /dev/sd[cde]1 > > > > # lvcreate -L 10G -n > > > > # mkfs.ext3 /dev// > > > > # lvchange -an / > > > > > > > > 3. Edit /etc/cluster/cluster.conf to include the newly created logical > > volume as a resource in one of your services. Alternatively, > configuration > > tools such as Conga or system-config-cluster may be used to create these > > entries. Below is a sample resource manager section > > from /etc/cluster/cluster.conf: > > > > > > > > > restricted="0"> priority="1"/> > > > > > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" > fsid="64050" > > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> > > > > > > > > > > > > > Regards > > > > Shalom. > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From sklemer at gmail.com Thu Feb 3 10:38:54 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Thu, 3 Feb 2011 12:38:54 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: > > https://access.redhat.com/kb/docs/DOC-3068 > > > On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs wrote: > >> Is using ha-lvm with clvmd a new capability? It's always been my >> understanding that the lvm locking type for using ha-lvm had to be set >> to '1'. >> >> I'd much rather be using clvmd if it is the way to go. Can you point >> me to the docs you are seeing these instructions in please? >> >> As for why your config isn't working, clvmd requires that it's >> resources are indeed tagged as cluster volumes, so you might try doing >> that and see how it goes. >> >> -C >> >> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: >> > Hello. >> > >> > >> > >> > I followed redhat instruction trying install HA-LVM with clvmd. ( rhcs >> 5.6 - >> > rgmanager 2.0.52-9 ) >> > >> > >> > >> > I can't make it work. >> > >> > >> > >> > lvm.conf- locking_type=3 >> > >> > clvmd work >> > >> > Its failed saying HA-LVM is not configured correctly. >> > >> > The manual said that we should run "lvchange -a n lvxx" edit the >> > cluster.conf & start the service. >> > >> > >> > >> > But From lvm.conf : >> > >> > >> > >> > case $1 in >> > >> > start) >> > >> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ >> .....c >> > ]]; then >> > >> > ha_lvm_proper_setup_check || exit 1 >> > >> > >> > >> > If the vg is not taged as cluster than the ha_lvm is looking for >> volume_list >> > in lvm.conf. >> > >> > >> > >> > I am confused- Does the VG should taged as cluster ?? ( BTW - the old >> > fashion HA-LVM is worked with no problems ) >> > >> > redhat instructions : >> > >> > To set up HA LVM Failover (using the preferred CLVM variant), perform >> the >> > following steps: >> > >> > >> > >> > 1. Ensure that the parameter locking_type in the global section >> > of /etc/lvm/lvm.conf is set to the value '3', that all the necessary LVM >> > cluster packages are installed, and the necessary daemons are started >> (like >> > 'clvmd' and the cluster mirror log daemon - if necessary). >> > >> > >> > >> > 2. Create the logical volume and filesystem using standard LVM2 and file >> > system commands. For example: >> > >> > # pvcreate /dev/sd[cde]1 >> > >> > # vgcreate /dev/sd[cde]1 >> > >> > # lvcreate -L 10G -n >> > >> > # mkfs.ext3 /dev// >> > >> > # lvchange -an / >> > >> > >> > >> > 3. Edit /etc/cluster/cluster.conf to include the newly created logical >> > volume as a resource in one of your services. Alternatively, >> configuration >> > tools such as Conga or system-config-cluster may be used to create these >> > entries. Below is a sample resource manager section >> > from /etc/cluster/cluster.conf: >> > >> > >> > >> > > > restricted="0"> > priority="1"/> >> > >> > > > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> > > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >> fsid="64050" >> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >> >> > >> > >> > >> > >> > >> > Regards >> > >> > Shalom. >> > >> > >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster at redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett.dellegrazie at gmail.com Thu Feb 3 10:39:43 2011 From: brett.dellegrazie at gmail.com (Brett Delle Grazie) Date: Thu, 3 Feb 2011 10:39:43 +0000 Subject: [Linux-cluster] clvmd mirroring problems on split-site SAN (2 node cluster) - cpg_dispatch failed: SA_AIS_ERR_LIBRARY In-Reply-To: References: Message-ID: On 3 February 2011 10:22, Pieter Baele wrote: > On Thu, Feb 3, 2011 at 10:17, Brett Delle Grazie > wrote: >> On 3 February 2011 07:37, Pieter Baele wrote: >>> Feb ?3 08:35:13 nodex cmirrord[3682]: cpg_dispatch failed: SA_AIS_ERR_LIBRARY >> >> Something is very odd and/or broken. >> >> What versions are you running: >> OS:? >> Kernel:? >> cman:? >> openais:? >> lvm2:? >> lvm2-cluster:? >> > > OS: RH 6.0 > Kernel: 2.6.32-71.el6.x86_64 > cman-3.0.12-23.el6.x86_64 > openais-1.1.1-6.el6.x86_64 > lvm2-2.02.72-8.el6.x86_64 > lvm2-cluster-2.02.72-8.el6.x86_64 > >> This is one you're probably going to have to raise with RedHat or >> someone on this list far >> more experienced than I. >> > Already placed it on the customer portal as well. > But mailing list are a better way to get the right specialists ;-) Then I suggest you mail lvm2 and/or OpenAIS mailing lists as well. Please don't take the discussion off list as it prevents others who have your problem from seeing the solution. CLVMD mirroring is quite new and its unlikely many people have experience with it. At this stage, there is nothing else I can provide apart from suggesting more obvious things like checking the state of OpenAIS / Corosync / whatever back-end you're using and checking you have no networking issues (failures, packet drops, broken multicast etc.) Good luck. > > Greetings, PieterB > -- Best Regards, Brett Delle Grazie From corey.kovacs at gmail.com Thu Feb 3 10:49:18 2011 From: corey.kovacs at gmail.com (Corey Kovacs) Date: Thu, 3 Feb 2011 10:49:18 +0000 Subject: [Linux-cluster] Multi-homing in rhel5 Message-ID: The cluster2 docs outline a procedure for multihoming which is unsupported by redhat. Is anyone actually using this method or are people more inclined to use configs in which secondary interfaces are given names by which the cluster then uses them as primary config nodes. For example, on my cluster I have eth0 as the primary interface for all normal system traffic, and eth1 as my cluster interconnect. eth0 - nodename eth1 - nodename-clu <-- cluster config points to this as nodes.... clients access the cluster services via eth0. I've seen other configs where people configure the cluster to use eth0 for cluster coms so that ricci/luci work correctly, but I don't use those. Is there an advantage of one method over the other ? From corey.kovacs at gmail.com Thu Feb 3 14:32:20 2011 From: corey.kovacs at gmail.com (Corey Kovacs) Date: Thu, 3 Feb 2011 14:32:20 +0000 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Excellent, Thanks -C On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: > > > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: >> >> >> >> https://access.redhat.com/kb/docs/DOC-3068 >> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs >> wrote: >>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>> understanding that the lvm locking type for using ha-lvm had to be set >>> to '1'. >>> >>> I'd much rather be using clvmd if it is the way to go. Can you point >>> me to the docs you are seeing these instructions in please? >>> >>> As for why your config isn't working, clvmd requires that it's >>> resources are indeed tagged as cluster volumes, so you might try doing >>> that and see how it goes. >>> >>> -C >>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: >>> > Hello. >>> > >>> > >>> > >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( rhcs >>> > 5.6 - >>> > rgmanager 2.0.52-9 ) >>> > >>> > >>> > >>> > I can't make it work. >>> > >>> > >>> > >>> > lvm.conf- locking_type=3 >>> > >>> > clvmd work >>> > >>> > Its failed saying HA-LVM is not configured correctly. >>> > >>> > The manual said that we should run "lvchange -a n lvxx" edit the >>> > cluster.conf & start the service. >>> > >>> > >>> > >>> > But From lvm.conf : >>> > >>> > >>> > >>> > case $1 in >>> > >>> > start) >>> > >>> > ?? ? ? ?if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ >>> > .....c >>> > ]]; then >>> > >>> > ?? ? ? ? ? ? ? ?ha_lvm_proper_setup_check || exit 1 >>> > >>> > >>> > >>> > If the vg is not taged as cluster than the ha_lvm is looking for >>> > volume_list >>> > in lvm.conf. >>> > >>> > >>> > >>> > I am confused- Does the VG should taged as cluster ?? ?( BTW - the old >>> > fashion HA-LVM is worked with no problems ) >>> > >>> > redhat instructions : >>> > >>> > To set up HA LVM Failover (using the preferred CLVM variant), perform >>> > the >>> > following steps: >>> > >>> > >>> > >>> > 1. Ensure that the parameter?locking_type?in the global section >>> > of?/etc/lvm/lvm.conf?is set to the value?'3', that all the necessary >>> > LVM >>> > cluster packages are installed, and the necessary daemons are started >>> > (like >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>> > >>> > >>> > >>> > 2. Create the logical volume and filesystem using standard LVM2 and >>> > file >>> > system commands. For example: >>> > >>> > # pvcreate /dev/sd[cde]1 >>> > >>> > ?# vgcreate /dev/sd[cde]1 >>> > >>> > ?# lvcreate -L 10G -n >>> > >>> > ?# mkfs.ext3 /dev// >>> > >>> > ?# lvchange -an / >>> > >>> > >>> > >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created logical >>> > volume as a resource in one of your services. Alternatively, >>> > configuration >>> > tools such as?Conga?or?system-config-cluster?may be used to create >>> > these >>> > entries.? Below is a sample resource manager section >>> > from?/etc/cluster/cluster.conf: >>> > >>> > >>> > >>> > ? ??? ?????? >> > ordered="1" >>> > restricted="0"> ????????? >> > priority="1"/> >>> > ????????? >>> > ?? ?? ?????? >> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> ?????? >> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>> > fsid="64050" >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>> > >>> > ?? >>> > ?????? ?????? ?? >>> > >>> > >>> > >>> > Regards >>> > >>> > Shalom. >>> > >>> > >>> > >>> > -- >>> > Linux-cluster mailing list >>> > Linux-cluster at redhat.com >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From Colin.Simpson at iongeo.com Thu Feb 3 15:37:41 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Thu, 03 Feb 2011 15:37:41 +0000 Subject: [Linux-cluster] Multi-homing in rhel5 In-Reply-To: References: Message-ID: <1296747461.23971.15.camel@cowie.iouk.ioroot.tld> I'd like to know best practice on this too. It's always seemed a bit unclear to me how to configure this if fail over is used "alt-name". Or how good or quick the failover is or worth it over bonding. Colin On Thu, 2011-02-03 at 10:49 +0000, Corey Kovacs wrote: > The cluster2 docs outline a procedure for multihoming which is > unsupported by redhat. > > Is anyone actually using this method or are people more inclined to > use configs in which secondary interfaces are given names by which the > cluster then uses them as primary config nodes. > > For example, on my cluster I have eth0 as the primary interface for > all normal system traffic, and eth1 as my cluster interconnect. > > eth0 - nodename > eth1 - nodename-clu <-- cluster config points to this as nodes.... > > clients access the cluster services via eth0. > > I've seen other configs where people configure the cluster to use eth0 > for cluster coms so that ricci/luci work correctly, but I don't use > those. > > Is there an advantage of one method over the other ? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From Bennie_R_Thomas at raytheon.com Thu Feb 3 20:35:07 2011 From: Bennie_R_Thomas at raytheon.com (Bennie Thomas) Date: Thu, 03 Feb 2011 14:35:07 -0600 Subject: [Linux-cluster] Running cluster tools using non-root user In-Reply-To: References: Message-ID: <4D4B117B.2060804@raytheon.com> For the /usr/sbin/clustat to work for the basic user you must set uid. chmod u+s /usr/sbin/clustat You do not need sudo for this command to work. Now for clusvcadm. I would set up sudo for this, that way you can limit the users. Andrew Beekhof wrote: > On Thu, Jan 27, 2011 at 10:56 AM, Parvez Shaikh > wrote: > >> I believe Pacemaker is not same as "RHCS" >> > > Correct. At least not yet anyway. > Thats why I called my reply a shameless plug since it was for a > competing project. > > Pacemaker does ship in RHEL6 though. > > >> or do they share code? >> > > A Pacemaker installation shares almost all the underlying > infrastructure of what you know as RHCS - it just replaces the > rgmanager part. > > >> If yes, in which version of RHCS would this feature would be available? >> > > We can't comment on future releases sorry. > > >> I require to enable service, disable service, and get status. I am using CLI >> tools and any scripting trick can help me running clusvcadm and/or clustat. >> >> su -c "clusvcadm...." require entering password, can this also be eliminated >> using sudoers? >> >> Thanks >> >> On Wed, Jan 26, 2011 at 3:22 PM, Andrew Beekhof wrote: >> >>> [Shameless plug] >>> >>> The next version of Pacemaker (1.1.6) will have this feature :-) >>> The patches were merged form our devel branch about a week ago. >>> >>> [/Shameless plug] >>> >>> On Tue, Jan 25, 2011 at 10:39 AM, Parvez Shaikh >>> wrote: >>> >>>> Hi all >>>> >>>> Is it possible to run cluster tools like clustat or clusvcadm etc. using >>>> non-root user? >>>> >>>> If yes, to which groups this user should belong to? Otherwise can this >>>> be >>>> done using sudo(and sudoers) file. >>>> >>>> As of now I get following error on clustat - >>>> >>>> Could not connect to CMAN: Permission denied >>>> >>>> >>>> Thanks, >>>> Parvez >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- Bennie Thomas Sr. Information Systems Technologist II Raytheon Company 972.205.4126 972.205.6363 fax 888.347.1660 pager Bennie_R_Thomas at raytheon.com DISCLAIMER: This message contains information that may be confidential and privileged. Unless you are the addressee (or authorized to receive mail for the addressee), you should not use, copy or disclose to anyone this message or any information contained in this message. If you have received this message in error, please so advise the sender by reply e-mail and delete this message. Thank you for your cooperation. Any views or opinions presented are solely those of the author and do not necessarily represent those of Raytheon unless specifically stated. Electronic communications including email may be monitored by Raytheon for operational or business reasons. -------------- next part -------------- An HTML attachment was scrubbed... URL: From punit_j at rediffmail.com Fri Feb 4 12:14:41 2011 From: punit_j at rediffmail.com (punit_j) Date: 4 Feb 2011 12:14:41 -0000 Subject: [Linux-cluster] =?utf-8?q?Redhat_cluster_not_Quorate?= Message-ID: <20110204121441.17755.qmail@f5mail-236-235.rediffmail.com> Hi , I am using Redhat cluster suite for HA for my services. I have a 3+ 1 node cluster with 1 vote each for a node and also a Quoram disk with votes=3. So the total expected_votes is 7. The problem I am facing is if my 1 node goes down it causes all the nodes to be fenced and cluster to go inquorate. Is this an issue with my number of votes I assigned ? Thanks and Regards, Punit -------------- next part -------------- An HTML attachment was scrubbed... URL: From nehemiasjahcob at gmail.com Fri Feb 4 12:27:28 2011 From: nehemiasjahcob at gmail.com (Nehemias Urzua Q.) Date: Fri, 4 Feb 2011 09:27:28 -0300 Subject: [Linux-cluster] Redhat cluster not Quorate In-Reply-To: <20110204121441.17755.qmail@f5mail-236-235.rediffmail.com> References: <20110204121441.17755.qmail@f5mail-236-235.rediffmail.com> Message-ID: Hi You can send your configuration file please. best regards 2011/2/4 punit_j > Hi , > > I am using Redhat cluster suite for HA for my services. I have a 3+ 1 node > cluster with 1 vote each for a node and also a Quoram disk with votes=3. So > the total expected_votes is 7. > > The problem I am facing is if my 1 node goes down it causes all the nodes > to be fenced and cluster to go inquorate. > > Is this an issue with my number of votes I assigned ? > > Thanks and Regards, > Punit > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sklemer at gmail.com Fri Feb 4 13:13:01 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Fri, 4 Feb 2011 15:13:01 +0200 Subject: [Linux-cluster] Redhat cluster not Quorate In-Reply-To: <20110204121441.17755.qmail@f5mail-236-235.rediffmail.com> References: <20110204121441.17755.qmail@f5mail-236-235.rediffmail.com> Message-ID: Hi. Can you please attach the cluster.conf file ? Shalom. On Fri, Feb 4, 2011 at 2:14 PM, punit_j wrote: > Hi , > > I am using Redhat cluster suite for HA for my services. I have a 3+ 1 node > cluster with 1 vote each for a node and also a Quoram disk with votes=3. So > the total expected_votes is 7. > > The problem I am facing is if my 1 node goes down it causes all the nodes > to be fenced and cluster to go inquorate. > > Is this an issue with my number of votes I assigned ? > > Thanks and Regards, > Punit > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From share2dom at gmail.com Fri Feb 4 14:32:55 2011 From: share2dom at gmail.com (Dominic Geevarghese) Date: Fri, 4 Feb 2011 20:02:55 +0530 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hi, I am not sure about the error you are getting but it would be great if you could try the preferred method locking_type = 1 volume_list [ "your-root-vg-name" , "@hostname" ] rebuild initrd add the and resources in cluster.conf , start the cman, rgmanager . Thanks, On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: > Excellent, > > > Thanks > > -C > > On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: > > > > > > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: > >> > >> > >> > >> https://access.redhat.com/kb/docs/DOC-3068 > >> > >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs > >> wrote: > >>> > >>> Is using ha-lvm with clvmd a new capability? It's always been my > >>> understanding that the lvm locking type for using ha-lvm had to be set > >>> to '1'. > >>> > >>> I'd much rather be using clvmd if it is the way to go. Can you point > >>> me to the docs you are seeing these instructions in please? > >>> > >>> As for why your config isn't working, clvmd requires that it's > >>> resources are indeed tagged as cluster volumes, so you might try doing > >>> that and see how it goes. > >>> > >>> -C > >>> > >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: > >>> > Hello. > >>> > > >>> > > >>> > > >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( > rhcs > >>> > 5.6 - > >>> > rgmanager 2.0.52-9 ) > >>> > > >>> > > >>> > > >>> > I can't make it work. > >>> > > >>> > > >>> > > >>> > lvm.conf- locking_type=3 > >>> > > >>> > clvmd work > >>> > > >>> > Its failed saying HA-LVM is not configured correctly. > >>> > > >>> > The manual said that we should run "lvchange -a n lvxx" edit the > >>> > cluster.conf & start the service. > >>> > > >>> > > >>> > > >>> > But From lvm.conf : > >>> > > >>> > > >>> > > >>> > case $1 in > >>> > > >>> > start) > >>> > > >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ > >>> > .....c > >>> > ]]; then > >>> > > >>> > ha_lvm_proper_setup_check || exit 1 > >>> > > >>> > > >>> > > >>> > If the vg is not taged as cluster than the ha_lvm is looking for > >>> > volume_list > >>> > in lvm.conf. > >>> > > >>> > > >>> > > >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - the > old > >>> > fashion HA-LVM is worked with no problems ) > >>> > > >>> > redhat instructions : > >>> > > >>> > To set up HA LVM Failover (using the preferred CLVM variant), perform > >>> > the > >>> > following steps: > >>> > > >>> > > >>> > > >>> > 1. Ensure that the parameter locking_type in the global section > >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the necessary > >>> > LVM > >>> > cluster packages are installed, and the necessary daemons are started > >>> > (like > >>> > 'clvmd' and the cluster mirror log daemon - if necessary). > >>> > > >>> > > >>> > > >>> > 2. Create the logical volume and filesystem using standard LVM2 and > >>> > file > >>> > system commands. For example: > >>> > > >>> > # pvcreate /dev/sd[cde]1 > >>> > > >>> > # vgcreate /dev/sd[cde]1 > >>> > > >>> > # lvcreate -L 10G -n > >>> > > >>> > # mkfs.ext3 /dev// > >>> > > >>> > # lvchange -an / > >>> > > >>> > > >>> > > >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created > logical > >>> > volume as a resource in one of your services. Alternatively, > >>> > configuration > >>> > tools such as Conga or system-config-cluster may be used to create > >>> > these > >>> > entries. Below is a sample resource manager section > >>> > from /etc/cluster/cluster.conf: > >>> > > >>> > > >>> > > >>> > >>> > ordered="1" > >>> > restricted="0"> >>> > priority="1"/> > >>> > > >>> > >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" > >>> > fsid="64050" > >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> > >>> > > >>> > recovery="relocate"> > >>> > > >>> > > >>> > > >>> > > >>> > Regards > >>> > > >>> > Shalom. > >>> > > >>> > > >>> > > >>> > -- > >>> > Linux-cluster mailing list > >>> > Linux-cluster at redhat.com > >>> > https://www.redhat.com/mailman/listinfo/linux-cluster > >>> > > >>> > >>> -- > >>> Linux-cluster mailing list > >>> Linux-cluster at redhat.com > >>> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sklemer at gmail.com Fri Feb 4 15:04:44 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Fri, 4 Feb 2011 17:04:44 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Thanks. This is the old methos and its work great but its hard to maintain such cluster. Shalom. On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese wrote: > > Hi, > > I am not sure about the error you are getting but it would be great if you > could try the preferred method > > locking_type = 1 > volume_list [ "your-root-vg-name" , "@hostname" ] > > rebuild initrd > > add the and resources in cluster.conf , start the cman, > rgmanager . > > > Thanks, > > On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: > >> Excellent, >> >> >> Thanks >> >> -C >> >> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: >> > >> > >> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: >> >> >> >> >> >> >> >> https://access.redhat.com/kb/docs/DOC-3068 >> >> >> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs >> >> wrote: >> >>> >> >>> Is using ha-lvm with clvmd a new capability? It's always been my >> >>> understanding that the lvm locking type for using ha-lvm had to be set >> >>> to '1'. >> >>> >> >>> I'd much rather be using clvmd if it is the way to go. Can you point >> >>> me to the docs you are seeing these instructions in please? >> >>> >> >>> As for why your config isn't working, clvmd requires that it's >> >>> resources are indeed tagged as cluster volumes, so you might try doing >> >>> that and see how it goes. >> >>> >> >>> -C >> >>> >> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: >> >>> > Hello. >> >>> > >> >>> > >> >>> > >> >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( >> rhcs >> >>> > 5.6 - >> >>> > rgmanager 2.0.52-9 ) >> >>> > >> >>> > >> >>> > >> >>> > I can't make it work. >> >>> > >> >>> > >> >>> > >> >>> > lvm.conf- locking_type=3 >> >>> > >> >>> > clvmd work >> >>> > >> >>> > Its failed saying HA-LVM is not configured correctly. >> >>> > >> >>> > The manual said that we should run "lvchange -a n lvxx" edit the >> >>> > cluster.conf & start the service. >> >>> > >> >>> > >> >>> > >> >>> > But From lvm.conf : >> >>> > >> >>> > >> >>> > >> >>> > case $1 in >> >>> > >> >>> > start) >> >>> > >> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ >> >>> > .....c >> >>> > ]]; then >> >>> > >> >>> > ha_lvm_proper_setup_check || exit 1 >> >>> > >> >>> > >> >>> > >> >>> > If the vg is not taged as cluster than the ha_lvm is looking for >> >>> > volume_list >> >>> > in lvm.conf. >> >>> > >> >>> > >> >>> > >> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - the >> old >> >>> > fashion HA-LVM is worked with no problems ) >> >>> > >> >>> > redhat instructions : >> >>> > >> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >> perform >> >>> > the >> >>> > following steps: >> >>> > >> >>> > >> >>> > >> >>> > 1. Ensure that the parameter locking_type in the global section >> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the necessary >> >>> > LVM >> >>> > cluster packages are installed, and the necessary daemons are >> started >> >>> > (like >> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >> >>> > >> >>> > >> >>> > >> >>> > 2. Create the logical volume and filesystem using standard LVM2 and >> >>> > file >> >>> > system commands. For example: >> >>> > >> >>> > # pvcreate /dev/sd[cde]1 >> >>> > >> >>> > # vgcreate /dev/sd[cde]1 >> >>> > >> >>> > # lvcreate -L 10G -n >> >>> > >> >>> > # mkfs.ext3 /dev// >> >>> > >> >>> > # lvchange -an / >> >>> > >> >>> > >> >>> > >> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >> logical >> >>> > volume as a resource in one of your services. Alternatively, >> >>> > configuration >> >>> > tools such as Conga or system-config-cluster may be used to create >> >>> > these >> >>> > entries. Below is a sample resource manager section >> >>> > from /etc/cluster/cluster.conf: >> >>> > >> >>> > >> >>> > >> >>> > > >>> > ordered="1" >> >>> > restricted="0"> > >>> > priority="1"/> >> >>> > >> >>> > > >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> > name="FS" >> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >> >>> > fsid="64050" >> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >> >>> > >> >>> > > recovery="relocate"> >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > Regards >> >>> > >> >>> > Shalom. >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > Linux-cluster mailing list >> >>> > Linux-cluster at redhat.com >> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >>> > >> >>> >> >>> -- >> >>> Linux-cluster mailing list >> >>> Linux-cluster at redhat.com >> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster at redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From share2dom at gmail.com Fri Feb 4 15:23:02 2011 From: share2dom at gmail.com (dOminic) Date: Fri, 4 Feb 2011 20:53:02 +0530 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hi Shalom, Are you still facing problem implementing HA-LVM with locking_type = 3 setting ?. If yes, it would be great if you could provide the following details . So that others can also check * steps you are following along with complete output * status in "clustat" after making changes in cluster.conf * attach cluster.conf and /var/log/messages. dominic On Fri, Feb 4, 2011 at 8:34 PM, ???? ???? wrote: > Thanks. > > This is the old methos and its work great but its hard to maintain such > cluster. > > Shalom. > > > On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese wrote: > >> >> Hi, >> >> I am not sure about the error you are getting but it would be great if you >> could try the preferred method >> >> locking_type = 1 >> volume_list [ "your-root-vg-name" , "@hostname" ] >> >> rebuild initrd >> >> add the and resources in cluster.conf , start the cman, >> rgmanager . >> >> >> Thanks, >> >> On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: >> >>> Excellent, >>> >>> >>> Thanks >>> >>> -C >>> >>> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: >>> > >>> > >>> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: >>> >> >>> >> >>> >> >>> >> https://access.redhat.com/kb/docs/DOC-3068 >>> >> >>> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs >> > >>> >> wrote: >>> >>> >>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>> >>> understanding that the lvm locking type for using ha-lvm had to be >>> set >>> >>> to '1'. >>> >>> >>> >>> I'd much rather be using clvmd if it is the way to go. Can you point >>> >>> me to the docs you are seeing these instructions in please? >>> >>> >>> >>> As for why your config isn't working, clvmd requires that it's >>> >>> resources are indeed tagged as cluster volumes, so you might try >>> doing >>> >>> that and see how it goes. >>> >>> >>> >>> -C >>> >>> >>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? wrote: >>> >>> > Hello. >>> >>> > >>> >>> > >>> >>> > >>> >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( >>> rhcs >>> >>> > 5.6 - >>> >>> > rgmanager 2.0.52-9 ) >>> >>> > >>> >>> > >>> >>> > >>> >>> > I can't make it work. >>> >>> > >>> >>> > >>> >>> > >>> >>> > lvm.conf- locking_type=3 >>> >>> > >>> >>> > clvmd work >>> >>> > >>> >>> > Its failed saying HA-LVM is not configured correctly. >>> >>> > >>> >>> > The manual said that we should run "lvchange -a n lvxx" edit the >>> >>> > cluster.conf & start the service. >>> >>> > >>> >>> > >>> >>> > >>> >>> > But From lvm.conf : >>> >>> > >>> >>> > >>> >>> > >>> >>> > case $1 in >>> >>> > >>> >>> > start) >>> >>> > >>> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ >>> >>> > .....c >>> >>> > ]]; then >>> >>> > >>> >>> > ha_lvm_proper_setup_check || exit 1 >>> >>> > >>> >>> > >>> >>> > >>> >>> > If the vg is not taged as cluster than the ha_lvm is looking for >>> >>> > volume_list >>> >>> > in lvm.conf. >>> >>> > >>> >>> > >>> >>> > >>> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - the >>> old >>> >>> > fashion HA-LVM is worked with no problems ) >>> >>> > >>> >>> > redhat instructions : >>> >>> > >>> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >>> perform >>> >>> > the >>> >>> > following steps: >>> >>> > >>> >>> > >>> >>> > >>> >>> > 1. Ensure that the parameter locking_type in the global section >>> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the >>> necessary >>> >>> > LVM >>> >>> > cluster packages are installed, and the necessary daemons are >>> started >>> >>> > (like >>> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>> >>> > >>> >>> > >>> >>> > >>> >>> > 2. Create the logical volume and filesystem using standard LVM2 and >>> >>> > file >>> >>> > system commands. For example: >>> >>> > >>> >>> > # pvcreate /dev/sd[cde]1 >>> >>> > >>> >>> > # vgcreate /dev/sd[cde]1 >>> >>> > >>> >>> > # lvcreate -L 10G -n >>> >>> > >>> >>> > # mkfs.ext3 /dev// >>> >>> > >>> >>> > # lvchange -an / >>> >>> > >>> >>> > >>> >>> > >>> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >>> logical >>> >>> > volume as a resource in one of your services. Alternatively, >>> >>> > configuration >>> >>> > tools such as Conga or system-config-cluster may be used to create >>> >>> > these >>> >>> > entries. Below is a sample resource manager section >>> >>> > from /etc/cluster/cluster.conf: >>> >>> > >>> >>> > >>> >>> > >>> >>> > >> >>> > ordered="1" >>> >>> > restricted="0"> >> >>> > priority="1"/> >>> >>> > >>> >>> > >> >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >> name="FS" >>> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>> >>> > fsid="64050" >>> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>> >>> > >>> >>> > >> recovery="relocate"> >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > Regards >>> >>> > >>> >>> > Shalom. >>> >>> > >>> >>> > >>> >>> > >>> >>> > -- >>> >>> > Linux-cluster mailing list >>> >>> > Linux-cluster at redhat.com >>> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> > >>> >>> >>> >>> -- >>> >>> Linux-cluster mailing list >>> >>> Linux-cluster at redhat.com >>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> > >>> > -- >>> > Linux-cluster mailing list >>> > Linux-cluster at redhat.com >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhayden.public at gmail.com Fri Feb 4 18:57:45 2011 From: rhayden.public at gmail.com (Robert Hayden) Date: Fri, 4 Feb 2011 12:57:45 -0600 Subject: [Linux-cluster] IPv6 Setup with RHCS Message-ID: I have searched for a concrete example of RHCS in a pure IPv6 environment, but I have only found references that IPv6 is supported. Does anyone have experience with setting up RHCS with IPv6 that they would be willing to share? Any good, technical papers out there? In particular, I would like to stay in the RHEL 5.x release, but would consider RHEL 6 options as well. I am wanting to protect a simple, custom application that requires a floating IPv6 IP address resource. Thanks Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From shea.benn at gmail.com Fri Feb 4 19:07:03 2011 From: shea.benn at gmail.com (Shea Bennett) Date: Fri, 4 Feb 2011 14:07:03 -0500 Subject: [Linux-cluster] IPv6 Setup with RHCS In-Reply-To: References: Message-ID: Robert, I am searching for the same thing and haven't found anything definitive on http://docs.redhat.com. I have a support call to RedHat setup for Monday. I will update with what I find out. Shea On Fri, Feb 4, 2011 at 13:57, Robert Hayden wrote: > I have searched for a concrete example of RHCS in a pure IPv6 environment, > but I have only found references that IPv6 is supported. > > Does anyone have experience with setting up RHCS with IPv6 that they would > be willing to share? Any good, technical papers out there? In particular, > I would like to stay in the RHEL 5.x release, but would consider RHEL 6 > options as well. I am wanting to protect a simple, custom application that > requires a floating IPv6 IP address resource. > > Thanks > Robert > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- *P** Please consider the environment before printing this e-mail* This e-mail message and all documents that accompany it may contain privileged or confidential information, and are intended only for the use of the individual or entity to which addressed. Any unauthorized disclosure or distribution of this e-mail message is prohibited. If you have received this e-mail message in error, please notify me immediately. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sklemer at gmail.com Fri Feb 4 22:07:36 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Sat, 5 Feb 2011 00:07:36 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hello Dominic. I will be in lab on monday and collect all steps & logs files. btw - Is it redhat recommendation to preferre using HA-LVM with the locking_type=1 method ?? Regards Shalom On Fri, Feb 4, 2011 at 5:23 PM, dOminic wrote: > Hi Shalom, > > Are you still facing problem implementing HA-LVM with locking_type = 3 > setting ?. If yes, it would be great if you could provide the following > details . > So that others can also check > > * steps you are following along with complete output > * status in "clustat" after making changes in cluster.conf > * attach cluster.conf and /var/log/messages. > > dominic > > On Fri, Feb 4, 2011 at 8:34 PM, ???? ???? wrote: > >> Thanks. >> >> This is the old methos and its work great but its hard to maintain such >> cluster. >> >> Shalom. >> >> >> On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese wrote: >> >>> >>> Hi, >>> >>> I am not sure about the error you are getting but it would be great if >>> you could try the preferred method >>> >>> locking_type = 1 >>> volume_list [ "your-root-vg-name" , "@hostname" ] >>> >>> rebuild initrd >>> >>> add the and resources in cluster.conf , start the cman, >>> rgmanager . >>> >>> >>> Thanks, >>> >>> On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: >>> >>>> Excellent, >>>> >>>> >>>> Thanks >>>> >>>> -C >>>> >>>> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: >>>> > >>>> > >>>> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? wrote: >>>> >> >>>> >> >>>> >> >>>> >> https://access.redhat.com/kb/docs/DOC-3068 >>>> >> >>>> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs < >>>> corey.kovacs at gmail.com> >>>> >> wrote: >>>> >>> >>>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>>> >>> understanding that the lvm locking type for using ha-lvm had to be >>>> set >>>> >>> to '1'. >>>> >>> >>>> >>> I'd much rather be using clvmd if it is the way to go. Can you point >>>> >>> me to the docs you are seeing these instructions in please? >>>> >>> >>>> >>> As for why your config isn't working, clvmd requires that it's >>>> >>> resources are indeed tagged as cluster volumes, so you might try >>>> doing >>>> >>> that and see how it goes. >>>> >>> >>>> >>> -C >>>> >>> >>>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? >>>> wrote: >>>> >>> > Hello. >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( >>>> rhcs >>>> >>> > 5.6 - >>>> >>> > rgmanager 2.0.52-9 ) >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > I can't make it work. >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > lvm.conf- locking_type=3 >>>> >>> > >>>> >>> > clvmd work >>>> >>> > >>>> >>> > Its failed saying HA-LVM is not configured correctly. >>>> >>> > >>>> >>> > The manual said that we should run "lvchange -a n lvxx" edit the >>>> >>> > cluster.conf & start the service. >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > But From lvm.conf : >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > case $1 in >>>> >>> > >>>> >>> > start) >>>> >>> > >>>> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ >>>> >>> > .....c >>>> >>> > ]]; then >>>> >>> > >>>> >>> > ha_lvm_proper_setup_check || exit 1 >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > If the vg is not taged as cluster than the ha_lvm is looking for >>>> >>> > volume_list >>>> >>> > in lvm.conf. >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - the >>>> old >>>> >>> > fashion HA-LVM is worked with no problems ) >>>> >>> > >>>> >>> > redhat instructions : >>>> >>> > >>>> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >>>> perform >>>> >>> > the >>>> >>> > following steps: >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > 1. Ensure that the parameter locking_type in the global section >>>> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the >>>> necessary >>>> >>> > LVM >>>> >>> > cluster packages are installed, and the necessary daemons are >>>> started >>>> >>> > (like >>>> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > 2. Create the logical volume and filesystem using standard LVM2 >>>> and >>>> >>> > file >>>> >>> > system commands. For example: >>>> >>> > >>>> >>> > # pvcreate /dev/sd[cde]1 >>>> >>> > >>>> >>> > # vgcreate /dev/sd[cde]1 >>>> >>> > >>>> >>> > # lvcreate -L 10G -n >>>> >>> > >>>> >>> > # mkfs.ext3 /dev// >>>> >>> > >>>> >>> > # lvchange -an / >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >>>> logical >>>> >>> > volume as a resource in one of your services. Alternatively, >>>> >>> > configuration >>>> >>> > tools such as Conga or system-config-cluster may be used to create >>>> >>> > these >>>> >>> > entries. Below is a sample resource manager section >>>> >>> > from /etc/cluster/cluster.conf: >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > >>> >>> > ordered="1" >>>> >>> > restricted="0"> >>> >>> > priority="1"/> >>>> >>> > >>>> >>> > >>> >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >>> name="FS" >>>> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>>> >>> > fsid="64050" >>>> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>>> >>> > >>>> >>> > >>> recovery="relocate"> >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > Regards >>>> >>> > >>>> >>> > Shalom. >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > -- >>>> >>> > Linux-cluster mailing list >>>> >>> > Linux-cluster at redhat.com >>>> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> > >>>> >>> >>>> >>> -- >>>> >>> Linux-cluster mailing list >>>> >>> Linux-cluster at redhat.com >>>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> > >>>> > >>>> > -- >>>> > Linux-cluster mailing list >>>> > Linux-cluster at redhat.com >>>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>> > >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Sat Feb 5 06:55:55 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Sat, 05 Feb 2011 07:55:55 +0100 Subject: [Linux-cluster] IPv6 Setup with RHCS In-Reply-To: References: Message-ID: <4D4CF47B.5060103@redhat.com> Hi Robert On 02/04/2011 07:57 PM, Robert Hayden wrote: > I have searched for a concrete example of RHCS in a pure IPv6 > environment, but I have only found references that IPv6 is supported. > > Does anyone have experience with setting up RHCS with IPv6 that they > would be willing to share? Yes, I use IPv6 for testing RHCS before each release. The real question is: do you want to use IPv6 for cluster heartbeat/backend and/or do you want RHCS to drive for example IPv6 virtual IPs? > Any good, technical papers out there? In > particular, I would like to stay in the RHEL 5.x release, but would > consider RHEL 6 options as well. I am wanting to protect a simple, > custom application that requires a floating IPv6 IP address resource. VIP should work just fine in both, but I only tested RHEL6 deeply with IPv6 (both backend and VIP). Keep in mind that any application you want to use (that being custom made or any other resource) must be IPv6 aware. RHCS will only manage the VIP for you and will not make your application IPv6 compliant (a common thing people ask, while it sounds obvious to many, it is source of confusion to many more). In terms of configuration, it is really no different than an IPv4 VIP. Simply replace the ipv4 address (or add one.. you get the idea) with an IPv6 address. Same requirements apply too. The IPv4 VIP requires one node interface to have an IPv4 on the same network. This is no different for IPv6. That means, before you start setting up IPv6 in RHCS, make sure that IPv6 is configured and working on the host system otherwise you will spend endless time debugging. Fabio From sklemer at gmail.com Sat Feb 5 07:37:37 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Sat, 5 Feb 2011 09:37:37 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hi. After reading the manual again ,I think i know what was my setup problem. The VGs should be taged as cluster , ( this action will path the lvm.sh check ) & the LVs should be deactivated. The cluster will activate the LVs as exclusive. (i will check it on monday ). Shalom. On Sat, Feb 5, 2011 at 12:07 AM, ???? ???? wrote: > Hello Dominic. > > I will be in lab on monday and collect all steps & logs files. > > btw - Is it redhat recommendation to preferre using HA-LVM with the > locking_type=1 method ?? > > > Regards > > Shalom > > > On Fri, Feb 4, 2011 at 5:23 PM, dOminic wrote: > >> Hi Shalom, >> >> Are you still facing problem implementing HA-LVM with locking_type = 3 >> setting ?. If yes, it would be great if you could provide the following >> details . >> So that others can also check >> >> * steps you are following along with complete output >> * status in "clustat" after making changes in cluster.conf >> * attach cluster.conf and /var/log/messages. >> >> dominic >> >> On Fri, Feb 4, 2011 at 8:34 PM, ???? ???? wrote: >> >>> Thanks. >>> >>> This is the old methos and its work great but its hard to maintain such >>> cluster. >>> >>> Shalom. >>> >>> >>> On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese >> > wrote: >>> >>>> >>>> Hi, >>>> >>>> I am not sure about the error you are getting but it would be great if >>>> you could try the preferred method >>>> >>>> locking_type = 1 >>>> volume_list [ "your-root-vg-name" , "@hostname" ] >>>> >>>> rebuild initrd >>>> >>>> add the and resources in cluster.conf , start the cman, >>>> rgmanager . >>>> >>>> >>>> Thanks, >>>> >>>> On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: >>>> >>>>> Excellent, >>>>> >>>>> >>>>> Thanks >>>>> >>>>> -C >>>>> >>>>> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: >>>>> > >>>>> > >>>>> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? >>>>> wrote: >>>>> >> >>>>> >> >>>>> >> >>>>> >> https://access.redhat.com/kb/docs/DOC-3068 >>>>> >> >>>>> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs < >>>>> corey.kovacs at gmail.com> >>>>> >> wrote: >>>>> >>> >>>>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>>>> >>> understanding that the lvm locking type for using ha-lvm had to be >>>>> set >>>>> >>> to '1'. >>>>> >>> >>>>> >>> I'd much rather be using clvmd if it is the way to go. Can you >>>>> point >>>>> >>> me to the docs you are seeing these instructions in please? >>>>> >>> >>>>> >>> As for why your config isn't working, clvmd requires that it's >>>>> >>> resources are indeed tagged as cluster volumes, so you might try >>>>> doing >>>>> >>> that and see how it goes. >>>>> >>> >>>>> >>> -C >>>>> >>> >>>>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? >>>>> wrote: >>>>> >>> > Hello. >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > I followed redhat instruction trying install HA-LVM with clvmd. ( >>>>> rhcs >>>>> >>> > 5.6 - >>>>> >>> > rgmanager 2.0.52-9 ) >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > I can't make it work. >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > lvm.conf- locking_type=3 >>>>> >>> > >>>>> >>> > clvmd work >>>>> >>> > >>>>> >>> > Its failed saying HA-LVM is not configured correctly. >>>>> >>> > >>>>> >>> > The manual said that we should run "lvchange -a n lvxx" edit the >>>>> >>> > cluster.conf & start the service. >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > But From lvm.conf : >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > case $1 in >>>>> >>> > >>>>> >>> > start) >>>>> >>> > >>>>> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) >>>>> =~ >>>>> >>> > .....c >>>>> >>> > ]]; then >>>>> >>> > >>>>> >>> > ha_lvm_proper_setup_check || exit 1 >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > If the vg is not taged as cluster than the ha_lvm is looking for >>>>> >>> > volume_list >>>>> >>> > in lvm.conf. >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - >>>>> the old >>>>> >>> > fashion HA-LVM is worked with no problems ) >>>>> >>> > >>>>> >>> > redhat instructions : >>>>> >>> > >>>>> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >>>>> perform >>>>> >>> > the >>>>> >>> > following steps: >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > 1. Ensure that the parameter locking_type in the global section >>>>> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the >>>>> necessary >>>>> >>> > LVM >>>>> >>> > cluster packages are installed, and the necessary daemons are >>>>> started >>>>> >>> > (like >>>>> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > 2. Create the logical volume and filesystem using standard LVM2 >>>>> and >>>>> >>> > file >>>>> >>> > system commands. For example: >>>>> >>> > >>>>> >>> > # pvcreate /dev/sd[cde]1 >>>>> >>> > >>>>> >>> > # vgcreate /dev/sd[cde]1 >>>>> >>> > >>>>> >>> > # lvcreate -L 10G -n >>>>> >>> > >>>>> >>> > # mkfs.ext3 /dev// >>>>> >>> > >>>>> >>> > # lvchange -an / >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >>>>> logical >>>>> >>> > volume as a resource in one of your services. Alternatively, >>>>> >>> > configuration >>>>> >>> > tools such as Conga or system-config-cluster may be used to >>>>> create >>>>> >>> > these >>>>> >>> > entries. Below is a sample resource manager section >>>>> >>> > from /etc/cluster/cluster.conf: >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>> >>> > ordered="1" >>>>> >>> > restricted="0"> >>>> >>> > priority="1"/> >>>>> >>> > >>>>> >>> > >>>>> >>>> >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >>>> name="FS" >>>>> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>>>> >>> > fsid="64050" >>>>> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>>>> >>> > >>>>> >>> > >>>> recovery="relocate"> >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > Regards >>>>> >>> > >>>>> >>> > Shalom. >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > -- >>>>> >>> > Linux-cluster mailing list >>>>> >>> > Linux-cluster at redhat.com >>>>> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>> > >>>>> >>> >>>>> >>> -- >>>>> >>> Linux-cluster mailing list >>>>> >>> Linux-cluster at redhat.com >>>>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> > >>>>> > >>>>> > -- >>>>> > Linux-cluster mailing list >>>>> > Linux-cluster at redhat.com >>>>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> > >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From share2dom at gmail.com Sat Feb 5 10:24:24 2011 From: share2dom at gmail.com (dOminic) Date: Sat, 5 Feb 2011 15:54:24 +0530 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hi, If you are you using traditional HA-LVM setup with locking_type = 1 , then you need to setup tagging / initrd rebuild Anything went wrong with tagging.... then it won't prevent admin to activate vg, remove the VG which is activated/used in another node. # vgchange -ay domvg 1 logical volume(s) in volume group "domvg" now active # lvremove /dev/domvg/domlv Do you really want to remove active logical volume domlv? [y/n]: y Logical volume "domlv" successfully removed If you are using clvmd variant with locking_type = 3 and cluster is *active/passive* then clvmd won't let you to remove the VG which is activate/used in another node. # lvremove /dev/domvg/domlv Do you really want to remove active clustered logical volume domlv? [y/n]: y Error locking on node node1.domtest.com: LV domvg/domlv in use: not deactivating Unable to deactivate logical volume "domlv" - dominic On Sat, Feb 5, 2011 at 1:07 PM, ???? ???? wrote: > Hi. > > After reading the manual again ,I think i know what was my setup problem. > > The VGs should be taged as cluster , ( this action will path the lvm.sh > check ) & the LVs should be deactivated. > > The cluster will activate the LVs as exclusive. (i will check it on monday > ). > > Shalom. > > > On Sat, Feb 5, 2011 at 12:07 AM, ???? ???? wrote: > >> Hello Dominic. >> >> I will be in lab on monday and collect all steps & logs files. >> >> btw - Is it redhat recommendation to preferre using HA-LVM with the >> locking_type=1 method ?? >> >> >> Regards >> >> Shalom >> >> >> On Fri, Feb 4, 2011 at 5:23 PM, dOminic wrote: >> >>> Hi Shalom, >>> >>> Are you still facing problem implementing HA-LVM with locking_type = 3 >>> setting ?. If yes, it would be great if you could provide the following >>> details . >>> So that others can also check >>> >>> * steps you are following along with complete output >>> * status in "clustat" after making changes in cluster.conf >>> * attach cluster.conf and /var/log/messages. >>> >>> dominic >>> >>> On Fri, Feb 4, 2011 at 8:34 PM, ???? ???? wrote: >>> >>>> Thanks. >>>> >>>> This is the old methos and its work great but its hard to maintain such >>>> cluster. >>>> >>>> Shalom. >>>> >>>> >>>> On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese < >>>> share2dom at gmail.com> wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> I am not sure about the error you are getting but it would be great if >>>>> you could try the preferred method >>>>> >>>>> locking_type = 1 >>>>> volume_list [ "your-root-vg-name" , "@hostname" ] >>>>> >>>>> rebuild initrd >>>>> >>>>> add the and resources in cluster.conf , start the cman, >>>>> rgmanager . >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: >>>>> >>>>>> Excellent, >>>>>> >>>>>> >>>>>> Thanks >>>>>> >>>>>> -C >>>>>> >>>>>> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? wrote: >>>>>> > >>>>>> > >>>>>> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? >>>>>> wrote: >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> https://access.redhat.com/kb/docs/DOC-3068 >>>>>> >> >>>>>> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs < >>>>>> corey.kovacs at gmail.com> >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>>>>> >>> understanding that the lvm locking type for using ha-lvm had to be >>>>>> set >>>>>> >>> to '1'. >>>>>> >>> >>>>>> >>> I'd much rather be using clvmd if it is the way to go. Can you >>>>>> point >>>>>> >>> me to the docs you are seeing these instructions in please? >>>>>> >>> >>>>>> >>> As for why your config isn't working, clvmd requires that it's >>>>>> >>> resources are indeed tagged as cluster volumes, so you might try >>>>>> doing >>>>>> >>> that and see how it goes. >>>>>> >>> >>>>>> >>> -C >>>>>> >>> >>>>>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? >>>>>> wrote: >>>>>> >>> > Hello. >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > I followed redhat instruction trying install HA-LVM with clvmd. >>>>>> ( rhcs >>>>>> >>> > 5.6 - >>>>>> >>> > rgmanager 2.0.52-9 ) >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > I can't make it work. >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > lvm.conf- locking_type=3 >>>>>> >>> > >>>>>> >>> > clvmd work >>>>>> >>> > >>>>>> >>> > Its failed saying HA-LVM is not configured correctly. >>>>>> >>> > >>>>>> >>> > The manual said that we should run "lvchange -a n lvxx" edit the >>>>>> >>> > cluster.conf & start the service. >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > But From lvm.conf : >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > case $1 in >>>>>> >>> > >>>>>> >>> > start) >>>>>> >>> > >>>>>> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) >>>>>> =~ >>>>>> >>> > .....c >>>>>> >>> > ]]; then >>>>>> >>> > >>>>>> >>> > ha_lvm_proper_setup_check || exit 1 >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > If the vg is not taged as cluster than the ha_lvm is looking for >>>>>> >>> > volume_list >>>>>> >>> > in lvm.conf. >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - >>>>>> the old >>>>>> >>> > fashion HA-LVM is worked with no problems ) >>>>>> >>> > >>>>>> >>> > redhat instructions : >>>>>> >>> > >>>>>> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >>>>>> perform >>>>>> >>> > the >>>>>> >>> > following steps: >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > 1. Ensure that the parameter locking_type in the global section >>>>>> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the >>>>>> necessary >>>>>> >>> > LVM >>>>>> >>> > cluster packages are installed, and the necessary daemons are >>>>>> started >>>>>> >>> > (like >>>>>> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > 2. Create the logical volume and filesystem using standard LVM2 >>>>>> and >>>>>> >>> > file >>>>>> >>> > system commands. For example: >>>>>> >>> > >>>>>> >>> > # pvcreate /dev/sd[cde]1 >>>>>> >>> > >>>>>> >>> > # vgcreate /dev/sd[cde]1 >>>>>> >>> > >>>>>> >>> > # lvcreate -L 10G -n >>>>>> >>> > >>>>>> >>> > # mkfs.ext3 /dev// >>>>>> >>> > >>>>>> >>> > # lvchange -an / >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >>>>>> logical >>>>>> >>> > volume as a resource in one of your services. Alternatively, >>>>>> >>> > configuration >>>>>> >>> > tools such as Conga or system-config-cluster may be used to >>>>>> create >>>>>> >>> > these >>>>>> >>> > entries. Below is a sample resource manager section >>>>>> >>> > from /etc/cluster/cluster.conf: >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>> >>> > ordered="1" >>>>>> >>> > restricted="0"> >>>>> >>> > priority="1"/> >>>>>> >>> > >>>>>> >>> > >>>>>> >>>>> >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >>>>> name="FS" >>>>>> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>>>>> >>> > fsid="64050" >>>>>> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>>>>> >>> > >>>>>> >>> > >>>>> recovery="relocate"> >>>>>> >>> > >>>>>> >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > Regards >>>>>> >>> > >>>>>> >>> > Shalom. >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > -- >>>>>> >>> > Linux-cluster mailing list >>>>>> >>> > Linux-cluster at redhat.com >>>>>> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>> > >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Linux-cluster mailing list >>>>>> >>> Linux-cluster at redhat.com >>>>>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > Linux-cluster mailing list >>>>>> > Linux-cluster at redhat.com >>>>>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> > >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Tue Feb 8 07:44:22 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Tue, 08 Feb 2011 08:44:22 +0100 Subject: [Linux-cluster] fence-agents 3.1.1 stable release Message-ID: <4D50F456.30009@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Welcome to the second fence-agents standalone release. This release contains a few bug fixes and a brand new agent for Eaton ePDU devices (courtesy of Arnaud Quette). The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-3.1.1.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio Under the hood (from 3.1.0): Fabio M. Di Nitto (3): Fix build for distributions that don't use bash as default shell build: fix make dist target Update COPYRIGHT file Marek 'marx' Grac (3): fence_ipmilan: Add "diag" option to support "ipmitool chassis power diag" fence_ipmilan: Fix manual page to describe usage with HP iLO 3 fence_eaton_snmp: New fence agent for Eaton devices Ryan O'Hara (6): fence_scsi: identify dm-multipath devices correctly fence_scsi: fix regular expression for grep fence_scsi: always do sg_turs before registration fence_scsi: always do sg_turs for dm-mp devices fence_scsi: verify that on/off actions succeed fence_scsi: properly log errors for all commands configure.ac | 1 + doc/COPYRIGHT | 19 ++-- fence/agents/Makefile.am | 1 + fence/agents/alom/Makefile.am | 2 +- fence/agents/apc/Makefile.am | 2 +- fence/agents/apc_snmp/Makefile.am | 2 +- fence/agents/bladecenter/Makefile.am | 2 +- fence/agents/cisco_mds/Makefile.am | 2 +- fence/agents/cisco_ucs/Makefile.am | 2 +- fence/agents/drac5/Makefile.am | 2 +- fence/agents/eaton_snmp/Makefile.am | 16 +++ fence/agents/eaton_snmp/README | 20 +++ fence/agents/eaton_snmp/fence_eaton_snmp.py | 177 +++++++++++++++++++++++++ fence/agents/eps/Makefile.am | 2 +- fence/agents/ibmblade/Makefile.am | 2 +- fence/agents/ifmib/Makefile.am | 2 +- fence/agents/ilo/Makefile.am | 2 +- fence/agents/ilo_mp/Makefile.am | 2 +- fence/agents/intelmodular/Makefile.am | 2 +- fence/agents/ipmilan/Makefile.am | 2 +- fence/agents/ipmilan/ipmilan.c | 36 +++++- fence/agents/ldom/Makefile.am | 2 +- fence/agents/lpar/Makefile.am | 2 +- fence/agents/node_assassin/Makefile.am | 2 +- fence/agents/rhevm/Makefile.am | 2 +- fence/agents/rsa/Makefile.am | 2 +- fence/agents/sanbox2/Makefile.am | 2 +- fence/agents/scsi/fence_scsi.pl | 189 +++++++++++++++++++++++---- fence/agents/virsh/Makefile.am | 2 +- fence/agents/vmware/Makefile.am | 2 +- fence/agents/wti/Makefile.am | 2 +- make/fencebuild.mk | 2 +- make/release.mk | 1 - 33 files changed, 445 insertions(+), 63 deletions(-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJNUPRVAAoJEFA6oBJjVJ+OvR8P/RhSI+A8HaF8817LlxMFP/5v bmP/tr3TrLpUC+gnnauTizrGuBjVogmUz9aO8VWS2wFcpf8NZpwzPrps8v2HAIZr dEdB8l2yhQsis5cPuIWV8YiOPrp1S/+ewQxadFfmNQUuS+OrwSR4qA8pxAlw/mxW 4OuXhJzLTsK4RxV/rD3K8q1vrEiN3MgAW/ql1sDL94U5Rgs8RTL+FhXMqEqmBXl6 D/ZMnSD5KCYXNOw9r4wblxDkTdm1zP0s6oTM/6VZimYS1UxvuBZJaaxLcnixj+k8 MTCaVawCJtK6PcJXyf3+iHT9OuaFPvQCnn20sNerHuMJWd5jEoyY4lrDMvas73/F ryJwHMwc/JpiXvbbNuMyS+oYyMFLqW1HSqR3SigiNtgMcoFPRYo1/UdbsTFvHxQe p9V9W6mTggODLukEex5ShWFkyTS5IoZMniACey4bXdpvU/DJ797l0tqsJIouRv8Z oRuOPCpX2BP7YAPj34fq82CgmUrPHklDevC6/qyjw8dp+PyRpLKXVyCJeotvYvrF I4KzW4kjbgsXzRYdGPcIC27HbQ9lF0St21zinZQZzaZMLxFw9D4Re8/50dxILT41 AZs/nhVEZxz+lHKOjx5nW1bLXS8+oYUrhCxt2zNoBs6Evnz/5avsYlCBgOn7g37L FmiqOW4U2iOdZDme10SN =jrmg -----END PGP SIGNATURE----- From rossnick-lists at cybercat.ca Wed Feb 9 15:26:17 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Wed, 9 Feb 2011 10:26:17 -0500 Subject: [Linux-cluster] Heartbeat ? Message-ID: Hi ! I curently have a router (in fact 2) in CentOS 5.5 that uses the redhat's heartbeat package to manage high-availibility ressource (an ip each of 2 interfaces, and that's about it). This package was removed from RHEL 6. Was it replace by pacemaker ? What's the equivalent in RHEL 6 ? Regards, From gordan at bobich.net Wed Feb 9 16:14:13 2011 From: gordan at bobich.net (Gordan Bobic) Date: Wed, 09 Feb 2011 16:14:13 +0000 Subject: [Linux-cluster] Heartbeat ? In-Reply-To: References: Message-ID: <4D52BD55.1060603@bobich.net> Nicolas Ross wrote: > I curently have a router (in fact 2) in CentOS 5.5 that uses the > redhat's heartbeat package to manage high-availibility ressource (an ip > each of 2 interfaces, and that's about it). > > This package was removed from RHEL 6. Was it replace by pacemaker ? > What's the equivalent in RHEL 6 ? I'm using heartbeat from clusterlabs.org yum repository. The F13 packages work just fine on RHEL6. Have a look here: http://www.clusterlabs.org/wiki/Install Gordan From veliogluh at itu.edu.tr Wed Feb 9 16:38:46 2011 From: veliogluh at itu.edu.tr (Hakan VELIOGLU) Date: Wed, 09 Feb 2011 18:38:46 +0200 Subject: [Linux-cluster] Piranha ipv6 support In-Reply-To: <4D52BD55.1060603@bobich.net> References: <4D52BD55.1060603@bobich.net> Message-ID: <20110209183846.5866264gswpd0z6e@webmail.itu.edu.tr> Hi, Does piranha package (LVS tool) has ipv6 load balance support in RHEL 6? ?f yes how can I set ipv6 addresses in /etc/sysconfig/ha/lvs.cf config file? Thanks... Hakan VELIOGLU RHCE LPIC-1 From sklemer at gmail.com Thu Feb 10 07:08:50 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Thu, 10 Feb 2011 09:08:50 +0200 Subject: [Linux-cluster] falied to implement HA-LVM with clvmd rhcs5.6 In-Reply-To: References: Message-ID: Hello. The HA-LVM + clvmd is working!!! The Vg should be tagged as cluster. The major problem I noticed when I checked the cluster is that on the passive system I am able to run vgchange -c n vgxx ; vgchange -a y vgxx ; mount the LV out of the cluster - While this lv is mounted on the active member. Shalom. On Sat, Feb 5, 2011 at 12:24 PM, dOminic wrote: > Hi, > > If you are you using traditional HA-LVM setup with locking_type = 1 , then > you need to setup tagging / initrd rebuild > Anything went wrong with tagging.... then it won't prevent admin to > activate vg, remove the VG which is activated/used in another node. > > # vgchange -ay domvg > 1 logical volume(s) in volume group "domvg" now active > # lvremove /dev/domvg/domlv > Do you really want to remove active logical volume domlv? [y/n]: y > Logical volume "domlv" successfully removed > > If you are using clvmd variant with locking_type = 3 and cluster is > *active/passive* then clvmd won't let you to remove the VG which is > activate/used in another node. > > # lvremove /dev/domvg/domlv > Do you really want to remove active clustered logical volume domlv? [y/n]: > y > Error locking on node node1.domtest.com: LV domvg/domlv in use: not > deactivating > Unable to deactivate logical volume "domlv" > > - dominic > > On Sat, Feb 5, 2011 at 1:07 PM, ???? ???? wrote: > >> Hi. >> >> After reading the manual again ,I think i know what was my setup >> problem. >> >> The VGs should be taged as cluster , ( this action will path the lvm.sh >> check ) & the LVs should be deactivated. >> >> The cluster will activate the LVs as exclusive. (i will check it on >> monday ). >> >> Shalom. >> >> >> On Sat, Feb 5, 2011 at 12:07 AM, ???? ???? wrote: >> >>> Hello Dominic. >>> >>> I will be in lab on monday and collect all steps & logs files. >>> >>> btw - Is it redhat recommendation to preferre using HA-LVM with the >>> locking_type=1 method ?? >>> >>> >>> Regards >>> >>> Shalom >>> >>> >>> On Fri, Feb 4, 2011 at 5:23 PM, dOminic wrote: >>> >>>> Hi Shalom, >>>> >>>> Are you still facing problem implementing HA-LVM with locking_type = 3 >>>> setting ?. If yes, it would be great if you could provide the following >>>> details . >>>> So that others can also check >>>> >>>> * steps you are following along with complete output >>>> * status in "clustat" after making changes in cluster.conf >>>> * attach cluster.conf and /var/log/messages. >>>> >>>> dominic >>>> >>>> On Fri, Feb 4, 2011 at 8:34 PM, ???? ???? wrote: >>>> >>>>> Thanks. >>>>> >>>>> This is the old methos and its work great but its hard to maintain such >>>>> cluster. >>>>> >>>>> Shalom. >>>>> >>>>> >>>>> On Fri, Feb 4, 2011 at 4:32 PM, Dominic Geevarghese < >>>>> share2dom at gmail.com> wrote: >>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am not sure about the error you are getting but it would be great if >>>>>> you could try the preferred method >>>>>> >>>>>> locking_type = 1 >>>>>> volume_list [ "your-root-vg-name" , "@hostname" ] >>>>>> >>>>>> rebuild initrd >>>>>> >>>>>> add the and resources in cluster.conf , start the cman, >>>>>> rgmanager . >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> On Thu, Feb 3, 2011 at 8:02 PM, Corey Kovacs wrote: >>>>>> >>>>>>> Excellent, >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> -C >>>>>>> >>>>>>> On Thu, Feb 3, 2011 at 10:38 AM, ???? ???? >>>>>>> wrote: >>>>>>> > >>>>>>> > >>>>>>> > On Thu, Feb 3, 2011 at 12:35 PM, ???? ???? >>>>>>> wrote: >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> https://access.redhat.com/kb/docs/DOC-3068 >>>>>>> >> >>>>>>> >> On Thu, Feb 3, 2011 at 11:13 AM, Corey Kovacs < >>>>>>> corey.kovacs at gmail.com> >>>>>>> >> wrote: >>>>>>> >>> >>>>>>> >>> Is using ha-lvm with clvmd a new capability? It's always been my >>>>>>> >>> understanding that the lvm locking type for using ha-lvm had to >>>>>>> be set >>>>>>> >>> to '1'. >>>>>>> >>> >>>>>>> >>> I'd much rather be using clvmd if it is the way to go. Can you >>>>>>> point >>>>>>> >>> me to the docs you are seeing these instructions in please? >>>>>>> >>> >>>>>>> >>> As for why your config isn't working, clvmd requires that it's >>>>>>> >>> resources are indeed tagged as cluster volumes, so you might try >>>>>>> doing >>>>>>> >>> that and see how it goes. >>>>>>> >>> >>>>>>> >>> -C >>>>>>> >>> >>>>>>> >>> On Thu, Feb 3, 2011 at 7:26 AM, ???? ???? >>>>>>> wrote: >>>>>>> >>> > Hello. >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > I followed redhat instruction trying install HA-LVM with clvmd. >>>>>>> ( rhcs >>>>>>> >>> > 5.6 - >>>>>>> >>> > rgmanager 2.0.52-9 ) >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > I can't make it work. >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > lvm.conf- locking_type=3 >>>>>>> >>> > >>>>>>> >>> > clvmd work >>>>>>> >>> > >>>>>>> >>> > Its failed saying HA-LVM is not configured correctly. >>>>>>> >>> > >>>>>>> >>> > The manual said that we should run "lvchange -a n lvxx" edit >>>>>>> the >>>>>>> >>> > cluster.conf & start the service. >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > But From lvm.conf : >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > case $1 in >>>>>>> >>> > >>>>>>> >>> > start) >>>>>>> >>> > >>>>>>> >>> > if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) >>>>>>> =~ >>>>>>> >>> > .....c >>>>>>> >>> > ]]; then >>>>>>> >>> > >>>>>>> >>> > ha_lvm_proper_setup_check || exit 1 >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > If the vg is not taged as cluster than the ha_lvm is looking >>>>>>> for >>>>>>> >>> > volume_list >>>>>>> >>> > in lvm.conf. >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > I am confused- Does the VG should taged as cluster ?? ( BTW - >>>>>>> the old >>>>>>> >>> > fashion HA-LVM is worked with no problems ) >>>>>>> >>> > >>>>>>> >>> > redhat instructions : >>>>>>> >>> > >>>>>>> >>> > To set up HA LVM Failover (using the preferred CLVM variant), >>>>>>> perform >>>>>>> >>> > the >>>>>>> >>> > following steps: >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > 1. Ensure that the parameter locking_type in the global section >>>>>>> >>> > of /etc/lvm/lvm.conf is set to the value '3', that all the >>>>>>> necessary >>>>>>> >>> > LVM >>>>>>> >>> > cluster packages are installed, and the necessary daemons are >>>>>>> started >>>>>>> >>> > (like >>>>>>> >>> > 'clvmd' and the cluster mirror log daemon - if necessary). >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > 2. Create the logical volume and filesystem using standard LVM2 >>>>>>> and >>>>>>> >>> > file >>>>>>> >>> > system commands. For example: >>>>>>> >>> > >>>>>>> >>> > # pvcreate /dev/sd[cde]1 >>>>>>> >>> > >>>>>>> >>> > # vgcreate /dev/sd[cde]1 >>>>>>> >>> > >>>>>>> >>> > # lvcreate -L 10G -n >>>>>>> >>> > >>>>>>> >>> > # mkfs.ext3 /dev// >>>>>>> >>> > >>>>>>> >>> > # lvchange -an / >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > 3. Edit /etc/cluster/cluster.conf to include the newly created >>>>>>> logical >>>>>>> >>> > volume as a resource in one of your services. Alternatively, >>>>>>> >>> > configuration >>>>>>> >>> > tools such as Conga or system-config-cluster may be used to >>>>>>> create >>>>>>> >>> > these >>>>>>> >>> > entries. Below is a sample resource manager section >>>>>>> >>> > from /etc/cluster/cluster.conf: >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>> >>> > ordered="1" >>>>>>> >>> > restricted="0"> >>>>>> >>> > priority="1"/> >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>>>>> >>> > name="lvm" vg_name="shared_vg" lv_name="ha-lv"/> >>>>>> name="FS" >>>>>>> >>> > device="/dev/shared_vg/ha-lv" force_fsck="0" force_unmount="1" >>>>>>> >>> > fsid="64050" >>>>>>> >>> > fstype="ext3" mountpoint="/mnt" options="" self_fence="0"/> >>>>>>> >>> > >>>>>>> >>> > >>>>>> recovery="relocate"> >>>>>>> >>> > >>>>>>> >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > Regards >>>>>>> >>> > >>>>>>> >>> > Shalom. >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > >>>>>>> >>> > -- >>>>>>> >>> > Linux-cluster mailing list >>>>>>> >>> > Linux-cluster at redhat.com >>>>>>> >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>> > >>>>>>> >>> >>>>>>> >>> -- >>>>>>> >>> Linux-cluster mailing list >>>>>>> >>> Linux-cluster at redhat.com >>>>>>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > Linux-cluster mailing list >>>>>>> > Linux-cluster at redhat.com >>>>>>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> > >>>>>>> >>>>>>> -- >>>>>>> Linux-cluster mailing list >>>>>>> Linux-cluster at redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Linux-cluster mailing list >>>>>> Linux-cluster at redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>>> >>>>> >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster at redhat.com >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster at redhat.com >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >>> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From parvez.h.shaikh at gmail.com Fri Feb 11 07:21:27 2011 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Fri, 11 Feb 2011 12:51:27 +0530 Subject: [Linux-cluster] Tuning red hat cluster Message-ID: Hi, As per my understanding rgmanager invokes 'status' on resource groups periodically to determine if these resources are up or down. I observed that this period is of around 30 seconds. Is it possible to tune or adjust this period for individual services or resource groups? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachar at awst.at Fri Feb 11 10:34:45 2011 From: zachar at awst.at (zachar at awst.at) Date: Fri, 11 Feb 2011 11:34:45 +0100 (CET) Subject: [Linux-cluster] =?utf-8?q?Tuning_red_hat_cluster?= Message-ID: Hi, http://sources.redhat.com/cluster/wiki/FAQ/RGManager#rgm_interval Regards, Balazs Parvez Shaikh schrieb: > Hi, > > As per my understanding rgmanager invokes 'status' on resource groups > periodically to determine if these resources are up or down. > > I observed that this period is of around 30 seconds. Is it possible to > tune > or adjust this period for individual services or resource groups? > > Thanks From kitgerrits at gmail.com Sat Feb 12 10:51:09 2011 From: kitgerrits at gmail.com (Kit Gerrits) Date: Sat, 12 Feb 2011 11:51:09 +0100 Subject: [Linux-cluster] A better understanding of multicast issues In-Reply-To: <4D3D9CA1.7040707@alteeve.com> Message-ID: <4d566626.857a0e0a.6cd4.0d41@mx.google.com> Digimer, Did you ever get a reply from anyone? If what you say is true, failure of one of our HSRP(HA) switches/routers might break the cluster. (if they don't share multicast menberships) I would guess that multicast groups originate in the cluster, not the switch. In that case, if the switch has been rebooted, the cluster needs to re-create the multicast groups on the switch. I would guess that the cluster itself needs to check if the switch is properly handling multicast. (subscribe to its own group and check if the packets are being handles correctly) This should provide an insight into clustering/multicast: http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note0918 6a008059a9df.shtml Regards, Kit -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer Sent: maandag 24 januari 2011 16:37 To: linux clustering Subject: [Linux-cluster] A better understanding of multicast issues Hi all, It seems to me that a very good number of clustering problems end up being multicast and smart switch related. I know that IGMP snooping and STP are often the cause, and PIM can help solve it. Despite understanding this, though, I can't quite understand exactly *why* IGMP snooping and STP break things. Reading up on them leads me to think that they should cleanly create and handle multicast groups, but this obviously isn't the case. When a switch restarts, shouldn't it send a request to clients asking to resubscribe to multicast groups? When corosync starts, I expect it would also send multicast joins. Sorry if the question is a little vague or odd. I'm trying to get my head around the troubles when, on the surface, the docs seem to make the process of creating/managing multicast quite simple and straight forward. Thanks! -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From corey.kovacs at gmail.com Sat Feb 12 15:36:20 2011 From: corey.kovacs at gmail.com (Corey Kovacs) Date: Sat, 12 Feb 2011 15:36:20 +0000 Subject: [Linux-cluster] A better understanding of multicast issues In-Reply-To: <4d566626.857a0e0a.6cd4.0d41@mx.google.com> References: <4D3D9CA1.7040707@alteeve.com> <4d566626.857a0e0a.6cd4.0d41@mx.google.com> Message-ID: When a multicast group is "joined" the switch/router will periodically (three mins i think) send out a query to the members to see if the connection is still needed. If a member does not reply to this query, then the connection is dropped for that port. If a switch is rebooted, then it's up to the member to re-establish the connection I believe, not the switch. Snooping is not generally a problem unless it's broken in the switch/router firmware. If it is, then you might need an upgrade. We use Multicast for all sorts of things and have indeed run into some problems on devices like Flex-10 cards for HP c7000 blade chassis that didn't do igmp snooping correctly, but we have gotten fixes for these issues from various vendors. If you have a planned outage for a switch, you can have your network people relocate the querier for a particular multicast group to another switch accessible to say a bonded pair or something. Things get really odd if you are on two separate switches that aren't stacked. Generally speaking, multicast isn't hard, you just have to think backwards. -C On Sat, Feb 12, 2011 at 10:51 AM, Kit Gerrits wrote: > > Digimer, > > Did you ever get a reply from anyone? > > If what you say is true, failure of one of our HSRP(HA) switches/routers > might break the cluster. > (if they don't share multicast menberships) > > I would guess that ?multicast groups originate in the cluster, not the > switch. > In that case, if the switch has been rebooted, the cluster needs to > re-create the multicast groups on the switch. > > I would guess that the cluster itself needs to check if the switch is > properly handling multicast. > (subscribe to its own group and check if the packets are being handles > correctly) > > This should provide an insight into clustering/multicast: > http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note0918 > 6a008059a9df.shtml > > > Regards, > > Kit > > > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer > Sent: maandag 24 januari 2011 16:37 > To: linux clustering > Subject: [Linux-cluster] A better understanding of multicast issues > > Hi all, > > ?It seems to me that a very good number of clustering problems end up being > multicast and smart switch related. I know that IGMP snooping and STP are > often the cause, and PIM can help solve it. Despite understanding this, > though, I can't quite understand exactly *why* IGMP snooping and STP break > things. > > ?Reading up on them leads me to think that they should cleanly create and > handle multicast groups, but this obviously isn't the case. When a switch > restarts, shouldn't it send a request to clients asking to resubscribe to > multicast groups? When corosync starts, I expect it would also send > multicast joins. > > ?Sorry if the question is a little vague or odd. I'm trying to get my head > around the troubles when, on the surface, the docs seem to make the process > of creating/managing multicast quite simple and straight forward. > > Thanks! > > -- > Digimer > E-Mail: digimer at alteeve.com > AN!Whitepapers: http://alteeve.com > Node Assassin: ?http://nodeassassin.org > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From linux at alteeve.com Sat Feb 12 16:21:44 2011 From: linux at alteeve.com (Digimer) Date: Sat, 12 Feb 2011 11:21:44 -0500 Subject: [Linux-cluster] A better understanding of multicast issues In-Reply-To: <4d566626.857a0e0a.6cd4.0d41@mx.google.com> References: <4d566626.857a0e0a.6cd4.0d41@mx.google.com> Message-ID: <4D56B398.70102@alteeve.com> On 02/12/2011 05:51 AM, Kit Gerrits wrote: > > Digimer, > > Did you ever get a reply from anyone? > > If what you say is true, failure of one of our HSRP(HA) switches/routers > might break the cluster. > (if they don't share multicast menberships) > > I would guess that multicast groups originate in the cluster, not the > switch. > In that case, if the switch has been rebooted, the cluster needs to > re-create the multicast groups on the switch. > > I would guess that the cluster itself needs to check if the switch is > properly handling multicast. > (subscribe to its own group and check if the packets are being handles > correctly) > > This should provide an insight into clustering/multicast: > http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note0918 > 6a008059a9df.shtml > > > Regards, > > Kit Hi Kit, I did not, and thank you for replying. So the frequent multicast breakdowns, given that it's fairly rare for switches to reset, is probably in the periodic checks done by the switches. I wonder then if corosync, for whatever reasons, doesn't or isn't able to answer the requests (quickly enough). Perhaps the process takes too much time? Corosync will, by default, decare a ring dead after ~3s. More to think about, and I appreciate that link. Thanks. :) -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From linux at alteeve.com Sat Feb 12 16:23:48 2011 From: linux at alteeve.com (Digimer) Date: Sat, 12 Feb 2011 11:23:48 -0500 Subject: [Linux-cluster] A better understanding of multicast issues In-Reply-To: References: <4D3D9CA1.7040707@alteeve.com> <4d566626.857a0e0a.6cd4.0d41@mx.google.com> Message-ID: <4D56B414.1050706@alteeve.com> On 02/12/2011 10:36 AM, Corey Kovacs wrote: > When a multicast group is "joined" the switch/router will periodically > (three mins i think) send out a query to the members to see if the > connection is still needed. If a member does not reply to this query, > then the connection is dropped for that port. If a switch is rebooted, > then it's up to the member to re-establish the connection I believe, > not the switch. Snooping is not generally a problem unless it's broken > in the switch/router firmware. If it is, then you might need an > upgrade. We use Multicast for all sorts of things and have indeed run > into some problems on devices like Flex-10 cards for HP c7000 blade > chassis that didn't do igmp snooping correctly, but we have gotten > fixes for these issues from various vendors. > > If you have a planned outage for a switch, you can have your network > people relocate the querier for a particular multicast group to > another switch accessible to say a bonded pair or something. Things > get really odd if you are on two separate switches that aren't > stacked. > > Generally speaking, multicast isn't hard, you just have to think backwards. > > -C Thanks for the reply, Corey. When I replied to Kit, I think I addressed issues you both brought up (yay head colds!). So let me extend my thanks to you here, this has given my more to consider. I'll have to fire up tcpdump on the the cluster and wait and see. -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From sachinbhugra at hotmail.com Sun Feb 13 09:14:43 2011 From: sachinbhugra at hotmail.com (Sachin Bhugra) Date: Sun, 13 Feb 2011 14:44:43 +0530 Subject: [Linux-cluster] Cluster node hangs Message-ID: Hi , I have setup a two node cluster in lab, with Vmware Server, and hence used manual fencing. It includes a iSCSI GFS2 partition and it service Apache in Active/Passive mode. Cluster works and I am able to relocate service between nodes with no issues. However, the problem comes when I shutdown the node, for testing, which is presently holding the service. When the node becomes unavailable, service gets relocated and GFS partition gets mounted on the other node, however it is not accessible. If I try to do a "ls/du" on GFS partition, the command hangs. On the other hand the node which was shutdown gets stuck at "unmounting file system". I tried using fence_manual -n nodename and then fence_ack_manual -n nodename, however it still remains the same. Can someone please help me is what I am doing wrong? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuric at redhat.com Sun Feb 13 09:41:55 2011 From: ekuric at redhat.com (Elvir Kuric) Date: Sun, 13 Feb 2011 10:41:55 +0100 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: References: Message-ID: <4D57A763.8030700@redhat.com> On 02/13/2011 10:14 AM, Sachin Bhugra wrote: > Hi , > > I have setup a two node cluster in lab, with Vmware Server, and hence > used manual fencing. It includes a iSCSI GFS2 partition and it service > Apache in Active/Passive mode. > > Cluster works and I am able to relocate service between nodes with no > issues. However, the problem comes when I shutdown the node, for > testing, which is presently holding the service. When the node becomes > unavailable, service gets relocated and GFS partition gets mounted on > the other node, however it is not accessible. If I try to do a "ls/du" > on GFS partition, the command hangs. On the other hand the node which > was shutdown gets stuck at "unmounting file system". > > I tried using fence_manual -n nodename and then fence_ack_manual -n > nodename, however it still remains the same. > > Can someone please help me is what I am doing wrong? > > Thanks, > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster It would be good to see /etc/fstab configuration used on cluster nodes. If /gfs partition is mounted manually it will not be unmounted correctly in case you restart node ( and not executing umount prior restart ), and will hang during shutdown/reboot process. More at: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html Regards, Elvir -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuric at redhat.com Sun Feb 13 09:52:51 2011 From: ekuric at redhat.com (Elvir Kuric) Date: Sun, 13 Feb 2011 10:52:51 +0100 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: <4D57A763.8030700@redhat.com> References: <4D57A763.8030700@redhat.com> Message-ID: <4D57A9F3.90408@redhat.com> On 02/13/2011 10:41 AM, Elvir Kuric wrote: > On 02/13/2011 10:14 AM, Sachin Bhugra wrote: >> Hi , >> >> I have setup a two node cluster in lab, with Vmware Server, and hence >> used manual fencing. It includes a iSCSI GFS2 partition and it >> service Apache in Active/Passive mode. >> >> Cluster works and I am able to relocate service between nodes with no >> issues. However, the problem comes when I shutdown the node, for >> testing, which is presently holding the service. When the node >> becomes unavailable, service gets relocated and GFS partition gets >> mounted on the other node, however it is not accessible. If I try to >> do a "ls/du" on GFS partition, the command hangs. On the other hand >> the node which was shutdown gets stuck at "unmounting file system". >> >> I tried using fence_manual -n nodename and then fence_ack_manual -n >> nodename, however it still remains the same. >> >> Can someone please help me is what I am doing wrong? >> >> Thanks, >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > It would be good to see /etc/fstab configuration used on cluster > nodes. If /gfs partition is mounted manually it will not be unmounted > correctly in case you restart node ( and not executing umount prior > restart ), and will hang during shutdown/reboot process. > > More at: > http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html Edit: above link, section 3.4Special Considerations when Mounting GFS2 File Systems > > > Regards, > > Elvir > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From sachinbhugra at hotmail.com Sun Feb 13 10:19:01 2011 From: sachinbhugra at hotmail.com (Sachin Bhugra) Date: Sun, 13 Feb 2011 15:49:01 +0530 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: <4D57A9F3.90408@redhat.com> References: , <4D57A763.8030700@redhat.com>, <4D57A9F3.90408@redhat.com> Message-ID: Thank for the reply and link. However, GFS2 is not listed in fstab, it is only handled by cluster config. Date: Sun, 13 Feb 2011 10:52:51 +0100 From: ekuric at redhat.com To: linux-cluster at redhat.com Subject: Re: [Linux-cluster] Cluster node hangs Message body On 02/13/2011 10:41 AM, Elvir Kuric wrote: On 02/13/2011 10:14 AM, Sachin Bhugra wrote: Hi , I have setup a two node cluster in lab, with Vmware Server, and hence used manual fencing. It includes a iSCSI GFS2 partition and it service Apache in Active/Passive mode. Cluster works and I am able to relocate service between nodes with no issues. However, the problem comes when I shutdown the node, for testing, which is presently holding the service. When the node becomes unavailable, service gets relocated and GFS partition gets mounted on the other node, however it is not accessible. If I try to do a "ls/du" on GFS partition, the command hangs. On the other hand the node which was shutdown gets stuck at "unmounting file system". I tried using fence_manual -n nodename and then fence_ack_manual -n nodename, however it still remains the same. Can someone please help me is what I am doing wrong? Thanks, -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster It would be good to see /etc/fstab configuration used on cluster nodes. If /gfs partition is mounted manually it will not be unmounted correctly in case you restart node ( and not executing umount prior restart ), and will hang during shutdown/reboot process. More at: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html Edit: above link, section 3.4 Special Considerations when Mounting GFS2 File Systems Regards, Elvir -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From shariq.siddiqui at yahoo.com Sun Feb 13 11:16:28 2011 From: shariq.siddiqui at yahoo.com (Shariq Siddiqui) Date: Sun, 13 Feb 2011 03:16:28 -0800 (PST) Subject: [Linux-cluster] want to use GFS2 In-Reply-To: References: Message-ID: <428984.82494.qm@web39802.mail.mud.yahoo.com> Dear All, I want to use GFS2 filesystem in shared storage. and trying to mount it in two node, what initially I need to do to fullfill this task. I don't need whole cluster suit i need minimal configuration. Please HELP Best Regards, Shariq Siddiqui Advanced Operations Technology PO.Box : 25904 - Riyadh 11476 Riyadh Saudi Arabia Tel : +966 1 291 0605 - Fax:+966 1 291 3328 -------------- next part -------------- An HTML attachment was scrubbed... URL: From share2dom at gmail.com Sun Feb 13 14:32:55 2011 From: share2dom at gmail.com (dOminic) Date: Sun, 13 Feb 2011 20:02:55 +0530 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: References: <4D57A763.8030700@redhat.com> <4D57A9F3.90408@redhat.com> Message-ID: Hi, Whats the msg you are getting in logs ?. It would be great if you could attach log mesgs along with cluster.conf -dominic On Sun, Feb 13, 2011 at 3:49 PM, Sachin Bhugra wrote: > Thank for the reply and link. However, GFS2 is not listed in fstab, it is > only handled by cluster config. > > ------------------------------ > Date: Sun, 13 Feb 2011 10:52:51 +0100 > From: ekuric at redhat.com > To: linux-cluster at redhat.com > Subject: Re: [Linux-cluster] Cluster node hangs > > > On 02/13/2011 10:41 AM, Elvir Kuric wrote: > > On 02/13/2011 10:14 AM, Sachin Bhugra wrote: > > Hi , > > I have setup a two node cluster in lab, with Vmware Server, and hence used > manual fencing. It includes a iSCSI GFS2 partition and it service Apache in > Active/Passive mode. > > Cluster works and I am able to relocate service between nodes with no > issues. However, the problem comes when I shutdown the node, for testing, > which is presently holding the service. When the node becomes unavailable, > service gets relocated and GFS partition gets mounted on the other node, > however it is not accessible. If I try to do a "ls/du" on GFS partition, the > command hangs. On the other hand the node which was shutdown gets stuck at > "unmounting file system". > > I tried using fence_manual -n nodename and then fence_ack_manual -n > nodename, however it still remains the same. > > Can someone please help me is what I am doing wrong? > > Thanks, > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > It would be good to see /etc/fstab configuration used on cluster nodes. > If /gfs partition is mounted manually it will not be unmounted correctly in > case you restart node ( and not executing umount prior restart ), and will > hang during shutdown/reboot process. > > More at: > http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html > > > Edit: above link, section 3.4 Special Considerations when Mounting GFS2 > File Systems > > > > Regards, > > Elvir > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Linux-cluster mailing list Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From share2dom at gmail.com Sun Feb 13 14:39:15 2011 From: share2dom at gmail.com (dOminic) Date: Sun, 13 Feb 2011 20:09:15 +0530 Subject: [Linux-cluster] want to use GFS2 In-Reply-To: <428984.82494.qm@web39802.mail.mud.yahoo.com> References: <428984.82494.qm@web39802.mail.mud.yahoo.com> Message-ID: Hi, You need to setup a simple cluster and a proper fencing mechanism . No need to configure any services since you want to use GFS2 on both the nodes. Start the cluster, mount the gfs2 by /etc/fstab entry. Note: You can't use GFS2 without a Cluster setup. Dominic On Sun, Feb 13, 2011 at 4:46 PM, Shariq Siddiqui wrote: > Dear All, > I want to use GFS2 filesystem in shared storage. > > and trying to mount it in two node, > > what initially I need to do to fullfill this task. > I don't need whole cluster suit i need minimal configuration. > > Please HELP > > > Best Regards, > > Shariq Siddiqui > > Advanced Operations Technology > > PO.Box : 25904 - Riyadh 11476 > > Riyadh Saudi Arabia > > Tel : +966 1 291 0605 - > > Fax:+966 1 291 3328 > [image: View Shariq Siddiqui's profile on LinkedIn] > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpeterso at redhat.com Mon Feb 14 13:37:21 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Mon, 14 Feb 2011 08:37:21 -0500 (EST) Subject: [Linux-cluster] want to use GFS2 In-Reply-To: Message-ID: <674877142.9531.1297690641329.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | Hi, | | You need to setup a simple cluster and a proper fencing mechanism . No | need | to configure any services since you want to use GFS2 on both the | nodes. | Start the cluster, mount the gfs2 by /etc/fstab entry. | | Note: You can't use GFS2 without a Cluster setup. | | Dominic Hi, Well, technically you can use GFS2 without a cluster setup. I believe Red Hat doesn't support it, and the storage can't be mounted by more than a single computer (with "lock_nolock" locking protocol), but it can be done. Regards, Bob Peterson Red Hat File Systems From randy.brown at noaa.gov Mon Feb 14 13:53:21 2011 From: randy.brown at noaa.gov (Randy Brown) Date: Mon, 14 Feb 2011 08:53:21 -0500 Subject: [Linux-cluster] Problem with machines fencing one another in 2 Node NFS cluster Message-ID: <4D5933D1.10009@noaa.gov> Hello, I am running a 2 node cluster being used as a NAS head for a Lefthand Networks iSCSI SAN to provide NFS mounts out to my network. Things have been OK for a while, but I recently lost one of the nodes as a result of a patching problem. In an effort to recreate the failed node, I imaged the working node and installed that image on the failed node. I set it's hostname and IP settings correctly and the machine booted and joined the cluster just fine. Or at least it appeared so. Things ran OK for the last few weeks, but I recently started seeing a behavior where the nodes start fencing each other. I'm wondering if there is something as a result of cloning the nodes that could be the problem. Possibly something that should be different but isn't because of the cloning? I am running CentOS 5.5 with the following package versions: Kernel - 2.6.18-194.11.3.el5 #1 SMP cman-2.0.115-34.el5_5.4 lvm2-cluster-2.02.56-7.el5_5.4 gfs2-utils-0.1.62-20.el5 kmod-gfs-0.1.34-12.el5.centos rgmanager-2.0.52-6.el5.centos.8 I have a Qlogic qla4062 HBA in the node running: QLogic iSCSI HBA Driver (f8b83000) v5.01.03.04 I will gladly provide more information as needed. Thank you, Randy -------------- next part -------------- A non-text attachment was scrubbed... Name: randy_brown.vcf Type: text/x-vcard Size: 313 bytes Desc: not available URL: From rhurst at bidmc.harvard.edu Mon Feb 14 13:54:05 2011 From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu) Date: Mon, 14 Feb 2011 08:54:05 -0500 Subject: [Linux-cluster] want to use GFS2 In-Reply-To: <674877142.9531.1297690641329.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> References: <674877142.9531.1297690641329.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <50168EC934B8D64AA8D8DD37F840F3DE05670D38F2@EVS2CCR.its.caregroup.org> Just two cents, I believe Red Hat does support GFS2 on single server using lock_nolock, because we do SAN "snaps" of actively clustered GFS2 volumes (simple flatfiles, no databases) and present the snap luns to a media agent server to backup the data oob. RHN was okay with that configuration and we have been running it this way on GFS and GFS2 for 5-years without issue. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bob Peterson Sent: Monday, February 14, 2011 8:37 AM To: linux clustering Subject: Re: [Linux-cluster] want to use GFS2 ----- Original Message ----- | Hi, | | You need to setup a simple cluster and a proper fencing mechanism . No | need to configure any services since you want to use GFS2 on both the | nodes. | Start the cluster, mount the gfs2 by /etc/fstab entry. | | Note: You can't use GFS2 without a Cluster setup. | | Dominic Hi, Well, technically you can use GFS2 without a cluster setup. I believe Red Hat doesn't support it, and the storage can't be mounted by more than a single computer (with "lock_nolock" locking protocol), but it can be done. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From linux at alteeve.com Mon Feb 14 14:03:36 2011 From: linux at alteeve.com (Digimer) Date: Mon, 14 Feb 2011 09:03:36 -0500 Subject: [Linux-cluster] Problem with machines fencing one another in 2 Node NFS cluster In-Reply-To: <4D5933D1.10009@noaa.gov> References: <4D5933D1.10009@noaa.gov> Message-ID: <4D593638.30802@alteeve.com> On 02/14/2011 08:53 AM, Randy Brown wrote: > Hello, > > I am running a 2 node cluster being used as a NAS head for a Lefthand > Networks iSCSI SAN to provide NFS mounts out to my network. Things have > been OK for a while, but I recently lost one of the nodes as a result of > a patching problem. In an effort to recreate the failed node, I imaged > the working node and installed that image on the failed node. I set > it's hostname and IP settings correctly and the machine booted and > joined the cluster just fine. Or at least it appeared so. Things ran > OK for the last few weeks, but I recently started seeing a behavior > where the nodes start fencing each other. I'm wondering if there is > something as a result of cloning the nodes that could be the problem. > Possibly something that should be different but isn't because of the > cloning? > > I am running CentOS 5.5 with the following package versions: > > Kernel - 2.6.18-194.11.3.el5 #1 SMP > cman-2.0.115-34.el5_5.4 > lvm2-cluster-2.02.56-7.el5_5.4 > gfs2-utils-0.1.62-20.el5 > kmod-gfs-0.1.34-12.el5.centos > rgmanager-2.0.52-6.el5.centos.8 > > I have a Qlogic qla4062 HBA in the node running: QLogic iSCSI HBA Driver > (f8b83000) v5.01.03.04 > > I will gladly provide more information as needed. > > Thank you, > Randy Silly question, but are the NICs mapped to their MAC addresses? If so, did you update the MAC addresses after cloning the server to reflect the actual MAC addresses? Assuming so, do you have managed switches? If so, can you test by swapping out a simple, unmanaged switch? This sounds like a multicast issue at some level. Fencing happens once the totem ring is declared failed. Do you see anything interesting in the log files prior to the fence? Can you run tcpdump to see what is happening on the interface(s) prior to the fence? -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From shariq.siddiqui at yahoo.com Mon Feb 14 18:45:26 2011 From: shariq.siddiqui at yahoo.com (Shariq Siddiqui) Date: Mon, 14 Feb 2011 10:45:26 -0800 (PST) Subject: [Linux-cluster] want to use GFS2 In-Reply-To: <50168EC934B8D64AA8D8DD37F840F3DE05670D38F2@EVS2CCR.its.caregroup.org> References: <674877142.9531.1297690641329.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> <50168EC934B8D64AA8D8DD37F840F3DE05670D38F2@EVS2CCR.its.caregroup.org> Message-ID: <71316.4030.qm@web39807.mail.mud.yahoo.com> Thanks All for your reply, I treid it with lock_nolock and its working fine with me, But only in one server at a time But I want to take a benafit of GFS2 to use this storage with two servers as a central storage. So?each one can write easily on it. My point is this can we use it without making cluster UP? or is there any other filesystem through which I can fullfill my requirment?? ? Best Regards, Shariq Siddiqui ? ? ________________________________ From: "rhurst at bidmc.harvard.edu" To: linux-cluster at redhat.com Sent: Mon, February 14, 2011 4:54:05 PM Subject: Re: [Linux-cluster] want to use GFS2 Just two cents, I believe Red Hat does support GFS2 on single server using lock_nolock, because we do SAN "snaps" of actively clustered GFS2 volumes (simple flatfiles, no databases) and present the snap luns to a media agent server to backup the data oob.? RHN was okay with that configuration and we have been running it this way on GFS and GFS2 for 5-years without issue. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bob Peterson Sent: Monday, February 14, 2011 8:37 AM To: linux clustering Subject: Re: [Linux-cluster] want to use GFS2 ----- Original Message ----- | Hi, | | You need to setup a simple cluster and a proper fencing mechanism . No | need to configure any services since you want to use GFS2 on both the | nodes. | Start the cluster, mount the gfs2 by /etc/fstab entry. | | Note: You can't use GFS2 without a Cluster setup. | | Dominic Hi, Well, technically you can use GFS2 without a cluster setup. I believe Red Hat doesn't support it, and the storage can't be mounted by more than a single computer (with "lock_nolock" locking protocol), but it can be done. Regards, Bob Peterson Red Hat File Systems -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpeterso at redhat.com Mon Feb 14 19:01:10 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Mon, 14 Feb 2011 14:01:10 -0500 (EST) Subject: [Linux-cluster] want to use GFS2 In-Reply-To: <71316.4030.qm@web39807.mail.mud.yahoo.com> Message-ID: <1886322021.17117.1297710070432.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | Thanks All for your reply, | | I treid it with lock_nolock and its working fine with me, But only in | one server | at a time | But I want to take a benafit of GFS2 to use this storage with two | servers as a | central storage. | So each one can write easily on it. | My point is this can we use it without making cluster UP? | | or is there any other filesystem through which I can fullfill my | requirment? | | Best Regards, | | Shariq Siddiqui Hi Shariq, If you want to share the storage with gfs2, you need to set up a simple cluster. Regards, Bob Peterson Red Hat File Systems From vincent.blondel at ing.be Mon Feb 14 21:02:40 2011 From: vincent.blondel at ing.be (vincent.blondel at ing.be) Date: Mon, 14 Feb 2011 22:02:40 +0100 Subject: [Linux-cluster] Two nodes DRBD - Fail-Over Actif/Passif Cluster. Message-ID: <294881FE3F4013418806F0CE6E73A7B6052F302466@ing.com> Hello all, I just installed last week two servers, each of them with Redhat Linux Enterprise 6.0 on it for hosting in a near future Blue Coat Reporter. Installation is ok but now I am trying to configure these both servers in cluster. First of all, I never configured any cluster with Linux ... Servers are both HP DL380R06 with disk cabinet directly attached on it. (twice exactly same hardware specs). What I would like to get is simply getting an Actif/Passif clustering mode with bidirectional disk space synchronization. This means, both servers are running. Only, the first one is running Reporter. During this time, disk spaces are continuously synchronized. When first one is down, second one becomes actif and when first one is running again, it synchronizes the disks and becomes primary again. server 1 is reporter1.lab.intranet with ip 10.30.30.90 server 2 is reporter2.lab.intranet with ip 10.30.30.91 the load balanced ip should be 10.30.30.92 .. After some days of research on the net, I came to the conclusion that I could be happy with a solution including, DRBD/GFS2 with Redhat Cluster Suite. I am first trying to get a complete picture running on two vmware fusion (Linux Redhat Enterprise Linux 6) on my macosx before configuring my real servers. So, after some hours of research on the net, I found some articles and links that seem to describe what I wanna get ... http://gcharriere.com/blog/?p=73 http://www.linuxtopia.org/online_books/rhel6/rhel_6_cluster_admin/rhel_6_cluster_ch-config-cli-CA.html http://www.drbd.org/users-guide/users-guide.html and the DRBD packages for RHEL6 that I did not find anywhere .. http://elrepo.org/linux/elrepo/el6/i386/RPMS/ I just only configured till now the first part, meaning cluster services but the first issue occur .. below the cluster.conf file ... and this is the result I get on both servers ... [root at reporter1 ~]# clustat Cluster Status for cluster @ Mon Feb 14 22:22:53 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ reporter1.lab.intranet 1 Online, Local, rgmanager reporter2.lab.intranet 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:example_apache (none) stopped as you can see, everything is stopped or in other words nothing runs .. so my question are : did I forget something in my conf file ? did I make something wrong in my conf file ? do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ? I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ?? do you have some best practice regarding this picture ?? many thks to help me because I certainly have a bad understanding on some points. Regards Vincent ----------------------------------------------------------------- ATTENTION: This e-mail is intended for the exclusive use of the recipient(s). This e-mail and its attachments, if any, contain confidential information and/or information protected by intellectual property rights or other rights. This e-mail does not constitute any commitment for ING Belgium except when expressly otherwise agreed in a written agreement between the intended recipient and ING Belgium. If you receive this message by mistake, please, notify the sender with the "reply" option and delete immediately this e-mail from your system, and destroy all copies of it. You may not, directly or indirectly, use this e-mail or any part of it if you are not the intended recipient. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. ----------------------------------------------------------------- ING Belgium SA/NV - Bank/Lender - Avenue Marnix 24, B-1000 Brussels, Belgium - Brussels RPM/RPR - VAT BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Account: 310-9156027-89 (IBAN BE45 3109 1560 2789). An insurance broker, registered with the Banking, Finance and Insurance Commission under the code number 12381A. ING Belgique SA - Banque/Preteur, Avenue Marnix 24, B-1000 Bruxelles - RPM Bruxelles - TVA BE 0403 200 393 - BIC (SWIFT) : BBRUBEBB - Compte: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Courtier d'assurances inscrit a la CBFA sous le numero 12381A. ING Belgie NV - Bank/Kredietgever - Marnixlaan 24, B-1000 Brussel - RPR Brussel - BTW BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Rekening: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Verzekeringsmakelaar ingeschreven bij de CBFA onder het nr. 12381A. ----------------------------------------------------------------- From niks at logik-internet.rs Mon Feb 14 23:39:21 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 00:39:21 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget Message-ID: <4D59BD29.30906@logik-internet.rs> Hello, I need to setup cluster of 3 servers without separate storage device (SAN). Servers should join their local hard drives to create shared storage space. Every server in cluster has public (100Mbps) and private (1Gbps) NIC. Private 1Gbit network will be used for exchange of data (files) on shared storage. Additional request is that data on shared storage is highly redundant (complete mirroring is required). I was wondering if following setup is possible and if anyone has any experience or comments on it: - Servers will export local disks of same size as iSCSI targets - Each server will access other's two servers disk over iSCSI initiator - CLVM will be used to set Volume Group on all 3 disks. In theory this VG will work on all servers, because they'll have access to all disks (either directly or over iSCSI). - CLVM will be used to create Logical Volume with mirroring option set to 3 (-m 3). Since there are 3 disks (physical devices) forming VG, each server will have redundant copy of same data. - Created Logical Volume will have GFS2 on it, so that it can be shared by cluster. - Web server will store web application files (scripts and photos) on created GFS. If it works, this setup should provide shared storage for cluster, built from already available local hard drives in servers forming cluster. By using LVM mirroring, each server will have the same copy of data, which should make cluster resistant to failure of any server. I was wondering, is LVM smart enough to optimize reading and use local drive for read operations? Looking forward to your comments. Best Regards, Nikola -------------- next part -------------- An HTML attachment was scrubbed... URL: From work at fajar.net Mon Feb 14 23:45:21 2011 From: work at fajar.net (Fajar A. Nugraha) Date: Tue, 15 Feb 2011 06:45:21 +0700 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59BD29.30906@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> Message-ID: On Tue, Feb 15, 2011 at 6:39 AM, Nikola Savic wrote: > > ? Hello, > > ? I need to setup cluster of 3 servers without separate storage device > (SAN). Servers should join their local hard drives to create shared storage > space. Every server in cluster has public (100Mbps) and private (1Gbps) NIC. > Private 1Gbit network will be used for exchange of data (files) on shared > storage. Additional request is that data on shared storage is highly > redundant (complete mirroring is required). > > ? I was wondering if following setup is possible and if anyone has any > experience or comments on it: > - Servers will export local disks of same size as iSCSI targets > - Each server will access other's two servers disk over iSCSI initiator > - CLVM will be used to set Volume Group on all 3 disks. In theory this VG > will work on all servers, because they'll have access to all disks (either > directly or over iSCSI). > - CLVM will be used to create Logical Volume with mirroring option set to 3 > (-m 3). Since there are 3 disks (physical devices) forming VG, each server > will have redundant copy of same data. > - Created Logical Volume will have GFS2 on it, so that it can be shared by > cluster. > - Web server will store web application files (scripts and photos) on > created GFS. > > ? If it works, this setup should provide shared storage for cluster, built > from already available local hard drives in servers forming cluster. By > using LVM mirroring, each server will have the same copy of data, which > should make cluster resistant to failure of any server. > > ? I was wondering, is LVM smart enough to optimize reading and use local > drive for read operations? Can LVM mirror handle one server outage? Can it automatically pick the difference when it's back on? Looks like you'd better stick with drbd. -- Fajar From linux at alteeve.com Mon Feb 14 23:56:42 2011 From: linux at alteeve.com (Digimer) Date: Mon, 14 Feb 2011 18:56:42 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59BD29.30906@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> Message-ID: <4D59C13A.9030807@alteeve.com> On 02/14/2011 06:39 PM, Nikola Savic wrote: > > Hello, > > I need to setup cluster of 3 servers without separate storage device > (SAN). Servers should join their local hard drives to create shared > storage space. Every server in cluster has public (100Mbps) and private > (1Gbps) NIC. Private 1Gbit network will be used for exchange of data > (files) on shared storage. Additional request is that data on shared > storage is highly redundant (complete mirroring is required). > > I was wondering if following setup is possible and if anyone has any > experience or comments on it: > - Servers will export local disks of same size as iSCSI targets > - Each server will access other's two servers disk over iSCSI initiator > - CLVM will be used to set Volume Group on all 3 disks. In theory this > VG will work on all servers, because they'll have access to all disks > (either directly or over iSCSI). > - CLVM will be used to create Logical Volume with mirroring option set > to 3 (-m 3). Since there are 3 disks (physical devices) forming VG, each > server will have redundant copy of same data. > - Created Logical Volume will have GFS2 on it, so that it can be shared > by cluster. > - Web server will store web application files (scripts and photos) on > created GFS. > > If it works, this setup should provide shared storage for cluster, > built from already available local hard drives in servers forming > cluster. By using LVM mirroring, each server will have the same copy of > data, which should make cluster resistant to failure of any server. > > I was wondering, is LVM smart enough to optimize reading and use local > drive for read operations? > > Looking forward to your comments. > > Best Regards, > Nikola I'd recommend looking at created a three-way DRBD resource. Use this resource as your cLVM PV/VG/LVs. On these LVs you can use GFS2 for the actual shared file system where your data can reside. A few notes: - You *MUST* have fencing setup. This is not an option, and manual fencing will not suffice. If your server have IPMI (or vendor equivalents like DRAC, iLO, etc) then you are fine. If not, you will need an external device like an addressable PDU (see APC or Tripplite). - LVM optimization is not something I can comment on. - We've not discussed complexity. Clustering is not inherently hard, but there is a lot to know and a lot can go wrong. Give yourself ample time to work through problems and test failure scenarios. Do not expect to launch in a month. It will likely takes a few months at minimum to be ready for production. - Join the #linux-cluster IRC channel on freenode.net. There are good people there who can help you out as you learn. - Be patient and have fun. :) -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From linux at alteeve.com Mon Feb 14 23:57:38 2011 From: linux at alteeve.com (Digimer) Date: Mon, 14 Feb 2011 18:57:38 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: References: <4D59BD29.30906@logik-internet.rs> Message-ID: <4D59C172.8070703@alteeve.com> On 02/14/2011 06:45 PM, Fajar A. Nugraha wrote: > On Tue, Feb 15, 2011 at 6:39 AM, Nikola Savic wrote: >> >> Hello, >> >> I need to setup cluster of 3 servers without separate storage device >> (SAN). Servers should join their local hard drives to create shared storage >> space. Every server in cluster has public (100Mbps) and private (1Gbps) NIC. >> Private 1Gbit network will be used for exchange of data (files) on shared >> storage. Additional request is that data on shared storage is highly >> redundant (complete mirroring is required). >> >> I was wondering if following setup is possible and if anyone has any >> experience or comments on it: >> - Servers will export local disks of same size as iSCSI targets >> - Each server will access other's two servers disk over iSCSI initiator >> - CLVM will be used to set Volume Group on all 3 disks. In theory this VG >> will work on all servers, because they'll have access to all disks (either >> directly or over iSCSI). >> - CLVM will be used to create Logical Volume with mirroring option set to 3 >> (-m 3). Since there are 3 disks (physical devices) forming VG, each server >> will have redundant copy of same data. >> - Created Logical Volume will have GFS2 on it, so that it can be shared by >> cluster. >> - Web server will store web application files (scripts and photos) on >> created GFS. >> >> If it works, this setup should provide shared storage for cluster, built >> from already available local hard drives in servers forming cluster. By >> using LVM mirroring, each server will have the same copy of data, which >> should make cluster resistant to failure of any server. >> >> I was wondering, is LVM smart enough to optimize reading and use local >> drive for read operations? > > Can LVM mirror handle one server outage? Can it automatically pick the > difference when it's back on? > Looks like you'd better stick with drbd. You can use DRBD as the PV for clustered LVM. :) -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From niks at logik-internet.rs Tue Feb 15 01:49:01 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 02:49:01 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59C13A.9030807@alteeve.com> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> Message-ID: <4D59DB8D.5080606@logik-internet.rs> Digimer wrote: > On 02/14/2011 06:39 PM, Nikola Savic wrote: > >> Hello, >> >> I need to setup cluster of 3 servers without separate storage device >> (SAN). Servers should join their local hard drives to create shared >> storage space. Every server in cluster has public (100Mbps) and private >> (1Gbps) NIC. Private 1Gbit network will be used for exchange of data >> (files) on shared storage. Additional request is that data on shared >> storage is highly redundant (complete mirroring is required). >> >> I was wondering if following setup is possible and if anyone has any >> experience or comments on it: >> - Servers will export local disks of same size as iSCSI targets >> - Each server will access other's two servers disk over iSCSI initiator >> - CLVM will be used to set Volume Group on all 3 disks. In theory this >> VG will work on all servers, because they'll have access to all disks >> (either directly or over iSCSI). >> - CLVM will be used to create Logical Volume with mirroring option set >> to 3 (-m 3). Since there are 3 disks (physical devices) forming VG, each >> server will have redundant copy of same data. >> - Created Logical Volume will have GFS2 on it, so that it can be shared >> by cluster. >> - Web server will store web application files (scripts and photos) on >> created GFS. >> >> If it works, this setup should provide shared storage for cluster, >> built from already available local hard drives in servers forming >> cluster. By using LVM mirroring, each server will have the same copy of >> data, which should make cluster resistant to failure of any server. >> >> I was wondering, is LVM smart enough to optimize reading and use local >> drive for read operations? >> >> Looking forward to your comments. >> >> Best Regards, >> Nikola >> > > I'd recommend looking at created a three-way DRBD resource. Use this > resource as your cLVM PV/VG/LVs. On these LVs you can use GFS2 for the > actual shared file system where your data can reside. > Thank you for prompt reply! Is there howto I can look into for this kind of setup? I assume that, when DRBD is used, only one of three mirrored devices is available for writing. That would require that one of servers exports writable DRBD using iSCSI or GNBD, so other cluster servers could access it. Am I right? What is main reason you would suggest DRBD and not solution based on LVM mirroring? Is it - Performance - Reliability - Better failover Best Regards, Nikola -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Tue Feb 15 01:51:00 2011 From: linux at alteeve.com (Digimer) Date: Mon, 14 Feb 2011 20:51:00 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59DB8D.5080606@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> Message-ID: <4D59DC04.6060305@alteeve.com> On 02/14/2011 08:49 PM, Nikola Savic wrote: > Digimer wrote: >> On 02/14/2011 06:39 PM, Nikola Savic wrote: >> >>> Hello, >>> >>> I need to setup cluster of 3 servers without separate storage device >>> (SAN). Servers should join their local hard drives to create shared >>> storage space. Every server in cluster has public (100Mbps) and private >>> (1Gbps) NIC. Private 1Gbit network will be used for exchange of data >>> (files) on shared storage. Additional request is that data on shared >>> storage is highly redundant (complete mirroring is required). >>> >>> I was wondering if following setup is possible and if anyone has any >>> experience or comments on it: >>> - Servers will export local disks of same size as iSCSI targets >>> - Each server will access other's two servers disk over iSCSI initiator >>> - CLVM will be used to set Volume Group on all 3 disks. In theory this >>> VG will work on all servers, because they'll have access to all disks >>> (either directly or over iSCSI). >>> - CLVM will be used to create Logical Volume with mirroring option set >>> to 3 (-m 3). Since there are 3 disks (physical devices) forming VG, each >>> server will have redundant copy of same data. >>> - Created Logical Volume will have GFS2 on it, so that it can be shared >>> by cluster. >>> - Web server will store web application files (scripts and photos) on >>> created GFS. >>> >>> If it works, this setup should provide shared storage for cluster, >>> built from already available local hard drives in servers forming >>> cluster. By using LVM mirroring, each server will have the same copy of >>> data, which should make cluster resistant to failure of any server. >>> >>> I was wondering, is LVM smart enough to optimize reading and use local >>> drive for read operations? >>> >>> Looking forward to your comments. >>> >>> Best Regards, >>> Nikola >>> >> >> I'd recommend looking at created a three-way DRBD resource. Use this >> resource as your cLVM PV/VG/LVs. On these LVs you can use GFS2 for the >> actual shared file system where your data can reside. >> > > Thank you for prompt reply! > > Is there howto I can look into for this kind of setup? I assume that, > when DRBD is used, only one of three mirrored devices is available for > writing. That would require that one of servers exports writable DRBD > using iSCSI or GNBD, so other cluster servers could access it. Am I right? > > What is main reason you would suggest DRBD and not solution based on > LVM mirroring? Is it > - Performance > - Reliability > - Better failover > > Best Regards, > Nikola I have an in-progress tutorial, which I would recommend as a guide only. If you are interested, I will send you the link off-list. As for your question; No, you can read/write to the shared storage at the same time without the need for iSCSI. DRBD can run in "Primary/Primary[/Primary]" mode. Then you layer onto this clustered LVM followed by GFS2. Once up, all three nodes can access and edit the same storage space at the same time. So you're taking advantage of all three technologies. As for mirrored LVM, I've not tried it yet as DRBD->cLVM->GFS2 has worked quite well for me. -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From linux at alteeve.com Tue Feb 15 02:37:39 2011 From: linux at alteeve.com (Digimer) Date: Mon, 14 Feb 2011 21:37:39 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59E3A1.4070508@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> Message-ID: <4D59E6F3.3050002@alteeve.com> On 02/14/2011 09:23 PM, Nikola Savic wrote: >> I have an in-progress tutorial, which I would recommend as a guide only. >> If you are interested, I will send you the link off-list. >> >> As for your question; No, you can read/write to the shared storage at >> the same time without the need for iSCSI. DRBD can run in >> "Primary/Primary[/Primary]" mode. Then you layer onto this clustered LVM >> followed by GFS2. Once up, all three nodes can access and edit the same >> storage space at the same time. >> >> So you're taking advantage of all three technologies. As for mirrored >> LVM, I've not tried it yet as DRBD->cLVM->GFS2 has worked quite well for me. > > I just read about Primary/Primary configuration in DRBD's User Guide, > but would love to get link to tutorial you mentioned, especially if it > covers fancing :) When one of servers is restarted and there is delay in > data being written to DRBD, what happens when sever is back up? Is > booting stopped by DRBD until synchronization is done, or does it try to > do it in background? If it's done in background, how does > Primary/Primary mode work? > > Thanks, > Nikola Once the cluster manager (corosync in Cluster3, openais in Cluster2) stops getting messages from a node (be it hung or dead), it starts a counter. Once the counter exceeds a set threshold, the node is declared dead and a fence is called against that node. This should, when working properly, reliably prevent the node from trying to access the shared storage (ie: stop it from trying to complete a write operation). Once, and *only* if the fence was successful, the cluster will reform. Once the cluster configuration is in place, recovery of the file system can begin (ie: the journal can be replayed). Finally, normal operation can continue, albeit with one less node. This is also where the resource manager (rgmanager or pacemaker) start shuffling around any resources that were lost when the node went down. Traditionally, fencing involves rebooting the lost node, in the hopes that it will come back in a healthier state. Assuming it does come up healthy, a couple main steps must occur. First, it will rejoin the other DRBD members. These members will have a "dirty block" list in memory which will allow them to quickly bring the recovered server back into sync. During this time, you can bring that node online (ie: set it primary and start accessing it via GFS2). However, note that it can not be the sole primary device until it is fully sync'ed. Second, the cluster reforms to restore the recovered node. Once the member has successfully joined, the resource manager (again, rgmanager or pacemaker) will begin reorganizing the clustered resources as per your configuration. An important note: If the fence call fails (either because of a fault in the fence device or due to misconfiguration), the cluster will hang and *all* access to the shared storage will stop. *This is by design!* The reason is that, should the cluster falsely assume the node was dead, begin recovering the journal and then the hung node recovered and tried to complete the write, the shared filesystem would be corrupted. That is; "It is better a hung cluster than a corrupt cluster." This is why fencing is so critical. :) -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org From niks at logik-internet.rs Tue Feb 15 02:53:33 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 03:53:33 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59E6F3.3050002@alteeve.com> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> Message-ID: <4D59EAAD.8030301@logik-internet.rs> Digimer wrote: > First, it will rejoin the other DRBD members. These members will have a > "dirty block" list in memory which will allow them to quickly bring the > recovered server back into sync. During this time, you can bring that > node online (ie: set it primary and start accessing it via GFS2). > However, note that it can not be the sole primary device until it is > fully sync'ed. > If I understand you well, even before sync is completely done DRBD will take care of reading and writing of dirty blocks on problematic node that got back online? Let's say that node was down for longer time and that synchronization can take few minutes, maybe more. If all services start working before sync is complete, it can happen that web applications tries to write into or read from dirty block(s). Will DRBD take care of that? If not, is there way to suspend startup of services (web server and similar) until sync is done? Thanks for detailed replies! Regards, Nikola From stefan at lsd.co.za Tue Feb 15 06:14:58 2011 From: stefan at lsd.co.za (Stefan Lesicnik) Date: Tue, 15 Feb 2011 08:14:58 +0200 (SAST) Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59DC04.6060305@alteeve.com> Message-ID: <1170617551.7574.1297750498041.JavaMail.root@zcs-jhb-lsd> I have an in-progress tutorial, which I would recommend as a guide only. If you are interested, I will send you the link off-list. As for your question; No, you can read/write to the shared storage at the same time without the need for iSCSI. DRBD can run in "Primary/Primary[/Primary]" mode. Then you layer onto this clustered LVM followed by GFS2. Once up, all three nodes can access and edit the same storage space at the same time. So you're taking advantage of all three technologies. As for mirrored LVM, I've not tried it yet as DRBD->cLVM->GFS2 has worked quite well for me. -- Digimer E-Mail: digimer at alteeve.com AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org Hi, Sorry my mail client isn't indenting above reply (zimbra - shrug). I have used DRBD to mirror 2 SAN's. I tested the active / active with ocfs2 and must say the performance knock was really terrible. It may have been application specific (many little files), but i think the general consensus is use active / passive with some cluster failover if you can. But please do test active / active and let us know if its better! (I also know there is a new version of drbd that is meant to improve dual primary mode) Stefan -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From gordan at bobich.net Tue Feb 15 09:57:18 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 09:57:18 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59EAAD.8030301@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> Message-ID: <4D5A4DFE.8050101@bobich.net> Nikola Savic wrote: > Digimer wrote: >> First, it will rejoin the other DRBD members. These members will have a >> "dirty block" list in memory which will allow them to quickly bring the >> recovered server back into sync. During this time, you can bring that >> node online (ie: set it primary and start accessing it via GFS2). >> However, note that it can not be the sole primary device until it is >> fully sync'ed. >> > > If I understand you well, even before sync is completely done DRBD > will take care of reading and writing of dirty blocks on problematic > node that got back online? Let's say that node was down for longer time > and that synchronization can take few minutes, maybe more. If all > services start working before sync is complete, it can happen that web > applications tries to write into or read from dirty block(s). Will DRBD > take care of that? If not, is there way to suspend startup of services > (web server and similar) until sync is done? DRBD and GFS will take care of that for you. DRBD directs reads to nodes that are up to date until everything is in sync. Make sure that in drbd.conf you put in a stonith parameter pointing at your fencing agent with suitable parameters, and set the timeout to slightly less than what you have it set in cluster.conf. That will ensure that you are protected from the race condition where DRBD might drop out but the node starts heartbeating between then and when the fencing timeout occurs. Oh, and if you are going to use DRBD there is no reason to use LVM. Gordan From work at fajar.net Tue Feb 15 10:08:45 2011 From: work at fajar.net (Fajar A. Nugraha) Date: Tue, 15 Feb 2011 17:08:45 +0700 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A4DFE.8050101@bobich.net> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> Message-ID: On Tue, Feb 15, 2011 at 4:57 PM, Gordan Bobic wrote: > Nikola Savic wrote: >> ?If I understand you well, even before sync is completely done DRBD >> will take care of reading and writing of dirty blocks on problematic >> node that got back online? Let's say that node was down for longer time >> and that synchronization can take few minutes, maybe more. If all >> services start working before sync is complete, it can happen that web >> applications tries to write into or read from dirty block(s). Will DRBD >> take care of that? If not, is there way to suspend startup of services >> (web server and similar) until sync is done? > > DRBD and GFS will take care of that for you. DRBD directs reads to nodes > that are up to date until everything is in sync. Really? Can you point to a documentation that said so? IIRC the block device /dev/drbd* on a node will not be accessible for read/write until it's synced. > > Make sure that in drbd.conf you put in a stonith parameter pointing at your > fencing agent with suitable parameters, and set the timeout to slightly less > than what you have it set in cluster.conf. That will ensure that you are > protected from the race condition where DRBD might drop out but the node > starts heartbeating between then and when the fencing timeout occurs. > > Oh, and if you are going to use DRBD there is no reason to use LVM. There are two ways to use DRBD with LVM in a cluster: (1) Use drbd on partition/disk, and use CLVM on top of that (2) create local LVM, and use drbd on top of the LVs Personally I prefer (2), since this setup allows LVM snapshots, and faster to resync if I want to reinitialize a drbd device on one of the nodes (like when a split brain occurred, which was often on my fencingless-test-setup a while back). -- Fajar From gordan at bobich.net Tue Feb 15 10:23:36 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 10:23:36 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> Message-ID: <4D5A5428.20400@bobich.net> Fajar A. Nugraha wrote: > On Tue, Feb 15, 2011 at 4:57 PM, Gordan Bobic wrote: >> Nikola Savic wrote: >>> If I understand you well, even before sync is completely done DRBD >>> will take care of reading and writing of dirty blocks on problematic >>> node that got back online? Let's say that node was down for longer time >>> and that synchronization can take few minutes, maybe more. If all >>> services start working before sync is complete, it can happen that web >>> applications tries to write into or read from dirty block(s). Will DRBD >>> take care of that? If not, is there way to suspend startup of services >>> (web server and similar) until sync is done? >> DRBD and GFS will take care of that for you. DRBD directs reads to nodes >> that are up to date until everything is in sync. > > Really? Can you point to a documentation that said so? > IIRC the block device /dev/drbd* on a node will not be accessible for > read/write until it's synced. If you are running in primary/primary mode, the block device will most definitely be available in rw mode as soon as drbd has connected to the cluster and established where to get the most up to date copy from. I haven't looked through the documentation recently so don't have a link handy but I have several clusters with this setup deployed, so I'm reasonably confident I know what I'm talking about. :) >> Make sure that in drbd.conf you put in a stonith parameter pointing at your >> fencing agent with suitable parameters, and set the timeout to slightly less >> than what you have it set in cluster.conf. That will ensure that you are >> protected from the race condition where DRBD might drop out but the node >> starts heartbeating between then and when the fencing timeout occurs. >> >> Oh, and if you are going to use DRBD there is no reason to use LVM. > > There are two ways to use DRBD with LVM in a cluster: > (1) Use drbd on partition/disk, and use CLVM on top of that > (2) create local LVM, and use drbd on top of the LVs > > Personally I prefer (2), since this setup allows LVM snapshots, and > faster to resync if I want to reinitialize a drbd device on one of the > nodes (like when a split brain occurred, which was often on my > fencingless-test-setup a while back). I don't see what the purpose of (1) is. I can sort of see where you are coming from with snapshots in (2), but what you are describing doesn't sound like something you would ever want to use in production. Just because you _can_ use LVM doesn't mean that you _should_ use it. Another bad thing about LVM if you are using it on top of RAID or an SSD is that its headers will throw the FS completely out of alignment if you don't pre-compensate for it. Gordan From niks at logik-internet.rs Tue Feb 15 11:49:38 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 12:49:38 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A4DFE.8050101@bobich.net> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> Message-ID: <4D5A6852.5000103@logik-internet.rs> Gordan Bobic wrote: > DRBD and GFS will take care of that for you. DRBD directs reads to > nodes that are up to date until everything is in sync. > > Make sure that in drbd.conf you put in a stonith parameter pointing at > your fencing agent with suitable parameters, and set the timeout to > slightly less than what you have it set in cluster.conf. That will > ensure that you are protected from the race condition where DRBD might > drop out but the node starts heartbeating between then and when the > fencing timeout occurs. > > Oh, and if you are going to use DRBD there is no reason to use LVM. This is interesting approach. I understand that DRBD with GFS2 doesn't require LVM between, but it does bring some inflexibility: * For each logical volume, one has to setup separate DRBD * Cluster wide logical volume resizing not easy * No snapshot - this is very important to me for MySQL backups. What is main reason for you not to use LVM on top of DRBD? Is it just that you didn't require benefits it brings? Or, it makes more problems by your opinion? Best Regards, Nikola -------------- next part -------------- An HTML attachment was scrubbed... URL: From gordan at bobich.net Tue Feb 15 12:04:44 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 12:04:44 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A6852.5000103@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> Message-ID: <4D5A6BDC.8080708@bobich.net> Nikola Savic wrote: > Gordan Bobic wrote: >> DRBD and GFS will take care of that for you. DRBD directs reads to >> nodes that are up to date until everything is in sync. >> >> Make sure that in drbd.conf you put in a stonith parameter pointing at >> your fencing agent with suitable parameters, and set the timeout to >> slightly less than what you have it set in cluster.conf. That will >> ensure that you are protected from the race condition where DRBD might >> drop out but the node starts heartbeating between then and when the >> fencing timeout occurs. >> >> Oh, and if you are going to use DRBD there is no reason to use LVM. > > This is interesting approach. I understand that DRBD with GFS2 doesn't > require LVM between, but it does bring some inflexibility: > > * For each logical volume, one has to setup separate DRBD Can you elaborate what you are referring to? Partitions? There is technically nothing stopping you from partitioning a DRBD device. Also depending on what you are doing, you may find that having one DRBD device per disk is preferable in terms of performance and reliability to having a mirrored pool (effectively RAID01). Pool of mirrors (RAID10) is more resilient. > * Cluster wide logical volume resizing not easy Are you really going to spoon-feed the space expansions that much, thus causing unnecessary fragmentation? If you size your storage sensibly, you won't need to upgrade it for a few years, and when the time comes to upgrade it you may well need to replace the servers while you're at it. Volume resizing is, IMO, over-rated and unnecessary in most cases, except where data growth is quite mind-boggling (in which case you won't be using MySQL anyway). > * No snapshot - this is very important to me for MySQL backups. Last I checked CLVM couldn't do snapshots, but that may have changed recently. Snapshots also aren't even remotely ideal for MySQL backups. You really need a replicated server to take reliable backups from. > What is main reason for you not to use LVM on top of DRBD? Is it just > that you didn't require benefits it brings? Or, it makes more problems > by your opinion? Traditionally, CLVM didn't provide any tangible benefits (no snapshots), and I never found myself in a situation where dynamically growing a volume with randomly assembled storage was required. If you are JBOD-ing a bunch of cheap SATA disks, you might as well size the storage correctly to begin with and not have to bother with LVM. I'm assuming this is what you are doing since you are doing it on the cheap (SAN-less). If you are using a SAN, the SAN will provide functionality to grow the exported block device and you can just grow the fs onto that, without needing LVM. So apart from snapshots (non-clustered) or a setup like what was suggested earlier, to have DRBD on top of local LVM to gain local-consistency snapshot capability in a cluster (not sure I'd trust that with my data, but it may be good for non-production environments), I don't really see the advantage. Snapshots also only give you crash-level consistency, which I never felt was good enough for applications like databases. A replicated slave that you can shut down is generally a more reliable solution for backups. Gordan From thomas at sjolshagen.net Tue Feb 15 12:12:25 2011 From: thomas at sjolshagen.net (Thomas Sjolshagen) Date: Tue, 15 Feb 2011 07:12:25 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A6852.5000103@logik-internet.rs> References: "\"<4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com>" <4D59E3A1.4070508@logik-internet.rs>" <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> Message-ID: <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> On Tue, 15 Feb 2011 12:49:38 +0100, Nikola Savic wrote: > This is interesting approach. I understand that DRBD with GFS2 doesn't require LVM between, but it does bring some inflexibility: > > * For each logical volume, one has to setup separate DRBD > * Cluster wide logical volume resizing not easy > * No snapshot - this is very important to me for MySQL backups. > > What is main reason for you not to use LVM on top of DRBD? Is it just that you didn't require benefits it brings? Or, it makes more problems by your opinion? Just so you realize; If you intend to use clvm (i.e. lvme in a cluster where you expect to be able to write to the volume from more than one node at/around the same time w/o a full-on failover), you will _not_ have snapshot support. And no, this isn't "not supported" as in "nobody to call if you encounter a problem", it's "not supported" as in "the tools will not let you create a snapshot of the LV". However, you ought to be able to configure one of the DRBD mirror members as part of a split/mount read-only/merge based equivalent and thus get a similar result, I think. // Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From niks at logik-internet.rs Tue Feb 15 12:19:41 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 13:19:41 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D59E6F3.3050002@alteeve.com> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> Message-ID: <4D5A6F5D.1050301@logik-internet.rs> Digimer wrote: > Once, and *only* if the fence was successful, the cluster will reform. > Once the cluster configuration is in place, recovery of the file system > can begin (ie: the journal can be replayed). Finally, normal operation > can continue, albeit with one less node. This is also where the resource > manager (rgmanager or pacemaker) start shuffling around any resources > that were lost when the node went down. > From guide you sent me, I understood that fencing to work well servers should have IPMI available on motherboards. My client is going to purchase servers at Hetzner from their EQ-Line. I asked their support if IPMI is available. Since my other client already has server with 'em, I tried to install ipmi related packages (like you specified in guide). IPMI service doesn't start, so I assume it's not available or not turned on in BIOS. How would cluster work if no IPMI or similar technology is available for fencing? In case one of nodes dies and no fencing is available, cluster will hang until administrator does manual fancing? Best Regards, Nikola From gordan at bobich.net Tue Feb 15 12:20:04 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 12:20:04 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> References: "\"<4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com>" <4D59E3A1.4070508@logik-internet.rs>" <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> Message-ID: <4D5A6F74.5030500@bobich.net> Thomas Sjolshagen wrote: > On Tue, 15 Feb 2011 12:49:38 +0100, Nikola Savic wrote: > >> This is interesting approach. I understand that DRBD with GFS2 >> doesn't require LVM between, but it does bring some inflexibility: >> >> * For each logical volume, one has to setup separate DRBD >> * Cluster wide logical volume resizing not easy >> * No snapshot - this is very important to me for MySQL backups. >> >> What is main reason for you not to use LVM on top of DRBD? Is it >> just that you didn't require benefits it brings? Or, it makes more >> problems by your opinion? > > Just so you realize; If you intend to use clvm (i.e. lvme in a cluster > where you expect to be able to write to the volume from more than one > node at/around the same time w/o a full-on failover), you will _not_ > have snapshot support. And no, this isn't "not supported" as in "nobody > to call if you encounter a problem", it's "not supported" as in "the > tools will not let you create a snapshot of the LV". > > However, you ought to be able to configure one of the DRBD mirror > members as part of a split/mount read-only/merge based equivalent and > thus get a similar result, I think. Indeed, that is right - you can drop a server out of the cluster, stop drbd replication and mount it read-only (lock_nolock) and use that as a "snapshot". The added benefit is that it won't cause massive cluster slow-down through lock-bouncing during the backup. Gordan From gordan at bobich.net Tue Feb 15 12:30:06 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 12:30:06 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A6F5D.1050301@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D5A6F5D.1050301@logik-internet.rs> Message-ID: <4D5A71CE.4030007@bobich.net> Nikola Savic wrote: > Digimer wrote: >> Once, and *only* if the fence was successful, the cluster will reform. >> Once the cluster configuration is in place, recovery of the file system >> can begin (ie: the journal can be replayed). Finally, normal operation >> can continue, albeit with one less node. This is also where the resource >> manager (rgmanager or pacemaker) start shuffling around any resources >> that were lost when the node went down. >> > > From guide you sent me, I understood that fencing to work well servers > should have IPMI available on motherboards. > > My client is going to purchase servers at Hetzner from their EQ-Line. > I asked their support if IPMI is available. Since my other client > already has server with 'em, I tried to install ipmi related packages > (like you specified in guide). IPMI service doesn't start, so I assume > it's not available or not turned on in BIOS. That doesn't mean much. The IPMI service isn't what you use for fencing in this context. It's for diagnostics (e.g. advanced sensor readings, fan speeds, temperatures, voltages, etc.). Think of it as lm_sensors on steroids. For fencing you need to connect to the machine externally over the network via IPMI, and this will run at firmware level (i.e. you need to be able to power the machine on and off without an OS running). > How would cluster work if no IPMI or similar technology is available > for fencing? In case one of nodes dies and no fencing is available, > cluster will hang until administrator does manual fancing? Yes, that's about the size of it. There are add-in cards you can use to add fencing functionality even if you don't have this built into the server, e.g. Raritan eRIC G4 and similar. I wrote a fencing agent for those, you should be able to find it in the redhat bugzilla. They can be found for about ?175 or so. That may or may not compare favourably to what you can get with the servers from the vendor. Alternatively, you can use network controllable power bars for fencing, they may work out cheaper (you need one eRIC card per server, and assuming your servers have dual PSUs, you'd only need two power bars). Something else just occurs to me - you mentioned MySQL. You do realize that the performance of it will be attrocious on a shared cluster file system (ANY shared cluster file system), right? Unless you only intend to run mysqld on a single node at a time (in which case there's no point in putting it on a cluster file system). Gordan From niks at logik-internet.rs Tue Feb 15 13:05:35 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 14:05:35 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A6BDC.8080708@bobich.net> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <4D5A6BDC.8080708@bobich.net> Message-ID: <4D5A7A1F.60804@logik-internet.rs> Gordan Bobic wrote: >> What is main reason for you not to use LVM on top of DRBD? Is it just >> that you didn't require benefits it brings? Or, it makes more >> problems by your opinion? > > Traditionally, CLVM didn't provide any tangible benefits (no > snapshots), and I never found myself in a situation where dynamically > growing a volume with randomly assembled storage was required. If you > are JBOD-ing a bunch of cheap SATA disks, you might as well size the > storage correctly to begin with and not have to bother with LVM. I'm > assuming this is what you are doing since you are doing it on the > cheap (SAN-less). If you are using a SAN, the SAN will provide > functionality to grow the exported block device and you can just grow > the fs onto that, without needing LVM. > > So apart from snapshots (non-clustered) or a setup like what was > suggested earlier, to have DRBD on top of local LVM to gain > local-consistency snapshot capability in a cluster (not sure I'd trust > that with my data, but it may be good for non-production > environments), I don't really see the advantage. Snapshots also only > give you crash-level consistency, which I never felt was good enough > for applications like databases. A replicated slave that you can shut > down is generally a more reliable solution for backups. Thank you for detailed response! I generally like idea of removing unneeded levels of technology. In case DRBD+GFS2 is used for shared storage, do I need cluster suite? Can GFS2 in this setup without cluster setup? Thanks, Nikola From niks at logik-internet.rs Tue Feb 15 13:14:42 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 14:14:42 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A71CE.4030007@bobich.net> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D5A6F5D.1050301@logik-internet.rs> <4D5A71CE.4030007@bobich.net> Message-ID: <4D5A7C42.3080003@logik-internet.rs> Gordan Bobic wrote: > Something else just occurs to me - you mentioned MySQL. You do realize > that the performance of it will be attrocious on a shared cluster file > system (ANY shared cluster file system), right? Unless you only intend > to run mysqld on a single node at a time (in which case there's no > point in putting it on a cluster file system). MySQL Master and Slave(s) will run on single node. No two MySQL instances will run on same set of data. Shared storage for MySQL data should enable easier movement of MySQL instance between nodes. Eg. when MySQL master needs to be moved from one node to other, I assume it would be easier with DRBD, because I would "only" need to stop MySQL on one node and start it on other configured to use same set of data. Additionally, floating IP address assigned to MySQL master would need to be re-assigned to new node. Slaves would also need to be restarted to connect to new master. Even without floating IP used only my MySQL Master, slaves and web application can easily be reconfigured to use new IP. Do you see problem in this kind of setup? Thanks, Nikola From gordan at bobich.net Tue Feb 15 13:12:58 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 13:12:58 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A7A1F.60804@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <4D5A6BDC.8080708@bobich.net> <4D5A7A1F.60804@logik-internet.rs> Message-ID: <4D5A7BDA.30104@bobich.net> Nikola Savic wrote: > Gordan Bobic wrote: >>> What is main reason for you not to use LVM on top of DRBD? Is it just >>> that you didn't require benefits it brings? Or, it makes more >>> problems by your opinion? >> Traditionally, CLVM didn't provide any tangible benefits (no >> snapshots), and I never found myself in a situation where dynamically >> growing a volume with randomly assembled storage was required. If you >> are JBOD-ing a bunch of cheap SATA disks, you might as well size the >> storage correctly to begin with and not have to bother with LVM. I'm >> assuming this is what you are doing since you are doing it on the >> cheap (SAN-less). If you are using a SAN, the SAN will provide >> functionality to grow the exported block device and you can just grow >> the fs onto that, without needing LVM. >> >> So apart from snapshots (non-clustered) or a setup like what was >> suggested earlier, to have DRBD on top of local LVM to gain >> local-consistency snapshot capability in a cluster (not sure I'd trust >> that with my data, but it may be good for non-production >> environments), I don't really see the advantage. Snapshots also only >> give you crash-level consistency, which I never felt was good enough >> for applications like databases. A replicated slave that you can shut >> down is generally a more reliable solution for backups. > > Thank you for detailed response! > > I generally like idea of removing unneeded levels of technology. > > In case DRBD+GFS2 is used for shared storage, do I need cluster suite? > Can GFS2 in this setup without cluster setup? No, it cannot. GFS2's locking is dependant on the cman service being up and quorate, so yes, you still need the cluster suite being up and running, since that is what handles fencing. You could replace DRBD+GFS+RHCS with, say, DRBD+OCFS2+Heartbeat, but that wouldn't gain you anything either way - you'd still need fencing configured and working. Note that DRBD should also have fencing (stonith) configured, and on a lower time-out than the rest of the cluster layer to eliminate possibility of split-braining. Gordan From gordan at bobich.net Tue Feb 15 13:31:42 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 13:31:42 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A7C42.3080003@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D5A6F5D.1050301@logik-internet.rs> <4D5A71CE.4030007@bobich.net> <4D5A7C42.3080003@logik-internet.rs> Message-ID: <4D5A803E.5090106@bobich.net> Nikola Savic wrote: > Gordan Bobic wrote: >> Something else just occurs to me - you mentioned MySQL. You do realize >> that the performance of it will be attrocious on a shared cluster file >> system (ANY shared cluster file system), right? Unless you only intend >> to run mysqld on a single node at a time (in which case there's no >> point in putting it on a cluster file system). > > MySQL Master and Slave(s) will run on single node. No two MySQL > instances will run on same set of data. Shared storage for MySQL data > should enable easier movement of MySQL instance between nodes. Eg. when > MySQL master needs to be moved from one node to other, I assume it would > be easier with DRBD, because I would "only" need to stop MySQL on one > node and start it on other configured to use same set of data. There is a better way to do that. Run DRBD in active-passive mode, and grab the fail-over scripts from heartbeat. Then set up a dependency in cluster.conf that will handle a combined service of DRBD disk (handling active/passive switch), file system (mounting the fs once the DRBD becomes active locally, and mysql. You define them as dependant on each other in cluster.conf by suitable nesting. > Additionally, floating IP address assigned to MySQL master would need to > be re-assigned to new node. You can make that IP a part of the dependency stack mentioned above. > Slaves would also need to be restarted to > connect to new master. Even without floating IP used only my MySQL > Master, slaves and web application can easily be reconfigured to use new > IP. Do you see problem in this kind of setup? If the IP fails over and the FS is consistent you don't need to change any configs - MySQL slaves will re-try connecting until they succeed. Just make sure your bin-logs are on the same mount as the rest of MySQL, since they have to fail over with the rest of the DB. Gordan From jeff.sturm at eprize.com Tue Feb 15 15:55:43 2011 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Tue, 15 Feb 2011 10:55:43 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5A6BDC.8080708@bobich.net> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs><4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <4D5A6BDC.8080708@bobich.net> Message-ID: <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> > -----Original Message----- > From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] > On Behalf Of Gordan Bobic > Sent: Tuesday, February 15, 2011 7:05 AM > > Volume resizing is, IMO, over-rated and unnecessary in most cases, except where data > growth is quite mind-boggling (in which case you won't be using MySQL anyway). We actually resize volumes often. Some of our storage volumes have 30 LUNs or more. We have so many because we've virtualized most of our infrastructure, and some of the hosts are single-purpose hosts. We don't want to allocate too more storage in advance, simply because it's easier to grow than to shrink. Stop the host, grow the volume, e2fsck/resize2fs, start up and go. Much nicer than increasing disk capacity on physical hosts. CLVM works well for this, but that's about all it's good for IMHO. I prefer to use the SAN's native volume management over CLVM when available. Haven't tried DRBD yet but I'm really tempted... it sounds like it has come a long way since its modest beginnings. -Jeff From gordan at bobich.net Tue Feb 15 16:17:03 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 16:17:03 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs><4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <4D5A6BDC.8080708@bobich.net> <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> Message-ID: <4D5AA6FF.8080608@bobich.net> Jeff Sturm wrote: >> -----Original Message----- >> From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] >> On Behalf Of Gordan Bobic >> Sent: Tuesday, February 15, 2011 7:05 AM >> >> Volume resizing is, IMO, over-rated and unnecessary in most cases, > except where data >> growth is quite mind-boggling (in which case you won't be using MySQL > anyway). > > We actually resize volumes often. Some of our storage volumes have 30 > LUNs or more. We have so many because we've virtualized most of our > infrastructure, and some of the hosts are single-purpose hosts. > > We don't want to allocate too more storage in advance, simply because > it's easier to grow than to shrink. Stop the host, grow the volume, > e2fsck/resize2fs, start up and go. Much nicer than increasing disk > capacity on physical hosts. Seems labour and downtime intensive to me. Maybe I'm just used to environments where that is an unacceptable tradeoff vs. ?40/TB for storage. Not to mention that it makes you totally reliant on SAN level redundancy, which I also generally deem unacceptable except on very high end SANs that have mirroring features. Additionally, considering you can self-build a multi-TB iSCSI SAN for a few hundred ?/$/? which will have volume growing ability (use sparse files for iSCSI volumes and write a byte to a greater offset), I cannot really see any justification whatsoever for using LVM with SAN based storage. > Haven't tried DRBD yet but I'm really tempted... it sounds like it has > come a long way since its modest beginnings. Not sure how far back you are talking about but I have been using it in production in both active-active and active-passive configurations since at least 2007 with no problems. From the usage point of view, the changes have been negligible. Gordan From rpeterso at redhat.com Tue Feb 15 16:24:26 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Tue, 15 Feb 2011 11:24:26 -0500 (EST) Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> Message-ID: <263367529.33108.1297787066881.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | We don't want to allocate too more storage in advance, simply because | it's easier to grow than to shrink. Stop the host, grow the volume, | e2fsck/resize2fs, start up and go. Much nicer than increasing disk | capacity on physical hosts. These might be good for ext3/4, but with gfs and gfs2 you can lvresize and gfs2_grow while the lv is mounted. In fact, we expect it. Just make sure the vg has the clustered bit set (vgchange -cy) first. Regards, Bob Peterson Red Hat File Systems From ajb2 at mssl.ucl.ac.uk Tue Feb 15 17:59:08 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Tue, 15 Feb 2011 17:59:08 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5ABEEC.30609@mssl.ucl.ac.uk> After lots of headbanging, I'm slowly realising that limits on GFS2 lock rates and totem message passing appears to be the main inhibitor of cluster performance. Even on disks which are only mounted on one node (using lock_dlm), the ping_pong rate is - quite frankly - appalling, at about 5000 locks/second, falling off to single digits when 3 nodes are active on the same directory. totem's defaults are pretty low: (from man openais.conf) max messages/second = 17 window_size = 50 encryption = on encryption/decryption threads = 1 netmtu = 1500 I suspect tuning these would have a marked effect on performance gfs_controld and dlm_controld aren't even appearing in the CPU usage tables (24Gb dual 5560CPUs) We have 2 GFS clusters, 2 nodes (imap) and 3 nodes (fileserving) The imap system has around 2.5-3 million small files in the Maildir imap tree, whilst the fileserver cluster has ~90 1Tb filesystems of 1-4 million files apiece (fileserver total is around 150 million files) When things get busy or when users get silly and drop 10,000 files in a directory, performance across the entire cluster goes downhill badly - not just in the affected disk or directory. Even worse: backups - it takes 20-28 hours to run a 0 file incremental backup of a 2.1million file system (ext4 takes about 8 minutes for the same file set!) All heartbeat/lock traffic is handled across a dedicated Gb switch with each cluster in its own vlan to ensure no external cruft gets in to cause problems. I'm seeing heartbeat/lock lan traffic peak out at about 120kb/s and 4000pps per node at the moment. Clearly the switch isn't the problem - and using hardware acclerated igb devices I'm pretty sure the networking's fine too. SAN side, there are 4 8Gb Qlogic cards facing the fabric and right now the whole mess talks to a Nexsan atabeast (which is slow, but seldom gets its commmand queue maxed out.) Has anyone played much with the totem message timings? if so what results have you had? As a comparison, the same hardware using EXT4 on a standalone system can trivially max out multiple 1Gb/s interfaces while transferring 1-2Mb/s files and gives lock rates of 1.8-2.5 million locks/second even with multiple ping_pong processes running. From swhiteho at redhat.com Tue Feb 15 18:20:20 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 15 Feb 2011 18:20:20 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5ABEEC.30609@mssl.ucl.ac.uk> References: <4D5ABEEC.30609@mssl.ucl.ac.uk> Message-ID: <1297794020.2711.21.camel@dolmen> Hi, On Tue, 2011-02-15 at 17:59 +0000, Alan Brown wrote: > After lots of headbanging, I'm slowly realising that limits on GFS2 lock > rates and totem message passing appears to be the main inhibitor of > cluster performance. > > Even on disks which are only mounted on one node (using lock_dlm), the > ping_pong rate is - quite frankly - appalling, at about 5000 > locks/second, falling off to single digits when 3 nodes are active on > the same directory. > Let me try and explain what is going on here.... the posix (fcntl) locks which you are using, do not go through the dlm, or at least not the main part of the dlm. The lock requests are sent to either gfs_controld or dlm_controld, depending upon the version of RHCS where the requests are processed in userspace via corosync/openais. > totem's defaults are pretty low: > > (from man openais.conf) > > max messages/second = 17 > window_size = 50 > encryption = on > encryption/decryption threads = 1 > netmtu = 1500 > > I suspect tuning these would have a marked effect on performance > > gfs_controld and dlm_controld aren't even appearing in the CPU usage > tables (24Gb dual 5560CPUs) > Only one of gfs_controld/dlm_controld will have any part in dealing with the locks that you are concerned with, depending on the version. > We have 2 GFS clusters, 2 nodes (imap) and 3 nodes (fileserving) > > The imap system has around 2.5-3 million small files in the Maildir imap > tree, whilst the fileserver cluster has ~90 1Tb filesystems of 1-4 > million files apiece (fileserver total is around 150 million files) > > When things get busy or when users get silly and drop 10,000 files in a > directory, performance across the entire cluster goes downhill badly - > not just in the affected disk or directory. > > Even worse: backups - it takes 20-28 hours to run a 0 file incremental > backup of a 2.1million file system (ext4 takes about 8 minutes for the > same file set!) > The issues you've reported here don't sound to me as if they are related to the rate of posix locks which can be granted. These sound to me a lot more like issues relating to the I/O pattern on the filesystem. How is the data spread out across directories and across nodes? Do you try to keep users local to a single node for the imap servers? Is the backup just doing a single pass scan over the whole fileystem? > > All heartbeat/lock traffic is handled across a dedicated Gb switch with > each cluster in its own vlan to ensure no external cruft gets in to > cause problems. > > I'm seeing heartbeat/lock lan traffic peak out at about 120kb/s and > 4000pps per node at the moment. Clearly the switch isn't the problem - > and using hardware acclerated igb devices I'm pretty sure the > networking's fine too. > During the actual workload, or just during the ping pong test? Steve. > SAN side, there are 4 8Gb Qlogic cards facing the fabric and right now > the whole mess talks to a Nexsan atabeast (which is slow, but seldom > gets its commmand queue maxed out.) > > Has anyone played much with the totem message timings? if so what > results have you had? > > As a comparison, the same hardware using EXT4 on a standalone system can > trivially max out multiple 1Gb/s interfaces while transferring 1-2Mb/s > files and gives lock rates of 1.8-2.5 million locks/second even with > multiple ping_pong processes running. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From yvette at dbtgroup.com Tue Feb 15 18:19:53 2011 From: yvette at dbtgroup.com (yvette hirth) Date: Tue, 15 Feb 2011 18:19:53 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> References: "\"<4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com>" <4D59E3A1.4070508@logik-internet.rs>" <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> Message-ID: <4D5AC3C9.8000401@dbtgroup.com> Thomas Sjolshagen wrote: > Just so you realize; If you intend to use clvm (i.e. lvme in a cluster > where you expect to be able to write to the volume from more than one > node at/around the same time w/o a full-on failover), you will _not_ > have snapshot support. And no, this isn't "not supported" as in "nobody > to call if you encounter a problem", it's "not supported" as in "the > tools will not let you create a snapshot of the LV". i've been listening to this discussion with much interest, as we would like to improve the currency of our backup files. right now we have an ensemble of GFS2 LV's ("pri") as our primary data store, and a "matching ensemble" of XFS LV's ("bak") as our backup data store. an hourly cron job rsync's all LV's in the ensemble from pri => bak. it's incredibly reliable, but this reduces our mean backup currency by 1/2 hour. one upside is that i've got snapshots that are only 1/2 hour old, and are daily backed up to tape. the conversation seems to indicate that we can change the bak LV's from XFS to GFS2 and have drbd auto-sync the pri LV changes made to the bak LV's - yes? this would reduce our backup currency from a mean of 1/2 hour to theoretically, "atomic" (more likely "mere seconds"). i assUme we have to change from XFS to GFS2, as drbd doesn't appear to do file system conversions... if our assumptions are correct, are there any guides / manuals / doc on how to do this? it's most tempting to try, since if it doesn't work, the hourly cron rsync's could be simply reinstated. many thanks in advance to any and all who can advise... yvette From niks at logik-internet.rs Tue Feb 15 20:09:26 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 21:09:26 +0100 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs><4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <4D5A6BDC.8080708@bobich.net> <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> Message-ID: <4D5ADD76.2050600@logik-internet.rs> Jeff Sturm wrote: > We actually resize volumes often. Some of our storage volumes have 30 > LUNs or more. We have so many because we've virtualized most of our > infrastructure, and some of the hosts are single-purpose hosts. > Can you please provide more information on how storage is organized? Are you using SAN or local hard disks in nodes? Is there mirroring of data and how is it implemented in your system? Thanks, Nikola From grimme at atix.de Tue Feb 15 20:07:31 2011 From: grimme at atix.de (Marc Grimme) Date: Tue, 15 Feb 2011 21:07:31 +0100 (CET) Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <1297794020.2711.21.camel@dolmen> Message-ID: <13588217.76.1297800448396.JavaMail.marc@mobilix-20> Hi Steve, I think lately I observed a very similar behavior with RHEL5 and gfs2. It was a gfs2 filesystem that had about 2Mio files with sum of 2GB in a directory. When I did a du -shx . in this directory it took about 5 Minutes (noatime mountoption given). Independently on how much nodes took part in the cluster (in the end I only tested with one node). This was only for the first time running all later executed du commands were much faster. When I mounted the exact same filesystem with lockproto=lock_nolock it took about 10-20 seconds to proceed with the same command. Next I started to analyze this with oprofile and observed the following result: opreport --long-file-names: CPU: AMD64 family10, speed 2900.11 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 samples % symbol name 200569 46.7639 search_rsb_list 118905 27.7234 create_lkb 32499 7.5773 search_bucket 4125 0.9618 find_lkb 3641 0.8489 process_send_sockets 3420 0.7974 dlm_scan_rsbs 3184 0.7424 _request_lock 3012 0.7023 find_rsb 2735 0.6377 receive_from_sock 2610 0.6085 _receive_message 2543 0.5929 dlm_allocate_rsb 2299 0.5360 dlm_hash2nodeid 2228 0.5195 _create_message 2180 0.5083 dlm_astd 2163 0.5043 dlm_find_lockspace_global 2109 0.4917 dlm_find_lockspace_local 2074 0.4836 dlm_lowcomms_get_buffer 2060 0.4803 dlm_lock 1982 0.4621 put_rsb .. opreport --image /gfs2 CPU: AMD64 family10, speed 2900.11 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 samples % symbol name 9310 15.5600 search_bucket 6268 10.4758 do_promote 2704 4.5192 gfs2_glock_put 2289 3.8256 gfs2_glock_hold 2286 3.8206 gfs2_glock_schedule_for_reclaim 2204 3.6836 gfs2_glock_nq 2204 3.6836 run_queue 2001 3.3443 gfs2_holder_wake .. opreport --image /dlm CPU: AMD64 family10, speed 2900.11 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 samples % symbol name 200569 46.7639 search_rsb_list 118905 27.7234 create_lkb 32499 7.5773 search_bucket 4125 0.9618 find_lkb 3641 0.8489 process_send_sockets 3420 0.7974 dlm_scan_rsbs 3184 0.7424 _request_lock 3012 0.7023 find_rsb 2735 0.6377 receive_from_sock 2610 0.6085 _receive_message 2543 0.5929 dlm_allocate_rsb 2299 0.5360 dlm_hash2nodeid 2228 0.5195 _create_message .. This very much reminded me on a similar test we've done years ago with gfs (see http://www.open-sharedroot.org/Members/marc/blog/blog-on-dlm/red-hat-dlm-__find_lock_by_id/profile-data-with-diffrent-table-sizes). Does this not show that during the du command 46% of the time the kernel stays in the dlm:search_rsb_list function while looking out for locks. It still looks like the hashtable for the lock in dlm is much too small and searching inside the hashmap is not constant anymore? I would be really interesting how long the described backup takes when the gfs2 filesystem is mounted exclusively on one node without locking. For me it looks like you're facing a similar problem with gfs2 that has been worked around with gfs by introducing the glock_purge functionality that leads to a much smaller glock->dlm->hashtable and makes backups and the like much faster. I hope this helps. Thanks and regards Marc. ----- Original Message ----- From: "Steven Whitehouse" To: "linux clustering" Sent: Dienstag, 15. Februar 2011 19:20:20 Subject: Re: [Linux-cluster] optimising DLM speed? Hi, On Tue, 2011-02-15 at 17:59 +0000, Alan Brown wrote: > After lots of headbanging, I'm slowly realising that limits on GFS2 lock > rates and totem message passing appears to be the main inhibitor of > cluster performance. > > Even on disks which are only mounted on one node (using lock_dlm), the > ping_pong rate is - quite frankly - appalling, at about 5000 > locks/second, falling off to single digits when 3 nodes are active on > the same directory. > Let me try and explain what is going on here.... the posix (fcntl) locks which you are using, do not go through the dlm, or at least not the main part of the dlm. The lock requests are sent to either gfs_controld or dlm_controld, depending upon the version of RHCS where the requests are processed in userspace via corosync/openais. > totem's defaults are pretty low: > > (from man openais.conf) > > max messages/second = 17 > window_size = 50 > encryption = on > encryption/decryption threads = 1 > netmtu = 1500 > > I suspect tuning these would have a marked effect on performance > > gfs_controld and dlm_controld aren't even appearing in the CPU usage > tables (24Gb dual 5560CPUs) > Only one of gfs_controld/dlm_controld will have any part in dealing with the locks that you are concerned with, depending on the version. > We have 2 GFS clusters, 2 nodes (imap) and 3 nodes (fileserving) > > The imap system has around 2.5-3 million small files in the Maildir imap > tree, whilst the fileserver cluster has ~90 1Tb filesystems of 1-4 > million files apiece (fileserver total is around 150 million files) > > When things get busy or when users get silly and drop 10,000 files in a > directory, performance across the entire cluster goes downhill badly - > not just in the affected disk or directory. > > Even worse: backups - it takes 20-28 hours to run a 0 file incremental > backup of a 2.1million file system (ext4 takes about 8 minutes for the > same file set!) > The issues you've reported here don't sound to me as if they are related to the rate of posix locks which can be granted. These sound to me a lot more like issues relating to the I/O pattern on the filesystem. How is the data spread out across directories and across nodes? Do you try to keep users local to a single node for the imap servers? Is the backup just doing a single pass scan over the whole fileystem? > > All heartbeat/lock traffic is handled across a dedicated Gb switch with > each cluster in its own vlan to ensure no external cruft gets in to > cause problems. > > I'm seeing heartbeat/lock lan traffic peak out at about 120kb/s and > 4000pps per node at the moment. Clearly the switch isn't the problem - > and using hardware acclerated igb devices I'm pretty sure the > networking's fine too. > During the actual workload, or just during the ping pong test? Steve. > SAN side, there are 4 8Gb Qlogic cards facing the fabric and right now > the whole mess talks to a Nexsan atabeast (which is slow, but seldom > gets its commmand queue maxed out.) > > Has anyone played much with the totem message timings? if so what > results have you had? > > As a comparison, the same hardware using EXT4 on a standalone system can > trivially max out multiple 1Gb/s interfaces while transferring 1-2Mb/s > files and gives lock rates of 1.8-2.5 million locks/second even with > multiple ping_pong processes running. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Marc Grimme Tel: +49 89 4523538-14 Fax: +49 89 9901766-0 E-Mail: grimme at atix.de ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan R. Bergrath | Vorsitzender des Aufsichtsrats: Dr. Martin Buss From ajb2 at mssl.ucl.ac.uk Tue Feb 15 20:24:44 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Tue, 15 Feb 2011 20:24:44 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5ABEEC.30609@mssl.ucl.ac.uk> References: <4D5ABEEC.30609@mssl.ucl.ac.uk> Message-ID: <4D5AE10C.9020106@mssl.ucl.ac.uk> The setup described is all on RHEL5.6. Fileserver filesystems are each mounted on one cluster node only (scattered across nodes) and then NFS exported as individual services for portability. (That exposed a major race condition with exportfs as it's not parallel aware in any way, shape or form) Imapserver filesystems are mounted on one node and ALL imap activity happens on that node (hot standby) Up to EL5.6 this has been pretty unstable, panicing regularly under load and losing filesystem to a FC driver bug I'll describe separately. From ajb2 at mssl.ucl.ac.uk Tue Feb 15 20:36:16 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Tue, 15 Feb 2011 20:36:16 +0000 Subject: [Linux-cluster] QLA2xxx tagged queue bug. Message-ID: <4D5AE3C0.5060006@mssl.ucl.ac.uk> I'm documenting this in case anyone else gets bitten (This is supposed to have been fixed since October, but we encountered it in the last few days on RHEL5.6 - either it's not fully fixed or the patch has fallen out of the production kernel) We kept getting GFS and GFS2 filesystems mysteriously going dead with "input/output" errors over the last 2 years, which has been traced to a bug in qla2xxx: A QUEUE FULL or BUSY from the target results in a generic error being passed up to dm-multipath from the qla2xxx driver (instead of the driver backing off the queue size and trying again a few milliseconds later.) When Dm-multipath receives an error, it marks the path to the target "bad" and tries another path. If the queue full condition doesn't clear quickly there is a cascade of path failures followed by the target being marked as BAD when they've all failed. If "queue_if_no_path" isn't explicitly enabled in /etc/multipath,conf, that causes the i/o error symptoms described above. Even if the target's tagged queue recovers before all paths fail, there tends to be a big hiccup in GFS(2) operations. If multipathing's queue_if_no_path is enabled and the OS has to wait for the target to return, there will be an even longer glitch. Currently the only workaround available is to set the qla2xxx tagged queue depth to a very low value via module options. Qla2xxx's tagged queue depth is PER LUN, while most target tagged queues are PER DEVICE (eg: A Nexsan Satabeast presenting 6 luns has 255 commands in total, not per lun). It's pretty easy to end up with more requests coming out of the initiators than the targets can handle simultaneously. From ajb2 at mssl.ucl.ac.uk Tue Feb 15 20:45:09 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Tue, 15 Feb 2011 20:45:09 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5AE10C.9020106@mssl.ucl.ac.uk> References: <4D5ABEEC.30609@mssl.ucl.ac.uk> <4D5AE10C.9020106@mssl.ucl.ac.uk> Message-ID: <4D5AE5D5.1050002@mssl.ucl.ac.uk> > It would be really interesting how long the described backup takes when the gfs2 filesystem is mounted exclusively on one node without locking. The 2 million inode system backs up in about 30 minutes when mounted lock_nolock (0 file incremental backup using bacula) > For me it looks like you're facing a similar problem with gfs2 that has been worked around with gfs by introducing the glock_purge functionality that leads to a much smaller glock->dlm->hashtable and makes backups and the like much faster. Quite likely. Backup performance under GFS2 is slightly worse than with GFS. "ls -l" in a directory is _significantly_ worse and can take up to 4 minutes for a directory with 4000 files onboard (Remember: This is with the GFS2 filesystem mounted lock_dlm on one node only!) Compounding matters, we have a network /home - mounted on the fileservers and NFSv3 exported to ~150 RHEL5 desktops (Lots of small files, LOTS of random access). KDE, Openoffice, Thunderbird, Mozilla are all pretty lock/cachefile happy and hit the network /home export fairly hard, so when there's a performance issue the users get pretty noisy. From vincent.blondel at ing.be Tue Feb 15 20:50:04 2011 From: vincent.blondel at ing.be (vincent.blondel at ing.be) Date: Tue, 15 Feb 2011 21:50:04 +0100 Subject: [Linux-cluster] Two nodes DRBD - Fail-Over Actif/Passif Cluster. In-Reply-To: <294881FE3F4013418806F0CE6E73A7B6052F302466@VPNLCMS92081.europe.intranet> References: <294881FE3F4013418806F0CE6E73A7B6052F302466@VPNLCMS92081.europe.intranet> Message-ID: <294881FE3F4013418806F0CE6E73A7B6052F302474@ing.com> > Hello all, > > I just installed last week two servers, each of them with Redhat Linux Enterprise 6.0 on it for hosting in a near future Blue Coat Reporter. Installation is ok but now I am trying to configure these both servers in cluster. > > First of all, I never configured any cluster with Linux ... > > Servers are both HP DL380R06 with disk cabinet directly attached on it. (twice exactly same hardware specs). > > What I would like to get is simply getting an Actif/Passif clustering mode with bidirectional disk space synchronization. This means, both servers are running. Only, the first one is running Reporter. During this time, disk spaces are continuously synchronized. When first one is down, second one becomes actif and when first one is running again, it synchronizes the disks and becomes primary again. > > server 1 is reporter1.lab.intranet with ip 10.30.30.90 > server 2 is reporter2.lab.intranet with ip 10.30.30.91 > > the load balanced ip should be 10.30.30.92 .. > > After some days of research on the net, I came to the conclusion that I could be happy with a solution including, DRBD/GFS2 with Redhat Cluster Suite. > > I am first trying to get a complete picture running on two vmware fusion (Linux Redhat Enterprise Linux 6) on my macosx before configuring my real servers. > > So, after some hours of research on the net, I found some articles and links that seem to describe what I wanna get ... > > http://gcharriere.com/blog/?p=73 > http://www.linuxtopia.org/online_books/rhel6/rhel_6_cluster_admin/rhel_6_cluster_ch-config-cli-CA.html > http://www.drbd.org/users-guide/users-guide.html > > and the DRBD packages for RHEL6 that I did not find anywhere .. > > http://elrepo.org/linux/elrepo/el6/i386/RPMS/ > > I just only configured till now the first part, meaning cluster services but the first issue occur .. > > below the cluster.conf file ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and this is the result I get on both servers ... > > [root at reporter1 ~]# clustat > Cluster Status for cluster @ Mon Feb 14 22:22:53 2011 > Member Status: Quorate > > Member Name ID Status > ------ ---- ---- ------ > reporter1.lab.intranet 1 Online, Local, rgmanager > reporter2.lab.intranet 2 Online, rgmanager > > Service Name Owner (Last) State > ------- ---- ----- ------ ----- > service:example_apache (none) stopped > > as you can see, everything is stopped or in other words nothing runs .. so my question are : > > did I forget something in my conf file ? > did I make something wrong in my conf file ? > do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ? > I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ?? > do you have some best practice regarding this picture ?? > > many thks to help me because I certainly have a bad understanding on some points. > any idea to solve this problem, .. many thks ?? > Regards > Vincent ----------------------------------------------------------------- ATTENTION: This e-mail is intended for the exclusive use of the recipient(s). This e-mail and its attachments, if any, contain confidential information and/or information protected by intellectual property rights or other rights. This e-mail does not constitute any commitment for ING Belgium except when expressly otherwise agreed in a written agreement between the intended recipient and ING Belgium. If you receive this message by mistake, please, notify the sender with the "reply" option and delete immediately this e-mail from your system, and destroy all copies of it. You may not, directly or indirectly, use this e-mail or any part of it if you are not the intended recipient. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. ----------------------------------------------------------------- ING Belgium SA/NV - Bank/Lender - Avenue Marnix 24, B-1000 Brussels, Belgium - Brussels RPM/RPR - VAT BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Account: 310-9156027-89 (IBAN BE45 3109 1560 2789). An insurance broker, registered with the Banking, Finance and Insurance Commission under the code number 12381A. ING Belgique SA - Banque/Preteur, Avenue Marnix 24, B-1000 Bruxelles - RPM Bruxelles - TVA BE 0403 200 393 - BIC (SWIFT) : BBRUBEBB - Compte: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Courtier d'assurances inscrit a la CBFA sous le numero 12381A. ING Belgie NV - Bank/Kredietgever - Marnixlaan 24, B-1000 Brussel - RPR Brussel - BTW BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Rekening: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Verzekeringsmakelaar ingeschreven bij de CBFA onder het nr. 12381A. ----------------------------------------------------------------- From gordan at bobich.net Tue Feb 15 21:01:21 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 21:01:21 +0000 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5AC3C9.8000401@dbtgroup.com> References: "\"<4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com>" <4D59E3A1.4070508@logik-internet.rs>" <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs> <4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs> <6b22c9afe6618e0ab302dae12c4b74aa@sjolshagen.net> <4D5AC3C9.8000401@dbtgroup.com> Message-ID: <4D5AE9A1.9080604@bobich.net> On 02/15/2011 06:19 PM, yvette hirth wrote: > Thomas Sjolshagen wrote: > >> Just so you realize; If you intend to use clvm (i.e. lvme in a cluster >> where you expect to be able to write to the volume from more than one >> node at/around the same time w/o a full-on failover), you will _not_ >> have snapshot support. And no, this isn't "not supported" as in >> "nobody to call if you encounter a problem", it's "not supported" as >> in "the tools will not let you create a snapshot of the LV". > > i've been listening to this discussion with much interest, as we would > like to improve the currency of our backup files. > > right now we have an ensemble of GFS2 LV's ("pri") as our primary data > store, and a "matching ensemble" of XFS LV's ("bak") as our backup data > store. an hourly cron job rsync's all LV's in the ensemble from pri => > bak. it's incredibly reliable, but this reduces our mean backup currency > by 1/2 hour. one upside is that i've got snapshots that are only 1/2 > hour old, and are daily backed up to tape. > > the conversation seems to indicate that we can change the bak LV's from > XFS to GFS2 and have drbd auto-sync the pri LV changes made to the bak > LV's - yes? this would reduce our backup currency from a mean of 1/2 > hour to theoretically, "atomic" (more likely "mere seconds"). i assUme > we have to change from XFS to GFS2, as drbd doesn't appear to do file > system conversions... > > if our assumptions are correct, are there any guides / manuals / doc on > how to do this? it's most tempting to try, since if it doesn't work, the > hourly cron rsync's could be simply reinstated. I'm not sure you realize what this would require. DRBD is a block device. You would have to start with a new partition/disk, "format" it for DRBD (creates DRBD metadata on the block device), then create GFS on top of it and put your files in. It's a backup+restore job to migrate to and from it. If you were to do this, your backup node would have to be a part of your DRBD cluster (all nodes need to share the DRBD device, unless you plan to only use it on the SAN that all the nodes connect to the volume from). You would then drop the backup node out of the cluster completely and make sure it cannot reconnect (this is vitally important), mount the GFS FS from DRBD ro with lock_nolock, and then back that up. Unless you are happy with just a block level mirror, which won't help you if data is accidentally deleted. DRBD is network RAID1, and RAID (of any level) is not a replacement for backups - but I'm sure you know that. Gordan From gordan at bobich.net Tue Feb 15 21:09:14 2011 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 15 Feb 2011 21:09:14 +0000 Subject: [Linux-cluster] Two nodes DRBD - Fail-Over Actif/Passif Cluster. In-Reply-To: <294881FE3F4013418806F0CE6E73A7B6052F302474@ing.com> References: <294881FE3F4013418806F0CE6E73A7B6052F302466@VPNLCMS92081.europe.intranet> <294881FE3F4013418806F0CE6E73A7B6052F302474@ing.com> Message-ID: <4D5AEB7A.6090207@bobich.net> On 02/15/2011 08:50 PM, vincent.blondel at ing.be wrote: >> below the cluster.conf file ... >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> and this is the result I get on both servers ... >> >> [root at reporter1 ~]# clustat >> Cluster Status for cluster @ Mon Feb 14 22:22:53 2011 >> Member Status: Quorate >> >> Member Name ID Status >> ------ ---- ---- ------ >> reporter1.lab.intranet 1 Online, Local, rgmanager >> reporter2.lab.intranet 2 Online, rgmanager >> >> Service Name Owner (Last) State >> ------- ---- ----- ------ ----- >> service:example_apache (none) stopped >> >> as you can see, everything is stopped or in other words nothing runs .. so my question are : Having a read through /var/log/messages for possible causes would be a good start. >> do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ? RHCS will automatically assign the IP to an interface that is on the same subnet. You most definitely shouldn't create the IP manually on any of the nodes. >> I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ?? >> do you have some best practice regarding this picture ?? I'm not familiar with the tag in cluster.conf, I usually configure most things as init script resources. Gordan From niks at logik-internet.rs Tue Feb 15 21:26:19 2011 From: niks at logik-internet.rs (Nikola Savic) Date: Tue, 15 Feb 2011 22:26:19 +0100 Subject: [Linux-cluster] Organizing 3 servers into cluster Message-ID: <4D5AEF7B.7030700@logik-internet.rs> Hello, I need to setup cluster using 3 servers. Thanks to everybody involved from this mailing list in previous post, we have concluded that DRBD+GFS2 is the best approach for building shared storage from local hard drives. It will enable mirroring of data between nodes using DRBD, and concurrent access to file systems thanks to GFS2. Main purpose of this cluster is hosting of single web site (web application). Main services we'll have are Web server (httpd) and MySQL. We also use memcached for shared session and caching. Cluster should provide following benefits: - High Availability - High Performance (balancing of web application execution on cluster nodes) - Traffic balancing This means that all 3 servers will execute web application and provide content to visitors. We didn't plan to use load balancers in front of web servers, because traffic balancing is important. That is why DNS round-robin approach was planned, which we already use for two server architecture used at moment. Web server on each node will be directly accessed by visitors, spread by use of DNS round-robin. One of servers will have MySQL Master used for writing, while other two will have MySQL Slave instances for reading. Each MySQL instance will execute on separate data on shared storage. I was confused by following line in RedHat's Cluster documentation, related to High Availablity: "An HA service can run on only one cluster node at a time to maintain data integrity". Does this mean that web servers can not work in parallel on all cluster nodes? Or, is this limitaion related to combination of IP address and service (eg. web server on IP 10.1.1.1)? When MySQL is in question, I have even more doubts on how to implement HA automatically. Failure of node where MySQL Master executes should result in automatic start of MySQL service on different node. In our configuration with 3 servers, there is no spare node, so MySQL Master and one of MySQL Slave instances should run on same server. If two nodes fail, all three MySQL instances will fail to single node :). If I understand MySQL docs well, this is not problem, but each instance must use different port, socket, data folders (which we already have separated). I didn't notice that MySQL instance can connect to specific IP. Does anyone have experience with this kind of setup? I know that one can run more than one instance of MySQL on single server, executed on different sets of data and connected to different ports. However, I'm not sure if it's possible to setup and if it's stable in cluster environment with HA? if it's possible, what are things I should take care of? Finally, for setup like this would you use available servers (3 of 'em in my case) as cluster nodes, or would you use 'em as bare metal machines for virtual servers with specific roles (eg. web server, MySQL master, Memcached, etc.), that can be easily moved from one base metal machine to other, in case of failure? Best Regards, Nikola -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajb2 at mssl.ucl.ac.uk Tue Feb 15 21:35:37 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Tue, 15 Feb 2011 21:35:37 +0000 Subject: [Linux-cluster] Linux-cluster Digest, Vol 82, Issue 20 In-Reply-To: References: Message-ID: <4D5AF1A9.3000307@mssl.ucl.ac.uk> >> I'm seeing heartbeat/lock lan traffic peak out at about 120kb/s and >> 4000pps per node at the moment. Clearly the switch isn't the problem - >> and using hardware acclerated igb devices I'm pretty sure the >> networking's fine too. >> > During the actual workload, or just during the ping pong test? During the actual workload. From list at fajar.net Tue Feb 15 22:09:36 2011 From: list at fajar.net (Fajar A. Nugraha) Date: Wed, 16 Feb 2011 05:09:36 +0700 Subject: [Linux-cluster] Organizing 3 servers into cluster In-Reply-To: <4D5AEF7B.7030700@logik-internet.rs> References: <4D5AEF7B.7030700@logik-internet.rs> Message-ID: On Wed, Feb 16, 2011 at 4:26 AM, Nikola Savic wrote: > > ? Hello, > > ? I need to setup cluster using 3 servers. Thanks to everybody involved from > this mailing list in previous post, we have concluded that DRBD+GFS2 is the > best approach for building shared storage from local hard drives. It will > enable mirroring of data between nodes using DRBD, and concurrent access to > file systems thanks to GFS2. > > ? Main purpose of this cluster is hosting of single web site (web > application). Main services we'll have are Web server (httpd) and MySQL. We > also use memcached for shared session and caching. Cluster should provide > following benefits: > - High Availability > - High Performance (balancing of web application execution on cluster nodes) > - Traffic balancing Before you get your hopes too high, make sure you test it first. DRBD will have some performance penalty compared to plain local block device, and GFS (or any other cluster file system) will have some performance penalty compared to ext3/4. Then there's also the additional layer of complexity involved (e.g.fencing, cluster service, etc.). Whether or not the penalty is acceptable depends on your needs. Depending on your needs, it might be possible that the "best" setup would be to dedicate one of the nodes as NAS server using nfs4 on top of ext4 for the other two nodes, and setup two floating IPs with something like vrrp. > > ? This means that all 3 servers will execute web application and provide > content to visitors. We didn't plan to use load balancers in front of web > servers, because traffic balancing is important. That is why DNS round-robin > approach was planned, which we already use for two server architecture used > at moment. Web server on each node will be directly accessed by visitors, > spread by use of DNS round-robin. One of servers will have MySQL Master used > for writing, while other two will have MySQL Slave instances for reading. > Each MySQL instance will execute on separate data on shared storage. > > ? I was confused by following line in RedHat's Cluster documentation, > related to High Availablity: "An HA service can run on only one cluster node > at a time to maintain data integrity". Does this mean that web servers can > not work in parallel on all cluster nodes? Or, is this limitaion related to > combination of IP address and service (eg. web server on IP 10.1.1.1)? I believe it's also related to what filesystem and what service you use. When you use ext3/4 for storage, obviously it can only be mounted on one node. Similar thing with application, MySQL requires exclusive access to its data directory while httpd has no problem sharing it's DocumentRoot with other http instances. -- Fajar From sachinbhugra at hotmail.com Tue Feb 15 22:24:34 2011 From: sachinbhugra at hotmail.com (sachin) Date: Wed, 16 Feb 2011 03:54:34 +0530 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: References: <4D57A763.8030700@redhat.com> <4D57A9F3.90408@redhat.com> Message-ID: Sorry for the delay friends. Actually, logs are scattered in different log files: 1. For rgmamager logs I have configured /var/log/cluster.log 2. Other cluster logs are going to messages file. Presently I am trying to find a way using which I can gather all the logs under one file other than messages. Seems I can use feature in cluster.conf, comments?? I am having openldap logging enabled on this server which is also using local4 facility and logs from cluster and ldap are getting mixed up. From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of dOminic Sent: Sunday, February 13, 2011 8:03 PM To: linux clustering Subject: Re: [Linux-cluster] Cluster node hangs Hi, Whats the msg you are getting in logs ?. It would be great if you could attach log mesgs along with cluster.conf -dominic On Sun, Feb 13, 2011 at 3:49 PM, Sachin Bhugra wrote: Thank for the reply and link. However, GFS2 is not listed in fstab, it is only handled by cluster config. _____ Date: Sun, 13 Feb 2011 10:52:51 +0100 From: ekuric at redhat.com To: linux-cluster at redhat.com Subject: Re: [Linux-cluster] Cluster node hangs On 02/13/2011 10:41 AM, Elvir Kuric wrote: On 02/13/2011 10:14 AM, Sachin Bhugra wrote: Hi , I have setup a two node cluster in lab, with Vmware Server, and hence used manual fencing. It includes a iSCSI GFS2 partition and it service Apache in Active/Passive mode. Cluster works and I am able to relocate service between nodes with no issues. However, the problem comes when I shutdown the node, for testing, which is presently holding the service. When the node becomes unavailable, service gets relocated and GFS partition gets mounted on the other node, however it is not accessible. If I try to do a "ls/du" on GFS partition, the command hangs. On the other hand the node which was shutdown gets stuck at "unmounting file system". I tried using fence_manual -n nodename and then fence_ack_manual -n nodename, however it still remains the same. Can someone please help me is what I am doing wrong? Thanks, -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster It would be good to see /etc/fstab configuration used on cluster nodes. If /gfs partition is mounted manually it will not be unmounted correctly in case you restart node ( and not executing umount prior restart ), and will hang during shutdown/reboot process. More at: http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Glo bal_File_System_2/index.html Edit: above link, section 3.4 Special Considerations when Mounting GFS2 File Systems Regards, Elvir -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent.blondel at ing.be Wed Feb 16 05:55:35 2011 From: vincent.blondel at ing.be (vincent.blondel at ing.be) Date: Wed, 16 Feb 2011 06:55:35 +0100 Subject: [Linux-cluster] Two nodes DRBD - Fail-Over Actif/Passif Cluster. In-Reply-To: <4D5AEB7A.6090207@bobich.net> References: <294881FE3F4013418806F0CE6E73A7B6052F302466@VPNLCMS92081.europe.intranet> <294881FE3F4013418806F0CE6E73A7B6052F302474@ing.com> <4D5AEB7A.6090207@bobich.net> Message-ID: <294881FE3F4013418806F0CE6E73A7B6052F302477@ing.com> >>> below the cluster.conf file ... >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> and this is the result I get on both servers ... >>> >>> [root at reporter1 ~]# clustat >>> Cluster Status for cluster @ Mon Feb 14 22:22:53 2011 >>> Member Status: Quorate >>> >>> Member Name ID Status >>> ------ ---- ---- ------ >>> reporter1.lab.intranet 1 Online, Local, rgmanager >>> reporter2.lab.intranet 2 Online, rgmanager >>> >>> Service Name Owner (Last) State >>> ------- ---- ----- ------ ----- >>> service:example_apache (none) stopped >>> >>> as you can see, everything is stopped or in other words nothing runs .. so my question are : > >Having a read through /var/log/messages for possible causes would be a >good start. > this is what I see in the /var/log/messages file ... Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service. Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Corosync built-in features: nss rdma Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Feb 16 07:36:54 reporter1 corosync[1250]: [MAIN ] Successfully parsed cman config Feb 16 07:36:54 reporter1 corosync[1250]: [TOTEM ] Initializing transport (UDP/IP). Feb 16 07:36:54 reporter1 corosync[1250]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Feb 16 07:36:55 reporter1 corosync[1250]: [TOTEM ] The network interface [10.30.30.90] is now up. Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Using quorum provider quorum_cman Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 16 07:36:55 reporter1 corosync[1250]: [CMAN ] CMAN 3.0.12 (built Aug 17 2010 14:08:49) started Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync configuration service Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync profile loading service Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Using quorum provider quorum_cman Feb 16 07:36:55 reporter1 corosync[1250]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Feb 16 07:36:55 reporter1 corosync[1250]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Feb 16 07:36:55 reporter1 corosync[1250]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Feb 16 07:36:55 reporter1 corosync[1250]: [CMAN ] quorum regained, resuming activity Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] This node is within the primary component and will provide service. Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Members[1]: 1 Feb 16 07:36:55 reporter1 corosync[1250]: [QUORUM] Members[1]: 1 Feb 16 07:36:55 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:36:55 reporter1 corosync[1250]: [CPG ] chosen downlist from node r(0) ip(10.30.30.90) Feb 16 07:36:55 reporter1 corosync[1250]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 07:36:56 reporter1 fenced[1302]: fenced 3.0.12 started Feb 16 07:36:57 reporter1 dlm_controld[1319]: dlm_controld 3.0.12 started Feb 16 07:36:57 reporter1 gfs_controld[1374]: gfs_controld 3.0.12 started Feb 16 07:37:03 reporter1 corosync[1250]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Feb 16 07:37:03 reporter1 corosync[1250]: [QUORUM] Members[2]: 1 2 Feb 16 07:37:03 reporter1 corosync[1250]: [QUORUM] Members[2]: 1 2 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] downlist received left_list: 0 Feb 16 07:37:03 reporter1 corosync[1250]: [CPG ] chosen downlist from node r(0) ip(10.30.30.90) >>> do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ? > >RHCS will automatically assign the IP to an interface that is on the >same subnet. You most definitely shouldn't create the IP manually on any >of the nodes. > >>> I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ?? >>> do you have some best practice regarding this picture ?? > >I'm not familiar with the tag in cluster.conf, I usually >configure most things as init script resources. > >Gordan ----------------------------------------------------------------- ATTENTION: This e-mail is intended for the exclusive use of the recipient(s). This e-mail and its attachments, if any, contain confidential information and/or information protected by intellectual property rights or other rights. This e-mail does not constitute any commitment for ING Belgium except when expressly otherwise agreed in a written agreement between the intended recipient and ING Belgium. If you receive this message by mistake, please, notify the sender with the "reply" option and delete immediately this e-mail from your system, and destroy all copies of it. You may not, directly or indirectly, use this e-mail or any part of it if you are not the intended recipient. Messages and attachments are scanned for all viruses known. If this message contains password-protected attachments, the files have NOT been scanned for viruses by the ING mail domain. Always scan attachments before opening them. ----------------------------------------------------------------- ING Belgium SA/NV - Bank/Lender - Avenue Marnix 24, B-1000 Brussels, Belgium - Brussels RPM/RPR - VAT BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Account: 310-9156027-89 (IBAN BE45 3109 1560 2789). An insurance broker, registered with the Banking, Finance and Insurance Commission under the code number 12381A. ING Belgique SA - Banque/Preteur, Avenue Marnix 24, B-1000 Bruxelles - RPM Bruxelles - TVA BE 0403 200 393 - BIC (SWIFT) : BBRUBEBB - Compte: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Courtier d'assurances inscrit a la CBFA sous le numero 12381A. ING Belgie NV - Bank/Kredietgever - Marnixlaan 24, B-1000 Brussel - RPR Brussel - BTW BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB - Rekening: 310-9156027-89 (IBAN: BE45 3109 1560 2789). Verzekeringsmakelaar ingeschreven bij de CBFA onder het nr. 12381A. ----------------------------------------------------------------- From shariq.siddiqui at yahoo.com Wed Feb 16 10:04:00 2011 From: shariq.siddiqui at yahoo.com (Shariq Siddiqui) Date: Wed, 16 Feb 2011 02:04:00 -0800 (PST) Subject: [Linux-cluster] RAW Devices performance issue Message-ID: <506336.34162.qm@web39801.mail.mud.yahoo.com> Dear All, I am going to install Oracle RAC on two Servers, With shared SAN storage (Servers and Storage is IBM) OS = RHEL 5u5 x64 bit And we used multipathing mechanism and created multipathing devices. i.e. /dev/mapper/mpath1. Then I created raw device /dev/raw/raw1 of this /dev/mapper/mpath1 Block device as per pre-reqs for Oracle Cluster. Every thing looks good, But we faced the performance issue as under... when we run command : #dd if=/dev/zero of=/dev/mapper/mpath1 bs=1024 count=1000 the writing rate is approx. 34 MB/s But If we run command #dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=1000 the writing rate is very slow like 253 KB/s Please advice how to tune the performance. Best Regards, Shariq Siddiqui -------------- next part -------------- An HTML attachment was scrubbed... URL: From list at fajar.net Wed Feb 16 10:34:07 2011 From: list at fajar.net (Fajar A. Nugraha) Date: Wed, 16 Feb 2011 17:34:07 +0700 Subject: [Linux-cluster] RAW Devices performance issue In-Reply-To: <506336.34162.qm@web39801.mail.mud.yahoo.com> References: <506336.34162.qm@web39801.mail.mud.yahoo.com> Message-ID: On Wed, Feb 16, 2011 at 5:04 PM, Shariq Siddiqui wrote: > > > Dear All, > > I am going to install Oracle RAC on two Servers, With shared SAN storage (Servers and Storage is IBM) > OS = RHEL 5u5 x64 bit > > And we used multipathing mechanism and created multipathing devices. > i.e. /dev/mapper/mpath1. > > Then I created raw device /dev/raw/raw1 of this /dev/mapper/mpath1 Block device as per pre-reqs for Oracle Cluster. > > Every thing looks good, But we faced the performance issue as under... > > when we run command : > #dd if=/dev/zero of=/dev/mapper/mpath1 bs=1024 count=1000 > the writing rate is approx. 34 MB/s > > But If we run command > #dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=1000 > the writing rate is very slow like 253 KB/s > > Please advice how to tune the performance. Shouldn't you ask Oracle about that? My GUESS is that in the first one the I/O is buffered, while in the second /dev/raw/raw* is simply a block device opened with O_DIRECT (thus bypassing buffer cache). You may want to retry dd with "oflags=direct" and compare the results. You might want to look at http://en.wikipedia.org/wiki/Raw_device http://download.oracle.com/docs/cd/B19306_01/relnotes.102/b15659/toc.htm#CJAICHEG Again, better ask Oracle if you want to be sure. -- Fajar From fajar at fajar.net Wed Feb 16 10:37:08 2011 From: fajar at fajar.net (Fajar A. Nugraha) Date: Wed, 16 Feb 2011 17:37:08 +0700 Subject: [Linux-cluster] RAW Devices performance issue In-Reply-To: <506336.34162.qm@web39801.mail.mud.yahoo.com> References: <506336.34162.qm@web39801.mail.mud.yahoo.com> Message-ID: On Wed, Feb 16, 2011 at 5:04 PM, Shariq Siddiqui wrote: > > > Dear All, > > I am going to install Oracle RAC on two Servers, With shared SAN storage (Servers and Storage is IBM) > OS = RHEL 5u5 x64 bit > > And we used multipathing mechanism and created multipathing devices. > i.e. /dev/mapper/mpath1. > > Then I created raw device /dev/raw/raw1 of this /dev/mapper/mpath1 Block device as per pre-reqs for Oracle Cluster. > > Every thing looks good, But we faced the performance issue as under... > > when we run command : > #dd if=/dev/zero of=/dev/mapper/mpath1 bs=1024 count=1000 > the writing rate is approx. 34 MB/s > > But If we run command > #dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=1000 > the writing rate is very slow like 253 KB/s > > Please advice how to tune the performance. Shouldn't you ask Oracle about that? My GUESS is that in the first one the I/O is buffered, while in the second /dev/raw/raw* is simply a block device opened with O_DIRECT (thus bypassing buffer cache). You may want to retry dd with "oflags=direct" and compare the results. You might want to look at http://en.wikipedia.org/wiki/Raw_device http://download.oracle.com/docs/cd/B19306_01/relnotes.102/b15659/toc.htm#CJAICHEG Again, better ask Oracle if you want to be sure. -- Fajar From stefan at lsd.co.za Wed Feb 16 10:36:37 2011 From: stefan at lsd.co.za (Stefan Lesicnik) Date: Wed, 16 Feb 2011 12:36:37 +0200 (SAST) Subject: [Linux-cluster] RAW Devices performance issue In-Reply-To: <506336.34162.qm@web39801.mail.mud.yahoo.com> Message-ID: <844193781.8667.1297852597836.JavaMail.root@zcs-jhb-lsd> ----- Original Message ----- > From: "Shariq Siddiqui" > To: linux4oracle at yahoogroups.com, linux-cluster at redhat.com > Sent: Wednesday, 16 February, 2011 12:04:00 PM > Subject: [Linux-cluster] RAW Devices performance issue > Dear All, > > I am going to install Oracle RAC on two Servers, With shared SAN > storage (Servers and Storage is IBM) > OS = RHEL 5u5 x64 bit > > And we used multipathing mechanism and created multipathing devices. > i.e. /dev/mapper/mpath1. > > Then I created raw device /dev/raw/raw1 of this /dev/mapper/mpath1 > Block device as per pre-reqs for Oracle Cluster. > > Every thing looks good, But we faced the performance issue as under... > > when we run command : > #dd if=/dev/zero of=/dev/mapper/mpath1 bs=1024 count=1000 > the writing rate is approx. 34 MB/s > > But If we run command > #dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=1000 > the writing rate is very slow like 253 KB/s > > Please advice how to tune the performance. Hi, I dont know anything about using raw devices, but I do know the write speed through the multipath device for the SAN is slow. Try fix that performance first - check SAN cache write is enabled, check your raid levels and over how many disks. I cant say what you should get, but i've seen local non raided disks write much faster Stefan From swhiteho at redhat.com Wed Feb 16 11:13:38 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 11:13:38 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <13588217.76.1297800448396.JavaMail.marc@mobilix-20> References: <13588217.76.1297800448396.JavaMail.marc@mobilix-20> Message-ID: <1297854818.2464.20.camel@dolmen> Hi, On Tue, 2011-02-15 at 21:07 +0100, Marc Grimme wrote: > Hi Steve, > I think lately I observed a very similar behavior with RHEL5 and gfs2. > It was a gfs2 filesystem that had about 2Mio files with sum of 2GB in a directory. When I did a du -shx . in this directory it took about 5 Minutes (noatime mountoption given). Independently on how much nodes took part in the cluster (in the end I only tested with one node). This was only for the first time running all later executed du commands were much faster. > When I mounted the exact same filesystem with lockproto=lock_nolock it took about 10-20 seconds to proceed with the same command. > > Next I started to analyze this with oprofile and observed the following result: > > opreport --long-file-names: > CPU: AMD64 family10, speed 2900.11 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 200569 46.7639 search_rsb_list The resource table size is by default 256 entries in size. Assuming that you have enough ram that all 4m locks (for 2m files) are in memory at the same time, that is approx 15625 resources per hash chain, so it would make sense that this would start to slow things down a bit. There is a config option to increase the resource table size though, so perhaps you could try that? > 118905 27.7234 create_lkb This reads down a hash chain in the lkb table. That table is larger by default (1024), which is probably why there is less cpu time burned here. On the other hand, the hash chain might be read more than once if there is a collision on the lock ids. Again it is a config option, so it should be possible to increase the size of the table. > 32499 7.5773 search_bucket > 4125 0.9618 find_lkb > 3641 0.8489 process_send_sockets > 3420 0.7974 dlm_scan_rsbs > 3184 0.7424 _request_lock > 3012 0.7023 find_rsb > 2735 0.6377 receive_from_sock > 2610 0.6085 _receive_message > 2543 0.5929 dlm_allocate_rsb > 2299 0.5360 dlm_hash2nodeid > 2228 0.5195 _create_message > 2180 0.5083 dlm_astd > 2163 0.5043 dlm_find_lockspace_global > 2109 0.4917 dlm_find_lockspace_local > 2074 0.4836 dlm_lowcomms_get_buffer > 2060 0.4803 dlm_lock > 1982 0.4621 put_rsb > .. > > opreport --image /gfs2 > CPU: AMD64 family10, speed 2900.11 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 9310 15.5600 search_bucket This should get better in RHEL6.1 and above, due to the new design of glock hash table. The patch is already in upstream. The glock hash table is much larger than the dlm hash table, though there are still scalability issues due to the locking and that we cannot currently grow the hash table. > 6268 10.4758 do_promote The result in do_promote is interesting, as I wouldn't have expected that to show up here really, so I'll look into that when I have a moment and try to figure out what is going on. > 2704 4.5192 gfs2_glock_put > 2289 3.8256 gfs2_glock_hold > 2286 3.8206 gfs2_glock_schedule_for_reclaim > 2204 3.6836 gfs2_glock_nq > 2204 3.6836 run_queue > 2001 3.3443 gfs2_holder_wake > .. > > opreport --image /dlm > CPU: AMD64 family10, speed 2900.11 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 200569 46.7639 search_rsb_list > 118905 27.7234 create_lkb > 32499 7.5773 search_bucket > 4125 0.9618 find_lkb > 3641 0.8489 process_send_sockets > 3420 0.7974 dlm_scan_rsbs > 3184 0.7424 _request_lock > 3012 0.7023 find_rsb > 2735 0.6377 receive_from_sock > 2610 0.6085 _receive_message > 2543 0.5929 dlm_allocate_rsb > 2299 0.5360 dlm_hash2nodeid > 2228 0.5195 _create_message > .. > > This very much reminded me on a similar test we've done years ago with gfs (see http://www.open-sharedroot.org/Members/marc/blog/blog-on-dlm/red-hat-dlm-__find_lock_by_id/profile-data-with-diffrent-table-sizes). > > Does this not show that during the du command 46% of the time the kernel stays in the dlm:search_rsb_list function while looking out for locks. It still looks like the hashtable for the lock in dlm is much too small and searching inside the hashmap is not constant anymore? > > I would be really interesting how long the described backup takes when the gfs2 filesystem is mounted exclusively on one node without locking. > For me it looks like you're facing a similar problem with gfs2 that has been worked around with gfs by introducing the glock_purge functionality that leads to a much smaller glock->dlm->hashtable and makes backups and the like much faster. > > I hope this helps. > > Thanks and regards > Marc. > Many thanks for this information, it is really helpful to get feedback like this which helps identify issues in the code, Steve. From chauhan.anujsingh at gmail.com Wed Feb 16 11:54:37 2011 From: chauhan.anujsingh at gmail.com (Anuj) Date: Wed, 16 Feb 2011 17:24:37 +0530 Subject: [Linux-cluster] Linux-cluster Digest, Vol 82, Issue 19 In-Reply-To: References: Message-ID: Hi, Hello to all ! will you please guide me how can do a practice of clustrign as well as loadbalancer for testing enviorment can all of you please guide me what are the basic requirements i have three centos machine apache,Mysql and postfix is runing on these machines -- *Regards.*.// Anuj Singh Chauhan (Voice): 09013203509* * On Tue, Feb 15, 2011 at 10:30 PM, wrote: > Send Linux-cluster mailing list submissions to > linux-cluster at redhat.com > > To subscribe or unsubscribe via the World Wide Web, visit > https://www.redhat.com/mailman/listinfo/linux-cluster > or, via email, send a message with subject or body 'help' to > linux-cluster-request at redhat.com > > You can reach the person managing the list at > linux-cluster-owner at redhat.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linux-cluster digest..." > > > Today's Topics: > > 1. Re: Cluster with shared storage on low budget (Gordan Bobic) > 2. Re: Cluster with shared storage on low budget (Jeff Sturm) > 3. Re: Cluster with shared storage on low budget (Gordan Bobic) > 4. Re: Cluster with shared storage on low budget (Bob Peterson) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 15 Feb 2011 13:31:42 +0000 > From: Gordan Bobic > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > Message-ID: <4D5A803E.5090106 at bobich.net> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Nikola Savic wrote: > > Gordan Bobic wrote: > >> Something else just occurs to me - you mentioned MySQL. You do realize > >> that the performance of it will be attrocious on a shared cluster file > >> system (ANY shared cluster file system), right? Unless you only intend > >> to run mysqld on a single node at a time (in which case there's no > >> point in putting it on a cluster file system). > > > > MySQL Master and Slave(s) will run on single node. No two MySQL > > instances will run on same set of data. Shared storage for MySQL data > > should enable easier movement of MySQL instance between nodes. Eg. when > > MySQL master needs to be moved from one node to other, I assume it would > > be easier with DRBD, because I would "only" need to stop MySQL on one > > node and start it on other configured to use same set of data. > > There is a better way to do that. Run DRBD in active-passive mode, and > grab the fail-over scripts from heartbeat. Then set up a dependency in > cluster.conf that will handle a combined service of DRBD disk (handling > active/passive switch), file system (mounting the fs once the DRBD > becomes active locally, and mysql. You define them as dependant on each > other in cluster.conf by suitable nesting. > > > Additionally, floating IP address assigned to MySQL master would need to > > be re-assigned to new node. > > You can make that IP a part of the dependency stack mentioned above. > > > Slaves would also need to be restarted to > > connect to new master. Even without floating IP used only my MySQL > > Master, slaves and web application can easily be reconfigured to use new > > IP. Do you see problem in this kind of setup? > > If the IP fails over and the FS is consistent you don't need to change > any configs - MySQL slaves will re-try connecting until they succeed. > Just make sure your bin-logs are on the same mount as the rest of MySQL, > since they have to fail over with the rest of the DB. > > Gordan > > > > ------------------------------ > > Message: 2 > Date: Tue, 15 Feb 2011 10:55:43 -0500 > From: Jeff Sturm > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > Message-ID: > <64D0546C5EBBD147B75DE133D798665F0855C0F4 at hugo.eprize.local> > Content-Type: text/plain; charset="us-ascii" > > > -----Original Message----- > > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] > > On Behalf Of Gordan Bobic > > Sent: Tuesday, February 15, 2011 7:05 AM > > > > Volume resizing is, IMO, over-rated and unnecessary in most cases, > except where data > > growth is quite mind-boggling (in which case you won't be using MySQL > anyway). > > We actually resize volumes often. Some of our storage volumes have 30 > LUNs or more. We have so many because we've virtualized most of our > infrastructure, and some of the hosts are single-purpose hosts. > > We don't want to allocate too more storage in advance, simply because > it's easier to grow than to shrink. Stop the host, grow the volume, > e2fsck/resize2fs, start up and go. Much nicer than increasing disk > capacity on physical hosts. > > CLVM works well for this, but that's about all it's good for IMHO. I > prefer to use the SAN's native volume management over CLVM when > available. > > Haven't tried DRBD yet but I'm really tempted... it sounds like it has > come a long way since its modest beginnings. > > -Jeff > > > > > > ------------------------------ > > Message: 3 > Date: Tue, 15 Feb 2011 16:17:03 +0000 > From: Gordan Bobic > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > Message-ID: <4D5AA6FF.8080608 at bobich.net> > Content-Type: text/plain; charset=windows-1252; format=flowed > > Jeff Sturm wrote: > >> -----Original Message----- > >> From: linux-cluster-bounces at redhat.com > > [mailto:linux-cluster-bounces at redhat.com] > >> On Behalf Of Gordan Bobic > >> Sent: Tuesday, February 15, 2011 7:05 AM > >> > >> Volume resizing is, IMO, over-rated and unnecessary in most cases, > > except where data > >> growth is quite mind-boggling (in which case you won't be using MySQL > > anyway). > > > > We actually resize volumes often. Some of our storage volumes have 30 > > LUNs or more. We have so many because we've virtualized most of our > > infrastructure, and some of the hosts are single-purpose hosts. > > > > We don't want to allocate too more storage in advance, simply because > > it's easier to grow than to shrink. Stop the host, grow the volume, > > e2fsck/resize2fs, start up and go. Much nicer than increasing disk > > capacity on physical hosts. > > Seems labour and downtime intensive to me. Maybe I'm just used to > environments where that is an unacceptable tradeoff vs. ?40/TB for storage. > > Not to mention that it makes you totally reliant on SAN level > redundancy, which I also generally deem unacceptable except on very high > end SANs that have mirroring features. > > Additionally, considering you can self-build a multi-TB iSCSI SAN for a > few hundred ?/$/? which will have volume growing ability (use sparse > files for iSCSI volumes and write a byte to a greater offset), I cannot > really see any justification whatsoever for using LVM with SAN based > storage. > > > Haven't tried DRBD yet but I'm really tempted... it sounds like it has > > come a long way since its modest beginnings. > > Not sure how far back you are talking about but I have been using it in > production in both active-active and active-passive configurations since > at least 2007 with no problems. From the usage point of view, the > changes have been negligible. > > Gordan > > > > ------------------------------ > > Message: 4 > Date: Tue, 15 Feb 2011 11:24:26 -0500 (EST) > From: Bob Peterson > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > Message-ID: > < > 263367529.33108.1297787066881.JavaMail.root at zmail06.collab.prod.int.phx2.redhat.com > > > > Content-Type: text/plain; charset=utf-8 > > ----- Original Message ----- > | We don't want to allocate too more storage in advance, simply because > | it's easier to grow than to shrink. Stop the host, grow the volume, > | e2fsck/resize2fs, start up and go. Much nicer than increasing disk > | capacity on physical hosts. > > These might be good for ext3/4, but with gfs and gfs2 you can lvresize > and gfs2_grow while the lv is mounted. In fact, we expect it. > Just make sure the vg has the clustered bit set (vgchange -cy) first. > > Regards, > > Bob Peterson > Red Hat File Systems > > > > ------------------------------ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > End of Linux-cluster Digest, Vol 82, Issue 19 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajb2 at mssl.ucl.ac.uk Wed Feb 16 12:02:32 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 12:02:32 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5BBCD8.9010808@mssl.ucl.ac.uk> > There is a config option to increase the resource table size though, so perhaps you could try that? ..details? From swhiteho at redhat.com Wed Feb 16 12:57:27 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 12:57:27 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5BBCD8.9010808@mssl.ucl.ac.uk> References: <4D5BBCD8.9010808@mssl.ucl.ac.uk> Message-ID: <1297861047.2522.4.camel@dolmen> Hi, On Wed, 2011-02-16 at 12:02 +0000, Alan Brown wrote: > > There is a config option to increase the resource table size though, > so perhaps you could try that? > > ..details? > > You can set it via the configfs interface: echo "4096" > /sys/kernel/config/dlm//cluster/rsbtbl_size It doesn't change once a lockspace has been created, so the new table size needs to be set before mounting the filesystem, otherwise it will not take effect. The size must be a power of two. Likewise the lkbtbl_size can be set the same way, Steve. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ajb2 at mssl.ucl.ac.uk Wed Feb 16 14:12:30 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 14:12:30 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5BDB4E.3010006@mssl.ucl.ac.uk> > You can set it via the configfs interface: Given 24Gb ram, 100 filesystems, several hundred million of files and the usual user habits of trying to put 100k files in a directory: Is 24Gb enough or should I add more memory? (96Gb is easy, beyond that is harder) What would you consider safe maximums for these settings? What about the following parameters? buffer_size dirtbl_size From swhiteho at redhat.com Wed Feb 16 17:33:22 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 17:33:22 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5BDB4E.3010006@mssl.ucl.ac.uk> References: <4D5BDB4E.3010006@mssl.ucl.ac.uk> Message-ID: <1297877602.2522.69.camel@dolmen> Hi, On Wed, 2011-02-16 at 14:12 +0000, Alan Brown wrote: > > You can set it via the configfs interface: > > Given 24Gb ram, 100 filesystems, several hundred million of files and > the usual user habits of trying to put 100k files in a directory: > > Is 24Gb enough or should I add more memory? (96Gb is easy, beyond that > is harder) > The more memory you add, the greater the potential for caching large numbers of inodes, which in turn implies larger numbers of dlm locks. So you are much more likely to see these issues with large ram sizes. If you can easily do 96G, then I'd say start with that. > What would you consider safe maximums for these settings? > That is a more tricky question. There might be some issues if you go above 2^16 hash buckets due to the way in which dlm organises its hash buckets. Dave Teigland can give you more info on that. > What about the following parameters? > > buffer_size I doubt that this will need adjusting. > dirtbl_size That might need adjusting too, although it didn't appear to be significant on the profile results, Steve. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From teigland at redhat.com Wed Feb 16 17:52:27 2011 From: teigland at redhat.com (David Teigland) Date: Wed, 16 Feb 2011 12:52:27 -0500 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <13588217.76.1297800448396.JavaMail.marc@mobilix-20> References: <1297794020.2711.21.camel@dolmen> <13588217.76.1297800448396.JavaMail.marc@mobilix-20> Message-ID: <20110216175227.GB2291@redhat.com> On Tue, Feb 15, 2011 at 09:07:31PM +0100, Marc Grimme wrote: > Hi Steve, > I think lately I observed a very similar behavior with RHEL5 and gfs2. > It was a gfs2 filesystem that had about 2Mio files with sum of 2GB in a directory. When I did a du -shx . in this directory it took about 5 Minutes (noatime mountoption given). Independently on how much nodes took part in the cluster (in the end I only tested with one node). This was only for the first time running all later executed du commands were much faster. > When I mounted the exact same filesystem with lockproto=lock_nolock it took about 10-20 seconds to proceed with the same command. > > Next I started to analyze this with oprofile and observed the following result: > > opreport --long-file-names: > CPU: AMD64 family10, speed 2900.11 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 200569 46.7639 search_rsb_list > 118905 27.7234 create_lkb Hi Marc, thanks for sending this again, I remember that you pointed these out a long time ago, but had forgotten just how bad those searches were. I really do need to do some optimizing there. > This very much reminded me on a similar test we've done years ago with > gfs (see http://www.open-sharedroot.org/Members/marc/blog/blog-on-dlm/red-hat-dlm-__find_lock_by_id/profile-data-with-diffrent-table-sizes). > > Does this not show that during the du command 46% of the time the kernel > stays in the dlm:search_rsb_list function while looking out for locks. > It still looks like the hashtable for the lock in dlm is much too small > and searching inside the hashmap is not constant anymore? We should definately check if the default hash table sizes should be increased. Dave From teigland at redhat.com Wed Feb 16 17:58:48 2011 From: teigland at redhat.com (David Teigland) Date: Wed, 16 Feb 2011 12:58:48 -0500 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5BDB4E.3010006@mssl.ucl.ac.uk> References: <4D5BDB4E.3010006@mssl.ucl.ac.uk> Message-ID: <20110216175848.GC2291@redhat.com> On Wed, Feb 16, 2011 at 02:12:30PM +0000, Alan Brown wrote: > > You can set it via the configfs interface: > > Given 24Gb ram, 100 filesystems, several hundred million of files > and the usual user habits of trying to put 100k files in a > directory: > > Is 24Gb enough or should I add more memory? (96Gb is easy, beyond > that is harder) > > What would you consider safe maximums for these settings? > > What about the following parameters? > > buffer_size > dirtbl_size Don't change the buffer size, but I'd increase all the hash table sizes to 4096 and see if anything changes. echo "4096" > /sys/kernel/config/dlm/cluster/rsbtbl_size echo "4096" > /sys/kernel/config/dlm/cluster/lkbtbl_size echo "4096" > /sys/kernel/config/dlm/cluster/dirtbl_size (Before gfs file systems are mounted as Steve mentioned.) Dave From ajb2 at mssl.ucl.ac.uk Wed Feb 16 19:07:10 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 19:07:10 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5C205E.4010708@mssl.ucl.ac.uk> Steve: To add some interest (and give you numbers to work with as far as dlm config tuning goes), here are a selection of real world lock figures from our file cluster (cat $d | wc -l) /sys/kernel/debug/dlm/WwwHome-gfs2_locks 162299 (webserver exports) /sys/kernel/debug/dlm/soft2-gfs2_locks 198890 (Mainly IDL software - it's hopelessly inefficient, 32Gb partition) /sys/kernel/debug/dlm/home-gfs2_locks 74649 (users' /home directories, 150Gb partition) /sys/kernel/debug/dlm/User1_locks 318337 (thunderbird, mozilla, openoffice caches, 200gb partition) /sys/kernel/debug/dlm/Peace04-gfs2_locks 265955 (solar wind data) /sys/kernel/debug/dlm/Peace05-gfs2_locks 332267 /sys/kernel/debug/dlm/Peace06-gfs2_locks 283588 At the other end of the spectrum: /sys/kernel/debug/dlm/xray0-gfs2_locks 24917 (solar observation data) /sys/kernel/debug/dlm/xray2-gfs2_locks 558 /sys/kernel/debug/dlm/cassini2-gfs2_locks 598 (cassini probe data from Saturn) /sys/kernel/debug/dlm/cassini3-gfs2_locks 80 /sys/kernel/debug/dlm/cassini4-gfs2_locks 246 /sys/kernel/debug/dlm/rgoplates-gfs2_locks 27 (global archive of 100 years' worth of photographic plates from Greenwich observatory) Directories may have up to 90k entries in them, although we try very hard to encourage users to use nested structures and keep directories below 1000 entries for human readability (exceptions tend to be mirrors of offsite archives), but the counterpoint to is that it drives the number of directories up - which is why I was asking about the dirtbl_size entry. ~98% of directories are below 4000 entries. FSes usually have 400k-2M inodes in use. Does that help with tuning recommendations? From swhiteho at redhat.com Wed Feb 16 19:25:01 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 19:25:01 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5C205E.4010708@mssl.ucl.ac.uk> References: <4D5C205E.4010708@mssl.ucl.ac.uk> Message-ID: <1297884301.2522.74.camel@dolmen> Hi, On Wed, 2011-02-16 at 19:07 +0000, Alan Brown wrote: > Steve: > > To add some interest (and give you numbers to work with as far as dlm > config tuning goes), here are a selection of real world lock figures > from our file cluster (cat $d | wc -l) > > /sys/kernel/debug/dlm/WwwHome-gfs2_locks 162299 (webserver exports) > /sys/kernel/debug/dlm/soft2-gfs2_locks 198890 (Mainly IDL software - > it's hopelessly inefficient, 32Gb partition) > /sys/kernel/debug/dlm/home-gfs2_locks 74649 (users' /home directories, > 150Gb partition) > /sys/kernel/debug/dlm/User1_locks 318337 (thunderbird, mozilla, > openoffice caches, 200gb partition) > /sys/kernel/debug/dlm/Peace04-gfs2_locks 265955 (solar wind data) > /sys/kernel/debug/dlm/Peace05-gfs2_locks 332267 > /sys/kernel/debug/dlm/Peace06-gfs2_locks 283588 > A faster way to just grab lock numbers is to grep for gfs2 in /proc/slabinfo as that will show how many are allocated at any one time. > At the other end of the spectrum: > > /sys/kernel/debug/dlm/xray0-gfs2_locks 24917 (solar observation data) > /sys/kernel/debug/dlm/xray2-gfs2_locks 558 > /sys/kernel/debug/dlm/cassini2-gfs2_locks 598 (cassini probe data from > Saturn) > /sys/kernel/debug/dlm/cassini3-gfs2_locks 80 > /sys/kernel/debug/dlm/cassini4-gfs2_locks 246 > /sys/kernel/debug/dlm/rgoplates-gfs2_locks 27 (global archive of 100 > years' worth of photographic plates from Greenwich observatory) > > > Directories may have up to 90k entries in them, although we try very > hard to encourage users to use nested structures and keep directories > below 1000 entries for human readability (exceptions tend to be mirrors > of offsite archives), but the counterpoint to is that it drives the > number of directories up - which is why I was asking about the > dirtbl_size entry. > The dirtbl refers to the DLM's resource directory and not to the directories which are in the filesystem. So the dirtbl will scale according to the number of dlm locks, which in turn scales with the number of cached inodes. Directories of the size (number of entries) which you have indicated should not be causing a problem as lookup should still be quite fast at that scale. > ~98% of directories are below 4000 entries. > > FSes usually have 400k-2M inodes in use. > The important thing from the dlm tuning point of view is how many of those inodes are cached on each node at once, so using the slabinfo trick above will show that. > Does that help with tuning recommendations? > It is always useful to have some background information like this, and I think as a first step trying Dave's suggested DLM table config changes is a good plan, Steve. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ajb2 at mssl.ucl.ac.uk Wed Feb 16 19:36:09 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 19:36:09 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5C2729.2070809@mssl.ucl.ac.uk> > A faster way to just grab lock numbers is to grep for gfs2 in /proc/slabinfo as that will show how many are allocated at any one time. True, but it doesn't show mow many are used per fs. FWIW, here are current stats on each cluster node (it's evening and lightly loaded) gfs2_quotad 47 108 144 27 1 : tunables 120 60 8 : slabdata 4 4 0 gfs2_rgrpd 9563 9618 184 21 1 : tunables 120 60 8 : slabdata 458 458 0 gfs2_bufdata 318804 318840 96 40 1 : tunables 120 60 8 : slabdata 7971 7971 1 gfs2_inode 725605 725605 800 5 1 : tunables 54 27 8 : slabdata 145121 145121 0 gfs2_glock 738297 738297 424 9 1 : tunables 54 27 8 : slabdata 82033 82033 0 gfs2_quotad 94 189 144 27 1 : tunables 120 60 8 : slabdata 7 7 0 gfs2_rgrpd 1658 1680 184 21 1 : tunables 120 60 8 : slabdata 80 80 0 gfs2_bufdata 1065806 1067080 96 40 1 : tunables 120 60 8 : slabdata 26677 26677 0 gfs2_inode 986986 1024845 800 5 1 : tunables 54 27 8 : slabdata 204969 204969 0 gfs2_glock 1105575 1812825 424 9 1 : tunables 54 27 8 : slabdata 201425 201425 1 gfs2_quotad 45 108 144 27 1 : tunables 120 60 8 : slabdata 4 4 2 gfs2_rgrpd 6515 6573 184 21 1 : tunables 120 60 8 : slabdata 313 313 0 gfs2_bufdata 100785 101000 96 40 1 : tunables 120 60 8 : slabdata 2525 2525 0 gfs2_inode 2954515 2954515 800 5 1 : tunables 54 27 8 : slabdata 590903 590903 0 gfs2_glock 3332311 3639843 424 9 1 : tunables 54 27 8 : slabdata 404427 404427 0 From ajb2 at mssl.ucl.ac.uk Wed Feb 16 19:41:04 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 19:41:04 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5C2850.7090100@mssl.ucl.ac.uk> > Directories of the size (number of entries) which you have indicated should not be causing a problem as lookup should still be quite fast at that scale. Perhaps, but even so 4000 file directories usually take over a minute to "ls -l" , while 85k file/directories take 5 mins (20-40 mins on a bad day) - and this is mounted lock_dlm, single-node-only From swhiteho at redhat.com Wed Feb 16 20:19:21 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 20:19:21 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5C2729.2070809@mssl.ucl.ac.uk> References: <4D5C2729.2070809@mssl.ucl.ac.uk> Message-ID: <1297887561.2522.77.camel@dolmen> Hi, On Wed, 2011-02-16 at 19:36 +0000, Alan Brown wrote: > > A faster way to just grab lock numbers is to grep for gfs2 > in /proc/slabinfo as that will show how many are allocated at any one > time. > > True, but it doesn't show mow many are used per fs. > For the GFS2 glocks, that doesn't matter - all of the glocks are held in a single hash table no matter how many filesystems there are. The DLM however has hash tables for each lockspace (per filesystem) so it might make a difference there. > FWIW, here are current stats on each cluster node (it's evening and > lightly loaded) > > > gfs2_quotad 47 108 144 27 1 : tunables 120 60 > 8 : slabdata 4 4 0 > gfs2_rgrpd 9563 9618 184 21 1 : tunables 120 60 > 8 : slabdata 458 458 0 > gfs2_bufdata 318804 318840 96 40 1 : tunables 120 60 > 8 : slabdata 7971 7971 1 > gfs2_inode 725605 725605 800 5 1 : tunables 54 27 > 8 : slabdata 145121 145121 0 > gfs2_glock 738297 738297 424 9 1 : tunables 54 27 > 8 : slabdata 82033 82033 0 > > gfs2_quotad 94 189 144 27 1 : tunables 120 60 > 8 : slabdata 7 7 0 > gfs2_rgrpd 1658 1680 184 21 1 : tunables 120 60 > 8 : slabdata 80 80 0 > gfs2_bufdata 1065806 1067080 96 40 1 : tunables 120 60 > 8 : slabdata 26677 26677 0 > gfs2_inode 986986 1024845 800 5 1 : tunables 54 27 > 8 : slabdata 204969 204969 0 > gfs2_glock 1105575 1812825 424 9 1 : tunables 54 27 > 8 : slabdata 201425 201425 1 > > gfs2_quotad 45 108 144 27 1 : tunables 120 60 > 8 : slabdata 4 4 2 > gfs2_rgrpd 6515 6573 184 21 1 : tunables 120 60 > 8 : slabdata 313 313 0 > gfs2_bufdata 100785 101000 96 40 1 : tunables 120 60 > 8 : slabdata 2525 2525 0 > gfs2_inode 2954515 2954515 800 5 1 : tunables 54 27 > 8 : slabdata 590903 590903 0 > gfs2_glock 3332311 3639843 424 9 1 : tunables 54 27 > 8 : slabdata 404427 404427 0 > Thanks for the info. There is now a bug open (bz #678102) for increasing the default DLM hash table size, Steve. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swhiteho at redhat.com Wed Feb 16 20:26:20 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 16 Feb 2011 20:26:20 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5C2850.7090100@mssl.ucl.ac.uk> References: <4D5C2850.7090100@mssl.ucl.ac.uk> Message-ID: <1297887980.2522.83.camel@dolmen> Hi, On Wed, 2011-02-16 at 19:41 +0000, Alan Brown wrote: > > Directories of the size (number of entries) which you have indicated > should not be causing a problem as lookup should still be quite fast at > that scale. > > Perhaps, but even so 4000 file directories usually take over a minute to > "ls -l" , while 85k file/directories take 5 mins (20-40 mins on a bad > day) - and this is mounted lock_dlm, single-node-only > > Yes, ls -l will always take longer because it is not just accessing the directory, but also every inode in the directory. As a result the I/O pattern will generally be poor. Also, the order in which GFS2 returns the directory entries is not efficient if it is used for doing the stat calls associated with the ls -l. Better performance could be obtained by sorting the inodes to run stat on into inode number order. The reason that the ordering is not ideal is that without that we could not maintain a uniform view of the directory from a readers point of view while other processes are adding or removing entries. It is a historical issue that we have inherited from GFS and I've spent some time trying to come up with a solution in kernel space, but in the end, a userland solution may be a better way to solve it. I assume that once the directory has been read in once, that it acesses will be much faster on subsequent occasions, Steve. From jeff.sturm at eprize.com Wed Feb 16 20:25:56 2011 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Wed, 16 Feb 2011 15:25:56 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <4D5ADD76.2050600@logik-internet.rs> References: <4D59BD29.30906@logik-internet.rs> <4D59C13A.9030807@alteeve.com> <4D59DB8D.5080606@logik-internet.rs> <4D59DC04.6060305@alteeve.com> <4D59E3A1.4070508@logik-internet.rs> <4D59E6F3.3050002@alteeve.com> <4D59EAAD.8030301@logik-internet.rs><4D5A4DFE.8050101@bobich.net> <4D5A6852.5000103@logik-internet.rs><4D5A6BDC.8080708@bobich.net><64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> <4D5ADD76.2050600@logik-internet.rs> Message-ID: <64D0546C5EBBD147B75DE133D798665F0855C138@hugo.eprize.local> > -----Original Message----- > From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] > On Behalf Of Nikola Savic > Sent: Tuesday, February 15, 2011 3:09 PM > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > > Jeff Sturm wrote: > > We actually resize volumes often. Some of our storage volumes have 30 > > LUNs or more. We have so many because we've virtualized most of our > > infrastructure, and some of the hosts are single-purpose hosts. > > > > Can you please provide more information on how storage is organized? > > Are you using SAN or local hard disks in nodes? Is there mirroring of data and how is > it implemented in your system? To answer your questions, these nodes are paravirtualized under the Xen hypervisor. The physical volumes are kept on a central storage server (commercial SAN appliance), organized into a clustered volume group by CLVM. Out of that volume group we have 30+ logical volumes, each of which are simple filesystem images, mounted as the root filesystem on one of the virtual hosts. -Jeff From ajb2 at mssl.ucl.ac.uk Wed Feb 16 20:38:57 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 20:38:57 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5C35E1.7050404@mssl.ucl.ac.uk> > For the GFS2 glocks, that doesn't matter - all of the glocks are held in a single hash table no matter how many filesystems there are. Given nearly 4 mlllion glocks currently on one of the boxes in a quiet state (and nearly 6 million if everything was on one node), is the existing hash table large enough? From jeff.sturm at eprize.com Wed Feb 16 21:08:55 2011 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Wed, 16 Feb 2011 16:08:55 -0500 Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <263367529.33108.1297787066881.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> References: <64D0546C5EBBD147B75DE133D798665F0855C0F4@hugo.eprize.local> <263367529.33108.1297787066881.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <64D0546C5EBBD147B75DE133D798665F0855C139@hugo.eprize.local> > -----Original Message----- > From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] > On Behalf Of Bob Peterson > Sent: Tuesday, February 15, 2011 11:24 AM > To: linux clustering > Subject: Re: [Linux-cluster] Cluster with shared storage on low budget > > ----- Original Message ----- > | We don't want to allocate too more storage in advance, simply because > | it's easier to grow than to shrink. Stop the host, grow the volume, > | e2fsck/resize2fs, start up and go. Much nicer than increasing disk > | capacity on physical hosts. > > These might be good for ext3/4, but with gfs and gfs2 you can lvresize and gfs2_grow > while the lv is mounted. In fact, we expect it. > Just make sure the vg has the clustered bit set (vgchange -cy) first. You know, that's a good point. We don't use GFS2 for any non-clustered fs, right now, but why not? Are you saying I can do an online gfs2_grow even with lock_nolock? -Jeff From ajb2 at mssl.ucl.ac.uk Wed Feb 16 21:12:58 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Wed, 16 Feb 2011 21:12:58 +0000 Subject: [Linux-cluster] optimising DLM speed? Message-ID: <4D5C3DDA.6090906@mssl.ucl.ac.uk> > Yes, ls -l will always take longer because it is not just accessing the directory, but also every inode in the directory. As a result the I/O pattern will generally be poor. I know and accept that. It's common to most filesystems but the access time is particularly pronounced with GFS2 (presumably because of the added latencies) The problem is that users don't see things from the same point of view, so there's a constant flow of complaints about "slow servers". They think that holding down the number of files/directory is an unreasonable restriction - and in some cases (NASA/ESA archives) I can't even explain the reasons why as the people involved are unreachable. This is despite quite documentable performance gains from breaking up large directories even on non-cluster filesystems - We saw a ls -lR speedup of around 700x when moving one directory structure from flat (130k files) to nested. The same poor I/O pattern has a direct bearing on incremental backup speeds - backup software has to stat() a file (at minimum - SHA hash comparisons are even more overhead) to see if anything's changed, which means in large directories a backup may drop down to scan rates of 10 files/second or lower and seldom exceeds 100 files/second at best. (Bacula is pretty good about caching and issues a fadvise(notneeded) after each file is checked. I just wish other filesystem-crawling processes did the same) > I assume that once the directory has been read in once, that it acesses will be much faster on subsequent occasions, Correct - but after 5-10 idle minutes the cached information is lost and the pattern repeats. > It is a historical issue that we have inherited from GFS and I've spent some time trying to come up with a solution in kernel space, but in the end, a userland solution may be a better way to solve it. In the case of NFS clients, I'm seriously looking at trying to move to RHEL6 and use fscache - this should help reduce load a little but won't help for uncached directories. If you have any suggestions on the [nfs export|client mount] side to try and help things I'm open to suggestions. From rpeterso at redhat.com Wed Feb 16 21:20:05 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Wed, 16 Feb 2011 16:20:05 -0500 (EST) Subject: [Linux-cluster] Cluster with shared storage on low budget In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0855C139@hugo.eprize.local> Message-ID: <883242944.61521.1297891205419.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | You know, that's a good point. We don't use GFS2 for any non-clustered | fs, right now, but why not? Are you saying I can do an online | gfs2_grow | even with lock_nolock? | | -Jeff Hi Jeff, Yes, you should be able to. Regards, Bob Peterson Red Hat File Systems From fdinitto at redhat.com Thu Feb 17 09:19:57 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Thu, 17 Feb 2011 10:19:57 +0100 Subject: [Linux-cluster] Announcing "Cluster in a BOX" project Message-ID: <4D5CE83D.7050400@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi all, A lot of people find it hard to setup their first cluster or simply don?t have time to repeat the same setup over and over, either for development or do some basic testing or to showcase cluster technologies to other people. The "Cluster in a BOX" project (cbox in short), is one script to setup a KVM based virtual test cluster in a matter of few minutes. cbox is still in its early development, has several limitations and some strict requirements. Plans are to include as many cluster technologies and configuration examples as possible and remove as many limitations as we possibly can. If your cluster project or technology is not there yet, or your distribution is not supported, it?s simply because I do not have the resources to do all by myself. Do not take it personally, we will get it there together and I absolutely welcome comments, patches and feedback at any time. Support for pacemaker, DRBD and OCFS2 will come shortly. cbox documentation is here (in temporary lack of a more neutral location): http://sources.redhat.com/cluster/wiki/cluster_in_a_box Please read it carefully before running cbox. Fabio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJNXOg6AAoJEFA6oBJjVJ+Oky4QAIYZptGXaEeCuOIUhyalZrxA Piuc05L5De6Nsfe7EtVQmut+mylK2uuF8DmErenIsmy/uIlX7xPHx5vzeOxE0nWk xTvEjVnAL4t36CNf8AfrVXJ7F+4OsRqNrXjukxTJ7lbq72aYqtr8NQDL7sEfCSBp XQSywRWIdxjqC5JGHNtKN/rSdcD6AlMt9EutvDHJkWZtzLFAZhbxdPkj0sTDyAWp qje2Bnsz/BfNLcfnxEhGZsZ+ZH1X/A3ps6xT5EIPo3r3l52HTOSCXNLYLBIY4+pK E4IdzRLwJOQqYujPjYsMeESBww2cgDFSJFHW/AR5YZCE7MkjS/e/BkZxXBkZY5dq YFR9KTU1GvQi8Hwi32pDYJXcd0bbkKT/Cq73a2/bkwEqwBR2oWisr/lI196SGKaW PVBNUM2N4DPxv6Rc3tg966ZycAkR/PM8oa1EbojAC9hl7eM5yQADAzs9mof/YISt 1Nay4gdDYEEzH5Mt6pozUZaK6sQcz63C/vrxGsQZAjAXwZz87jmlHOoLBWGhVj6a Z/FEWo0ofaYBBQKQnboz4V9EKFP3M4yhJ8vDcWUa3qwaDL9X2BVCg/dtuLq3Q51O szFMibYzw73Xr4K2CS/XlVZQ4Lykfy7gEKxeEIo/7287W2keL/YgXtuVf+17ZUkl bdhqV0OXF8hq/xMC+OeU =QTI7 -----END PGP SIGNATURE----- From swhiteho at redhat.com Thu Feb 17 10:13:15 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Thu, 17 Feb 2011 10:13:15 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5C35E1.7050404@mssl.ucl.ac.uk> References: <4D5C35E1.7050404@mssl.ucl.ac.uk> Message-ID: <1297937595.2552.9.camel@dolmen> Hi, On Wed, 2011-02-16 at 20:38 +0000, Alan Brown wrote: > > For the GFS2 glocks, that doesn't matter - all of the glocks are held > in a single hash table no matter how many filesystems there are. > > Given nearly 4 mlllion glocks currently on one of the boxes in a quiet > state (and nearly 6 million if everything was on one node), is the > existing hash table large enough? > > It is a concern. The table cannot be realistically expanded forever, and expanding it "on the fly" would be very tricky. There are however other factors which determine the scalability of the hash table, not just the number of hash heads. By using RCU for the upstream code, we've been able to reduce locking and improve speed by a significant factor without needing to increase the number of list heads in the hash table. We did increase that number though, anyway, since the new system we are using can put both the hash chain lock and the hash table head into a single pointer. That means less space for locks and therefore we increased the number of hash table heads at that time. However large we grow the table though, it will never really be "enough" so that probably the next development will be to have trees rather than chains of glocks under each hash table head, and at least then the chain lengths will scale with log(N) rather than N. The issue with doing that is making such a thing work with RCU. We do go to some lengths to avoid doing hash lookups at all. Once a glock has been attached to an inode, we don't do any lookups in the hash table again until the inode has been pushed out of the cache, so it will only show up on a workload which is constantly scanning new inodes which are not in cache already. At least until now, the time taken to do the I/O associated with such operations has been much larger, so that it didn't really show up as an important performance item. Obviously if it causes problems, then we'll look into addressing them. Hopefully that explains a bit more of our reasoning behind the decisions that have been made. Please let us know if we can be of further help, Steve. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From linux-cluster at redhat.com Thu Feb 17 12:33:07 2011 From: linux-cluster at redhat.com (Mailbot for etexusa.com) Date: Thu, 17 Feb 2011 04:33:07 -0800 Subject: [Linux-cluster] DSN: failed (ADROITHOUSE@REDIFFMAIL.COM) Message-ID: This is a Delivery Status Notification (DSN). I was unable to deliver your message to adroithouse at rediffmail.com. I said (end of message) And they gave me the error; 552 suspicious virus code detected in executables attached, message not accepted (#5.3.4) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/rfc822-headers Size: 515 bytes Desc: not available URL: From ajb2 at mssl.ucl.ac.uk Thu Feb 17 21:24:41 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Thu, 17 Feb 2011 21:24:41 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <20110216175848.GC2291@redhat.com> References: <4D5BDB4E.3010006@mssl.ucl.ac.uk> <20110216175848.GC2291@redhat.com> Message-ID: <4D5D9219.8090002@mssl.ucl.ac.uk> David Teigland wrote: > > Don't change the buffer size, but I'd increase all the hash table sizes to > 4096 and see if anything changes. > > echo "4096" > /sys/kernel/config/dlm/cluster/rsbtbl_size > echo "4096" > /sys/kernel/config/dlm/cluster/lkbtbl_size > echo "4096" > /sys/kernel/config/dlm/cluster/dirtbl_size Increasing rsbtbl_size to 4096 or higher results in FSes refusing to mount and clvm refusing to start - both with "cannot allocate memory" At 2048, it works, but gfs_controld and dlm_controld exited when I tried to mount all FSes on one node as a test. At 1024 it seems stable. The other settings seemed to have applied OK. So far, reports are positive (but it's quiet at the moment) I've got a strace of clvmd trying to start with rsbtbl_size set to 4096. Should I post it here or would you prefer it mailed direct? From teigland at redhat.com Thu Feb 17 21:29:48 2011 From: teigland at redhat.com (David Teigland) Date: Thu, 17 Feb 2011 16:29:48 -0500 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <4D5D9219.8090002@mssl.ucl.ac.uk> References: <4D5BDB4E.3010006@mssl.ucl.ac.uk> <20110216175848.GC2291@redhat.com> <4D5D9219.8090002@mssl.ucl.ac.uk> Message-ID: <20110217212948.GA9582@redhat.com> On Thu, Feb 17, 2011 at 09:24:41PM +0000, Alan Brown wrote: > David Teigland wrote: > > > >Don't change the buffer size, but I'd increase all the hash table sizes to > >4096 and see if anything changes. > > > >echo "4096" > /sys/kernel/config/dlm/cluster/rsbtbl_size > >echo "4096" > /sys/kernel/config/dlm/cluster/lkbtbl_size > >echo "4096" > /sys/kernel/config/dlm/cluster/dirtbl_size > > Increasing rsbtbl_size to 4096 or higher results in FSes refusing to > mount and clvm refusing to start - both with "cannot allocate > memory" > > At 2048, it works, but gfs_controld and dlm_controld exited when I > tried to mount all FSes on one node as a test. > > At 1024 it seems stable. > > The other settings seemed to have applied OK. So far, reports are > positive (but it's quiet at the moment) > > I've got a strace of clvmd trying to start with rsbtbl_size set to > 4096. Should I post it here or would you prefer it mailed direct? Thanks for testing, you can post here. From ajb2 at mssl.ucl.ac.uk Fri Feb 18 12:40:41 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Fri, 18 Feb 2011 12:40:41 +0000 Subject: [Linux-cluster] optimising DLM speed? In-Reply-To: <20110217212948.GA9582@redhat.com> References: <4D5BDB4E.3010006@mssl.ucl.ac.uk> <20110216175848.GC2291@redhat.com> <4D5D9219.8090002@mssl.ucl.ac.uk> <20110217212948.GA9582@redhat.com> Message-ID: <4D5E68C9.1000206@mssl.ucl.ac.uk> David Teigland wrote: >> I've got a strace of clvmd trying to start with rsbtbl_size set to >> 4096. Should I post it here or would you prefer it mailed direct? > > Thanks for testing, you can post here. > -------------- next part -------------- A non-text attachment was scrubbed... Name: ClvmdFailedStartStrace.gz Type: application/x-gzip Size: 42737 bytes Desc: not available URL: From jon at whiteheat.org.uk Fri Feb 18 13:30:28 2011 From: jon at whiteheat.org.uk (Jonathan Gowar) Date: Fri, 18 Feb 2011 13:30:28 +0000 Subject: [Linux-cluster] WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured Message-ID: <4D5E7474.2080500@whiteheat.org.uk> I've been following the cluster from scratch guide, by Beekhof. I'm using Debian 6, so I don't know how much that might confuse things; I appreciate there are a few debian-specifics. Before adding the drbd pacemaker resource crm status looked fine. After configuring the resource I get the following error from crm_mon:- WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured Here is the crm configuration, and monitor:- root at squeeze:~# crm configure show node sleeze node sneeze node squeeze primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ op monitor interval="30s" primitive WebData ocf:linbit:drbd \ params drbd_resource="wwwdata" \ op monitor interval="60s" primitive WebSite ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" \ op monitor interval="1m" ms WebDataClone WebData \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" colocation website-with-ip inf: WebSite ClusterIP order apache-after-ip inf: ClusterIP WebSite property $id="cib-bootstrap-options" \ dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="false" rsc_defaults $id="rsc-options" \ resource-stickiness="100" root at squeeze:~# crm status ============ Last updated: Fri Feb 18 13:15:53 2011 Stack: openais Current DC: sneeze - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 3 Nodes configured, 3 expected votes 3 Resources configured. ============ Online: [ squeeze sneeze sleeze ] ClusterIP (ocf::heartbeat:IPaddr2): Started sneeze WebSite (ocf::heartbeat:apache): Started sneeze Master/Slave Set: WebDataClone Masters: [ squeeze ] Slaves: [ sneeze ] Failed actions: WebData_monitor_0 (node=sleeze, call=4, rc=6, status=complete): not configured WebData_monitor_0 (node=sneeze, call=9, rc=6, status=complete): not configured WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured Does anyone have any ideas as to how I might investigate where the problem is. Kind regards, Jon From Jason_Henderson at Mitel.com Fri Feb 18 17:47:17 2011 From: Jason_Henderson at Mitel.com (Jason_Henderson at Mitel.com) Date: Fri, 18 Feb 2011 12:47:17 -0500 Subject: [Linux-cluster] HP iLO3 and /sbin/fence_ilo not working Message-ID: According to this knowledge base article at redhat, https://access.redhat.com/kb/docs/DOC-39336, the /sbin/fence_ilo script should work with iLO3. I am using the version of cman mentioned in the article. The iLO3 firmware is at the latest revision 1.16. The fence_ilo script just returns a connect/login error as follows: [root at node01 ~]# /sbin/fence_ilo -o status -a '10.39.170.233' -l mitel -p ilopassword -v Unable to connect/login to fencing device The login credentials are correct as I can connect via ssh: [root at node01 ~]# ssh mitel at 10.39.170.233 mitel at 10.39.170.233's password: User:mitel logged-in to ILOUSE103N281.(10.39.170.233) iLO 3 Standard 1.16 at Dec 17 2010 Server Name: host is unnamed Server Power: On hpiLO-> Is their anything else that needs to be done to execute the script successfully? The fence_ilo script works on previous iLO versions. Linked Article: Issue * How do I set up HP iLO3 as a fence device in a Red Hat Cluster Suite (RHCS) cluster? * Why does fencing fail on Red Hat Cluster Suite when using HP's iLO3? Environment * Red Hat Enterprise Linux (RHEL) 5 * Red Hat Enterprise Linux 6 * Red Hat Cluster Suite * HP iLO3 Resolution Support for the iLO3 fence device in RHEL5 has been added with the release of cman 2.0.115-34.el5_5.4 through erratum RHEA-2010-0876. No special setup is required after installing this erratum to get the iLO3 to work. HP has also released new firmware to address an issue with fence_ipmi which could cause the server to be powered on instead of off. We therefore advise you to upgrade to firmware version 1.15 (28 Oct 2010) as provided by HP. For installation instructions or problem resolution with this firmware version we refer you to the HP website. Root Cause HP iLO3 was not supported by fence_ilo. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jason_Henderson at Mitel.com Fri Feb 18 18:04:01 2011 From: Jason_Henderson at Mitel.com (Jason_Henderson at Mitel.com) Date: Fri, 18 Feb 2011 13:04:01 -0500 Subject: [Linux-cluster] HP iLO3 and /sbin/fence_ilo not working In-Reply-To: Message-ID: I think I mis-understood the article in combination with a reply from HP tech support on the issue. Looks like the fence_ipmilan agent is what changed to support fencing with iLO3, not fence_ilo. linux-cluster-bounces at redhat.com wrote on 02/18/2011 12:47:17 PM: > > According to this knowledge base article at redhat, https://access. > redhat.com/kb/docs/DOC-39336, the /sbin/fence_ilo script should work > with iLO3. I am using the version of cman mentioned in the article. > The iLO3 firmware is at the latest revision 1.16. > The fence_ilo script just returns a connect/login error as follows: > > [root at node01 ~]# /sbin/fence_ilo -o status -a '10.39.170.233' -l > mitel -p ilopassword -v > Unable to connect/login to fencing device > > The login credentials are correct as I can connect via ssh: > > [root at node01 ~]# ssh mitel at 10.39.170.233 > mitel at 10.39.170.233's password: > User:mitel logged-in to ILOUSE103N281.(10.39.170.233) > iLO 3 Standard 1.16 at Dec 17 2010 > Server Name: host is unnamed > Server Power: On > > hpiLO-> > > Is their anything else that needs to be done to execute the script > successfully? The fence_ilo script works on previous iLO versions. > > > Linked Article: > > Issue > > * How do I set up HP iLO3 as a fence device in a Red Hat > Cluster Suite (RHCS) cluster? > * Why does fencing fail on Red Hat Cluster Suite when using HP's iLO3? > > Environment > > * Red Hat Enterprise Linux (RHEL) 5 > * Red Hat Enterprise Linux 6 > * Red Hat Cluster Suite > * HP iLO3 > > Resolution > > Support for the iLO3 fence device in RHEL5 has been added with the > release of cman 2.0.115-34.el5_5.4 through erratum RHEA-2010-0876. > No special setup is required after installing this erratum to get > the iLO3 to work. > > HP has also released new firmware to address an issue with > fence_ipmi which could cause the server to be powered on instead of > off. We therefore advise you to upgrade to firmware version 1.15 (28 > Oct 2010) as provided by HP. For installation instructions or > problem resolution with this firmware version we refer you to the HP website. > > Root Cause > > HP iLO3 was not supported by fence_ilo.-- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From jon at whiteheat.org.uk Fri Feb 18 20:52:11 2011 From: jon at whiteheat.org.uk (Jonathan Gowar) Date: Fri, 18 Feb 2011 20:52:11 +0000 Subject: [Linux-cluster] WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured In-Reply-To: <4D5E7474.2080500@whiteheat.org.uk> References: <4D5E7474.2080500@whiteheat.org.uk> Message-ID: <4D5EDBFB.7010400@whiteheat.org.uk> On 18/02/11 13:30, Jonathan Gowar wrote: > I've been following the cluster from scratch guide, by Beekhof. I'm > using Debian 6, so I don't know how much that might confuse things; I > appreciate there are a few debian-specifics. > > Before adding the drbd pacemaker resource crm status looked fine. After > configuring the resource I get the following error from crm_mon:- > > WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not > configured > > Here is the crm configuration, and monitor:- > > root at squeeze:~# crm configure show > node sleeze > node sneeze > node squeeze > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ > op monitor interval="30s" > primitive WebData ocf:linbit:drbd \ > params drbd_resource="wwwdata" \ > op monitor interval="60s" > primitive WebSite ocf:heartbeat:apache \ > params configfile="/etc/apache2/apache2.conf" \ > op monitor interval="1m" > ms WebDataClone WebData \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" > notify="true" > colocation website-with-ip inf: WebSite ClusterIP > order apache-after-ip inf: ClusterIP WebSite > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="3" \ > stonith-enabled="false" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > root at squeeze:~# crm status > ============ > Last updated: Fri Feb 18 13:15:53 2011 > Stack: openais > Current DC: sneeze - partition with quorum > Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b > 3 Nodes configured, 3 expected votes > 3 Resources configured. > ============ > > Online: [ squeeze sneeze sleeze ] > > ClusterIP (ocf::heartbeat:IPaddr2): Started sneeze > WebSite (ocf::heartbeat:apache): Started sneeze > Master/Slave Set: WebDataClone > Masters: [ squeeze ] > Slaves: [ sneeze ] > > Failed actions: > WebData_monitor_0 (node=sleeze, call=4, rc=6, status=complete): not > configured > WebData_monitor_0 (node=sneeze, call=9, rc=6, status=complete): not > configured > WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not > configured > > Does anyone have any ideas as to how I might investigate where the > problem is. > > Kind regards, > Jon Hi, Found out how to debug failing resources:- http://www.clusterlabs.org/wiki/Debugging_Resource_Failures I managed to clear 1 problem, fuser was not installed; that means psmisc for Debian users. root at squeeze:~# crm configure show node sleeze node sneeze node squeeze primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="80.87.131.245" cidr_netmask="32" \ op monitor interval="30s" primitive WebData ocf:linbit:drbd \ params drbd_resource="wwwdata" \ op monitor interval="60s" primitive WebFS ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/wwwdata" directory="/var/www/drbd" fstype="ext4" \ meta is-managed="true" primitive WebSite ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" \ op monitor interval="1m" ms WebDataClone WebData \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" is-managed="false" location cli-prefer-WebSite WebSite \ rule $id="cli-prefer-rule-WebSite" inf: #uname eq sleeze colocation WebSite-with-WebFS inf: WebSite WebFS colocation fs_on_drbd inf: WebFS WebDataClone:Master colocation website-with-ip inf: WebSite ClusterIP order WebFS-after-WebData inf: WebDataClone:promote WebFS:start order WebSite-after-WebFS inf: WebFS WebSite order apache-after-ip inf: ClusterIP WebSite property $id="cib-bootstrap-options" \ dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="false" \ last-lrm-refresh="1298043091" rsc_defaults $id="rsc-options" \ resource-stickiness="100" Here are a couple of bad looking lines from the debug output:- /usr/lib/ocf/resource.d/linbit/drbd: 1: [[: not found /usr/lib/ocf/resource.d/linbit/drbd: 1: 0x080307: not found /usr/lib/ocf/resource.d/linbit/drbd: 1: Bad substitution n.b. See full debug report at http://pastebin.com/pjKxBu8K OCF Return Code: 2 OCF Alias: OCF_ERR_ARGS Description: "The resource's configuration is not valid on this machine. Eg. Refers to a location/tool not found on the node." Recovery Type: hard Let me know if there's anything else I need to post. Kind regards, Jon From jon at whiteheat.org.uk Fri Feb 18 21:51:55 2011 From: jon at whiteheat.org.uk (Jonathan Gowar) Date: Fri, 18 Feb 2011 21:51:55 +0000 Subject: [Linux-cluster] WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured In-Reply-To: <4D5EDBFB.7010400@whiteheat.org.uk> References: <4D5E7474.2080500@whiteheat.org.uk> <4D5EDBFB.7010400@whiteheat.org.uk> Message-ID: <4D5EE9FB.9040407@whiteheat.org.uk> On 18/02/11 20:52, Jonathan Gowar wrote: > On 18/02/11 13:30, Jonathan Gowar wrote: >> I've been following the cluster from scratch guide, by Beekhof. I'm >> using Debian 6, so I don't know how much that might confuse things; I >> appreciate there are a few debian-specifics. >> >> Before adding the drbd pacemaker resource crm status looked fine. After >> configuring the resource I get the following error from crm_mon:- >> >> WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not >> configured >> >> Here is the crm configuration, and monitor:- >> >> root at squeeze:~# crm configure show >> node sleeze >> node sneeze >> node squeeze >> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >> params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ >> op monitor interval="30s" >> primitive WebData ocf:linbit:drbd \ >> params drbd_resource="wwwdata" \ >> op monitor interval="60s" >> primitive WebSite ocf:heartbeat:apache \ >> params configfile="/etc/apache2/apache2.conf" \ >> op monitor interval="1m" >> ms WebDataClone WebData \ >> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" >> notify="true" >> colocation website-with-ip inf: WebSite ClusterIP >> order apache-after-ip inf: ClusterIP WebSite >> property $id="cib-bootstrap-options" \ >> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="3" \ >> stonith-enabled="false" >> rsc_defaults $id="rsc-options" \ >> resource-stickiness="100" >> root at squeeze:~# crm status >> ============ >> Last updated: Fri Feb 18 13:15:53 2011 >> Stack: openais >> Current DC: sneeze - partition with quorum >> Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b >> 3 Nodes configured, 3 expected votes >> 3 Resources configured. >> ============ >> >> Online: [ squeeze sneeze sleeze ] >> >> ClusterIP (ocf::heartbeat:IPaddr2): Started sneeze >> WebSite (ocf::heartbeat:apache): Started sneeze >> Master/Slave Set: WebDataClone >> Masters: [ squeeze ] >> Slaves: [ sneeze ] >> >> Failed actions: >> WebData_monitor_0 (node=sleeze, call=4, rc=6, status=complete): not >> configured >> WebData_monitor_0 (node=sneeze, call=9, rc=6, status=complete): not >> configured >> WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not >> configured >> >> Does anyone have any ideas as to how I might investigate where the >> problem is. >> >> Kind regards, >> Jon > > Hi, > > Found out how to debug failing resources:- > > http://www.clusterlabs.org/wiki/Debugging_Resource_Failures > > I managed to clear 1 problem, fuser was not installed; that means psmisc > for Debian users. > > > root at squeeze:~# crm configure show > node sleeze > node sneeze > node squeeze > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ > op monitor interval="30s" > primitive WebData ocf:linbit:drbd \ > params drbd_resource="wwwdata" \ > op monitor interval="60s" > primitive WebFS ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/wwwdata" directory="/var/www/drbd" > fstype="ext4" \ > meta is-managed="true" > primitive WebSite ocf:heartbeat:apache \ > params configfile="/etc/apache2/apache2.conf" \ > op monitor interval="1m" > ms WebDataClone WebData \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" > notify="true" is-managed="false" > location cli-prefer-WebSite WebSite \ > rule $id="cli-prefer-rule-WebSite" inf: #uname eq sleeze > colocation WebSite-with-WebFS inf: WebSite WebFS > colocation fs_on_drbd inf: WebFS WebDataClone:Master > colocation website-with-ip inf: WebSite ClusterIP > order WebFS-after-WebData inf: WebDataClone:promote WebFS:start > order WebSite-after-WebFS inf: WebFS WebSite > order apache-after-ip inf: ClusterIP WebSite > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="3" \ > stonith-enabled="false" \ > last-lrm-refresh="1298043091" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > > Here are a couple of bad looking lines from the debug output:- > > > /usr/lib/ocf/resource.d/linbit/drbd: 1: [[: not found > /usr/lib/ocf/resource.d/linbit/drbd: 1: 0x080307: not found > /usr/lib/ocf/resource.d/linbit/drbd: 1: Bad substitution > > > n.b. See full debug report at http://pastebin.com/pjKxBu8K > > OCF Return Code: 2 > OCF Alias: OCF_ERR_ARGS > Description: "The resource's configuration is not valid on this machine. > Eg. Refers to a location/tool not found on the node." > Recovery Type: hard > > Let me know if there's anything else I need to post. > > Kind regards, > Jon Hi, This appeared to be a problem running 3 nodes. Stopping corosync on one of the nodes levitated the problem. Is it possible to have a 3 node cluster, 3 running apache, 2 running DRBD? If so, can someone point me in the direction of how to. Kind regards, Jon From jakov.sosic at srce.hr Sat Feb 19 00:36:02 2011 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Sat, 19 Feb 2011 01:36:02 +0100 Subject: [Linux-cluster] MySQL MASTER->SLAVE agent? Message-ID: <4D5F1072.1050202@srce.hr> Hi. Would it be possible to use the MySQL replication as a way of achieving HA? When MASTER is down to take appropriate actions and declare SLAVE the NEW master, and take on IP address? Has anyone tested this kind of setup? Obviously, failback should be impossible and request manual action. This would pose as a good solution for environments without shared storage and without DRBD. -- Jakov Sosic www.srce.hr From crosa at redhat.com Sat Feb 19 01:05:33 2011 From: crosa at redhat.com (Cleber Rosa) Date: Fri, 18 Feb 2011 23:05:33 -0200 Subject: [Linux-cluster] MySQL MASTER->SLAVE agent? In-Reply-To: <4D5F1072.1050202@srce.hr> References: <4D5F1072.1050202@srce.hr> Message-ID: <4D5F175D.9070903@redhat.com> On 02/18/2011 10:36 PM, Jakov Sosic wrote: > Hi. > > Would it be possible to use the MySQL replication as a way of achieving HA? > > When MASTER is down to take appropriate actions and declare SLAVE the > NEW master, and take on IP address? Has anyone tested this kind of > setup? Obviously, failback should be impossible and request manual action. > > This would pose as a good solution for environments without shared > storage and without DRBD. > > Jakov, Actually MySQL allows for MASTER<->MASTER replication. I've successfully deployed that on some 10 sites or so top of RHCS using nothing but a floating IP address for MySQL (plus other resources for other services). Of course, you'd better keep an eye on MySQL's replication health when doing that. CR. From dan.candea at quah.ro Sat Feb 19 10:22:05 2011 From: dan.candea at quah.ro (Dan Candea) Date: Sat, 19 Feb 2011 12:22:05 +0200 Subject: [Linux-cluster] MySQL MASTER->SLAVE agent? In-Reply-To: <4D5F175D.9070903@redhat.com> References: <4D5F1072.1050202@srce.hr> <4D5F175D.9070903@redhat.com> Message-ID: <4D5F99CD.2060709@quah.ro> On 19.02.2011 03:05, Cleber Rosa wrote: > On 02/18/2011 10:36 PM, Jakov Sosic wrote: >> Hi. >> >> Would it be possible to use the MySQL replication as a way of >> achieving HA? >> >> When MASTER is down to take appropriate actions and declare SLAVE the >> NEW master, and take on IP address? Has anyone tested this kind of >> setup? Obviously, failback should be impossible and request manual >> action. >> >> This would pose as a good solution for environments without shared >> storage and without DRBD. >> >> > Jakov, > > Actually MySQL allows for MASTER<->MASTER replication. I've > successfully deployed that on some 10 sites or so top of RHCS using > nothing but a floating IP address for MySQL (plus other resources for > other services). > > Of course, you'd better keep an eye on MySQL's replication health when > doing that. > > CR. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster you could try with ndb tables, it's the mysql cluster engine -- Dan C?ndea Does God Play Dice? From crosa at redhat.com Sat Feb 19 21:27:01 2011 From: crosa at redhat.com (Cleber Rosa) Date: Sat, 19 Feb 2011 19:27:01 -0200 Subject: [Linux-cluster] MySQL MASTER->SLAVE agent? In-Reply-To: <4D5F99CD.2060709@quah.ro> References: <4D5F1072.1050202@srce.hr> <4D5F175D.9070903@redhat.com> <4D5F99CD.2060709@quah.ro> Message-ID: <4D6035A5.4020703@redhat.com> On 02/19/2011 08:22 AM, Dan Candea wrote: > On 19.02.2011 03:05, Cleber Rosa wrote: >> On 02/18/2011 10:36 PM, Jakov Sosic wrote: >>> Hi. >>> >>> Would it be possible to use the MySQL replication as a way of >>> achieving HA? >>> >>> When MASTER is down to take appropriate actions and declare SLAVE the >>> NEW master, and take on IP address? Has anyone tested this kind of >>> setup? Obviously, failback should be impossible and request manual >>> action. >>> >>> This would pose as a good solution for environments without shared >>> storage and without DRBD. >>> >>> >> Jakov, >> >> Actually MySQL allows for MASTER<->MASTER replication. I've >> successfully deployed that on some 10 sites or so top of RHCS using >> nothing but a floating IP address for MySQL (plus other resources for >> other services). >> >> Of course, you'd better keep an eye on MySQL's replication health >> when doing that. >> >> CR. >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > you could try with ndb tables, it's the mysql cluster engine > AFAIK this still requires a "data storage" node (which is centralized), but I may be totally outdated on the subject. From pieter.baele at gmail.com Mon Feb 21 07:54:22 2011 From: pieter.baele at gmail.com (Pieter Baele) Date: Mon, 21 Feb 2011 08:54:22 +0100 Subject: [Linux-cluster] CLVM mirror using Pacemaker (RHEL6) Message-ID: Hi, I added a DLM resource, but when I try to add clvm in crm, I get the following error: crm(live)configure# primitive clvm ocf:lvm2:clvmd params daemon_timeout="30" op monitor interval="60" timeout="60" ERROR: ocf:lvm2:clvmd: could not parse meta-data: ERROR: ocf:lvm2:clvmd: no such resource agent How can I set up clvm (mirroring) using Pacemaker DLM integration? Met vriendelijke groeten, Pieter Baele www.pieterb.be From linux-cluster at redhat.com Mon Feb 21 07:57:29 2011 From: linux-cluster at redhat.com (Mailbot for etexusa.com) Date: Sun, 20 Feb 2011 23:57:29 -0800 Subject: [Linux-cluster] DSN: failed (Message could not be delivered) Message-ID: This is a Delivery Status Notification (DSN). I was unable to deliver your message to guhantex at eth.net. I said RCPT TO: And they gave me the error; 550 5.1.1 unknown or illegal alias: guhantex at eth.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/rfc822-headers Size: 499 bytes Desc: not available URL: From andrew at beekhof.net Mon Feb 21 09:52:33 2011 From: andrew at beekhof.net (Andrew Beekhof) Date: Mon, 21 Feb 2011 10:52:33 +0100 Subject: [Linux-cluster] WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not configured In-Reply-To: <4D5EE9FB.9040407@whiteheat.org.uk> References: <4D5E7474.2080500@whiteheat.org.uk> <4D5EDBFB.7010400@whiteheat.org.uk> <4D5EE9FB.9040407@whiteheat.org.uk> Message-ID: On Fri, Feb 18, 2011 at 10:51 PM, Jonathan Gowar wrote: > On 18/02/11 20:52, Jonathan Gowar wrote: >> >> On 18/02/11 13:30, Jonathan Gowar wrote: >>> >>> I've been following the cluster from scratch guide, by Beekhof. I'm >>> using Debian 6, so I don't know how much that might confuse things; I >>> appreciate there are a few debian-specifics. >>> >>> Before adding the drbd pacemaker resource crm status looked fine. After >>> configuring the resource I get the following error from crm_mon:- >>> >>> WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not >>> configured >>> >>> Here is the crm configuration, and monitor:- >>> >>> root at squeeze:~# crm configure show >>> node sleeze >>> node sneeze >>> node squeeze >>> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >>> params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ >>> op monitor interval="30s" >>> primitive WebData ocf:linbit:drbd \ >>> params drbd_resource="wwwdata" \ >>> op monitor interval="60s" >>> primitive WebSite ocf:heartbeat:apache \ >>> params configfile="/etc/apache2/apache2.conf" \ >>> op monitor interval="1m" >>> ms WebDataClone WebData \ >>> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" >>> notify="true" >>> colocation website-with-ip inf: WebSite ClusterIP >>> order apache-after-ip inf: ClusterIP WebSite >>> property $id="cib-bootstrap-options" \ >>> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ >>> cluster-infrastructure="openais" \ >>> expected-quorum-votes="3" \ >>> stonith-enabled="false" >>> rsc_defaults $id="rsc-options" \ >>> resource-stickiness="100" >>> root at squeeze:~# crm status >>> ============ >>> Last updated: Fri Feb 18 13:15:53 2011 >>> Stack: openais >>> Current DC: sneeze - partition with quorum >>> Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b >>> 3 Nodes configured, 3 expected votes >>> 3 Resources configured. >>> ============ >>> >>> Online: [ squeeze sneeze sleeze ] >>> >>> ClusterIP (ocf::heartbeat:IPaddr2): Started sneeze >>> WebSite (ocf::heartbeat:apache): Started sneeze >>> Master/Slave Set: WebDataClone >>> Masters: [ squeeze ] >>> Slaves: [ sneeze ] >>> >>> Failed actions: >>> WebData_monitor_0 (node=sleeze, call=4, rc=6, status=complete): not >>> configured >>> WebData_monitor_0 (node=sneeze, call=9, rc=6, status=complete): not >>> configured >>> WebData_monitor_0 (node=squeeze, call=11, rc=6, status=complete): not >>> configured >>> >>> Does anyone have any ideas as to how I might investigate where the >>> problem is. >>> >>> Kind regards, >>> Jon >> >> Hi, >> >> Found out how to debug failing resources:- >> >> http://www.clusterlabs.org/wiki/Debugging_Resource_Failures >> >> I managed to clear 1 problem, fuser was not installed; that means psmisc >> for Debian users. >> >> >> root at squeeze:~# crm configure show >> node sleeze >> node sneeze >> node squeeze >> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >> params ip="xxx.xxx.xxx.xxx" cidr_netmask="32" \ >> op monitor interval="30s" >> primitive WebData ocf:linbit:drbd \ >> params drbd_resource="wwwdata" \ >> op monitor interval="60s" >> primitive WebFS ocf:heartbeat:Filesystem \ >> params device="/dev/drbd/by-res/wwwdata" directory="/var/www/drbd" >> fstype="ext4" \ >> meta is-managed="true" >> primitive WebSite ocf:heartbeat:apache \ >> params configfile="/etc/apache2/apache2.conf" \ >> op monitor interval="1m" >> ms WebDataClone WebData \ >> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" >> notify="true" is-managed="false" >> location cli-prefer-WebSite WebSite \ >> rule $id="cli-prefer-rule-WebSite" inf: #uname eq sleeze >> colocation WebSite-with-WebFS inf: WebSite WebFS >> colocation fs_on_drbd inf: WebFS WebDataClone:Master >> colocation website-with-ip inf: WebSite ClusterIP >> order WebFS-after-WebData inf: WebDataClone:promote WebFS:start >> order WebSite-after-WebFS inf: WebFS WebSite >> order apache-after-ip inf: ClusterIP WebSite >> property $id="cib-bootstrap-options" \ >> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="3" \ >> stonith-enabled="false" \ >> last-lrm-refresh="1298043091" >> rsc_defaults $id="rsc-options" \ >> resource-stickiness="100" >> >> >> Here are a couple of bad looking lines from the debug output:- >> >> >> /usr/lib/ocf/resource.d/linbit/drbd: 1: [[: not found >> /usr/lib/ocf/resource.d/linbit/drbd: 1: 0x080307: not found >> /usr/lib/ocf/resource.d/linbit/drbd: 1: Bad substitution >> >> >> n.b. See full debug report at http://pastebin.com/pjKxBu8K >> >> OCF Return Code: 2 >> OCF Alias: OCF_ERR_ARGS >> Description: "The resource's configuration is not valid on this machine. >> Eg. Refers to a location/tool not found on the node." >> Recovery Type: hard >> >> Let me know if there's anything else I need to post. >> >> Kind regards, >> Jon > > Hi, > > ?This appeared to be a problem running 3 nodes. ?Stopping corosync on one of > the nodes levitated the problem. > > Is it possible to have a 3 node cluster, 3 running apache, 2 running DRBD? Should be possible > ?If so, can someone point me in the direction of how to. Depends on what errors are being thrown From andrew at beekhof.net Mon Feb 21 09:58:37 2011 From: andrew at beekhof.net (Andrew Beekhof) Date: Mon, 21 Feb 2011 10:58:37 +0100 Subject: [Linux-cluster] MySQL MASTER->SLAVE agent? In-Reply-To: <4D5F1072.1050202@srce.hr> References: <4D5F1072.1050202@srce.hr> Message-ID: On Sat, Feb 19, 2011 at 1:36 AM, Jakov Sosic wrote: > Hi. > > Would it be possible to use the MySQL replication as a way of achieving HA? > > When MASTER is down to take appropriate actions and declare SLAVE the > NEW master, and take on IP address? That would be tricky for rgmanager since it doesn't understand the concept of multi-state resources. I know people have done it with Pacemaker (also available in RHEL6) though. > Has anyone tested this kind of > setup? Obviously, failback should be impossible and request manual action. > > This would pose as a good solution for environments without shared > storage and without DRBD. > > > -- > Jakov Sosic > www.srce.hr > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From thiagoh at digirati.com.br Mon Feb 21 15:19:41 2011 From: thiagoh at digirati.com.br (Thiago Henrique) Date: Mon, 21 Feb 2011 12:19:41 -0300 Subject: [Linux-cluster] Segfault in GFS2 Message-ID: <1298301581.21845.36.camel@thiagohenrique06> Hello, I'm making a simple test with GFS2: I run simultaneously on both nodes, a script that make write operations in the filesystem. It causes GFS2 to dump a stack trace and fault. I have a cluster configured with two nodes like this: Ubuntu 10.04.1 LTS Kernel 2.6.35-23-generic drbd8-source-2:8.3.7-1ubuntu2.1 drbd8-utils-2:8.3.8.1-0ubuntu1 cman-3.0.2-2ubuntu3.1 libcman3-3.0.2-2ubuntu3.1 gfs2-tools-3.0.2-2ubuntu3.1 Is this known? What other kind of information could be useful to help find this issue? Thanks, -- Thiago Henrique STACK TRACE: ################################################################################ /var/log/kern.log: Feb 20 06:29:39 wcluster1 kernel: [142560.304056] INFO: task gfs2_quotad:1813 blocked for more than 120 seconds. Feb 20 06:29:39 wcluster1 kernel: [142560.304075] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 20 06:29:39 wcluster1 kernel: [142560.304089] gfs2_quotad D f4887e0c 0 1813 2 0x00000000 Feb 20 06:29:39 wcluster1 kernel: [142560.304098] f4887e1c 00000046 00000002 f4887e0c f5778744 c05d99e0 c08c3700 c08c3700 Feb 20 06:29:39 wcluster1 kernel: [142560.304114] e70ea676 00008184 c08c3700 c08c3700 e70c4587 00008184 00000000 c08c3700 Feb 20 06:29:39 wcluster1 kernel: [142560.304123] c08c3700 f545bf70 00000001 f4887e50 00000000 f4887e58 f4887e24 f85ab73d Feb 20 06:29:39 wcluster1 kernel: [142560.304133] Call Trace: Feb 20 06:29:39 wcluster1 kernel: [142560.304174] [] gfs2_glock_holder_wait+0xd/0x20 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304192] [] __wait_on_bit+0x4d/0x70 Feb 20 06:29:39 wcluster1 kernel: [142560.304203] [] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304214] [] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304220] [] out_of_line_wait_on_bit+0xab/0xc0 Feb 20 06:29:39 wcluster1 kernel: [142560.304231] [] ? wake_bit_function+0x0/0x50 Feb 20 06:29:39 wcluster1 kernel: [142560.304242] [] gfs2_glock_wait+0x32/0x40 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304254] [] gfs2_glock_nq+0x29e/0x350 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304266] [] ? default_spin_lock_flags+0x8/0x10 Feb 20 06:29:39 wcluster1 kernel: [142560.304272] [] ? _raw_spin_lock_irqsave+0x2f/0x50 Feb 20 06:29:39 wcluster1 kernel: [142560.304296] [] gfs2_statfs_sync+0x4c/0x1b0 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304304] [] ? del_timer_sync+0x19/0x20 Feb 20 06:29:39 wcluster1 kernel: [142560.304319] [] ? gfs2_statfs_sync+0x44/0x1b0 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304323] [] ? process_timeout+0x0/0x10 Feb 20 06:29:39 wcluster1 kernel: [142560.304337] [] quotad_check_timeo+0x3e/0xa0 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304343] [] ? finish_wait+0x4f/0x70 Feb 20 06:29:39 wcluster1 kernel: [142560.304356] [] gfs2_quotad+0x20a/0x250 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304362] [] ? schedule+0x37a/0x7a0 Feb 20 06:29:39 wcluster1 kernel: [142560.304367] [] ? autoremove_wake_function+0x0/0x50 Feb 20 06:29:39 wcluster1 kernel: [142560.304380] [] ? gfs2_quotad+0x0/0x250 [gfs2] Feb 20 06:29:39 wcluster1 kernel: [142560.304386] [] kthread +0x74/0x80 Feb 20 06:29:39 wcluster1 kernel: [142560.304390] [] ? kthread+0x0/0x80 Feb 20 06:29:39 wcluster1 kernel: [142560.304397] [] kernel_thread_helper+0x6/0x10 ################################################################################ From ooolinux at 163.com Tue Feb 22 02:55:00 2011 From: ooolinux at 163.com (yue) Date: Tue, 22 Feb 2011 10:55:00 +0800 (CST) Subject: [Linux-cluster] hi,question about gfs2 Message-ID: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> 1.if i can deploy gfs2 on fedora12. if it is ok to build from source code ? 2.the max node gfs2 can manger? i use san, if i have 100 machines,if gfs2 can work over those nodes? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From ooolinux at 163.com Tue Feb 22 03:04:39 2011 From: ooolinux at 163.com (yue) Date: Tue, 22 Feb 2011 11:04:39 +0800 (CST) Subject: [Linux-cluster] hi,question about gfs2 Message-ID: <6d94d69.12d04.12e4b5362f4.Coremail.ooolinux@163.com> 1.if i can deploy gfs2 on fedora12. if it is ok to build from source code ? 2.the max node gfs2 can manger? i use san, if i have 100 machines,if gfs2 can work over those nodes? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at beekhof.net Tue Feb 22 12:39:33 2011 From: andrew at beekhof.net (Andrew Beekhof) Date: Tue, 22 Feb 2011 13:39:33 +0100 Subject: [Linux-cluster] hi,question about gfs2 In-Reply-To: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> References: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> Message-ID: 2011/2/22 yue : > 1.if i can deploy gfs2 on fedora12. Why would you do that? Isn't F-12 unsupported in less than 2 months? > if it is ok to build from source code > ? > 2.the max node? gfs2 can manger?? i use san, if i have 100 machines,if gfs2 > can work over those nodes? > > thanks > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From rpeterso at redhat.com Tue Feb 22 13:48:42 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Tue, 22 Feb 2011 08:48:42 -0500 (EST) Subject: [Linux-cluster] hi,question about gfs2 In-Reply-To: <6d94d69.12d04.12e4b5362f4.Coremail.ooolinux@163.com> Message-ID: <205773825.125610.1298382522204.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- Original Message ----- | 1.if i can deploy gfs2 on fedora12. if it is ok to build from source | code ? Yes you can. However, as Andrew said, it's probably a mistake. You're better off using Fedora 14 where the code base is newer and it will be supported longer. You can build it from source, but find a source that's compatible with your kernel may be a challenge. GFS2 has advanced to match ongoing kernel development. You can build the Fedora 12 kernel from source RPMS, but you're likely going to encounter bugs that have already been fixed by later revisions. If you go with Fedora 14, it may be easier to compile the latest source from the GFS2 kernel git repo. | 2.the max node gfs2 can manger? i use san, if i have 100 machines,if | gfs2 can work over those nodes? | | thanks GFS2 does not care how many nodes are in your cluster. The only thing that cares is the rest of the cluster infrastructure. However, we don't recommend that many nodes for various reasons. For one thing, your network may be clogged with lots of traffic, which may interfere with proper cluster communications. Regards, Bob Peterson Red Hat File Systems From ooolinux at 163.com Wed Feb 23 01:59:09 2011 From: ooolinux at 163.com (yue) Date: Wed, 23 Feb 2011 09:59:09 +0800 (CST) Subject: [Linux-cluster] hi,question about gfs2 In-Reply-To: References: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> Message-ID: <16cbd14.1841.12e503dc631.Coremail.ooolinux@163.com> 1.hi, i do not know fc12 unsupported in less than 2 months now my environment is fc12 2.if there are rpm ,rpm is good. 3.i want gfs2 to host xen image-disk. 100G--200G image file. and need xen live migration amoung gluster. i do not know gfs2's performance. thanks ---------------- At 2011-02-22 20:39:33?"Andrew Beekhof" wrote: >2011/2/22 yue : >> 1.if i can deploy gfs2 on fedora12. > >Why would you do that? Isn't F-12 unsupported in less than 2 months? > >> if it is ok to build from source code >> ? >> 2.the max node gfs2 can manger? i use san, if i have 100 machines,if gfs2 >> can work over those nodes? >> >> thanks >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ad+lists at uni-x.org Wed Feb 23 08:48:45 2011 From: ad+lists at uni-x.org (Alexander Dalloz) Date: Wed, 23 Feb 2011 09:48:45 +0100 Subject: [Linux-cluster] hi,question about gfs2 In-Reply-To: References: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> Message-ID: <4D64C9ED.6020604@uni-x.org> Am 22.02.2011 13:39, schrieb Andrew Beekhof: > 2011/2/22 yue : >> 1.if i can deploy gfs2 on fedora12. > > Why would you do that? Isn't F-12 unsupported in less than 2 months? F 12 *is* unsupported since nearly 3 months! It went EOL on 2010-12-02 https://fedoraproject.org/wiki/Fedora_Release_Life_Cycle#Maintenance_Schedule Alexander From jeff.sturm at eprize.com Wed Feb 23 13:42:23 2011 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Wed, 23 Feb 2011 08:42:23 -0500 Subject: [Linux-cluster] hi,question about gfs2 In-Reply-To: <16cbd14.1841.12e503dc631.Coremail.ooolinux@163.com> References: <30cb8ebe.1293c.12e4b4a8e15.Coremail.ooolinux@163.com> <16cbd14.1841.12e503dc631.Coremail.ooolinux@163.com> Message-ID: <64D0546C5EBBD147B75DE133D798665F0855C1EA@hugo.eprize.local> GFS2 isn't the only way to get live Xen migration. If it's simpler, you can implement CLVM on shared storage, and use logical volumes to contain disk images. Your cluster infrastructure will ensure consistency of volume group metadata and still provide for domain failover. -Jeff From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of yue Sent: Tuesday, February 22, 2011 8:59 PM To: Andrew Beekhof Cc: cluster-devel; linux clustering Subject: Re: [Linux-cluster] hi,question about gfs2 1.hi, i do not know fc12 unsupported in less than 2 months now my environment is fc12 2.if there are rpm ,rpm is good. 3.i want gfs2 to host xen image-disk. 100G--200G image file. and need xen live migration amoung gluster. i do not know gfs2's performance. thanks ---------------- At 2011-02-22 20:39:33?"Andrew Beekhof" wrote: >2011/2/22 yue : >> 1.if i can deploy gfs2 on fedora12. > >Why would you do that? Isn't F-12 unsupported in less than 2 months? > >> if it is ok to build from source code >> ? >> 2.the max node gfs2 can manger? i use san, if i have 100 machines,if gfs2 >> can work over those nodes? >> >> thanks >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachar at awst.at Wed Feb 23 14:45:02 2011 From: zachar at awst.at (zachar at awst.at) Date: Wed, 23 Feb 2011 15:45:02 +0100 (CET) Subject: [Linux-cluster] =?utf-8?q?hi=2Cquestion_about_gfs2?= Message-ID: Is this solution supported by RedHat? If I am correct, it isn't: https://access.redhat.com/kb/docs/DOC-17651 With HA-LVM you will loose the live-migration... And if I am correct, as the redhat cluster suite's resource script do not activate the vgs (or lvs) exclusively (that can be an option when you are using clustered LVM), nothing would prevent the second node to start the vm that is already running on the master node (which would probably kill your filesystems in your vm) in a split brain situation (except a "reliable" fencing). If I am correct, the only supported method is CLVMD+GFS(2) if you want live migration for a VM. Is this true? Regards, Balazs Jeff Sturm schrieb: > GFS2 isn't the only way to get live Xen migration. If it's simpler, you > can implement CLVM on shared storage, and use logical volumes to contain > disk images. Your cluster infrastructure will ensure consistency of > volume group metadata and still provide for domain failover. > > > > -Jeff > > > > From: linux-cluster-bounces at redhat.com [mailto: > linux-cluster-bounces at redhat.com] On Behalf Of yue > Sent: Tuesday, February 22, 2011 8:59 PM > To: Andrew Beekhof > Cc: cluster-devel; linux clustering > Subject: Re: [Linux-cluster] hi,question about gfs2 > > > > 1.hi, i do not know fc12 unsupported in less than 2 months > > now my environment is fc12 > > 2.if there are rpm ,rpm is good. > 3.i want gfs2 to host xen image-disk. 100G--200G image file. > and need xen live migration amoung gluster. > i do not know gfs2's performance. > thanks > ---------------- > At 2011-02-22 20:39:33?"Andrew Beekhof" wrote: > > >2011/2/22 yue : > >> 1.if i can deploy gfs2 on fedora12. > > > >Why would you do that? Isn't F-12 unsupported in less than 2 months? > > > >> if it is ok to build from source code > >> ? > >> 2.the max node gfs2 can manger? i use san, if i have 100 machines, > if gfs2 > >> can work over those nodes? > >> > >> thanks > >> > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > > From rpeterso at redhat.com Wed Feb 23 18:08:07 2011 From: rpeterso at redhat.com (Bob Peterson) Date: Wed, 23 Feb 2011 13:08:07 -0500 (EST) Subject: [Linux-cluster] New gfs2-utils-3.1.1 release In-Reply-To: <2038341380.151565.1298484449813.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <1628649254.151580.1298484487360.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Hi, I just wanted to let everyone know that I just did a new build of gfs2-utils. The new source tarball can be downloaded here: https://fedorahosted.org/releases/g/f/gfs2-utils/gfs2-utils-3.1.1.tar.gz To report bugs or issues: https://bugzilla.redhat.com/ Regards, Bob Peterson Red Hat File Systems Changes since 3.1.0: Bob Peterson (14): gfs2_edit savemeta doesn't save all leaf blocks for large dirs fsck.gfs2: segfault in pass1b gfs2_edit: fix segfault in set_bitmap when block is in rgrp gfs2_edit: fix careless compiler warning gfs2_edit: Fix error message on blockalloc when outside bitmap gfs2_edit: add -d option for printing journal details gfs2_edit: has problems printing gfs1 journals gfs2_edit: print large block numbers better gfs2_edit: handle corrupt file systems better fsck.gfs2: can't repair rgrps resulting from gfs_grow->gfs2_convert fsck.gfs2: reports master/root dinodes as unused and fixes the bitmap gfs2-utils: minor corrections to README.build GFS2: mkfs.gfs2 segfaults with 18.55TB and -b512 mkfs.gfs2 should support discard request generation Ben Marzinski (1): gfs2_grow: fix growing on full filesystems Dave Teigland (1): gfs_controld: remove oom_adj Steve Whitehouse (5): libgfs2: Move gfs2_query into gfs2_convert libgfs2: Remove unused function get_sysfs_uinit() libgfs2: Remove calls to gettext from libgfs2 strings: Clean up strings tune: Clean up and make closer to tune2fs From Ning.Bao at statcan.gc.ca Wed Feb 23 18:49:58 2011 From: Ning.Bao at statcan.gc.ca (Ning.Bao at statcan.gc.ca) Date: Wed, 23 Feb 2011 13:49:58 -0500 Subject: [Linux-cluster] question about Fabric fencing in GFS2 Message-ID: Hi I am very new to GFS2. When I am reading GFS2 docs, I have noticed fabric fencing method. If I understand correctly, fabric fencing requires cluster node would be able to login into the SAN switch to disable ports. Does such kind of access put admin password of the SAN switch in cluster.conf in clear text? If yes, it could be a bad idea for storage admins, If not, does storage admins need create a special account which can only disable the ports for the particular host? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From scooter at cgl.ucsf.edu Wed Feb 23 20:17:00 2011 From: scooter at cgl.ucsf.edu (Scooter Morris) Date: Wed, 23 Feb 2011 12:17:00 -0800 Subject: [Linux-cluster] multiple gfs2_tool shrinks cause hang? Message-ID: <4D656B3C.20603@cgl.ucsf.edu> Hi all, I recently had a hang on our cluster that I unwittingly caused and wondered if anyone else has seen anything similar. We were noticing a definitely slow-down in one filesystem and doing some investigation, I noticed that one of the nodes had a large number of locks gfs2_glock in /proc/slabinfo was very large. I decided to try doing a gfs2_tool shrink on the filesystem that was going to slow. I noticed some reduction in the number of locks, but not a lot, so I did it again. Everything dropped into D wait on that filesystem, as did several of the kernel threads. Has anyone else seen this behavior? Is this a known bug? -- scooter From sachinbhugra at hotmail.com Wed Feb 23 20:27:08 2011 From: sachinbhugra at hotmail.com (sachin) Date: Thu, 24 Feb 2011 01:57:08 +0530 Subject: [Linux-cluster] Cluster node hangs In-Reply-To: References: <4D57A763.8030700@redhat.com> <4D57A9F3.90408@redhat.com> Message-ID: Hi Dominic, Below is my cluster.conf: ===================================