From hirantha at vcs.informatics.lk Tue Nov 1 07:46:07 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Tue, 1 Nov 2005 13:46:07 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced Message-ID: <1419854.1130833720823.JavaMail.root@ux-mail> Dear All, We are setting up a 2 node cluster (node1 and node2) with RHCSv4 with RHELv4 for one of my clients. My hardware is 2 HPDL380 with iLO as a fence device for each node. MSA500 is shared storage for both nodes. Cluster rpms are installed successfully with latest kernel and RHCS updates from the RHN. Initially all the services (ccsd,cman,fenced etc.) are starting smoothly. The issue is when we unplugged the network cable of node1 and node2 will fencing the node1 and shutdown the machine; then node1 will automatically get shutdown itself. Now both nodes are down. So we start one node (say node1) and it hangs on the fencing domain state - when we start the other node (say node2), node2 will shutdown node1 then again node2 shutdown itself. It is very difficult to get the clear picture of these states, since I couldn't get an idea or how to configure iLO fence device on both nodes. Please advice how to configure HP iLOs on both nodes and how to rectify this issue. Here is my cluster.conf: -------------- next part -------------- An HTML attachment was scrubbed... URL: From teigland at redhat.com Tue Nov 1 15:37:57 2005 From: teigland at redhat.com (David Teigland) Date: Tue, 1 Nov 2005 09:37:57 -0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <1419854.1130833720823.JavaMail.root@ux-mail> References: <1419854.1130833720823.JavaMail.root@ux-mail> Message-ID: <20051101153757.GA21919@redhat.com> On Tue, Nov 01, 2005 at 01:46:07PM +0600, Hirantha Wijayawardena wrote: > Dear All, > > We are setting up a 2 node cluster (node1 and node2) with RHCSv4 with RHELv4 > for one of my clients. My hardware is 2 HPDL380 with iLO as a fence device > for each node. MSA500 is shared storage for both nodes. > > Cluster rpms are installed successfully with latest kernel and RHCS updates > from the RHN. Initially all the services (ccsd,cman,fenced etc.) are > starting smoothly. The issue is when we unplugged the network cable of node1 > and node2 will fencing the node1 and shutdown the machine; then node1 will > automatically get shutdown itself. Now both nodes are down. So we start one > node (say node1) and it hangs on the fencing domain state - when we start > the other node (say node2), node2 will shutdown node1 then again node2 > shutdown itself. It is very difficult to get the clear picture of these > states, since I couldn't get an idea or how to configure iLO fence device on > both nodes. > > Please advice how to configure HP iLOs on both nodes and how to rectify this > issue. In this special two node configuration, if both nodes are still alive they will each try to fence the other by design. We expect that A will fence B before B can fence A -- that's always the case if you have a single fencing device, but with iLO I believe it's possible for both nodes to fence each other in parallel. That would result in both being rebooted instead of just one as we intend. In practice, I'd expect that one node may often be faster than the other by a slight margin resulting in just one node being rebooted. Another way to get around this problem is by using the fenced -f option to specify different post-fail-delay values for the two nodes. On node1 do 'fenced -f 1' and on node2 do 'fenced -f 6'. This will give node1 a five second head-start and it should fence node2 before node2 can fence node1. > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" passwd="RWE232WE"/> > login="Administrator" name="HPiLO_node2" passwd="QWD31D4D"/> > I've never configured fence_ilo before, but you may want to check this. You specify in node A's section how others will fence node A (not how node A will fence another node). So, shouldn't node1 list HPiLO_node1 as its fence device and node2 list HPiLO_node2? Dave From lhh at redhat.com Tue Nov 1 20:40:47 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 01 Nov 2005 15:40:47 -0500 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <20051101153757.GA21919@redhat.com> References: <1419854.1130833720823.JavaMail.root@ux-mail> <20051101153757.GA21919@redhat.com> Message-ID: <1130877647.29380.92.camel@ayanami.boston.redhat.com> On Tue, 2005-11-01 at 09:37 -0600, David Teigland wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" passwd="RWE232WE"/> > > > login="Administrator" name="HPiLO_node2" passwd="QWD31D4D"/> > > > > I've never configured fence_ilo before, but you may want to check this. > You specify in node A's section how others will fence node A > (not how node A will fence another node). So, shouldn't node1 list > HPiLO_node1 as its fence device and node2 list HPiLO_node2? Yes, this is correct - the configuration looks backwards. -- Lon From hirantha at vcs.informatics.lk Wed Nov 2 04:20:43 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Wed, 2 Nov 2005 10:20:43 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced In-Reply-To: <1130877647.29380.92.camel@ayanami.boston.redhat.com> Message-ID: <7457931.1130907806091.JavaMail.root@ux-mail> Thanks all, But I didn't get - the configuration is backwards!! I'm very new to RHCS but not for Linux Cluster and please help on this. As you know I have 2 fence devices each node has its own. Node1 has HPiLO_node1 Node2 has HPiLO_node2 So I configured as follows Is this correct? And I will do what Dave suggests and let you guys know. Before that please advice me is my configuration on fence devices are correct. Thanks in advance - Hirantha -----Original Message----- From: Lon Hohberger [mailto:lhh at redhat.com] Sent: Wednesday, November 02, 2005 2:41 AM To: linux clustering Cc: Hirantha Wijayawardena Subject: Re: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced On Tue, 2005-11-01 at 09:37 -0600, David Teigland wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" passwd="RWE232WE"/> > > > login="Administrator" name="HPiLO_node2" passwd="QWD31D4D"/> > > > > I've never configured fence_ilo before, but you may want to check this. > You specify in node A's section how others will fence node A > (not how node A will fence another node). So, shouldn't node1 list > HPiLO_node1 as its fence device and node2 list HPiLO_node2? Yes, this is correct - the configuration looks backwards. -- Lon From hlawatschek at atix.de Wed Nov 2 08:49:25 2005 From: hlawatschek at atix.de (Mark Hlawatschek) Date: Wed, 02 Nov 2005 09:49:25 +0100 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced In-Reply-To: <7457931.1130907806091.JavaMail.root@ux-mail> References: <7457931.1130907806091.JavaMail.root@ux-mail> Message-ID: <1130921365.4109.13.camel@falballa.gallien.atix> Hi Hirantha, The fence device for nodeN has to be the ILO device that is used to fence nodeN - i.e. the ILO device inside nodeN. For the cluster.conf this means: (...) (...) and so on... I hope that helps, Mark On Wed, 2005-11-02 at 10:20 +0600, Hirantha Wijayawardena wrote: > Thanks all, > > But I didn't get - the configuration is backwards!! > > I'm very new to RHCS but not for Linux Cluster and please help on this. > > As you know I have 2 fence devices each node has its own. > > Node1 has HPiLO_node1 > Node2 has HPiLO_node2 > > So I configured as follows > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" passwd="RWE232WE"/> > login="Administrator" name="HPiLO_node2" > passwd="QWD31D4D"/> > > > Is this correct? > > And I will do what Dave suggests and let you guys know. Before that please > advice me is my configuration on fence devices are correct. > > Thanks in advance > > - Hirantha > > > -----Original Message----- > From: Lon Hohberger [mailto:lhh at redhat.com] > Sent: Wednesday, November 02, 2005 2:41 AM > To: linux clustering > Cc: Hirantha Wijayawardena > Subject: Re: [Linux-cluster] RHSCv4 2-node cluster hangs while > startingfenced > > On Tue, 2005-11-01 at 09:37 -0600, David Teigland wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" > passwd="RWE232WE"/> > > > > > login="Administrator" name="HPiLO_node2" > passwd="QWD31D4D"/> > > > > > > > I've never configured fence_ilo before, but you may want to check this. > > You specify in node A's section how others will fence node A > > (not how node A will fence another node). So, shouldn't node1 list > > HPiLO_node1 as its fence device and node2 list HPiLO_node2? > > Yes, this is correct - the configuration looks backwards. > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Mark Hlawatschek From hirantha at vcs.informatics.lk Wed Nov 2 10:40:25 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Wed, 2 Nov 2005 16:40:25 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced In-Reply-To: <1130921365.4109.13.camel@falballa.gallien.atix> Message-ID: <1472992.1130930589935.JavaMail.root@ux-mail> Dear all, Thanks a lot for support you guys. I managed to start the cluster with fenced. I tried what Dave told me - but I'm not sure whether this is what he told coz I don't have a clue I changed the fence-levels between 2 nodes. Say: Node1 --> HPiLO_node2 --> fence-level-1 Node2 --> HPiLO_node1 --> fence-level-6 But I do not have a clue what is fence-level and what was the different between each level and why is start to work after changing as above. Here is my .conf file and please advice me what is fence-levels and how it works.. Thanks all your support - Hirantha -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Mark Hlawatschek Sent: Wednesday, November 02, 2005 2:49 PM To: linux clustering Subject: RE: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced Hi Hirantha, The fence device for nodeN has to be the ILO device that is used to fence nodeN - i.e. the ILO device inside nodeN. For the cluster.conf this means: (...) (...) and so on... I hope that helps, Mark On Wed, 2005-11-02 at 10:20 +0600, Hirantha Wijayawardena wrote: > Thanks all, > > But I didn't get - the configuration is backwards!! > > I'm very new to RHCS but not for Linux Cluster and please help on this. > > As you know I have 2 fence devices each node has its own. > > Node1 has HPiLO_node1 > Node2 has HPiLO_node2 > > So I configured as follows > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" passwd="RWE232WE"/> > login="Administrator" name="HPiLO_node2" > passwd="QWD31D4D"/> > > > Is this correct? > > And I will do what Dave suggests and let you guys know. Before that please > advice me is my configuration on fence devices are correct. > > Thanks in advance > > - Hirantha > > > -----Original Message----- > From: Lon Hohberger [mailto:lhh at redhat.com] > Sent: Wednesday, November 02, 2005 2:41 AM > To: linux clustering > Cc: Hirantha Wijayawardena > Subject: Re: [Linux-cluster] RHSCv4 2-node cluster hangs while > startingfenced > > On Tue, 2005-11-01 at 09:37 -0600, David Teigland wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > login="Administrator" name="HPiLO_node1" > passwd="RWE232WE"/> > > > > > login="Administrator" name="HPiLO_node2" > passwd="QWD31D4D"/> > > > > > > > I've never configured fence_ilo before, but you may want to check this. > > You specify in node A's section how others will fence node A > > (not how node A will fence another node). So, shouldn't node1 list > > HPiLO_node1 as its fence device and node2 list HPiLO_node2? > > Yes, this is correct - the configuration looks backwards. > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Mark Hlawatschek -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From vlad at nkmz.donetsk.ua Wed Nov 2 11:24:09 2005 From: vlad at nkmz.donetsk.ua (Vlad) Date: Wed, 2 Nov 2005 13:24:09 +0200 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <1419854.1130833720823.JavaMail.root@ux-mail> References: <1419854.1130833720823.JavaMail.root@ux-mail> Message-ID: <184378242.20051102132409@nkmz.donetsk.ua> You can configure HP iLOs from server BIOS ( press [F8] on server startup). It's for virtual server power manage, virtual local console manage only. Any local server OS can't see iLO as network device. iLO work as 'http://'. You can access iLO server from network workstation from MS Internet Explorer as 'http://'. For examlpe, in you local network, http://192.168.31.1 I think that you can't use iLO port for server network and cluster. Tuesday, November 1, 2005, 9:46:07 AM, you wrote: HW> Dear All, HW> We are setting up a2 nodecluster(node1 and node2)withRHCSv4 HW> with RHELv4 for one of my clients. My hardware is 2 HPDL380 with HW> iLO as a fence device for each node. MSA500 is shared storage for HW> both nodes. HW> Cluster rpms are installedsuccessfullywith latestkerneland HW> RHCS updates from the RHN.Initially all the services HW> (ccsd,cman,fenced etc.) are starting smoothly.Theissueis when HW> weunplugged the network cable ofnode1 and node2 will fencing the HW> node1 and shutdown the machine; then node1 will automatically HW> getshutdown itself. Now both nodes are down. So we start one node HW> (say node1) and ithangson the fencing domainstate?when we start HW> the other node (say node2), node2 will shutdown node1 then again HW> node2 shutdownitself. It is verydifficult to get the clearpicture HW> of thesestates, sinceIcouldn?tget an idea or how to configure iLO HW> fence device on both nodes. HW> Please advice howto configure HP iLOs on both nodes and how to rectify this issue. HW> Here is mycluster.conf: HW> HW> HW> ??????? HW> ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? HW> ??????? HW> ??????? HW> ??????? ??????? hostname="10.10.10.1" login="Administrator" name="HPiLO_node1" HW> passwd="RWE232WE"/> HW> ??????? ??????? hostname="10.10.10.2" login="Administrator" name="HPiLO_node2" HW> passwd="QWD31D4D"/> HW> ??????? -- Best regards, Vlad mailto:vlad at nkmz.donetsk.ua From vlad at nkmz.donetsk.ua Wed Nov 2 11:27:42 2005 From: vlad at nkmz.donetsk.ua (Vlad) Date: Wed, 2 Nov 2005 13:27:42 +0200 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <1419854.1130833720823.JavaMail.root@ux-mail> References: <1419854.1130833720823.JavaMail.root@ux-mail> Message-ID: <1031275036.20051102132742@nkmz.donetsk.ua> For cluster interconnect you can use network switch/hub or crossover network cable (only for 2-node cluster). Tuesday, November 1, 2005, 9:46:07 AM, you wrote: HW> Dear All, HW> We are setting up a2 nodecluster(node1 and node2)withRHCSv4 HW> with RHELv4 for one of my clients. My hardware is 2 HPDL380 with HW> iLO as a fence device for each node. MSA500 is shared storage for HW> both nodes. HW> Cluster rpms are installedsuccessfullywith latestkerneland HW> RHCS updates from the RHN.Initially all the services HW> (ccsd,cman,fenced etc.) are starting smoothly.Theissueis when HW> weunplugged the network cable ofnode1 and node2 will fencing the HW> node1 and shutdown the machine; then node1 will automatically HW> getshutdown itself. Now both nodes are down. So we start one node HW> (say node1) and ithangson the fencing domainstate?when we start HW> the other node (say node2), node2 will shutdown node1 then again HW> node2 shutdownitself. It is verydifficult to get the clearpicture HW> of thesestates, sinceIcouldn?tget an idea or how to configure iLO HW> fence device on both nodes. HW> Please advice howto configure HP iLOs on both nodes and how to rectify this issue. HW> Here is mycluster.conf: HW> HW> HW> ??????? HW> ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? HW> ??????? HW> ??????? HW> ??????? ??????? hostname="10.10.10.1" login="Administrator" name="HPiLO_node1" HW> passwd="RWE232WE"/> HW> ??????? ??????? hostname="10.10.10.2" login="Administrator" name="HPiLO_node2" HW> passwd="QWD31D4D"/> HW> ??????? -- Best regards, Vlad mailto:vlad at nkmz.donetsk.ua From vlad at nkmz.donetsk.ua Wed Nov 2 11:35:02 2005 From: vlad at nkmz.donetsk.ua (Vlad) Date: Wed, 2 Nov 2005 13:35:02 +0200 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <1419854.1130833720823.JavaMail.root@ux-mail> References: <1419854.1130833720823.JavaMail.root@ux-mail> Message-ID: <975731838.20051102133502@nkmz.donetsk.ua> In HP DL380G3, for example, you have 2 embedded NICs (NIC1 and NIC2). One (NIC2) for server public network and second (NIC1) for server cluster interconnect network. Tuesday, November 1, 2005, 9:46:07 AM, you wrote: HW> Dear All, HW> We are setting up a2 nodecluster(node1 and node2)withRHCSv4 HW> with RHELv4 for one of my clients. My hardware is 2 HPDL380 with HW> iLO as a fence device for each node. MSA500 is shared storage for HW> both nodes. HW> Cluster rpms are installedsuccessfullywith latestkerneland HW> RHCS updates from the RHN.Initially all the services HW> (ccsd,cman,fenced etc.) are starting smoothly.Theissueis when HW> weunplugged the network cable ofnode1 and node2 will fencing the HW> node1 and shutdown the machine; then node1 will automatically HW> getshutdown itself. Now both nodes are down. So we start one node HW> (say node1) and ithangson the fencing domainstate?when we start HW> the other node (say node2), node2 will shutdown node1 then again HW> node2 shutdownitself. It is verydifficult to get the clearpicture HW> of thesestates, sinceIcouldn?tget an idea or how to configure iLO HW> fence device on both nodes. HW> Please advice howto configure HP iLOs on both nodes and how to rectify this issue. HW> Here is mycluster.conf: HW> HW> HW> ??????? HW> ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? HW> ??????? HW> ??????? HW> ??????? ??????? hostname="10.10.10.1" login="Administrator" name="HPiLO_node1" HW> passwd="RWE232WE"/> HW> ??????? ??????? hostname="10.10.10.2" login="Administrator" name="HPiLO_node2" HW> passwd="QWD31D4D"/> HW> ??????? -- Best regards, Vlad mailto:vlad at nkmz.donetsk.ua From hirantha at vcs.informatics.lk Wed Nov 2 12:00:43 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Wed, 2 Nov 2005 18:00:43 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <184378242.20051102132409@nkmz.donetsk.ua> Message-ID: <24530537.1130935408898.JavaMail.root@ux-mail> Hi Vlad, RHCS docs says it is possible and I managed to setup HP iLO as a fence device. Now my issue is related to RHCS - what is fence-levels and how it is work.. Cheers, - Hirantha -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Vlad Sent: Wednesday, November 02, 2005 5:24 PM To: linux clustering Subject: Re: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced You can configure HP iLOs from server BIOS ( press [F8] on server startup). It's for virtual server power manage, virtual local console manage only. Any local server OS can't see iLO as network device. iLO work as 'http://'. You can access iLO server from network workstation from MS Internet Explorer as 'http://'. For examlpe, in you local network, http://192.168.31.1 I think that you can't use iLO port for server network and cluster. Tuesday, November 1, 2005, 9:46:07 AM, you wrote: HW> Dear All, HW> We are setting up a2 nodecluster(node1 and node2)withRHCSv4 HW> with RHELv4 for one of my clients. My hardware is 2 HPDL380 with HW> iLO as a fence device for each node. MSA500 is shared storage for HW> both nodes. HW> Cluster rpms are installedsuccessfullywith latestkerneland HW> RHCS updates from the RHN.Initially all the services HW> (ccsd,cman,fenced etc.) are starting smoothly.Theissueis when HW> weunplugged the network cable ofnode1 and node2 will fencing the HW> node1 and shutdown the machine; then node1 will automatically HW> getshutdown itself. Now both nodes are down. So we start one node HW> (say node1) and ithangson the fencing domainstate?when we start HW> the other node (say node2), node2 will shutdown node1 then again HW> node2 shutdownitself. It is verydifficult to get the clearpicture HW> of thesestates, sinceIcouldn?tget an idea or how to configure iLO HW> fence device on both nodes. HW> Please advice howto configure HP iLOs on both nodes and how to rectify this issue. HW> Here is mycluster.conf: HW> HW> HW> ??????? HW> ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? ??????? HW> ??????? ??????? ??????? HW> ??????? ??????? HW> ??????? HW> ??????? HW> ??????? HW> ??????? ??????? hostname="10.10.10.1" login="Administrator" name="HPiLO_node1" HW> passwd="RWE232WE"/> HW> ??????? ??????? hostname="10.10.10.2" login="Administrator" name="HPiLO_node2" HW> passwd="QWD31D4D"/> HW> ??????? -- Best regards, Vlad mailto:vlad at nkmz.donetsk.ua -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From erwan at seanodes.com Wed Nov 2 13:21:15 2005 From: erwan at seanodes.com (Velu Erwan) Date: Wed, 02 Nov 2005 14:21:15 +0100 Subject: [Linux-cluster] Readhead Issues using cluster-1.01.00 In-Reply-To: <4366443F.9000707@seanodes.com> References: <4366443F.9000707@seanodes.com> Message-ID: <4368BD4B.6030009@seanodes.com> Velu Erwan a ?crit : > Hi, > I've been playing with cluster-1.01.00 and I found the reading very slow. > I've been trying to setup "max_readahead" using gfs_tool and > performances are unchanged. > I've investigate more on that problem with a collegue. He founds that gfs is now using a diaper volume. This volume maps the physical device I provide to gfs, but we had 2 questions about that : 1?) Why this volume is so big ? On my system it reaches ~8192 ExaBytes ! The first time I saw that I thought it was an error... [root at max4 ~]# cat /proc/partitions | grep -e "major" -e "diapered" major minor #blocks name 252 0 9223372036854775807 diapered_g1v1 [root at max4 ~]# 2?) Regarding the source code, this diaper volume never set the "gd->queue->backing_dev_info.ra_pages" which is set to zero by gd->queue = blk_alloc_queue(GFP_KERNEL); Is it needed to enforce the cache/lock management or is it just a miss ? This could explain why the reading performances are low while gfs_read() makes a generic_file_read() isn't it ? From erling.nygaard at gmail.com Wed Nov 2 13:30:01 2005 From: erling.nygaard at gmail.com (Erling Nygaard) Date: Wed, 2 Nov 2005 14:30:01 +0100 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while starting fenced In-Reply-To: <24530537.1130935408898.JavaMail.root@ux-mail> References: <184378242.20051102132409@nkmz.donetsk.ua> <24530537.1130935408898.JavaMail.root@ux-mail> Message-ID: Vlad, you are misunderstanding what he is trying to use the iLO card for. In this setup is is being used as a fencing device, not an network card. (As you say, it is indeed not a network card) In other word, the iLO card in node A is used by other nodes in the cluster to fence (in this case forcefully power off) node A, the iLO card is _not_ used by node A for any ethernet traffic. To quote from old an old mail on this topic: Fencing is the act of forcefully preventing a node from being able to access resources after that node has been evicted from the cluster This is done in an attempt to avoid corruption. The canonical example of when it is needed is the live-hang scenario, as you described: 1. node A hangs with I/Os pending to a shared file system 2. node B and node C decide that node A is dead and recover resources allocated on node A (including the shared file system) 3. node A resumes normal operation 4. node A completes I/Os to shared file system At this point, the shared file system is probably corrupt. If you're lucky, fsck will fix it -- if you're not, you'll need to restore from backup. I/O fencing (STONITH, STOMITH, or whatever we want to call it) prevents the last step (step 4) from happening. How fencing is done (power cycling via external switch, SCSI reservations, FC zoning, integrated methods like IPMI, iLO, manual intervention, etc.) is unimportant - so long as whatever method is used can guarantee that step 4 can not complete. Please excuse my English -------------- next part -------------- An HTML attachment was scrubbed... URL: From erwan at seanodes.com Wed Nov 2 14:27:17 2005 From: erwan at seanodes.com (Velu Erwan) Date: Wed, 02 Nov 2005 15:27:17 +0100 Subject: [Linux-cluster] [cluster-1.01.00] Lock_gulmd startup position Message-ID: <4368CCC5.9030402@seanodes.com> One of my nodes had crashed (the lockserver node) and some was rebooting... Ok this situation is really weird... but as my lockserver wasn't able to reboot, lock_gulmd of the other nodes are waiting. But lock_gulmd is started at position 22 in the initscript (S22) and usually sshd is started later (S55). It means that if nodes can't find the master node because there is a cluster.conf with a lockserver defined and if your lockserver is not started or reachable your rebooting nodes are blocked and you can't ssh them ! Is it possible to start the lock_gulmd after sshd ? Erwan, From teigland at redhat.com Wed Nov 2 15:31:52 2005 From: teigland at redhat.com (David Teigland) Date: Wed, 2 Nov 2005 09:31:52 -0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced In-Reply-To: <1472992.1130930589935.JavaMail.root@ux-mail> References: <1130921365.4109.13.camel@falballa.gallien.atix> <1472992.1130930589935.JavaMail.root@ux-mail> Message-ID: <20051102153152.GB7484@redhat.com> On Wed, Nov 02, 2005 at 04:40:25PM +0600, Hirantha Wijayawardena wrote: > Here is my .conf file and please advice me what is fence-levels and how it > works.. > > > post_join_delay="3"/> > > > > > > > > > > > > > > > > > > > > > No, you've misunderstood. You're cluster.conf file should look like this: Notice that values in have been swapped. If this doesn't work, my other suggestion was to do: node1$ fenced -f 1 node2$ fenced -f 6 which has nothing to do with the cluster.conf file. Dave From lhh at redhat.com Wed Nov 2 16:07:49 2005 From: lhh at redhat.com (Lon Hohberger) Date: Wed, 02 Nov 2005 11:07:49 -0500 Subject: [Fwd: RE: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced] Message-ID: <1130947669.29380.199.camel@ayanami.boston.redhat.com> Whoops, forgot to cc linux-cluster. -- Lon -------------- next part -------------- An embedded message was scrubbed... From: Lon Hohberger Subject: RE: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced Date: Wed, 02 Nov 2005 11:03:14 -0500 Size: 2269 URL: From teigland at redhat.com Wed Nov 2 16:18:16 2005 From: teigland at redhat.com (David Teigland) Date: Wed, 2 Nov 2005 10:18:16 -0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs while startingfenced In-Reply-To: <20051102153152.GB7484@redhat.com> References: <1130921365.4109.13.camel@falballa.gallien.atix> <1472992.1130930589935.JavaMail.root@ux-mail> <20051102153152.GB7484@redhat.com> Message-ID: <20051102161816.GA7766@redhat.com> On Wed, Nov 02, 2005 at 09:31:52AM -0600, David Teigland wrote: > Notice that values in have been swapped. If this > doesn't work, my other suggestion was to do: > > node1$ fenced -f 1 > node2$ fenced -f 6 Sorry, this may be confusing. The fenced options are not an alternative to writing a correct cluster.conf file. If, after cluster.conf is correct, you get both nodes fencing each other in parallel, then you may want to try these fenced options. Dave From erwan at seanodes.com Wed Nov 2 16:36:18 2005 From: erwan at seanodes.com (Velu Erwan) Date: Wed, 02 Nov 2005 17:36:18 +0100 Subject: [Linux-cluster] Readhead Issues using cluster-1.01.00 In-Reply-To: <4368BD4B.6030009@seanodes.com> References: <4366443F.9000707@seanodes.com> <4368BD4B.6030009@seanodes.com> Message-ID: <4368EB02.6030402@seanodes.com> Velu Erwan a ?crit : > Velu Erwan a ?crit : > > 1?) Why this volume is so big ? On my system it reaches ~8192 ExaBytes ! > The first time I saw that I thought it was an error... > > [root at max4 ~]# cat /proc/partitions | grep -e "major" -e "diapered" > major minor #blocks name > 252 0 9223372036854775807 diapered_g1v1 > [root at max4 ~]# > I don't know if it's normal or not but gd->capacity is set to zero then -1 is substract. As gd->capacity is a unsigned long we reach the maximum size. > > 2?) Regarding the source code, this diaper volume never set the > "gd->queue->backing_dev_info.ra_pages" which is set to zero by > gd->queue = blk_alloc_queue(GFP_KERNEL); > Is it needed to enforce the cache/lock management or is it just a miss ? > This could explain why the reading performances are low while > gfs_read() makes a generic_file_read() isn't it ? I've made this patch which still uses a hardcoded value but where the diapered volume have a ra_pages set. Using 2048 give some excellent results. This patch make the previous one obsolete for sure. Please found it attached. But I don't know how it affects gfs for its cache/lock management because maybe having some pages in cache could create some coherency troubles. What do you think about that ? Erwan, -------------- next part -------------- A non-text attachment was scrubbed... Name: cluster-1.01.00-readahead2.patch Type: text/x-patch Size: 331 bytes Desc: not available URL: From jan at bruvoll.com Wed Nov 2 16:49:04 2005 From: jan at bruvoll.com (Jan Bruvoll) Date: Wed, 02 Nov 2005 16:49:04 +0000 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers Message-ID: <4368EE00.3010907@bruvoll.com> Dear list, as we are quickly filling up our 8Tb GFS file system, we have started planning the migration to 64-bit linux "clients". In that respect I have been looking for information on how to "upgrade" the filesystem, and I haven't been able to find out whether we have to do anything in particular to the filesystem when switching, or whether the on-disk data structures will be the same for 64 bit machines? Can anybody on the list shed light on how we would move on from the 8Tb limitation? TIA & best regards Jan Bruvoll From lhh at redhat.com Wed Nov 2 16:58:15 2005 From: lhh at redhat.com (Lon Hohberger) Date: Wed, 02 Nov 2005 11:58:15 -0500 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers In-Reply-To: <4368EE00.3010907@bruvoll.com> References: <4368EE00.3010907@bruvoll.com> Message-ID: <1130950695.29380.201.camel@ayanami.boston.redhat.com> On Wed, 2005-11-02 at 16:49 +0000, Jan Bruvoll wrote: > Dear list, > > as we are quickly filling up our 8Tb GFS file system, we have started > planning the migration to 64-bit linux "clients". In that respect I have > been looking for information on how to "upgrade" the filesystem, and I > haven't been able to find out whether we have to do anything in > particular to the filesystem when switching, or whether the on-disk data > structures will be the same for 64 bit machines? Can anybody on the list > shed light on how we would move on from the 8Tb limitation? The GFS on-disk metadata structures are 64-bit clean. -- Lon From teigland at redhat.com Wed Nov 2 17:29:35 2005 From: teigland at redhat.com (David Teigland) Date: Wed, 2 Nov 2005 11:29:35 -0600 Subject: [Linux-cluster] Readhead Issues using cluster-1.01.00 In-Reply-To: <4368EB02.6030402@seanodes.com> References: <4366443F.9000707@seanodes.com> <4368BD4B.6030009@seanodes.com> <4368EB02.6030402@seanodes.com> Message-ID: <20051102172935.GB7766@redhat.com> On Wed, Nov 02, 2005 at 05:36:18PM +0100, Velu Erwan wrote: > Velu Erwan a ?crit : > > >Velu Erwan a ?crit : > > > >1?) Why this volume is so big ? On my system it reaches ~8192 ExaBytes ! > >The first time I saw that I thought it was an error... > > > >[root at max4 ~]# cat /proc/partitions | grep -e "major" -e "diapered" > >major minor #blocks name > >252 0 9223372036854775807 diapered_g1v1 > >[root at max4 ~]# > > > I don't know if it's normal or not but gd->capacity is set to zero then > -1 is substract. > As gd->capacity is a unsigned long we reach the maximum size. The size of the diaper device is intentionally set to the max size; all requests are then just passed through to the real device regardless of how large the real device is. > >2?) Regarding the source code, this diaper volume never set the > >"gd->queue->backing_dev_info.ra_pages" which is set to zero by > >gd->queue = blk_alloc_queue(GFP_KERNEL); > >Is it needed to enforce the cache/lock management or is it just a miss ? > >This could explain why the reading performances are low while > >gfs_read() makes a generic_file_read() isn't it ? > > I've made this patch which still uses a hardcoded value but where the > diapered volume have a ra_pages set. > Using 2048 give some excellent results. > This patch make the previous one obsolete for sure. Please found it > attached. > But I don't know how it affects gfs for its cache/lock management > because maybe having some pages in cache could create some coherency > troubles. > > What do you think about that ? I don't know, but here are a couple things you might look into: - Did this problem exist a few kernel versions ago? We should try the RHEL4 kernel (or something close to it) and the version of gfs that runs on that (RHEL4 cvs branch). If that version is ok, then there's probably a recent kernel change that we've missed that requires us to do something new. - Remove the diaper code and see if that changes things. Look at these patches where I removed the diaper code from gfs2; do the equivalent for the version of gfs you're playing with: http://sources.redhat.com/ml/cluster-cvs/2005-q3/msg00184.html I suspect that read-ahead should indeed be happening and that something has broken it recently. I think we should first figure out how it worked in the past. Thanks, Dave From jan at bruvoll.com Wed Nov 2 17:53:29 2005 From: jan at bruvoll.com (Jan Bruvoll) Date: Wed, 02 Nov 2005 17:53:29 +0000 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers In-Reply-To: <1130950695.29380.201.camel@ayanami.boston.redhat.com> References: <4368EE00.3010907@bruvoll.com> <1130950695.29380.201.camel@ayanami.boston.redhat.com> Message-ID: <4368FD19.2000601@bruvoll.com> Lon Hohberger wrote: >On Wed, 2005-11-02 at 16:49 +0000, Jan Bruvoll wrote: > > >>Dear list, >> >>as we are quickly filling up our 8Tb GFS file system, we have started >>planning the migration to 64-bit linux "clients". In that respect I have >>been looking for information on how to "upgrade" the filesystem, and I >>haven't been able to find out whether we have to do anything in >>particular to the filesystem when switching, or whether the on-disk data >>structures will be the same for 64 bit machines? Can anybody on the list >>shed light on how we would move on from the 8Tb limitation? >> >> > >The GFS on-disk metadata structures are 64-bit clean. > > Hi, thanks for your reply. Ok, so the ondisk data structures can stay. Is there then anything in particular we need to do to be able to grow the fs past 8Tb, apart from attaching clients that understand "large integers"? Will we have to reformat with a larger blocksize, or can we continue using the current disk contents as-is? Best regards Jan From dawson at fnal.gov Wed Nov 2 18:02:05 2005 From: dawson at fnal.gov (Troy Dawson) Date: Wed, 02 Nov 2005 12:02:05 -0600 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers In-Reply-To: <1130950695.29380.201.camel@ayanami.boston.redhat.com> References: <4368EE00.3010907@bruvoll.com> <1130950695.29380.201.camel@ayanami.boston.redhat.com> Message-ID: <4368FF1D.3050102@fnal.gov> Lon Hohberger wrote: > On Wed, 2005-11-02 at 16:49 +0000, Jan Bruvoll wrote: > >>Dear list, >> >>as we are quickly filling up our 8Tb GFS file system, we have started >>planning the migration to 64-bit linux "clients". In that respect I have >>been looking for information on how to "upgrade" the filesystem, and I >>haven't been able to find out whether we have to do anything in >>particular to the filesystem when switching, or whether the on-disk data >>structures will be the same for 64 bit machines? Can anybody on the list >>shed light on how we would move on from the 8Tb limitation? > > > The GFS on-disk metadata structures are 64-bit clean. > > -- Lon > I don't believe switching your machines to 64-bit is going to help you. I have both 64 bit and 32 bit using the same GFS file system without any problem, and without doing anything to the filesystem. Also, from the release notes for GFS 6.1 http://www.redhat.com/docs/manuals/csgfs/release-notes/GFS_6_1-relnotes.txt --------------------- Increased total storage supported Red Hat GFS now supports 8 terabytes of storage. For more information about Red Hat GFS requirements, refer to the Red Hat GFS 6.1 Administrator's Guide, "Chapter 2. System Requirements". --------------------- And if you are wondering about the guide, it again says 8 terabytes at http://www.redhat.com/docs/manuals/csgfs/browse/rh-gfs-en/s1-sysreq-fibredevice.html Troy -- __________________________________________________ Troy Dawson dawson at fnal.gov (630)840-6468 Fermilab ComputingDivision/CSS CSI Group __________________________________________________ From rkenna at redhat.com Wed Nov 2 21:39:10 2005 From: rkenna at redhat.com (Rob Kenna) Date: Wed, 02 Nov 2005 16:39:10 -0500 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers In-Reply-To: <4368FF1D.3050102@fnal.gov> References: <4368EE00.3010907@bruvoll.com> <1130950695.29380.201.camel@ayanami.boston.redhat.com> <4368FF1D.3050102@fnal.gov> Message-ID: <436931FE.2060703@redhat.com> Troy Dawson wrote: > Lon Hohberger wrote: > >> On Wed, 2005-11-02 at 16:49 +0000, Jan Bruvoll wrote: >> >>> Dear list, >>> >>> as we are quickly filling up our 8Tb GFS file system, we have started >>> planning the migration to 64-bit linux "clients". In that respect I have >>> been looking for information on how to "upgrade" the filesystem, and I >>> haven't been able to find out whether we have to do anything in >>> particular to the filesystem when switching, or whether the on-disk data >>> structures will be the same for 64 bit machines? Can anybody on the list >>> shed light on how we would move on from the 8Tb limitation? >> >> >> >> The GFS on-disk metadata structures are 64-bit clean. >> >> -- Lon >> > > I don't believe switching your machines to 64-bit is going to help you. > I have both 64 bit and 32 bit using the same GFS file system without > any problem, and without doing anything to the filesystem. > > Also, from the release notes for GFS 6.1 > http://www.redhat.com/docs/manuals/csgfs/release-notes/GFS_6_1-relnotes.txt > --------------------- > Increased total storage supported > > Red Hat GFS now supports 8 terabytes of storage. For more > information about Red Hat GFS requirements, refer to the Red Hat GFS 6.1 > Administrator's Guide, "Chapter 2. System Requirements". > --------------------- > > And if you are wondering about the guide, it again says 8 terabytes at This needs some clarification. GFS 6.1 is capable of supporting multiple 16TB filesystems on 32-bit OS architectures and 8EB on 64 bit OS architecures. We are currently supporting multiple 8TB file systems and will officially support larger file systems in time. The point is GFS is a 64 bit filesystem, with the same on-disk layout, regardless of the CPU architecture of the cluster nodes. You can also run mixed 32/64 architecures across x86/EM64T/AMD64/ia64. Obviously, mixed 32/64 combo's can not exceed 16TB. - Rob > > http://www.redhat.com/docs/manuals/csgfs/browse/rh-gfs-en/s1-sysreq-fibredevice.html > > > Troy -- Robert Kenna / Red Hat Sr Product Mgr - Storage 10 Technology Park Drive Westford, MA 01886 o: (978) 392-2410 (x22410) f: (978) 392-1001 c: (978) 771-6314 rkenna at redhat.com From jan at bruvoll.com Thu Nov 3 00:46:52 2005 From: jan at bruvoll.com (Jan Bruvoll) Date: Thu, 03 Nov 2005 00:46:52 +0000 Subject: [Linux-cluster] Upgrade from 32-bit to 64-bit servers In-Reply-To: <436931FE.2060703@redhat.com> References: <4368EE00.3010907@bruvoll.com> <1130950695.29380.201.camel@ayanami.boston.redhat.com> <4368FF1D.3050102@fnal.gov> <436931FE.2060703@redhat.com> Message-ID: <43695DFC.1000402@bruvoll.com> Rob Kenna wrote: > This needs some clarification. GFS 6.1 is capable of supporting > multiple 16TB filesystems on 32-bit OS architectures and 8EB on 64 bit > OS architecures. We are currently supporting multiple 8TB file > systems and will officially support larger file systems in time. The > point is GFS is a 64 bit filesystem, with the same on-disk layout, > regardless of the CPU architecture of the cluster nodes. > > You can also run mixed 32/64 architecures across x86/EM64T/AMD64/ia64. > Obviously, mixed 32/64 combo's can not exceed 16TB. Hi, thanks for the clarification - but I still have the following question: - will we be able to take the already existing filesystem, created on a 32-bit node, use it on a 64-bit platform (i.e. upgrade all the nodes), and from there extend it to whatever size >16Tb we desire, or will we have to adjust blocksizes, etc. to go past the 32-bit platform limit? I understand that the filesystem itself, on-disk, is not the part imposing the limit - however I think I have seen somewhere that one has to use larger block sizes to reach the upper limits of file system size. Is this correct? Thanks again Jan From hirantha at vcs.informatics.lk Thu Nov 3 05:28:16 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Thu, 3 Nov 2005 11:28:16 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs whilestartingfenced In-Reply-To: <1130947393.29380.195.camel@ayanami.boston.redhat.com> Message-ID: <22648586.1130998264092.JavaMail.root@ux-mail> Hi All, Thanks a lot for your support.. My cluster is running smooth now but I need to change what Dave and Lon suggests. I will re-configure as Dave says and if not second option he suggested. Lon's guide read the same configuration but I didn't get ======> this part What does it means? Please advice as I needed to understand to setup RHCS Thanks in advice - Hirantha -----Original Message----- From: Lon Hohberger [mailto:lhh at redhat.com] Sent: Wednesday, November 02, 2005 10:03 PM To: Hirantha Wijayawardena Subject: RE: [Linux-cluster] RHSCv4 2-node cluster hangs whilestartingfenced On Wed, 2005-11-02 at 16:40 +0600, Hirantha Wijayawardena wrote: > Dear all, > > Thanks a lot for support you guys. I managed to start the cluster with > fenced. I tried what Dave told me - but I'm not sure whether this is what he > told coz I don't have a clue > > I changed the fence-levels between 2 nodes. Say: > Node1 --> HPiLO_node2 --> fence-level-1 > Node2 --> HPiLO_node1 --> fence-level-6 The point is that Node1 should have the device(s) controlling *Node1* in its fence block. Node1's fence block is *not* this: "What devices can Node1 use to fence off Node2...?" ... it is: "What devices can other nodes use to fence off Node1...?" I'm going to go out on a limb and state that I think HPiLO_node2 is the iLO card residing in Node2, and HPiLO_node1 is the iLO card which is in Node1. If that is the case, I suspect you need to have your config look like this: Fence levels are sets of procedures to try for fencing. Most users (you included) should only ever need 1 fence level. :) -- Lon From masotti at mclink.it Thu Nov 3 12:28:11 2005 From: masotti at mclink.it (Marco Masotti) Date: Thu, 3 Nov 2005 13:28:11 +0100 (CET) Subject: [Linux-cluster] (no subject) Message-ID: <1.3.200511031327.80317@mclink.it> Hello, I've setup a 2-node cluster using the following versions of software: - cluster 1.01.00 - device-mapper 1.01.05 - LVM2 2.0.1.09 - kernel is 2.6.13-4 SMP (RH-FC4), running on physical SMP machine [biceleron] - kernel is 2.16.13-ARCH (as packaged by Archlinux), running on GSX virtual machine [archlinux], GSX software running on [biceleron] The cluster is formed without any evident problem at startup. My problem: ----------- The problem happens when I try to create a logical volume, getting the following: On the first node [biceleron], with the actual physical disk attached: [root at biceleron]# lvcreate -L10000 -ntest1vg VolGroupHdf Error locking on node archlinux: Internal lvm error, check syslog Failed to activate new LV. On the second node [archlinux], /var/log/daemon.log shows: Nov 3 13:08:48 archlinux lvm[2670]: Volume group for uuid not found: np60FVh26Fpvf3NlNrwM0EIiaNa41un5nR6ShP77FzT5waM6CoS0Bm2vzu0X8Izb Please also note that, locally on [biceleron], the logical volume gets actually created under /dev/VolGroupHdf/test1vg Thank you for any hint Marco M. From masotti at mclink.it Thu Nov 3 12:29:52 2005 From: masotti at mclink.it (Marco Masotti) Date: Thu, 3 Nov 2005 13:29:52 +0100 (CET) Subject: [Linux-cluster] Error locking on node, Internal lvm error, when creating logical volume Message-ID: <1.3.200511031329.81790@mclink.it> Hello, I've setup a 2-node cluster using the following versions of software: - cluster 1.01.00 - device-mapper 1.01.05 - LVM2 2.0.1.09 - kernel is 2.6.13-4 SMP (RH-FC4), running on physical SMP machine [biceleron] - kernel is 2.16.13-ARCH (as packaged by Archlinux), running on GSX virtual machine [archlinux], GSX software running on [biceleron] The cluster is formed without any evident problem at startup. My problem: ----------- The problem happens when I try to create a logical volume, getting the following: On the first node [biceleron], with the actual physical disk attached: [root at biceleron]# lvcreate -L10000 -ntest1vg VolGroupHdf Error locking on node archlinux: Internal lvm error, check syslog Failed to activate new LV. On the second node [archlinux], /var/log/daemon.log shows: Nov 3 13:08:48 archlinux lvm[2670]: Volume group for uuid not found: np60FVh26Fpvf3NlNrwM0EIiaNa41un5nR6ShP77FzT5waM6CoS0Bm2vzu0X8Izb Please also note that, locally on [biceleron], the logical volume gets actually created under /dev/VolGroupHdf/test1vg Thank you for any hint Marco M. From lhh at redhat.com Thu Nov 3 15:57:38 2005 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 03 Nov 2005 10:57:38 -0500 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs whilestartingfenced In-Reply-To: <22648586.1130998264092.JavaMail.root@ux-mail> References: <22648586.1130998264092.JavaMail.root@ux-mail> Message-ID: <1131033458.29380.289.camel@ayanami.boston.redhat.com> On Thu, 2005-11-03 at 11:28 +0600, Hirantha Wijayawardena wrote: > Hi All, > > Thanks a lot for your support.. > > My cluster is running smooth now but I need to change what Dave and Lon > suggests. I will re-configure as Dave says and if not second option he > suggested. Lon's guide read the same configuration but I didn't get > > > ======> this part > > > What does it means? It was there for emphasis. -- Lon From erwan at seanodes.com Thu Nov 3 16:21:06 2005 From: erwan at seanodes.com (Velu Erwan) Date: Thu, 03 Nov 2005 17:21:06 +0100 Subject: [Linux-cluster] Readhead Issues using cluster-1.01.00 In-Reply-To: <20051102172935.GB7766@redhat.com> References: <4366443F.9000707@seanodes.com> <4368BD4B.6030009@seanodes.com> <4368EB02.6030402@seanodes.com> <20051102172935.GB7766@redhat.com> Message-ID: <436A38F2.60204@seanodes.com> David Teigland a ?crit : [...] >I don't know, but here are a couple things you might look into: > >- Did this problem exist a few kernel versions ago? We should try the > RHEL4 kernel (or something close to it) and the version of gfs that > runs on that (RHEL4 cvs branch). If that version is ok, then there's > probably a recent kernel change that we've missed that requires us to > do something new. > > > I've diffed the GFS-kernel-2.6.9-42.2.src.rpm and cluster-1.01.00 and code is so close that it can't explain the loss of the readahead. >- Remove the diaper code and see if that changes things. Look at these > patches where I removed the diaper code from gfs2; do the equivalent > for the version of gfs you're playing with: > http://sources.redhat.com/ml/cluster-cvs/2005-q3/msg00184.html > > > It will be my next test but it sounds like a leak in the diaper implementation could be the explanation of the trouble.My patch shows it. Removing the diaper will make gfs using my device where the readahead is already set : so the performances will be there. The gfs version I had tested (which helps me to determine what are the nominal performances) was 6.0 where diaper didn't exists. >I suspect that read-ahead should indeed be happening and that something >has broken it recently. I think we should first figure out how it worked >in the past. > > I don't manage how the diaperer disk could have a readahead value if gfs doesn't specify one even more when the diapered volume can't be tuned using blktool or hdparm. From teigland at redhat.com Thu Nov 3 17:06:19 2005 From: teigland at redhat.com (David Teigland) Date: Thu, 3 Nov 2005 11:06:19 -0600 Subject: [Linux-cluster] Readhead Issues using cluster-1.01.00 In-Reply-To: <436A38F2.60204@seanodes.com> References: <4366443F.9000707@seanodes.com> <4368BD4B.6030009@seanodes.com> <4368EB02.6030402@seanodes.com> <20051102172935.GB7766@redhat.com> <436A38F2.60204@seanodes.com> Message-ID: <20051103170619.GA22567@redhat.com> On Thu, Nov 03, 2005 at 05:21:06PM +0100, Velu Erwan wrote: > >- Did this problem exist a few kernel versions ago? We should try the > > RHEL4 kernel (or something close to it) and the version of gfs that > > runs on that (RHEL4 cvs branch). If that version is ok, then there's > > probably a recent kernel change that we've missed that requires us to > > do something new. > > I've diffed the GFS-kernel-2.6.9-42.2.src.rpm and cluster-1.01.00 and > code is so close that it can't explain the loss of the readahead. I'm thinking that something changed in the kernel, outside gfs, between 2.6.9-x and 2.6.13 that is causing this. The fact that gfs is the same is probably the problem -- gfs may need to be updated for readahead to work properly on newer kernels. I suspect that gfs-6.1 readahead will work fine on 2.6.9, let's start there. > >- Remove the diaper code and see if that changes things. Look at these > > patches where I removed the diaper code from gfs2; do the equivalent > > for the version of gfs you're playing with: > > http://sources.redhat.com/ml/cluster-cvs/2005-q3/msg00184.html > > > It will be my next test but it sounds like a leak in the diaper > implementation could be the explanation of the trouble.My patch shows it. > Removing the diaper will make gfs using my device where the readahead is > already set : so the performances will be there. OK, so something in the kernel related to block devices and readahead may have changed recently that now requires us to set some new diaper device values (perhaps the ones you've tried.) We need to identify what kernel change that was (go back through kernel changelog) and then figure out the corresponding gfs fix. > The gfs version I had tested (which helps me to determine what are the > nominal performances) was 6.0 where diaper didn't exists. Yes, that's the right idea, and we can narrow in on the regression even further if we find that readahead works on 6.1 + 2.6.9. Then see if it works on 2.6.11, etc. Eventually we'll find the kernel where it broke and can look through the changelog for that one kernel release. Dave From hardyjm at potsdam.edu Thu Nov 3 22:01:57 2005 From: hardyjm at potsdam.edu (Jeff Hardy) Date: Thu, 03 Nov 2005 17:01:57 -0500 Subject: [Linux-cluster] GNBD client, memory starvation Message-ID: <1131055317.2600.31.camel@fritzdesk.potsdam.edu> I am having a few issues with memory exhaustion on gnbd clients when writing large files to a gnbd server that re-exports ATAoE storage. These are files 6 or 7 GB or larger. If I export a SCSI disk from that same gnbd server I do not have issues, so it leads me to believe the slower ATAoE storage relative to local disk is causing the cache to fill up on the gnbd client. On the gnbd server, I export some ATAoE storage that has been chunked up by LVM: gnbd_export -c -d /dev/san00/lv00 -e lv00 And on the client: gnbd_import -i 10.0.0.1 -n Can any one suggest some proc settings to fiddle with, or other things to try? The gnbd server and clients are both FC4. Just one client, so no fencing or anything else. I have tried fiddling with different values in a couple of proc settings, and it seems to have helped a bit. In fact, I could not reliably write large files to the exported SCSI disk either, until I made these changes: echo 5 > /proc/sys/vm/dirty_background_ratio echo 5 > /proc/sys/vm/dirty_ratio My testwrite script is a Perl script that just writes 1's to a file. From hirantha at vcs.informatics.lk Fri Nov 4 03:01:43 2005 From: hirantha at vcs.informatics.lk (Hirantha Wijayawardena) Date: Fri, 4 Nov 2005 09:01:43 +0600 Subject: [Linux-cluster] RHSCv4 2-node cluster hangs whilestartingfenced Re In-Reply-To: <1131033458.29380.289.camel@ayanami.boston.redhat.com> Message-ID: <33148349.1131075881147.JavaMail.root@ux-mail> Hi Dave/Lon, Thanks a million for your support.. - Hirantha -----Original Message----- From: Lon Hohberger [mailto:lhh at redhat.com] Sent: Thursday, November 03, 2005 9:58 PM To: Hirantha Wijayawardena Cc: 'linux clustering'; 'David Teigland' Subject: RE: [Linux-cluster] RHSCv4 2-node cluster hangs whilestartingfenced On Thu, 2005-11-03 at 11:28 +0600, Hirantha Wijayawardena wrote: > Hi All, > > Thanks a lot for your support.. > > My cluster is running smooth now but I need to change what Dave and Lon > suggests. I will re-configure as Dave says and if not second option he > suggested. Lon's guide read the same configuration but I didn't get > > > ======> this part > > > What does it means? It was there for emphasis. -- Lon From sgrieve at star-telegram.com Fri Nov 4 05:35:05 2005 From: sgrieve at star-telegram.com (Grieve, Shane) Date: Thu, 3 Nov 2005 23:35:05 -0600 Subject: [Linux-cluster] clvmd Compile problem RHEL4 Message-ID: After checking out LVM2 from the redhat cvs I am have problems getting this to compile on RHEL4 cluster and device-mapper compile fine but LVM errors out with. T_INTERNAL -DMIRRORED_INTERNAL -DDEVMAPPER_SUPPORT -DO_DIRECT_SUPPORT -DHAVE_LIBDL -DHAVE_GETOPTLONG -DMODPROBE_CMD=\"/sbin/modprobe\" -fPIC -Wall -Wundef -Wshadow -Wcast-align -Wwrite-strings -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -Winline -O2 -D_REENTRANT -fno-strict-aliasing clvmd.c -o clvmd.o clvmd-command.c: In function `lock_vg': clvmd-command.c:157: warning: implicit declaration of function `hash_create' clvmd-command.c:157: warning: nested extern declaration of `hash_create' clvmd-command.c:157: warning: assignment makes pointer from integer without a cast clvmd-command.c:170: warning: implicit declaration of function `hash_lookup' clvmd-command.c:170: warning: nested extern declaration of `hash_lookup' clvmd-command.c:178: warning: implicit declaration of function `hash_remove' clvmd-command.c:178: warning: nested extern declaration of `hash_remove' clvmd-command.c:186: warning: implicit declaration of function `hash_insert' clvmd-command.c:186: warning: nested extern declaration of `hash_insert' clvmd-command.c: In function `cmd_client_cleanup': clvmd-command.c:274: warning: implicit declaration of function `hash_iterate' clvmd-command.c:274: warning: nested extern declaration of `hash_iterate' clvmd-command.c:274: error: syntax error before '{' token clvmd-command.c:276: warning: implicit declaration of function `hash_get_key' clvmd-command.c:276: warning: nested extern declaration of `hash_get_key' clvmd-command.c:276: warning: initialization makes pointer from integer without a cast clvmd-command.c:279: error: `lkid' undeclared (first use in this function) clvmd-command.c:279: error: (Each undeclared identifier is reported only once clvmd-command.c:279: error: for each function it appears in.) clvmd-command.c:282: warning: implicit declaration of function `hash_destroy' clvmd-command.c:282: warning: nested extern declaration of `hash_destroy' clvmd-command.c:282: error: `lock_hash' undeclared (first use in this function) clvmd-command.c: At top level: clvmd-command.c:285: error: syntax error before '}' token make[2]: *** [clvmd-command.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[2]: Leaving directory `/cvs/LVM2/daemons/clvmd' make[1]: *** [clvmd] Error 2 make[1]: Leaving directory `/cvs/LVM2/daemons' make: *** [daemons] Error 2 [root at lnewlep LVM2]# uname -a Linux lnewlep 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005 i686 i686 i386 GNU/Linux My configure options were. ./configure --with-clvmd=all --with-cluster=shared --disable-selinux Thanks in Advance Shane -------------- next part -------------- An HTML attachment was scrubbed... URL: From pcaulfie at redhat.com Fri Nov 4 08:20:01 2005 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Fri, 04 Nov 2005 08:20:01 +0000 Subject: [Linux-cluster] clvmd Compile problem RHEL4 In-Reply-To: References: Message-ID: <436B19B1.4000003@redhat.com> Grieve, Shane wrote: > After checking out LVM2 from the redhat cvs I am have problems getting > this to compile on RHEL4 cluster and device-mapper compile fine but LVM > errors out with. > Yep, CVS is in heavy development at the moment. Use a downloaded tarball instead. -- patrick From sequel at neofreak.org Fri Nov 4 14:14:06 2005 From: sequel at neofreak.org (DeadManMoving) Date: Fri, 04 Nov 2005 09:14:06 -0500 Subject: [Linux-cluster] activeMonitor() Message-ID: <1131113646.26125.9.camel@sequel.info.polymtl.ca> Hi list, is there someone who can tell me (or shed some lights) on what is the purpose of the new activeMonitor() function in the new (rgmanager-1.9.39-0) /usr/share/cluster/fs.sh? I can see it is use before mounting a device or after umounting a device but in my case, i don't see anything in $mp/.clumanager beside the usual rmtab (as i see in the activeMonitor() function, it's suppose to write devmon.data and devmon.pid there). Also, is there any plan to implement quota in fs.sh? Cause i have to manually patch that file each time i upgrade my cluster. Thanks a lot for your time! Tony Lapointe From lhh at redhat.com Fri Nov 4 15:56:29 2005 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 04 Nov 2005 10:56:29 -0500 Subject: [Linux-cluster] activeMonitor() In-Reply-To: <1131113646.26125.9.camel@sequel.info.polymtl.ca> References: <1131113646.26125.9.camel@sequel.info.polymtl.ca> Message-ID: <1131119789.29380.364.camel@ayanami.boston.redhat.com> On Fri, 2005-11-04 at 09:14 -0500, DeadManMoving wrote: > Hi list, > > is there someone who can tell me (or shed some lights) on what is the > purpose of the new activeMonitor() function in the new > (rgmanager-1.9.39-0) /usr/share/cluster/fs.sh? I can see it is use > before mounting a device or after umounting a device but in my case, i > don't see anything in $mp/.clumanager beside the usual rmtab (as i see > in the activeMonitor() function, it's suppose to write devmon.data and > devmon.pid there). Some people don't like the 60-second write check interval, so activeMonitor() spawns a daemon which monitors fs activity using direct I/O in order to attempt to detect file system problems faster (2 second interval). It's a really simple daemon, which exits / reboots depending on the configuration. The only problem is that the daemon is not not there, probably because the developer did something daft and forgot to do "cvs add" before running "cvs commit"... > Also, is there any plan to implement quota in fs.sh? Cause i have to > manually patch that file each time i upgrade my cluster. Why don't you post that patch to linux-cluster? -- Lon From sequel at neofreak.org Fri Nov 4 17:20:40 2005 From: sequel at neofreak.org (DeadManMoving) Date: Fri, 04 Nov 2005 12:20:40 -0500 Subject: [Linux-cluster] activeMonitor() In-Reply-To: <1131119789.29380.364.camel@ayanami.boston.redhat.com> References: <1131113646.26125.9.camel@sequel.info.polymtl.ca> <1131119789.29380.364.camel@ayanami.boston.redhat.com> Message-ID: <1131124840.26125.15.camel@sequel.info.polymtl.ca> On Fri, 2005-11-04 at 10:56 -0500, Lon Hohberger wrote: > On Fri, 2005-11-04 at 09:14 -0500, DeadManMoving wrote: > > Hi list, > > > > is there someone who can tell me (or shed some lights) on what is the > > purpose of the new activeMonitor() function in the new > > (rgmanager-1.9.39-0) /usr/share/cluster/fs.sh? I can see it is use > > before mounting a device or after umounting a device but in my case, i > > don't see anything in $mp/.clumanager beside the usual rmtab (as i see > > in the activeMonitor() function, it's suppose to write devmon.data and > > devmon.pid there). > > Some people don't like the 60-second write check interval, so > activeMonitor() spawns a daemon which monitors fs activity using direct > I/O in order to attempt to detect file system problems faster (2 second > interval). > > It's a really simple daemon, which exits / reboots depending on the > configuration. Do you have any idea of what will be the configuration directives (in cluster.conf?) to deal with that daemon? > > The only problem is that the daemon is not not there, probably because > the developer did something daft and forgot to do "cvs add" before > running "cvs commit"... > > > > Also, is there any plan to implement quota in fs.sh? Cause i have to > > manually patch that file each time i upgrade my cluster. > > Why don't you post that patch to linux-cluster? > I'll try as soon as i can, cause right now, the only thing i do is quotaon after the device is mounted and quotaoff before the device is unmounted because i always need quota support (will try to post a patch to make it conditionnal depending if the user specified quota in $mount_options). > -- Lon > Tony Lapointe From lhh at redhat.com Fri Nov 4 17:22:26 2005 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 04 Nov 2005 12:22:26 -0500 Subject: [Linux-cluster] activeMonitor() In-Reply-To: <1131124840.26125.15.camel@sequel.info.polymtl.ca> References: <1131113646.26125.9.camel@sequel.info.polymtl.ca> <1131119789.29380.364.camel@ayanami.boston.redhat.com> <1131124840.26125.15.camel@sequel.info.polymtl.ca> Message-ID: <1131124946.29380.380.camel@ayanami.boston.redhat.com> On Fri, 2005-11-04 at 12:20 -0500, DeadManMoving wrote: > Do you have any idea of what will be the configuration directives (in > cluster.conf?) to deal with that daemon? active_monitor="1" It doesn't work without the daemon, which I will get to committing today or Monday. > I'll try as soon as i can, cause right now, the only thing i do is > quotaon after the device is mounted and quotaoff before the device is > unmounted because i always need quota support (will try to post a patch > to make it conditionnal depending if the user specified quota in > $mount_options). Great! -- Lon From sequel at neofreak.org Fri Nov 4 19:51:34 2005 From: sequel at neofreak.org (DeadManMoving) Date: Fri, 04 Nov 2005 14:51:34 -0500 Subject: [Linux-cluster] activeMonitor() In-Reply-To: <1131124946.29380.380.camel@ayanami.boston.redhat.com> References: <1131113646.26125.9.camel@sequel.info.polymtl.ca> <1131119789.29380.364.camel@ayanami.boston.redhat.com> <1131124840.26125.15.camel@sequel.info.polymtl.ca> <1131124946.29380.380.camel@ayanami.boston.redhat.com> Message-ID: <1131133894.26125.22.camel@sequel.info.polymtl.ca> On Fri, 2005-11-04 at 12:22 -0500, Lon Hohberger wrote: > On Fri, 2005-11-04 at 12:20 -0500, DeadManMoving wrote: > > > Do you have any idea of what will be the configuration directives (in > > cluster.conf?) to deal with that daemon? > > active_monitor="1" > > It doesn't work without the daemon, which I will get to committing today > or Monday. > > > I'll try as soon as i can, cause right now, the only thing i do is > > quotaon after the device is mounted and quotaoff before the device is > > unmounted because i always need quota support (will try to post a patch > > to make it conditionnal depending if the user specified quota in > > $mount_options). > > Great! > Here it is, i can't tell if this is the best way to do it and i can't test it right now cause my cluster is in production and i don't have any other servers available right now. Any comments are welcome! > -- Lon > Thanks, Tony Lapointe -------------- next part -------------- A non-text attachment was scrubbed... Name: fs_quota.diff Type: text/x-patch Size: 704 bytes Desc: not available URL: From morehead at choopa.com Thu Nov 3 22:45:57 2005 From: morehead at choopa.com (Dave Morhead) Date: Thu, 03 Nov 2005 17:45:57 -0500 Subject: [Linux-cluster] gfs mount fedora core4 Message-ID: <436A9325.5040707@choopa.com> I have been working to get a small cluster up and running with GFS. I have installed all the required packages with yum, according to the docs (http://www.redhat.com/docs/manuals/csgfs/browse/rh-cs-en/) and (http://www.redhat.com/docs/manuals/csgfs/admin-guide/) however when i attempt to mount my GFS filesystem with this command: mount -t GFS /dev/volume1/lvol0 /clusterfs1 I get the following error: mount: unknown filesystem type 'GFS' im not sure where to go from here, am i missing a package of some sort? on a side note i used lvcreate to make the logical filesystem I am unable to find pool_tools not sure if that has any effect. Thanks! From bunk at stusta.de Fri Nov 4 12:06:40 2005 From: bunk at stusta.de (Adrian Bunk) Date: Fri, 4 Nov 2005 13:06:40 +0100 Subject: [Linux-cluster] [-mm patch] drivers/dlm/: possible cleanups Message-ID: <20051104120640.GB5587@stusta.de> This patch contains the following possible cleanups: - every file should #include the headers containing the prototypes for it's global functions - make needlessly global functions static - #if 0 the following unused global functions: - device.c: dlm_device_free_devices - lock.c: dlm_remove_from_waiters - lockspace.c: dlm_find_lockspace_name Please review which of these changes do make sense. Signed-off-by: Adrian Bunk --- drivers/dlm/ast.c | 1 + drivers/dlm/device.c | 7 ++++--- drivers/dlm/dir.c | 1 + drivers/dlm/lock.c | 4 +++- drivers/dlm/lock.h | 2 -- drivers/dlm/lockspace.c | 2 ++ drivers/dlm/lockspace.h | 1 - drivers/dlm/memory.c | 1 + drivers/dlm/midcomms.c | 1 + drivers/dlm/recover.c | 1 + drivers/dlm/recoverd.c | 1 + drivers/dlm/requestqueue.c | 1 + drivers/dlm/util.c | 1 + 13 files changed, 17 insertions(+), 7 deletions(-) --- linux-2.6.14-rc5-mm1-full/drivers/dlm/ast.c.old 2005-11-04 11:21:45.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/ast.c 2005-11-04 11:21:57.000000000 +0100 @@ -13,6 +13,7 @@ #include "dlm_internal.h" #include "lock.h" +#include "ast.h" #define WAKE_ASTS 0 --- linux-2.6.14-rc5-mm1-full/drivers/dlm/dir.c.old 2005-11-04 11:22:15.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/dir.c 2005-11-04 11:22:26.000000000 +0100 @@ -21,6 +21,7 @@ #include "recover.h" #include "util.h" #include "lock.h" +#include "dir.h" static void put_free_de(struct dlm_ls *ls, struct dlm_direntry *de) --- linux-2.6.14-rc5-mm1-full/drivers/dlm/memory.c.old 2005-11-04 11:22:45.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/memory.c 2005-11-04 11:22:58.000000000 +0100 @@ -13,6 +13,7 @@ #include "dlm_internal.h" #include "config.h" +#include "memory.h" static kmem_cache_t *lkb_cache; --- linux-2.6.14-rc5-mm1-full/drivers/dlm/device.c.old 2005-11-04 11:25:18.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/device.c 2005-11-04 11:26:39.000000000 +0100 @@ -39,7 +39,6 @@ #include #include "lvb_table.h" -#include "device.h" static struct file_operations _dlm_fops; static const char *name_prefix="dlm"; @@ -1050,6 +1049,7 @@ return status; } +#if 0 /* Called when the cluster is shutdown uncleanly, all lockspaces have been summarily removed */ void dlm_device_free_devices() @@ -1069,6 +1069,7 @@ } up(&user_ls_lock); } +#endif /* 0 */ static struct file_operations _dlm_fops = { .open = dlm_open, @@ -1089,7 +1090,7 @@ /* * Create control device */ -int __init dlm_device_init(void) +static int __init dlm_device_init(void) { int r; @@ -1110,7 +1111,7 @@ return 0; } -void __exit dlm_device_exit(void) +static void __exit dlm_device_exit(void) { misc_deregister(&ctl_device); } --- linux-2.6.14-rc5-mm1-full/drivers/dlm/lock.h.old 2005-11-04 11:26:57.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/lock.h 2005-11-04 11:28:28.000000000 +0100 @@ -13,7 +13,6 @@ #ifndef __LOCK_DOT_H__ #define __LOCK_DOT_H__ -void dlm_print_lkb(struct dlm_lkb *lkb); void dlm_print_rsb(struct dlm_rsb *r); int dlm_receive_message(struct dlm_header *hd, int nodeid, int recovery); int dlm_modes_compat(int mode1, int mode2); @@ -22,7 +21,6 @@ void dlm_put_rsb(struct dlm_rsb *r); void dlm_hold_rsb(struct dlm_rsb *r); int dlm_put_lkb(struct dlm_lkb *lkb); -int dlm_remove_from_waiters(struct dlm_lkb *lkb); void dlm_scan_rsbs(struct dlm_ls *ls); int dlm_purge_locks(struct dlm_ls *ls); --- linux-2.6.14-rc5-mm1-full/drivers/dlm/lock.c.old 2005-11-04 11:27:20.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/lock.c 2005-11-04 11:28:42.000000000 +0100 @@ -152,7 +152,7 @@ {0, 0, 0, 0, 0, 0, 0, 0} /* PD */ }; -void dlm_print_lkb(struct dlm_lkb *lkb) +static void dlm_print_lkb(struct dlm_lkb *lkb) { printk(KERN_ERR "lkb: nodeid %d id %x remid %x exflags %x flags %x\n" " status %d rqmode %d grmode %d wait_type %d ast_type %d\n", @@ -751,10 +751,12 @@ return error; } +#if 0 int dlm_remove_from_waiters(struct dlm_lkb *lkb) { return remove_from_waiters(lkb); } +#endif /* 0 */ static void dir_remove(struct dlm_rsb *r) { --- linux-2.6.14-rc5-mm1-full/drivers/dlm/lockspace.h.old 2005-11-04 11:28:59.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/lockspace.h 2005-11-04 11:29:06.000000000 +0100 @@ -18,7 +18,6 @@ void dlm_lockspace_exit(void); struct dlm_ls *dlm_find_lockspace_global(uint32_t id); struct dlm_ls *dlm_find_lockspace_local(void *id); -struct dlm_ls *dlm_find_lockspace_name(char *name, int namelen); void dlm_put_lockspace(struct dlm_ls *ls); #endif /* __LOCKSPACE_DOT_H__ */ --- linux-2.6.14-rc5-mm1-full/drivers/dlm/lockspace.c.old 2005-11-04 11:29:17.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/lockspace.c 2005-11-04 11:43:53.000000000 +0100 @@ -239,10 +239,12 @@ return ls; } +#if 0 struct dlm_ls *dlm_find_lockspace_name(char *name, int namelen) { return find_lockspace_name(name, namelen); } +#endif /* 0 */ struct dlm_ls *dlm_find_lockspace_global(uint32_t id) { --- linux-2.6.14-rc5-mm1-full/drivers/dlm/midcomms.c.old 2005-11-04 11:30:11.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/midcomms.c 2005-11-04 11:30:29.000000000 +0100 @@ -29,6 +29,7 @@ #include "config.h" #include "rcom.h" #include "lock.h" +#include "midcomms.h" static void copy_from_cb(void *dst, const void *base, unsigned offset, --- linux-2.6.14-rc5-mm1-full/drivers/dlm/recover.c.old 2005-11-04 11:30:58.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/recover.c 2005-11-04 11:31:11.000000000 +0100 @@ -21,6 +21,7 @@ #include "lock.h" #include "lowcomms.h" #include "member.h" +#include "recover.h" /* --- linux-2.6.14-rc5-mm1-full/drivers/dlm/recoverd.c.old 2005-11-04 11:31:28.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/recoverd.c 2005-11-04 11:31:42.000000000 +0100 @@ -20,6 +20,7 @@ #include "lowcomms.h" #include "lock.h" #include "requestqueue.h" +#include "recoverd.h" /* If the start for which we're re-enabling locking (seq) has been superseded --- linux-2.6.14-rc5-mm1-full/drivers/dlm/requestqueue.c.old 2005-11-04 11:32:04.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/requestqueue.c 2005-11-04 11:32:15.000000000 +0100 @@ -15,6 +15,7 @@ #include "lock.h" #include "dir.h" #include "config.h" +#include "requestqueue.h" struct rq_entry { struct list_head list; --- linux-2.6.14-rc5-mm1-full/drivers/dlm/util.c.old 2005-11-04 11:32:32.000000000 +0100 +++ linux-2.6.14-rc5-mm1-full/drivers/dlm/util.c 2005-11-04 11:32:40.000000000 +0100 @@ -12,6 +12,7 @@ #include "dlm_internal.h" #include "rcom.h" +#include "util.h" static void header_out(struct dlm_header *hd) { From sboyd at redhat.com Sun Nov 6 05:33:09 2005 From: sboyd at redhat.com (Sean Boyd) Date: Sun, 06 Nov 2005 15:33:09 +1000 Subject: [Linux-cluster] Error locking on node, Internal lvm error, when creating logical volume In-Reply-To: <1.3.200511031329.81790@mclink.it> References: <1.3.200511031329.81790@mclink.it> Message-ID: <1131255189.14988.109.camel@sboyd.brisbane.redhat.com> On Thu, 2005-11-03 at 13:29 +0100, Marco Masotti wrote: > Hello, > > I've setup a 2-node cluster using the following versions of software: > > - cluster 1.01.00 > - device-mapper 1.01.05 > - LVM2 2.0.1.09 > > - kernel is 2.6.13-4 SMP (RH-FC4), running on physical SMP machine [biceleron] > - kernel is 2.16.13-ARCH (as packaged by Archlinux), running on GSX virtual machine [archlinux], GSX software running on [biceleron] > > > The cluster is formed without any evident problem at startup. > > My problem: > ----------- > The problem happens when I try to create a logical volume, getting the following: > > On the first node [biceleron], with the actual physical disk attached: > > [root at biceleron]# lvcreate -L10000 -ntest1vg VolGroupHdf > Error locking on node archlinux: Internal lvm error, check syslog > Failed to activate new LV. > > > > On the second node [archlinux], /var/log/daemon.log shows: > > Nov 3 13:08:48 archlinux lvm[2670]: Volume group for uuid not > found: np60FVh26Fpvf3NlNrwM0EIiaNa41un5nR6ShP77FzT5waM6CoS0Bm2vzu0X8Izb > This may have to do with the fact that kpartx is not integrated into the rc.sysinit script. Therefore there are no mapped partitions in the /etc/lvm/.cache. Make sure your lvm.conf has filters for the disks associated with the dm device. Make sure you have run kpartx -a (to map dm partitions) then restart the clvm daemon (to populate the cache with the partitions). This is an issue with RHEL4 U2. I haven't checked FC4. --Sean From masotti at mclink.it Sun Nov 6 21:27:52 2005 From: masotti at mclink.it (Marco Masotti) Date: Sun, 6 Nov 2005 22:27:52 +0100 (CET) Subject: [Linux-cluster] Error locking on node, Internal lvm error, Message-ID: <1.3.200511062227.18193@mclink.it> > ========================== > Date: Sun, 06 Nov 2005 15:33:09 +1000 > From: Sean Boyd > To: linux clustering > Cc: masotti at mclink.it > Subject: Re: [Linux-cluster] Error locking on node, Internal > lvm error, when creating logical volume > ========================== [...] Thank you for your reply. It looks like I missed asking myself the question "where is your underlying network block device system?". Banal as it may seem, actually it was not there yet and the two nodes were not sharing any data blocks! As a result, the volume group was not found on the other related cluster member. Eventually, if deemed anyhow useful, that may be added as hint in some future troubleshooting guide. In my revised setup, the data blocks were then supplied using an iSCSI-based San, with target and initiator ( open-iscsi and I-scsi Enterprise Target) respectively still running on the two cluster's virtual machines. It is worth saying that performances were pretty good on this 366MHz dual celeron physical host, with dd=/dev/zero of=./gfs/somedata bs=1M count=4096 performing roughly at 8MByte/s > > This may have to do with the fact that kpartx is not integrated > into the > rc.sysinit script. Therefore there are no mapped partitions in > the /etc/lvm/.cache. > > Make sure your lvm.conf has filters for the disks associated > with the dm > device. > > Make sure you have run kpartx -a (to map dm partitions) then > restart > the clvm daemon (to populate the cache with the partitions). > > This is an issue with RHEL4 U2. I haven't checked FC4. > > --Sean > I cannot find any kpartx executable in my software loads, can you please tell which of them is it part of? Thank you Marco M. From sboyd at redhat.com Sun Nov 6 23:22:35 2005 From: sboyd at redhat.com (Sean Boyd) Date: Mon, 07 Nov 2005 09:22:35 +1000 Subject: [Linux-cluster] Error locking on node, Internal lvm error, In-Reply-To: <1.3.200511062227.18193@mclink.it> References: <1.3.200511062227.18193@mclink.it> Message-ID: <1131319355.14988.114.camel@sboyd.brisbane.redhat.com> On Sun, 2005-11-06 at 22:27 +0100, Marco Masotti wrote: > > ========================== > > Date: Sun, 06 Nov 2005 15:33:09 +1000 > > From: Sean Boyd > > To: linux clustering > > Cc: masotti at mclink.it > > Subject: Re: [Linux-cluster] Error locking on node, Internal > > lvm error, when creating logical volume > > ========================== > > [...] > > Thank you for your reply. > It looks like I missed asking myself the question "where is your underlying network block device system?". Banal as it may seem, actually it was not there yet and the two nodes were not sharing any data blocks! As a result, the volume group was not found on the other related cluster member. Eventually, if deemed anyhow useful, that may be added as hint in some future troubleshooting guide. > > In my revised setup, the data blocks were then supplied using an iSCSI-based San, with target and initiator ( open-iscsi and I-scsi Enterprise Target) respectively still running on the two cluster's virtual machines. > > It is worth saying that performances were pretty good on this 366MHz dual celeron physical host, with dd=/dev/zero of=./gfs/somedata bs=1M count=4096 performing roughly at 8MByte/s > > > > > This may have to do with the fact that kpartx is not integrated > > into the > > rc.sysinit script. Therefore there are no mapped partitions in > > the /etc/lvm/.cache. > > > > Make sure your lvm.conf has filters for the disks associated > > with the dm > > device. > > > > Make sure you have run kpartx -a (to map dm partitions) then > > restart > > the clvm daemon (to populate the cache with the partitions). > > > > This is an issue with RHEL4 U2. I haven't checked FC4. > > > > --Sean > > > > I cannot find any kpartx executable in my software loads, can you please tell which of them is it part of? I may have assumed too much. You're not multipathing to the storage are you? In any event l have seen this issue when the underlying devices weren't in the /etc/lvm/.cache file. Once the block devices and their partitions were configured correctly a restart of clvm populated the lvm cache and fixed the issue. HTH --Sean From nattaponv at hotmail.com Mon Nov 7 11:37:06 2005 From: nattaponv at hotmail.com (nattapon viroonsri) Date: Mon, 07 Nov 2005 11:37:06 +0000 Subject: [Linux-cluster] Fence device, How it work Message-ID: I tend to use APC AP7900 as fence device. So when failover have occure backup node control APC to power cycle failnode. But im not sure how kind of this device work . Is it just power off or power cycle at outlet that fail node was pluged ? If so failnode was shutdown or reboot uncleanly ? (look like no ups when have no power) Im afraid of data corrupt may occur. Or cluster manager instruct failnode to shutdown before instruct APC to power cycle ? ( So it reboot or shutdown cleanly) Nattapon, Regards _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.com/ From Axel.Thimm at ATrpms.net Mon Nov 7 12:58:12 2005 From: Axel.Thimm at ATrpms.net (Axel Thimm) Date: Mon, 7 Nov 2005 13:58:12 +0100 Subject: [Linux-cluster] cman multihome setup deprecated? Message-ID: <20051107125812.GB15722@neu.nirvana> Hi, I'm looking into setting up cman multihome & multicast setup. I see that the man page had the multihome bits removed: revision 1.6 date: 2005/02/14 03:53:51; author: teigland; state: Exp; lines: +2 -36 remove multihome setup from man page revision 1.5.2.1 date: 2005/02/14 03:54:46; author: teigland; state: Exp; lines: +2 -36 remove multihome setup from man page It still seems to work, although I have to route the multicasts manually to really have them running over both interfaces (the interface parameter in cluster.conf is ignored). Is this functionality deprecated and to be removed in a future update? Or does it need some work to be republished? What is the recommended precedure for having a dedicated cluster network with a failover setup over a secondary network? E.g. I have all cluster nodes connected over two network, the client ("LAN") network, and a dedicated cluster network ("cluster-net"). What I'd like to achive it to have high-performance by allowing cman/dlm to have its own network, but also high-availablity by using the "LAN" network when the "cluster-net" fails, e.g. broken cable or broken switch. Thanks! -- Axel.Thimm at ATrpms.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From Axel.Thimm at ATrpms.net Mon Nov 7 13:06:03 2005 From: Axel.Thimm at ATrpms.net (Axel Thimm) Date: Mon, 7 Nov 2005 14:06:03 +0100 Subject: [Linux-cluster] Registring multicast IPs at iana Message-ID: <20051107130603.GC15722@neu.nirvana> Hi, shouldn't cman apply for some default multicast addresses assigned by iana? The recommendations found in some cman-over-multicast docs use either the "subnet-broadcast" address, or some RIP2 router stuff. It would be cleaner to use some currently unassigned multicast addresses. Either in 224.0.1/24 if there would be only one or a few or a block higher up. -- Axel.Thimm at ATrpms.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From pcaulfie at redhat.com Mon Nov 7 15:17:52 2005 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 07 Nov 2005 15:17:52 +0000 Subject: [Linux-cluster] Registring multicast IPs at iana In-Reply-To: <20051107130603.GC15722@neu.nirvana> References: <20051107130603.GC15722@neu.nirvana> Message-ID: <436F7020.5010707@redhat.com> Axel Thimm wrote: > Hi, > > shouldn't cman apply for some default multicast addresses assigned by > iana? The recommendations found in some cman-over-multicast docs use > either the "subnet-broadcast" address, or some RIP2 router stuff. > > It would be cleaner to use some currently unassigned multicast > addresses. Either in 224.0.1/24 if there would be only one or a few or > a block higher up. Yes. we should register a multicast address. The default action is to use broadcast and there is no default multicast address. There's one mentioned in some example config files (which I think is the router one you mentioned) -- patrick From pcaulfie at redhat.com Mon Nov 7 15:22:04 2005 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 07 Nov 2005 15:22:04 +0000 Subject: [Linux-cluster] cman multihome setup deprecated? In-Reply-To: <20051107125812.GB15722@neu.nirvana> References: <20051107125812.GB15722@neu.nirvana> Message-ID: <436F711C.5080306@redhat.com> Axel Thimm wrote: > Hi, > > I'm looking into setting up cman multihome & multicast setup. I see > that the man page had the multihome bits removed: > > revision 1.6 > date: 2005/02/14 03:53:51; author: teigland; state: Exp; lines: +2 -36 > remove multihome setup from man page > > revision 1.5.2.1 > date: 2005/02/14 03:54:46; author: teigland; state: Exp; lines: +2 -36 > remove multihome setup from man page > > It still seems to work, although I have to route the multicasts > manually to really have them running over both interfaces (the > interface parameter in cluster.conf is ignored). > > Is this functionality deprecated and to be removed in a future update? > Or does it need some work to be republished? > > What is the recommended precedure for having a dedicated cluster > network with a failover setup over a secondary network? E.g. I have > all cluster nodes connected over two network, the client ("LAN") > network, and a dedicated cluster network ("cluster-net"). > > What I'd like to achive it to have high-performance by allowing > cman/dlm to have its own network, but also high-availablity by using > the "LAN" network when the "cluster-net" fails, e.g. broken cable or > broken switch. "deprecated" isn't quite the right word! But I'm not sure what is. As it stands, cman in RHEL4 supports multihome. The problem is that the DLM doesn't so it's not much use in most environments. At the head of CVS, cman does not support multihome (though this should get added to AIS soon-ish) but the transport layer of the DLM does :) Currently for failover we recommend the bonding network device which doesn't achieve what you want, I agree. -- patrick From lhh at redhat.com Mon Nov 7 18:38:54 2005 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 07 Nov 2005 13:38:54 -0500 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: References: Message-ID: <1131388734.21393.6.camel@ayanami.boston.redhat.com> On Mon, 2005-11-07 at 11:37 +0000, nattapon viroonsri wrote: > I tend to use APC AP7900 as fence device. > > So when failover have occure backup node control APC to power cycle > failnode. > But im not sure how kind of this device work . > Is it just power off or power cycle at outlet that fail node was pluged ? Power-cycle. > If so failnode was shutdown or reboot uncleanly ? (look like no ups when > have no power) It is an "unclean" power-off. > Im afraid of data corrupt may occur. Most journalled file systems will just replay the journal and life will continue after it reboots. > Or cluster manager instruct failnode to shutdown before instruct APC to > power cycle ? ( So it reboot or shutdown cleanly) It does not instruct the node to shut down. It turns off the power to the node. When a node appears down, it is currently assumed that we can no longer communicate with it, so telling it to shut down will not work. (This isn't always the case, but for now, this is how it works.) -- Lon From lhh at redhat.com Mon Nov 7 18:40:07 2005 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 07 Nov 2005 13:40:07 -0500 Subject: [Linux-cluster] Registring multicast IPs at iana In-Reply-To: <436F7020.5010707@redhat.com> References: <20051107130603.GC15722@neu.nirvana> <436F7020.5010707@redhat.com> Message-ID: <1131388807.21393.8.camel@ayanami.boston.redhat.com> On Mon, 2005-11-07 at 15:17 +0000, Patrick Caulfield wrote: > Axel Thimm wrote: > > Yes. we should register a multicast address. The default action is to use > broadcast and there is no default multicast address. There's one mentioned in > some example config files (which I think is the router one you mentioned) 224.0.0.0/24 is the router range, I think... Don't remember. It's been awhile. -- Lon From teigland at redhat.com Mon Nov 7 20:04:31 2005 From: teigland at redhat.com (David Teigland) Date: Mon, 7 Nov 2005 14:04:31 -0600 Subject: [Linux-cluster] Re: [-mm patch] drivers/dlm/: possible cleanups In-Reply-To: <20051104120640.GB5587@stusta.de> References: <20051104120640.GB5587@stusta.de> Message-ID: <20051107200431.GC20531@redhat.com> On Fri, Nov 04, 2005 at 01:06:40PM +0100, Adrian Bunk wrote: > This patch contains the following possible cleanups: > - every file should #include the headers containing the prototypes for > it's global functions Including unnecessary headers doesn't sound right. > - make needlessly global functions static > - #if 0 the following unused global functions: > - device.c: dlm_device_free_devices > - lock.c: dlm_remove_from_waiters > - lockspace.c: dlm_find_lockspace_name I've removed the unused functions and added the statics. Thanks, Dave From masotti at mclink.it Tue Nov 8 09:44:11 2005 From: masotti at mclink.it (Marco Masotti) Date: Tue, 8 Nov 2005 10:44:11 +0100 (CET) Subject: [Linux-cluster] Error locking on node, Internal lvm error, Message-ID: <1.3.200511081044.2037@mclink.it> > ========================== > Date: Mon, 07 Nov 2005 09:22:35 +1000 > From: Sean Boyd > To: Marco Masotti > Cc: linux-cluster at redhat.com > Subject: Re: [Linux-cluster] Error locking on node, Internal > lvm error, > ========================== [...] > > > > I cannot find any kpartx executable in my software loads, can > you please tell which of them is it part of? > > I may have assumed too much. You're not multipathing to the storage > are > you? > I agree and, in fact, my early setup was still primitive with respect of multipath. I downloaded the multipath-tool software and will try to have a look. > In any event l have seen this issue when the underlying devices > weren't > in the /etc/lvm/.cache file. Once the block devices and their > partitions > were configured correctly a restart of clvm populated the lvm > cache and > fixed the issue. > Thank you for your hint. Marco M. > HTH > > --Sean > From Axel.Thimm at ATrpms.net Tue Nov 8 10:38:57 2005 From: Axel.Thimm at ATrpms.net (Axel Thimm) Date: Tue, 8 Nov 2005 11:38:57 +0100 Subject: [Linux-cluster] Re: Registring multicast IPs at iana In-Reply-To: <1131388807.21393.8.camel@ayanami.boston.redhat.com> References: <20051107130603.GC15722@neu.nirvana> <436F7020.5010707@redhat.com> <1131388807.21393.8.camel@ayanami.boston.redhat.com> Message-ID: <20051108103857.GB2757@neu.nirvana> On Mon, Nov 07, 2005 at 01:40:07PM -0500, Lon Hohberger wrote: > On Mon, 2005-11-07 at 15:17 +0000, Patrick Caulfield wrote: > > Axel Thimm wrote: > > Yes. we should register a multicast address. The default action is to use > > broadcast and there is no default multicast address. There's one mentioned in > > some example config files (which I think is the router one you mentioned) Perhaps until having a registered multicast address, the docs should use something in the range 239.x.x.x? Perhaps even 239.255.x.x, which is the "minimal enclosing local scope" implying that cluster heartbeat should not go across the whole organizational scope? Otherwise 239.192/14 is for organizational scope. > 224.0.0.0/24 is the router range, I think... Don't remember. It's been > awhile. 224.0.0/24 is for local network control, 224.0.1/24 is internetwork control including stuff like NTP syncing. There are unassigned IPs from 224.0.1.179 upwards. If they don't know what to do with you you get into 224.0.23.x upwards (currently 224.0.23.160). That's for ipv4, ipv6 is another beast. It depends on how many IPs you'd like to reserve. I'd go for at least two to have a default config for redundant intracluster networking. And cman would by default use only the first one. http://www.iana.org/cgi-bin/multicast.pl -- Axel.Thimm at ATrpms.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From Axel.Thimm at ATrpms.net Tue Nov 8 10:45:51 2005 From: Axel.Thimm at ATrpms.net (Axel Thimm) Date: Tue, 8 Nov 2005 11:45:51 +0100 Subject: [Linux-cluster] Re: cman multihome setup deprecated? In-Reply-To: <436F711C.5080306@redhat.com> References: <20051107125812.GB15722@neu.nirvana> <436F711C.5080306@redhat.com> Message-ID: <20051108104551.GC2757@neu.nirvana> On Mon, Nov 07, 2005 at 03:22:04PM +0000, Patrick Caulfield wrote: > Axel Thimm wrote: > > Is this functionality deprecated and to be removed in a future update? > > Or does it need some work to be republished? > > > > What is the recommended precedure for having a dedicated cluster > > network with a failover setup over a secondary network? E.g. I have > > all cluster nodes connected over two network, the client ("LAN") > > network, and a dedicated cluster network ("cluster-net"). > > > > What I'd like to achive it to have high-performance by allowing > > cman/dlm to have its own network, but also high-availablity by using > > the "LAN" network when the "cluster-net" fails, e.g. broken cable or > > broken switch. > > "deprecated" isn't quite the right word! But I'm not sure what is. "in works" ? :) > As it stands, cman in RHEL4 supports multihome. The problem is that the DLM > doesn't so it's not much use in most environments. So what would be the impact if the primary network breaks down? cman continues to operate happily, e.g. the heartbeat works, the cluster remains quorate etc., but GFS cannot operate its locks anymore? Would that harm more than it would help? I.e. should I better implement multihoming on a lower level, e.g. a dynamic router on each node? > At the head of CVS, cman does not support multihome (though this should get > added to AIS soon-ish) but the transport layer of the DLM does :) That's for RHEL5 I guess :) > Currently for failover we recommend the bonding network device which doesn't > achieve what you want, I agree. -- Axel.Thimm at ATrpms.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From pcaulfie at redhat.com Tue Nov 8 10:57:18 2005 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 08 Nov 2005 10:57:18 +0000 Subject: [Linux-cluster] Re: cman multihome setup deprecated? In-Reply-To: <20051108104551.GC2757@neu.nirvana> References: <20051107125812.GB15722@neu.nirvana> <436F711C.5080306@redhat.com> <20051108104551.GC2757@neu.nirvana> Message-ID: <4370848E.6040308@redhat.com> Axel Thimm wrote: > On Mon, Nov 07, 2005 at 03:22:04PM +0000, Patrick Caulfield wrote: > >>Axel Thimm wrote: >> >>>Is this functionality deprecated and to be removed in a future update? >>>Or does it need some work to be republished? >>> >>>What is the recommended precedure for having a dedicated cluster >>>network with a failover setup over a secondary network? E.g. I have >>>all cluster nodes connected over two network, the client ("LAN") >>>network, and a dedicated cluster network ("cluster-net"). >>> >>>What I'd like to achive it to have high-performance by allowing >>>cman/dlm to have its own network, but also high-availablity by using >>>the "LAN" network when the "cluster-net" fails, e.g. broken cable or >>>broken switch. >> >>"deprecated" isn't quite the right word! But I'm not sure what is. > > > "in works" ? :) Something like that, yes >>As it stands, cman in RHEL4 supports multihome. The problem is that the DLM >>doesn't so it's not much use in most environments. > > > So what would be the impact if the primary network breaks down? cman > continues to operate happily, e.g. the heartbeat works, the cluster > remains quorate etc., but GFS cannot operate its locks anymore? > > Would that harm more than it would help? I.e. should I better > implement multihoming on a lower level, e.g. a dynamic router on each > node? It's not useful - that's why the multihome capability in cman is played down or denied. So, if you wanted do do something smarter than bonded interfaces with redundant switches then you'll have to do some smart routing/switching/bonding, yes. > >>At the head of CVS, cman does not support multihome (though this should get >>added to AIS soon-ish) but the transport layer of the DLM does :) > > > That's for RHEL5 I guess :) Certainly. > >>Currently for failover we recommend the bonding network device which doesn't >>achieve what you want, I agree. > -- patrick From mwill at penguincomputing.com Tue Nov 8 15:52:22 2005 From: mwill at penguincomputing.com (Michael Will) Date: Tue, 08 Nov 2005 07:52:22 -0800 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <1131388734.21393.6.camel@ayanami.boston.redhat.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> Message-ID: <4370C9B6.7070006@penguincomputing.com> Lon Hohberger wrote: >On Mon, 2005-11-07 at 11:37 +0000, nattapon viroonsri wrote: > > >>I tend to use APC AP7900 as fence device. >> >>So when failover have occure backup node control APC to power cycle >>failnode. >>But im not sure how kind of this device work . >>Is it just power off or power cycle at outlet that fail node was pluged ? >> >> > >Power-cycle. > > I always wondered about this. If the node has a problem, chances are that rebooting does not fix it. Now if the node comes up semi-functional and attempts to regain control over the ressource that it owned before, then that could be bad. Should it not rather be shut-down so an human intervention can fix it before it is being made operational again? I/O fencing instead of power fencing kind of works like this, you undo the i/o block once you know the node is fine again. Michael -- Michael Will Penguin Computing Corp. Sales Engineer 415-954-2822 415-954-2899 fx mwill at penguincomputing.com From gwood at dragonhold.org Tue Nov 8 16:56:57 2005 From: gwood at dragonhold.org (Graham Wood) Date: Tue, 8 Nov 2005 16:56:57 -0000 (GMT) Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <4370C9B6.7070006@penguincomputing.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> <4370C9B6.7070006@penguincomputing.com> Message-ID: <56665.208.178.77.200.1131469017.squirrel@208.178.77.200> > Now if the node comes up semi-functional and attempts to regain > control over the ressource that it owned before, then that could > be bad. So don't have it automatically try and do anything with the cluster resources on a boot. The I/O block just guarantees that it doesn't do anything - not having it doesn't force you to do anything with it. From clusterbuilder at gmail.com Tue Nov 8 17:30:08 2005 From: clusterbuilder at gmail.com (Nick I) Date: Tue, 8 Nov 2005 10:30:08 -0700 Subject: [Linux-cluster] backbone of a large-file streaming system Message-ID: Hi, I help maintain a Web site at www.clusterbuilder.org. You might have seen before that I have a section called Ask the Cluster Expert, where I am building a knowledgebase of cluster and grid information. When someone asks a question I am researching the answer to build this knowledgebase out. I received the following question: *"I am looking for a variety of solutions to be a backbone of a large-file streaming system providing thousands of concurrent download streams. Preferably commodity hardware and Linux, though I'm open to commercial solutions."* I am wondering if anyone here has suggestions of what applications will work best for this type of setup. You can respond to the question at www.clusterbuilder.org/FAQ or respond in email. Thanks, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From lhh at redhat.com Tue Nov 8 20:09:17 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 08 Nov 2005 15:09:17 -0500 Subject: [Linux-cluster] activeMonitor() In-Reply-To: <1131133894.26125.22.camel@sequel.info.polymtl.ca> References: <1131113646.26125.9.camel@sequel.info.polymtl.ca> <1131119789.29380.364.camel@ayanami.boston.redhat.com> <1131124840.26125.15.camel@sequel.info.polymtl.ca> <1131124946.29380.380.camel@ayanami.boston.redhat.com> <1131133894.26125.22.camel@sequel.info.polymtl.ca> Message-ID: <1131480557.21393.67.camel@ayanami.boston.redhat.com> On Fri, 2005-11-04 at 14:51 -0500, DeadManMoving wrote: > Here it is, i can't tell if this is the best way to do it and i can't > test it right now cause my cluster is in production and i don't have any > other servers available right now. Any comments are welcome! Actually, it looks pretty good. Thanks! -- Lon From lhh at redhat.com Tue Nov 8 21:40:39 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 08 Nov 2005 16:40:39 -0500 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <4370C9B6.7070006@penguincomputing.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> <4370C9B6.7070006@penguincomputing.com> Message-ID: <1131486039.21393.153.camel@ayanami.boston.redhat.com> On Tue, 2005-11-08 at 07:52 -0800, Michael Will wrote: > > > >Power-cycle. > > > > > I always wondered about this. If the node has a problem, chances are > that rebooting does not > fix it. Now if the node comes up semi-functional and attempts to regain > control over the ressource > that it owned before, then that could be bad. Should it not rather be > shut-down so an human intervention > can fix it before it is being made operational again? This is a bit long, but maybe it will clear some things up a little. As far as a node taking over a resource it thinks it still has after a reboot (without notifying the other nodes of its intentions), that would be a bug the cluster software, and a really *bad* one too! A couple of things to remember when thinking about failures and fencing: (a) Failures are rare. A decent PC has something like a 99.95% uptime (I wish I knew where I heard/read this long ago) uptime - with no redundancy at all. A server with ECC RAM, RAID for internal disks, etc. probably has a higher uptime. (b) The hardware component most likely to fail is a hard disk (moving parts). If that's the root hard disk, the machine probably won't boot again. If it's the shared RAID set, then the whole cluster will likely have problems. (c) I hate to say this, but the kernel is probably more likely to fail (panic, hang) than any single piece of hardware. (d) Consider this (I think this is an example of what you said?): 1. Node A fails 2. Node B reboots node A 3. Node A correctly boots and rejoins cluster 4. Node A mounts a GFS file system correctly 5. Node A corrupts the GFS file system What is the chance that 5 will happen without data corruption occurring during before 1? Very slim, but nonzero - which brings me to my next point... (e) Always make backups of critical data, no matter what sort of block device or cluster technology you are using. A bad RAM chip (e.g. an parity RAM chip missing a double-bit errors) can cause periodic, quiet data corruption. Chances of this happening are also very slim, but again, nonzero. Probably at least as likely to happen as (d). (f) If you're worried about (d) and are willing to take the expected uptime hit for a given node when that node fails, even given (c), you can always change the cluster configuration to turn "off" a node instead of reboot it. :) (g) You can chkconfig --del the cluster components so that they don't automatically start on reboot; same effect as (f): the node won't reacquire the resources if it never rejoins the cluster... > I/O fencing instead of power fencing kind of works like this, you undo > the i/o block once you know > the node is fine again. Typically, we refer to that as "fabric level fencing" vs. "power level fencing", both fit in with the I/O fencing paradigm in preventing a node from flushing buffers after it has misbehaved. Note that typically the only way to be 100% positive a node has no buffers waiting after it has been fenced at the fabric level is a hard reboot. Many administrators will reboot a failed node as a first attempt to fix it anyway - so we're just saving them a step :) (Again, if you want, you can always do (f) or (g) above...) -- Lon From lhh at redhat.com Tue Nov 8 21:55:58 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 08 Nov 2005 16:55:58 -0500 Subject: [Linux-cluster] Re: Registring multicast IPs at iana In-Reply-To: <20051108103857.GB2757@neu.nirvana> References: <20051107130603.GC15722@neu.nirvana> <436F7020.5010707@redhat.com> <1131388807.21393.8.camel@ayanami.boston.redhat.com> <20051108103857.GB2757@neu.nirvana> Message-ID: <1131486958.21393.155.camel@ayanami.boston.redhat.com> On Tue, 2005-11-08 at 11:38 +0100, Axel Thimm wrote: > 224.0.0/24 is for local network control, 224.0.1/24 is internetwork > control including stuff like NTP syncing. There are unassigned IPs > from 224.0.1.179 upwards. If they don't know what to do with you you > get into 224.0.23.x upwards (currently 224.0.23.160). That's for ipv4, > ipv6 is another beast. True. > It depends on how many IPs you'd like to reserve. I'd go for at least > two to have a default config for redundant intracluster > networking. And cman would by default use only the first one. I wonder if we should just ask for a /20 or so range, so that all F/OSS cluster projects can use the same known range in the future? /me shrugs -- Lon From Axel.Thimm at ATrpms.net Wed Nov 9 01:12:25 2005 From: Axel.Thimm at ATrpms.net (Axel Thimm) Date: Wed, 9 Nov 2005 02:12:25 +0100 Subject: [Linux-cluster] Re: Registring multicast IPs at iana In-Reply-To: <1131486958.21393.155.camel@ayanami.boston.redhat.com> References: <20051107130603.GC15722@neu.nirvana> <436F7020.5010707@redhat.com> <1131388807.21393.8.camel@ayanami.boston.redhat.com> <20051108103857.GB2757@neu.nirvana> <1131486958.21393.155.camel@ayanami.boston.redhat.com> Message-ID: <20051109011225.GA21754@neu.nirvana> On Tue, Nov 08, 2005 at 04:55:58PM -0500, Lon Hohberger wrote: > On Tue, 2005-11-08 at 11:38 +0100, Axel Thimm wrote: > > It depends on how many IPs you'd like to reserve. I'd go for at least > > two to have a default config for redundant intracluster > > networking. And cman would by default use only the first one. > > I wonder if we should just ask for a /20 or so range, so that all F/OSS > cluster projects can use the same known range in the future? /me shrugs /20? Do you think one would need 4096 IPs? I thought that cman and similar cluster constructs *should* be restricted to local or organizational scope, and even if one would assign one IP per cluster that sounds like too much. OTOH better more than less. -- Axel.Thimm at ATrpms.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From mwill at penguincomputing.com Wed Nov 9 00:46:40 2005 From: mwill at penguincomputing.com (Michael Will) Date: Tue, 08 Nov 2005 16:46:40 -0800 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <1131486039.21393.153.camel@ayanami.boston.redhat.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> <4370C9B6.7070006@penguincomputing.com> <1131486039.21393.153.camel@ayanami.boston.redhat.com> Message-ID: <437146F0.6020803@penguincomputing.com> I was more thinking along those lines: 1. node A fails 2. node B reboots node A 3. node A fails again because it has not been fixed. now we could have a 2-3-2 loop. worst case situation is that 3. is actually 3.1 node A comes up and starts reaquiring its ressource 3.2 node A fails again because it has not been fixed 3.3 goto 2 Your recommendation f/g is exactly what I was wondering about as an alternative. I know it is possible but try to understand why it would not be the default behavior. In active/passive heartbeat style setups I set the nice-failback option so it does not try to reclaim ressources unless the other node fails, but I wonder what is the best path in a multinode active/active setup. Michael Lon Hohberger wrote: > On Tue, 2005-11-08 at 07:52 -0800, Michael Will wrote: > >>> Power-cycle. >>> >> I always wondered about this. If the node has a problem, chances are >> that rebooting does not >> fix it. Now if the node comes up semi-functional and attempts to regain >> control over the ressource >> that it owned before, then that could be bad. Should it not rather be >> shut-down so an human intervention >> can fix it before it is being made operational again? >> > > This is a bit long, but maybe it will clear some things up a little. As > far as a node taking over a resource it thinks it still has after a > reboot (without notifying the other nodes of its intentions), that would > be a bug the cluster software, and a really *bad* one too! > > A couple of things to remember when thinking about failures and fencing: > > (a) Failures are rare. A decent PC has something like a 99.95% uptime > (I wish I knew where I heard/read this long ago) uptime - with no > redundancy at all. A server with ECC RAM, RAID for internal disks, etc. > probably has a higher uptime. > > (b) The hardware component most likely to fail is a hard disk (moving > parts). If that's the root hard disk, the machine probably won't boot > again. If it's the shared RAID set, then the whole cluster will likely > have problems. > > (c) I hate to say this, but the kernel is probably more likely to fail > (panic, hang) than any single piece of hardware. > > (d) Consider this (I think this is an example of what you said?): > 1. Node A fails > 2. Node B reboots node A > 3. Node A correctly boots and rejoins cluster > 4. Node A mounts a GFS file system correctly > 5. Node A corrupts the GFS file system > > What is the chance that 5 will happen without data corruption occurring > during before 1? Very slim, but nonzero - which brings me to my next > point... > > (e) Always make backups of critical data, no matter what sort of block > device or cluster technology you are using. A bad RAM chip (e.g. an > parity RAM chip missing a double-bit errors) can cause periodic, quiet > data corruption. Chances of this happening are also very slim, but > again, nonzero. Probably at least as likely to happen as (d). > > (f) If you're worried about (d) and are willing to take the expected > uptime hit for a given node when that node fails, even given (c), you > can always change the cluster configuration to turn "off" a node instead > of reboot it. :) > > (g) You can chkconfig --del the cluster components so that they don't > automatically start on reboot; same effect as (f): the node won't > reacquire the resources if it never rejoins the cluster... > > > >> I/O fencing instead of power fencing kind of works like this, you undo >> the i/o block once you know >> the node is fine again. >> > > Typically, we refer to that as "fabric level fencing" vs. "power level > fencing", both fit in with the I/O fencing paradigm in preventing a node > from flushing buffers after it has misbehaved. > > Note that typically the only way to be 100% positive a node has no > buffers waiting after it has been fenced at the fabric level is a hard > reboot. > > Many administrators will reboot a failed node as a first attempt to fix > it anyway - so we're just saving them a step :) (Again, if you want, > you can always do (f) or (g) above...) > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Michael Will Penguin Computing Corp. Sales Engineer 415-954-2822 415-954-2899 fx mwill at penguincomputing.com From edwardam at interlix.com Wed Nov 9 06:46:30 2005 From: edwardam at interlix.com (Edward Muller) Date: Wed, 9 Nov 2005 00:46:30 -0600 Subject: [Linux-cluster] OOPS Message-ID: <200511090046.30188.edwardam@interlix.com> Using version 1.01.00 of the cluster software with version 2.01.09 of lvm/clvm on an aoe device I get the following error, repeatable under high IO (rsync triggers it). When I use the same aoe device with a normal filesystem (i.e. not through lvm) the oops does not occur. Also the OOPS goes away when I disable highmem for my kernel. I've been able to create this oops on 2.6.13-gentoo-r5 (yes, I'm using gentoo) and vanilla 2.6.14. If anyone needs any more information please let me know. Oops: 0002 [#1] Modules linked in: dlm cman ipv6 usbcore e1000 aoe CPU: 0 EIP: 0060:[] Not tainted VLI EFLAGS: 00010006 (2.6.14) EIP is at aoecmd_ata_rsp+0x20d/0x370 [aoe] eax: 00000400 ebx: f631083c ecx: 00000100 edx: 00000400 esi: f595d036 edi: 00000400 ebp: f78d3000 esp: c050fdc0 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c050e000 task=c04a3ba0) Stack: f78d3000 00000002 00000000 f7edd740 00000003 00000000 00000024 f7c70f80 00000286 c9eb0634 f7c503c0 00000086 f4500034 41e01640 f7c13900 d6aad39c c1bec680 c013f4c3 c1bb1a00 f7c06f48 f7c06f48 d6aad39c 00000092 f7c06f48 Call Trace: [] mempool_free+0x33/0x80 [] __freed_request+0x9c/0xb0 [] elv_queue_empty+0x26/0x30 [] __alloc_skb+0x55/0x130 [] aoenet_rcv+0xd4/0x150 [aoe] [] e1000_alloc_rx_buffers+0x7b/0x3b0 [e1000] [] netif_receive_skb+0x22c/0x300 [] e1000_clean_rx_irq+0x17d/0x4b0 [e1000] [] add_timer_randomness+0x15d/0x160 [] e1000_clean+0x4b/0xf0 [e1000] [] net_rx_action+0x74/0x100 [] __do_softirq+0x7b/0x90 [] do_softirq+0x26/0x30 [] do_IRQ+0x1e/0x30 [] common_interrupt+0x1a/0x20 [] mwait_idle+0x33/0x50 [] acpi_processor_idle+0x100/0x299 [] cpu_idle+0x50/0x60 [] start_kernel+0x17d/0x1c0 [] unknown_bootoption+0x0/0x1e0 Code: 00 00 00 e9 53 ff ff ff 0f b6 51 02 8b 84 24 bc 00 00 00 8b 48 58 c1 e2 09 8d 41 dc 39 d0 72 1b 89 d1 8b 7b 0c 83 c6 0c c1 e9 02 a5 89 d1 83 e1 03 74 02 f3 a4 e9 c3 fe ff ff 89 4c 24 04 c7 <0>Kernel panic - not syncing: Fatal exception in interrupt -- Edward Muller - Interlix edwardam at interlix.com 417-862-0573 PGP Key: http://interlix.com/Members/edwardam/pgpkeys -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From lhh at redhat.com Wed Nov 9 15:09:32 2005 From: lhh at redhat.com (Lon Hohberger) Date: Wed, 09 Nov 2005 10:09:32 -0500 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <437146F0.6020803@penguincomputing.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> <4370C9B6.7070006@penguincomputing.com> <1131486039.21393.153.camel@ayanami.boston.redhat.com> <437146F0.6020803@penguincomputing.com> Message-ID: <1131548972.5741.9.camel@ayanami.boston.redhat.com> On Tue, 2005-11-08 at 16:46 -0800, Michael Will wrote: > In active/passive heartbeat style setups I set the nice-failback > option so it does not try to reclaim ressources unless the other > node fails, but I wonder what is the best path in a multinode > active/active setup. Oh! That's the default behavior. Place the services where you want them on cluster startup, and they won't go back to that node after a failover unless you set them up to do so. There's no clean way to specify "preferred node - but only care at the first quorum formation" at the moment for several reasons. So, the "dead cluster -> form quorum -> place service X on node Y" doesn't work. This is partially because we don't really know the quorum incarnation #s (we have to function without them), and partially because nodes can join the cluster but never start the resource manager pieces (ex: if they were only mounting GFS volumes). -- Lon From rstevens at vitalstream.com Wed Nov 9 16:48:18 2005 From: rstevens at vitalstream.com (Rick Stevens) Date: Wed, 09 Nov 2005 08:48:18 -0800 Subject: [Linux-cluster] Fence device, How it work In-Reply-To: <437146F0.6020803@penguincomputing.com> References: <1131388734.21393.6.camel@ayanami.boston.redhat.com> <4370C9B6.7070006@penguincomputing.com> <1131486039.21393.153.camel@ayanami.boston.redhat.com> <437146F0.6020803@penguincomputing.com> Message-ID: <1131554898.22937.44.camel@prophead.corp.publichost.com> On Tue, 2005-11-08 at 16:46 -0800, Michael Will wrote: > I was more thinking along those lines: > > 1. node A fails > 2. node B reboots node A > 3. node A fails again because it has not been fixed. > > now we could have a 2-3-2 loop. worst case situation is > that 3. is actually > 3.1 node A comes up and starts reaquiring its ressource > 3.2 node A fails again because it has not been fixed > 3.3 goto 2 > > Your recommendation f/g is exactly what I was wondering about > as an alternative. I know it is possible but try to understand > why it would not be the default behavior. > > In active/passive heartbeat style setups I set the nice-failback > option so it does not try to reclaim ressources unless the other > node fails, but I wonder what is the best path in a multinode > active/active setup. IMHO, an auto reboot is never a good option. Theoretically, node A failed for some reason, and a human should examine it to find out what the problem is/was. Recovering a fenced node should require manual operator intervention--if for no other reason than to verify that a reboot will not cause a repeat of the incident. Fencing should a) turn off the fenced node's ability to reacquire resources; b) power down the fenced node (if possible); and c) alert the operator that fencing occurred. > Lon Hohberger wrote: > > On Tue, 2005-11-08 at 07:52 -0800, Michael Will wrote: > > > >>> Power-cycle. > >>> > >> I always wondered about this. If the node has a problem, chances are > >> that rebooting does not > >> fix it. Now if the node comes up semi-functional and attempts to regain > >> control over the ressource > >> that it owned before, then that could be bad. Should it not rather be > >> shut-down so an human intervention > >> can fix it before it is being made operational again? > >> > > > > This is a bit long, but maybe it will clear some things up a little. As > > far as a node taking over a resource it thinks it still has after a > > reboot (without notifying the other nodes of its intentions), that would > > be a bug the cluster software, and a really *bad* one too! > > > > A couple of things to remember when thinking about failures and fencing: > > > > (a) Failures are rare. A decent PC has something like a 99.95% uptime > > (I wish I knew where I heard/read this long ago) uptime - with no > > redundancy at all. A server with ECC RAM, RAID for internal disks, etc. > > probably has a higher uptime. > > > > (b) The hardware component most likely to fail is a hard disk (moving > > parts). If that's the root hard disk, the machine probably won't boot > > again. If it's the shared RAID set, then the whole cluster will likely > > have problems. > > > > (c) I hate to say this, but the kernel is probably more likely to fail > > (panic, hang) than any single piece of hardware. > > > > (d) Consider this (I think this is an example of what you said?): > > 1. Node A fails > > 2. Node B reboots node A > > 3. Node A correctly boots and rejoins cluster > > 4. Node A mounts a GFS file system correctly > > 5. Node A corrupts the GFS file system > > > > What is the chance that 5 will happen without data corruption occurring > > during before 1? Very slim, but nonzero - which brings me to my next > > point... > > > > (e) Always make backups of critical data, no matter what sort of block > > device or cluster technology you are using. A bad RAM chip (e.g. an > > parity RAM chip missing a double-bit errors) can cause periodic, quiet > > data corruption. Chances of this happening are also very slim, but > > again, nonzero. Probably at least as likely to happen as (d). > > > > (f) If you're worried about (d) and are willing to take the expected > > uptime hit for a given node when that node fails, even given (c), you > > can always change the cluster configuration to turn "off" a node instead > > of reboot it. :) > > > > (g) You can chkconfig --del the cluster components so that they don't > > automatically start on reboot; same effect as (f): the node won't > > reacquire the resources if it never rejoins the cluster... > > > > > > > >> I/O fencing instead of power fencing kind of works like this, you undo > >> the i/o block once you know > >> the node is fine again. > >> > > > > Typically, we refer to that as "fabric level fencing" vs. "power level > > fencing", both fit in with the I/O fencing paradigm in preventing a node > > from flushing buffers after it has misbehaved. > > > > Note that typically the only way to be 100% positive a node has no > > buffers waiting after it has been fenced at the fabric level is a hard > > reboot. > > > > Many administrators will reboot a failed node as a first attempt to fix > > it anyway - so we're just saving them a step :) (Again, if you want, > > you can always do (f) or (g) above...) > > > > -- Lon > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > ---------------------------------------------------------------------- - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com - - VitalStream, Inc. http://www.vitalstream.com - - - - "Hello. My PID is Inigo Montoya. You `kill -9'-ed my parent - - process. Prepare to vi." - ---------------------------------------------------------------------- From thomsonr at ucalgary.ca Wed Nov 9 17:17:49 2005 From: thomsonr at ucalgary.ca (Ryan Thomson) Date: Wed, 9 Nov 2005 10:17:49 -0700 (MST) Subject: [Linux-cluster] clusterfs.sh returning generic error (1) Message-ID: <49172.136.159.234.44.1131556669.squirrel@136.159.234.44> Hi list, I'm having some issues setting up a GFS mount w/ NFS export on RHEL4 using the latest cluster suite packages from RHN. I'm using GFS CVS (RHEL4) and LVM2 (clvmd) from source tarball (2.2.01.09) if that makes any difference. The problem I am having is this: I setup a service with a GFS resource, an NFS export resource and an NFS client resource. The service starts fine and I can mount the NFS export over the network from clients. After one minute and each minute after that I'm seeing some errors in my logs and the service is restarted. I looked and clusterfs.sh and saw that it's supposed to be doing a "isMounted" check every minute... but how is that failing if I can access everything just fine, locally and over NFS? Here is the error as I am seeing it in /var/log/messages: Nov 9 10:00:59 wolverine clurgmgrd[6901]: status on clusterfs "people" returned 1 (generic error) Nov 9 10:00:59 wolverine clurgmgrd[6901]: Stopping service NFS people Nov 9 10:00:59 wolverine clurgmgrd: [6901]: Removing IPv4 address 136.159.***.*** from eth0 Nov 9 10:00:59 wolverine clurgmgrd: [6901]: Removing export: 136.159.***.0/24:/people Nov 9 10:00:59 wolverine clurgmgrd: [6901]: unmounting /dev/mapper/BIOCOMP-people (/people) Nov 9 10:00:59 wolverine clurgmgrd[6901]: Service NFS people is recovering Nov 9 10:00:59 wolverine clurgmgrd[6901]: Recovering failed service NFS people Nov 9 10:01:00 wolverine kernel: GFS: Trying to join cluster "lock_nolock", "" Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: Joined cluster. Now mounting FS... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Trying to acquire journal lock... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Looking at journal... Nov 9 10:01:00 wolverine kernel: GFS: fsid=dm-1.0: jid=0: Done Nov 9 10:01:00 wolverine clurmtabd[27592]: #20: Failed set log level Nov 9 10:01:00 wolverine clurgmgrd: [6901]: Adding export: 136.159.***.0/24:/people (rw,sync) Nov 9 10:01:00 wolverine clurgmgrd: [6901]: Adding IPv4 address 136.159.***.*** to eth0 Nov 9 10:01:01 wolverine clurgmgrd[6901]: Service NFS people started And here is my cluster.conf file: