From yamato at redhat.com Fri Jul 1 02:56:13 2011 From: yamato at redhat.com (Masatake YAMATO) Date: Fri, 01 Jul 2011 11:56:13 +0900 (JST) Subject: [Linux-cluster] [PATCH] /config/dlm//comms//addr_list In-Reply-To: <20110630213414.GC16480@redhat.com> References: <20110609140546.GA30732@redhat.com> <20110630.213710.655406242398069789.yamato@redhat.com> <20110630213414.GC16480@redhat.com> Message-ID: <20110701.115613.319846100234085929.yamato@redhat.com> On Thu, 30 Jun 2011 17:34:14 -0400, David Teigland wrote: > On Thu, Jun 30, 2011 at 09:37:10PM +0900, Masatake YAMATO wrote: >> Added addr_list. Could you try my patch? >> >> Signed-off-by: Masatake YAMATO > > Thanks, it looks good, I'll push it to the next branch. Do you use this > mainly for debugging? or is there some other reason that I should note in > the commit message? > Dave > For understanding dlm and for debugging my cluster.conf:) Masatake YAMATO From yamato at redhat.com Fri Jul 1 07:26:58 2011 From: yamato at redhat.com (Masatake YAMATO) Date: Fri, 01 Jul 2011 16:26:58 +0900 (JST) Subject: [Linux-cluster] [PATCH] dumping the unknown address when got a connect from non cluster node Message-ID: <20110701.162658.1002310074831263303.yamato@redhat.com> Another patch useful for debugging cluster.conf and network configuration. This is useful when you build a cluster with nodes connected each others with a software bridge(virbrN). If you install wrong iptabels configuration, dlm cannot establish connections. You will just see dlm: connect from non cluster node in demsg. It is difficult to understand what happens quickly. This patch dumps the address of the non cluster node. Signed-off-by: Masatake YAMATO diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index bffa1e7..90c1c2e 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con) /* Get the new node's NODEID */ make_sockaddr(&peeraddr, 0, &len); if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) { + int i; + unsigned char *b=(unsigned char *)&peeraddr; log_print("connect from non cluster node"); + for (i=0; isock_mutex); return -1; From yamato at redhat.com Fri Jul 1 08:45:56 2011 From: yamato at redhat.com (Masatake YAMATO) Date: Fri, 01 Jul 2011 17:45:56 +0900 (JST) Subject: [Linux-cluster] [PATCH] trivial fix Message-ID: <20110701.174556.1041000521202229132.yamato@redhat.com> Fix a typo. Signed-off-by: Masatake YAMATO diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index bffa1e7..f0d4855 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection *con) int one = 1; if (con->nodeid == 0) { - log_print("attempt to connect sock 0 foiled"); + log_print("attempt to connect sock 0 failed"); return; } From sklemer at gmail.com Fri Jul 1 09:03:39 2011 From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=) Date: Fri, 1 Jul 2011 12:03:39 +0300 Subject: [Linux-cluster] fence_ipmilan fails to reboot In-Reply-To: References: Message-ID: Hi. I think you need to add the power_wait"10" & lanplus="1" Try this line: fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="xx.xx.xx.xx" lanplus="1" login="xxxt" name="node1_ilo" passwd="yyy Regards Shalom. On Thu, Jun 30, 2011 at 1:03 PM, Parvez Shaikh wrote: > Hi all, > > I am on RHEL 5.5; and I have two rack mounted servers with IPMI configured. > > When I run command from the prompt to reboot the server through > fence_ipmilan, it shutsdown the server fine but it fails to power it on > > # fence_ipmilan -a -l admin -p password -o reboot >> > Rebooting machine @ IPMI:...Failed >> > > But I can power it on or power off just fine > >> >> # fence_ipmilan -a -l admin -p password -o on >> > Powering on machine @ IPMI:...Done >> > > Due to this my fencing is failing and failover is not happening. > > I have questions around this - > > 1. Can we provide action (off or reboot) in cluster.conf for ipmi lan > fencing? > 2. Is there anything wrong in my configuration? Cluster.conf file is pasted > below > 3. Is this a known issue which is fixed in newer versions > > Here is how my cluster.conf looks like - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > Parvez > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Fri Jul 1 10:21:26 2011 From: ccaulfie at redhat.com (Christine Caulfield) Date: Fri, 01 Jul 2011 11:21:26 +0100 Subject: [Linux-cluster] [PATCH] trivial fix In-Reply-To: <20110701.174556.1041000521202229132.yamato@redhat.com> References: <20110701.174556.1041000521202229132.yamato@redhat.com> Message-ID: <4E0D9FA6.4040001@redhat.com> On 01/07/11 09:45, Masatake YAMATO wrote: > Fix a typo. > > Signed-off-by: Masatake YAMATO > > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c > index bffa1e7..f0d4855 100644 > --- a/fs/dlm/lowcomms.c > +++ b/fs/dlm/lowcomms.c > @@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection *con) > int one = 1; > > if (con->nodeid == 0) { > - log_print("attempt to connect sock 0 foiled"); > + log_print("attempt to connect sock 0 failed"); > return; > } > That's not a typo. I did mean "foiled" and not "failed". Read the code and it will make sense ;-) Chrissie From parvez.h.shaikh at gmail.com Fri Jul 1 10:24:54 2011 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Fri, 1 Jul 2011 15:54:54 +0530 Subject: [Linux-cluster] fence_ipmilan fails to reboot - SOLVED Message-ID: Hi all, Thanks for your responses, after providing auth=password; fencing succeeded Thanks, Parvez On Fri, Jul 1, 2011 at 2:33 PM, ???? ???? wrote: > Hi. > > I think you need to add the power_wait"10" & lanplus="1" > > Try this line: > > fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="xx.xx.xx.xx" > lanplus="1" login="xxxt" name="node1_ilo" passwd="yyy > > > Regards > > Shalom. > > On Thu, Jun 30, 2011 at 1:03 PM, Parvez Shaikh wrote: > >> Hi all, >> >> I am on RHEL 5.5; and I have two rack mounted servers with IPMI >> configured. >> >> When I run command from the prompt to reboot the server through >> fence_ipmilan, it shutsdown the server fine but it fails to power it on >> >> # fence_ipmilan -a -l admin -p password -o reboot >>> >> Rebooting machine @ IPMI:...Failed >>> >> >> But I can power it on or power off just fine >> >>> >>> # fence_ipmilan -a -l admin -p password -o on >>> >> Powering on machine @ IPMI:...Done >>> >> >> Due to this my fencing is failing and failover is not happening. >> >> I have questions around this - >> >> 1. Can we provide action (off or reboot) in cluster.conf for ipmi lan >> fencing? >> 2. Is there anything wrong in my configuration? Cluster.conf file is >> pasted below >> 3. Is this a known issue which is fixed in newer versions >> >> Here is how my cluster.conf looks like - >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Thanks, >> Parvez >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From teigland at redhat.com Fri Jul 1 17:41:36 2011 From: teigland at redhat.com (David Teigland) Date: Fri, 1 Jul 2011 13:41:36 -0400 Subject: [Linux-cluster] [PATCH] dumping the unknown address when got a connect from non cluster node In-Reply-To: <20110701.162658.1002310074831263303.yamato@redhat.com> References: <20110701.162658.1002310074831263303.yamato@redhat.com> Message-ID: <20110701174136.GC23008@redhat.com> On Fri, Jul 01, 2011 at 04:26:58PM +0900, Masatake YAMATO wrote: > Another patch useful for debugging cluster.conf and network configuration. > > This is useful when you build a cluster with nodes connected each others with > a software bridge(virbrN). If you install wrong iptabels configuration, dlm > cannot establish connections. You will just see > > dlm: connect from non cluster node > > in demsg. It is difficult to understand what happens quickly. > This patch dumps the address of the non cluster node. > > > Signed-off-by: Masatake YAMATO > > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c > index bffa1e7..90c1c2e 100644 > --- a/fs/dlm/lowcomms.c > +++ b/fs/dlm/lowcomms.c > @@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con) > /* Get the new node's NODEID */ > make_sockaddr(&peeraddr, 0, &len); > if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) { > + int i; > + unsigned char *b=(unsigned char *)&peeraddr; > log_print("connect from non cluster node"); > + for (i=0; i + printk("%02x ", b[i]); > + printk("\n"); > sock_release(newsock); > mutex_unlock(&con->sock_mutex); > return -1; Could you use print_hex_dump_bytes instead? Dave From yamato at redhat.com Mon Jul 4 03:11:12 2011 From: yamato at redhat.com (Masatake YAMATO) Date: Mon, 04 Jul 2011 12:11:12 +0900 (JST) Subject: [Linux-cluster] [PATCH] trivial fix In-Reply-To: <4E0D9FA6.4040001@redhat.com> References: <20110701.174556.1041000521202229132.yamato@redhat.com> <4E0D9FA6.4040001@redhat.com> Message-ID: <20110704.121112.149598301835655010.yamato@redhat.com> > On 01/07/11 09:45, Masatake YAMATO wrote: >> Fix a typo. >> >> Signed-off-by: Masatake YAMATO >> >> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c >> index bffa1e7..f0d4855 100644 >> --- a/fs/dlm/lowcomms.c >> +++ b/fs/dlm/lowcomms.c >> @@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection >> *con) >> int one = 1; >> >> if (con->nodeid == 0) { >> - log_print("attempt to connect sock 0 foiled"); >> + log_print("attempt to connect sock 0 failed"); >> return; >> } >> > > That's not a typo. I did mean "foiled" and not "failed". Read the code > and it will make sense ;-) Oh, sorry. > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From yamato at redhat.com Mon Jul 4 03:25:51 2011 From: yamato at redhat.com (Masatake YAMATO) Date: Mon, 04 Jul 2011 12:25:51 +0900 (JST) Subject: [Linux-cluster] [PATCH V2] dumping the unknown address when got a connect from non cluster node In-Reply-To: <20110701174136.GC23008@redhat.com> References: <20110701.162658.1002310074831263303.yamato@redhat.com> <20110701174136.GC23008@redhat.com> Message-ID: <20110704.122551.868642282278092140.yamato@redhat.com> >> Another patch useful for debugging cluster.conf and network configuration. >> >> This is useful when you build a cluster with nodes connected each others with >> a software bridge(virbrN). If you install wrong iptabels configuration, dlm >> cannot establish connections. You will just see >> >> dlm: connect from non cluster node >> >> in demsg. It is difficult to understand what happens quickly. >> This patch dumps the address of the non cluster node. >> >> >> Signed-off-by: Masatake YAMATO >> >> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c >> index bffa1e7..90c1c2e 100644 >> --- a/fs/dlm/lowcomms.c >> +++ b/fs/dlm/lowcomms.c >> @@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con) >> /* Get the new node's NODEID */ >> make_sockaddr(&peeraddr, 0, &len); >> if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) { >> + int i; >> + unsigned char *b=(unsigned char *)&peeraddr; >> log_print("connect from non cluster node"); >> + for (i=0; i> + printk("%02x ", b[i]); >> + printk("\n"); >> sock_release(newsock); >> mutex_unlock(&con->sock_mutex); >> return -1; > > Could you use print_hex_dump_bytes instead? > Dave Here is the revised version. This patch is useful when you build a cluster with nodes connected each others with a software bridge(virbrN). If you install wrong iptabels configuration, dlm cannot establish connections. You will just see dlm: connect from non cluster node in demsg. It is difficult to understand what happens quickly. This patch dumps the address of the non cluster node with print_hex_dump_bytes function: dlm: connect from non cluster node ss: 02 00 00 00 c0 a8 97 01 00 00 00 00 00 00 00 00 ................ .... Using print_hex_dump_bytes is sugested by David Teigland. Signed-off-by: Masatake YAMATO diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index bffa1e7..a762e9f 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -512,12 +512,10 @@ static void process_sctp_notification(struct connection *con, } make_sockaddr(&prim.ssp_addr, 0, &addr_len); if (dlm_addr_to_nodeid(&prim.ssp_addr, &nodeid)) { - int i; unsigned char *b=(unsigned char *)&prim.ssp_addr; log_print("reject connect from unknown addr"); - for (i=0; isock_mutex); return -1; From parvez.h.shaikh at gmail.com Tue Jul 5 11:32:34 2011 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Tue, 5 Jul 2011 17:02:34 +0530 Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster Message-ID: Hi all, I was trying to find out how much time does it take for RHCS to detect failure and recover from it. I found the link - http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf It says that network polling interval is 2 seconds and 6 retries are attempted before declaring a node as failed. I want to know can we tune this or configure it, say instead of 6 retries I want only 3 retries. Also reducing network polling time from 2 seconds to say 1 second (can it be less than 1 second, which I think would consume more CPU)? Also I have a script resource and I see it invoked with status argument after every 30 seconds, can we configure that as well? Failover also involve fencing, any pointers on how can we control / configure fencing time would also be useful,I use bladecenter fencing, IPMI fencing as well as UCS fencing. Thanks, Parvez -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Tue Jul 5 12:28:40 2011 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 05 Jul 2011 13:28:40 +0100 Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster In-Reply-To: References: Message-ID: <4E130378.1090408@redhat.com> That's a *very* old document. it's from 2003 and refers to RHEL2.1 .. which I sincerely hope you weren't planning to implement. Before you do anything more I recommend you read the documentation for the actual version of clustering you are going to install https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/ Chrissie On 05/07/11 12:32, Parvez Shaikh wrote: > Hi all, > > I was trying to find out how much time does it take for RHCS to detect > failure and recover from it. I found the link - > http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf > > It says that network polling interval is 2 seconds and 6 retries are > attempted before declaring a node as failed. I want to know can we tune > this or configure it, say instead of 6 retries I want only 3 retries. > Also reducing network polling time from 2 seconds to say 1 second (can > it be less than 1 second, which I think would consume more CPU)? > > Also I have a script resource and I see it invoked with status argument > after every 30 seconds, can we configure that as well? > > Failover also involve fencing, any pointers on how can we control / > configure fencing time would also be useful,I use bladecenter fencing, > IPMI fencing as well as UCS fencing. > > Thanks, > Parvez > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From parvez.h.shaikh at gmail.com Tue Jul 5 12:43:02 2011 From: parvez.h.shaikh at gmail.com (Parvez Shaikh) Date: Tue, 5 Jul 2011 18:13:02 +0530 Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster In-Reply-To: <4E130378.1090408@redhat.com> References: <4E130378.1090408@redhat.com> Message-ID: Hello Christine, Thanks for the link enlisting various documents, I have RHC running over RHEL 5.5 and has been working fine. However I would greatly appreciate, some document or pointers which help me in estimate failover time or adjust it; if that is possible. I have been through Administration Guide and could not find how I can adjust it. Thanks, Parvez On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield wrote: > That's a *very* old document. it's from 2003 and refers to RHEL2.1 .. which > I sincerely hope you weren't planning to implement. > > Before you do anything more I recommend you read the documentation for the > actual version of clustering you are going to install > > https://access.redhat.com/**knowledge/docs/Red_Hat_**Enterprise_Linux/ > > Chrissie > > > On 05/07/11 12:32, Parvez Shaikh wrote: > >> Hi all, >> >> I was trying to find out how much time does it take for RHCS to detect >> failure and recover from it. I found the link - >> http://www.redhat.com/**whitepapers/rha/RHA_**ClusterSuiteWPPDF.pdf >> >> It says that network polling interval is 2 seconds and 6 retries are >> attempted before declaring a node as failed. I want to know can we tune >> this or configure it, say instead of 6 retries I want only 3 retries. >> Also reducing network polling time from 2 seconds to say 1 second (can >> it be less than 1 second, which I think would consume more CPU)? >> >> Also I have a script resource and I see it invoked with status argument >> after every 30 seconds, can we configure that as well? >> >> Failover also involve fencing, any pointers on how can we control / >> configure fencing time would also be useful,I use bladecenter fencing, >> IPMI fencing as well as UCS fencing. >> >> Thanks, >> Parvez >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/**mailman/listinfo/linux-cluster >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/**mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Tue Jul 5 13:20:47 2011 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 05 Jul 2011 14:20:47 +0100 Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster In-Reply-To: References: <4E130378.1090408@redhat.com> Message-ID: <4E130FAF.1060600@redhat.com> Hiya, I don't have the URL to have but I'm pretty sure there's something in the Red Hat knowledge base about calculating failover times. You'll need to have paid support to get at it. Failing that here's a document I wrote that talks about configuring the insider bits of openais and cman. "man 5 openais.conf" is also helpful. I hope this helps :-) Chrissie On 05/07/11 13:43, Parvez Shaikh wrote: > Hello Christine, > > Thanks for the link enlisting various documents, I have RHC running over > RHEL 5.5 and has been working fine. However I would greatly appreciate, > some document or pointers which help me in estimate failover time or > adjust it; if that is possible. > > I have been through Administration Guide and could not find how I can > adjust it. > > Thanks, > Parvez > > On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield > wrote: > > That's a *very* old document. it's from 2003 and refers to RHEL2.1 > .. which I sincerely hope you weren't planning to implement. > > Before you do anything more I recommend you read the documentation > for the actual version of clustering you are going to install > > https://access.redhat.com/__knowledge/docs/Red_Hat___Enterprise_Linux/ > > > Chrissie > > > On 05/07/11 12:32, Parvez Shaikh wrote: > > Hi all, > > I was trying to find out how much time does it take for RHCS to > detect > failure and recover from it. I found the link - > http://www.redhat.com/__whitepapers/rha/RHA___ClusterSuiteWPPDF.pdf > > > It says that network polling interval is 2 seconds and 6 retries are > attempted before declaring a node as failed. I want to know can > we tune > this or configure it, say instead of 6 retries I want only 3 > retries. > Also reducing network polling time from 2 seconds to say 1 > second (can > it be less than 1 second, which I think would consume more CPU)? > > Also I have a script resource and I see it invoked with status > argument > after every 30 seconds, can we configure that as well? > > Failover also involve fencing, any pointers on how can we control / > configure fencing time would also be useful,I use bladecenter > fencing, > IPMI fencing as well as UCS fencing. > > Thanks, > Parvez > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/__mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/__mailman/listinfo/linux-cluster > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ccaulfie at redhat.com Tue Jul 5 15:10:56 2011 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 05 Jul 2011 16:10:56 +0100 Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster In-Reply-To: <4E130FAF.1060600@redhat.com> References: <4E130378.1090408@redhat.com> <4E130FAF.1060600@redhat.com> Message-ID: <4E132980.2050409@redhat.com> I forgot to paste the URL, sorry! http://people.redhat.com/ccaulfie/docs/CmanYinYang.pdf Chrissie On 05/07/11 14:20, Christine Caulfield wrote: > Hiya, > > I don't have the URL to have but I'm pretty sure there's something in > the Red Hat knowledge base about calculating failover times. You'll need > to have paid support to get at it. > > Failing that here's a document I wrote that talks about configuring the > insider bits of openais and cman. "man 5 openais.conf" is also helpful. > > I hope this helps :-) > > Chrissie > > On 05/07/11 13:43, Parvez Shaikh wrote: >> Hello Christine, >> >> Thanks for the link enlisting various documents, I have RHC running over >> RHEL 5.5 and has been working fine. However I would greatly appreciate, >> some document or pointers which help me in estimate failover time or >> adjust it; if that is possible. >> >> I have been through Administration Guide and could not find how I can >> adjust it. >> >> Thanks, >> Parvez >> >> On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield > > wrote: >> >> That's a *very* old document. it's from 2003 and refers to RHEL2.1 >> .. which I sincerely hope you weren't planning to implement. >> >> Before you do anything more I recommend you read the documentation >> for the actual version of clustering you are going to install >> >> https://access.redhat.com/__knowledge/docs/Red_Hat___Enterprise_Linux/ >> >> >> Chrissie >> >> >> On 05/07/11 12:32, Parvez Shaikh wrote: >> >> Hi all, >> >> I was trying to find out how much time does it take for RHCS to >> detect >> failure and recover from it. I found the link - >> http://www.redhat.com/__whitepapers/rha/RHA___ClusterSuiteWPPDF.pdf >> >> >> It says that network polling interval is 2 seconds and 6 retries are >> attempted before declaring a node as failed. I want to know can >> we tune >> this or configure it, say instead of 6 retries I want only 3 >> retries. >> Also reducing network polling time from 2 seconds to say 1 >> second (can >> it be less than 1 second, which I think would consume more CPU)? >> >> Also I have a script resource and I see it invoked with status >> argument >> after every 30 seconds, can we configure that as well? >> >> Failover also involve fencing, any pointers on how can we control / >> configure fencing time would also be useful,I use bladecenter >> fencing, >> IPMI fencing as well as UCS fencing. >> >> Thanks, >> Parvez >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/__mailman/listinfo/linux-cluster >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/__mailman/listinfo/linux-cluster >> >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From swap_project at yahoo.com Wed Jul 6 03:21:02 2011 From: swap_project at yahoo.com (Srija) Date: Tue, 5 Jul 2011 20:21:02 -0700 (PDT) Subject: [Linux-cluster] Cluster node issue In-Reply-To: Message-ID: <1309922462.81116.YahooMailClassic@web112808.mail.gq1.yahoo.com> Hi, We have 16 nodes cluster. Recently facing issues with the nodes. The problem is, occassionaly find one of the nodes is not accessable through ssh. The node is up and running, the zen guests on the nodes are also pingable . But the node, and the guests on the nodes are not able to accessable. Very recently it happened to one of the node again. The nodes are of rhel5.5, kernel 2.6.18-194.3.1.el5xen #1 SMP Sun May 2 04:26:43 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux when it happens that node becomes detached from the cluster. If anybody can give some hints that will be really appreciated. Not sure it is the kernel but or not...Here is the few line of the log file, when it happened last time. Thanks in advance.. ____________________________________________________________ Jul 1 17:11:03 server crond[11715]: (root) CMD (python /usr/share/rhn/virtualization/poller.py) Jul 1 17:11:03 server crond[11716]: (root) CMD (python /usr/share/rhn/virtualization/poller.py) Jul 1 17:11:01 server crond[11685]: (root) error: Job execution of per-minute job scheduled for 17:10 delayed into subsequent minute 17:11. Skipping job run. Jul 1 17:11:03 server crond[11685]: CRON (root) ERROR: cannot set security context Jul 1 17:17:13 server xinetd[6778]: START: pblocald pid=11896 from=xxx.xx.222.4 Jul 1 17:21:01 server crond[11852]: (root) error: Job execution of per-minute job scheduled for 17:15 delayed into subsequent minute 17:21. Skipping job run. Jul 1 17:21:01 server crond[11852]: CRON (root) ERROR: cannot set security context Jul 1 17:21:05 server crond[12031]: (root) CMD (python /usr/share/rhn/virtualization/poller.py) Jul 1 17:23:34 server INFO: task cmahealthd:7492 blocked for more than 120 seconds. Jul 1 17:23:37 server "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 1 17:23:37 server cmahealthd D 0000000000000180 0 7492 1 7507 7430 (NOTLB) Jul 1 17:23:37 server ffff880065f29b18 0000000000000282 0000000000000000 0000000000000000 Jul 1 17:23:37 server 0000000000000009 ffff880065c35040 ffff88007f6720c0 000000000001d982 Jul 1 17:23:37 server ffff880065c35228 ffff88007e16b400 Jul 1 17:23:37 server Call Trace: Jul 1 17:23:37 server [] __wake_up_common+0x3e/0x68 Jul 1 17:23:37 server [] base_probe+0x0/0x36 Jul 1 17:23:37 server [] wait_for_completion+0x7d/0xaa Jul 1 17:23:52 server [] default_wake_function+0x0/0xe Jul 1 17:23:52 server [] call_usermodehelper_keys+0xe3/0xf8 Jul 1 17:23:52 server [] __call_usermodehelper+0x0/0x4f Jul 1 17:23:52 server [] find_get_page+0x4d/0x55 Jul 1 17:23:52 server [] request_module+0x139/0x14d Jul 1 17:23:52 server [] mntput_no_expire+0x19/0x89 Jul 1 17:23:52 server [] link_path_walk+0xa6/0xb2 Jul 1 17:23:52 server [] mutex_lock+0xd/0x1d Jul 1 17:23:52 server [] base_probe+0x1e/0x36 Jul 1 17:23:52 server [] kobj_lookup+0x132/0x19b Jul 1 17:31:24 server xinetd[6778]: START: pblocald pid=12151 from=xxx.xx.222.4 Jul 1 17:28:16 server openais[6172]: [TOTEM] entering GATHER state from 12. Jul 1 17:28:41 server openais[6172]: [TOTEM] Creating commit token because I am the rep. Jul 1 17:28:41 server openais[6172]: [TOTEM] Saving state aru 20a high seq received 20a Jul 1 17:28:41 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e90 Jul 1 17:28:49 server openais[6172]: [TOTEM] entering COMMIT state. Jul 1 17:31:30 server openais[6172]: [TOTEM] Creating commit token because I am the rep. Jul 1 17:31:30 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e94 Jul 1 17:31:30 server openais[6172]: [TOTEM] entering COMMIT state. Jul 1 17:31:30 server openais[6172]: [TOTEM] entering GATHER state from 13. Jul 1 17:31:30 server openais[6172]: [TOTEM] Creating commit token because I am the rep. Jul 1 17:33:30 server [] chrdev_open+0x53/0x183 Jul 1 17:33:30 server [] chrdev_open+0x0/0x183 Jul 1 17:33:30 server [] __dentry_open+0xd9/0x1dc Jul 1 17:33:30 server [] do_filp_open+0x2a/0x38 Jul 1 17:33:30 server [] do_sys_open+0x44/0xbe Jul 1 17:33:30 server [] ia32_sysret+0x0/0x5 Jul 1 17:33:30 server Jul 1 17:31:30 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e98 Jul 1 17:31:30 server openais[6172]: [TOTEM] entering COMMIT state. Jul 1 17:31:30 server openais[6172]: [TOTEM] entering RECOVERY state. Jul 1 17:31:30 server openais[6172]: [TOTEM] position [0] member 192.168.xxx.9: Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11916 rep 192.168.xxx.9 Jul 1 17:31:30 server openais[6172]: [TOTEM] aru 20a high delivered 20a received flag 1 Jul 1 17:31:30 server openais[6172]: [TOTEM] position [1] member 192.168.xxx.10: Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10 Jul 1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1 Jul 1 17:31:30 server openais[6172]: [TOTEM] position [2] member 192.168.xxx.11: Jul 1 17:33:30 server INFO: task cmahealthd:7492 blocked for more than 120 seconds. Jul 1 17:33:30 server "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 1 17:33:30 server cmahealthd D 0000000000000180 0 7492 1 7507 7430 (NOTLB) Jul 1 17:33:30 server ffff880065f29b18 0000000000000282 0000000000000000 0000000000000000 Jul 1 17:33:30 server 0000000000000009 ffff880065c35040 ffff88007f6720c0 000000000001d982 Jul 1 17:33:30 server ffff880065c35228 ffff88007e16b400 Jul 1 17:33:30 server Call Trace: Jul 1 17:33:30 server [] __wake_up_common+0x3e/0x68 Jul 1 17:33:30 server [] base_probe+0x0/0x36 Jul 1 17:33:30 server [] wait_for_completion+0x7d/0xaa Jul 1 17:33:29 server dlm_controld[6272]: cluster is down, exiting Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10 Jul 1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1 Jul 1 17:31:30 server openais[6172]: [TOTEM] position [3] member 192.168.xxx.12: Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10 Jul 1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1 Jul 1 17:31:30 server openais[6172]: [TOTEM] position [4] member 192.168.xxx.13: Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10 Jul 1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1 Jul 1 17:31:30 server openais[6172]: [TOTEM] position [5] member 192.168.xxx.14: Jul 1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10 Jul 1 17:33:30 server [] default_wake_function+0x0/0xe Jul 1 17:33:30 server [] call_usermodehelper_keys+0xe3/0xf8 Jul 1 17:33:30 server [] __call_usermodehelper+0x0/0x4f Jul 1 17:33:30 server [] find_get_page+0x4d/0x55 Jul 1 17:33:30 server [] request_module+0x139/0x14d Jul 1 17:33:30 server [] mntput_no_expire+0x19/0x89 Jul 1 17:33:30 server [] link_path_walk+0xa6/0xb2 Jul 1 17:33:30 server [] mutex_lock+0xd/0x1d Jul 1 17:33:30 server [] base_probe+0x1e/0x36 Jul 1 17:33:30 server [] kobj_lookup+0x132/0x19b Jul 1 17:33:30 server gfs_controld[6280]: cluster is down, exiting From member at linkedin.com Wed Jul 6 12:11:42 2011 From: member at linkedin.com (Arif Bhai Surat via LinkedIn) Date: Wed, 6 Jul 2011 12:11:42 +0000 (UTC) Subject: [Linux-cluster] Invitation to connect on LinkedIn Message-ID: <122461066.15294886.1309954302233.JavaMail.app@ela4-bed77.prod> LinkedIn ------------ Arif Bhai Surat requested to add you as a connection on LinkedIn: ------------------------------------------ Marian, I'd like to add you to my professional network on LinkedIn. - Arif Bhai Accept invitation from Arif Bhai Surat http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2940492630_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP0Pdz8Vd30Qej99bRhepD95rkhIbP0Td3gMdjsQd3cLrCBxbOYWrSlI/EML_comm_afe/ View invitation from Arif Bhai Surat http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2940492630_2/39vc3cSczAQc3gVcAALqnpPbOYWrSlI/svi/ ------------------------------------------ DID YOU KNOW you can use your LinkedIn profile as your website? Select a vanity URL and then promote this address on your business cards, email signatures, website, etc http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ewp/inv-21/ -- (c) 2011, LinkedIn Corporation -------------- next part -------------- An HTML attachment was scrubbed... URL: From helen_heath at fastmail.fm Wed Jul 6 12:13:38 2011 From: helen_heath at fastmail.fm (Helen Heath) Date: Wed, 06 Jul 2011 13:13:38 +0100 Subject: [Linux-cluster] how to disable one node Message-ID: <1309954418.14584.2148764421@webmail.messagingengine.com> Hi all - I hope someone can shed some light on this. I have a 2-node cluster running on RedHat 3 which has a shared /clust1 filesystem and is connected to a network power switch. There is something very wrong with the cluster, as every day currently it is rebooting whichever is the primary node, for no reason I can track down. No hardware faults anywhere in the cluster, no failures of any kind logging in any log files, etc etc. It started out well over a year ago rebooting the primary node every other week, then across time it progressed to once a week, then once a day. I logged a call with RedHat way back when it first started; nothing was ever found to be the problem, and of course in time, RedHat v3 went out of support and they would no longer assist in troubleshooting the issue. Prior to this problem starting the cluster had been running happily with no issues for about 5 years. Now this cluster is shortly being replaced with new hardware and RedHat 5, so hopefully whatever is the problem will as mysteriously vanish as it appeared. However, I need to stop this daily reboot as it is playing havoc with the application that runs on this system (a heavily-utilised database) and having tried everything I can think of, I decided to 'break' the cluster; ie, take down one node so that only one node remains running the application. I cannot find a way to do this that persists across a reboot of the node that should be out of the cluster. I've run "/sbin/chkconfig --del clumanager" and it did take the service out of chkconfig (I verified this). The RedHat document http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html /Cluster_Administration/s1-admin-disable.html seems to indicate this should persist across a reboot - ie, you reboot the node and it does not attempt to rejoin the cluster; however, this didn't work! The primary node cluster monitoring software saw that the secondary node was down, STONITH kicked in, the NPS powered the port this node is connected to off and back on, the secondary node rebooted and rejoined the cluster! Does anyone know how to either temporarily remove the secondary node from the cluster in such a way that persists across reboots but can be easily brought back into the cluster when needed, or else (and preferably) how to temporarily stop the cluster monitoring software running on the primary node from even looking out for the secondary node - as in, it doesn't care whether the secondary node is up or not? I've checked for the period the secondary node is down that the primary node is quite happy to carry on processing as usual but as soon as the cluster monitoring software on the primary node realises the secondary node is down, it reboots it, and I'm back to square one! This is now really urgent (I've been trying to find an answer to this for some weeks now) as I go on holiday on Friday and I really don't want to leave my second-in-command with a mess on his hands! thanks -- Helen Heath helen_heath at fastmail.fm =*= Everything that has a beginning has an ending. Make your peace with that and all will be well. -- Buddhist saying -------------- next part -------------- An HTML attachment was scrubbed... URL: From anprice at redhat.com Wed Jul 6 13:55:07 2011 From: anprice at redhat.com (Andrew Price) Date: Wed, 06 Jul 2011 14:55:07 +0100 Subject: [Linux-cluster] gfs2-utils 3.1.2 Released Message-ID: <4E14693B.7090401@redhat.com> Hi, gfs2-utils 3.1.2 has been released. This version features various bug fixes, compression for gfs2_edit savemeta, and improved translation infrastructure. See below for a full list of changes. The source tarball is available from: https://fedorahosted.org/released/gfs2-utils/gfs2-utils-3.1.2.tar.gz To report bugs or issues, please use: https://bugzilla.redhat.com/ Regards, Andy Price Red Hat File Systems Changes since 3.1.1: Abhijith Das (2): gfs2_convert: exits with success without doing anything gfs2_convert exits with success without doing anything Andrew Price (5): gfs2_edit: Add compression to savemeta and restoremeta gfs2_utils: More error handling improvements gfs2-utils: quieten some new build warnings gfs2_edit: Fix savemeta compression for older zlibs gfs2-utils: Fix up make-tarball.sh Benjamin Marzinski (1): gfs2_grow: write one rindex entry and then the rest Bob Peterson (2): gfs2_edit savemeta was not saving some directory info fsck.gfs2 only rebuilds one missing journal at a time Carlos Maiolino (5): Add i18n support to gfs2-utils Track translatable files gfs2_convert: Add i18n support gfs2_convert: set translatable strings i18n support: Add gfs2_convert to translatable list Steven Whitehouse (1): Remove last traces of unlinked file from gfs2-utils From pradhanparas at gmail.com Wed Jul 6 16:15:00 2011 From: pradhanparas at gmail.com (Paras pradhan) Date: Wed, 6 Jul 2011 11:15:00 -0500 Subject: [Linux-cluster] DR node in a cluster Message-ID: Hi, My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case. -- Here is the cman_tool status output. - Version: 6.2.0 Config Version: 74 Cluster Name: vrprd Cluster Id: 3304 Cluster Member: Yes Cluster Generation: 1720 Membership state: Cluster-Member Nodes: 3 Expected votes: 6 Quorum device votes: 3 Total votes: 6 Quorum: 4 Active subsystems: 10 Flags: Dirty Ports Bound: 0 11 177 Node name: vrprd1.hostmy.com Node ID: 2 Multicast addresses: x.x.x.244 Node addresses: x.x.x.96 -- Thanks! Paras. -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Wed Jul 6 16:28:34 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 06 Jul 2011 17:28:34 +0100 Subject: [Linux-cluster] DR node in a cluster In-Reply-To: References: Message-ID: <1309969714.2739.16.camel@menhir> Hi, On Wed, 2011-07-06 at 11:15 -0500, Paras pradhan wrote: > Hi, > > > My GFS2 linux cluster has three nodes. Two at the data center and one > at the DR site. If the nodes at DR site break/turnoff, all the > services move to DR node. But if the 2 nodes at the data center lost > communication with the DR node, I am not sure how does the cluster > handles the split brain. So I am looking for some recommendation in > this kind of scenario. I am usig Qdisk votes (=3) in this case. > > Using GFS2 in stretched clusters like this is not something that we support or recommend. It might work in some circumstances, but it is very complicated to ensure that recovery will work correctly in all cases. If you don't have enough nodes at a site to allow quorum to be established, then when communication fails between sites you must fence those nodes or risk data corruption when communication is re-established, Steve. > -- > Here is the cman_tool status output. > > > > > - > Version: 6.2.0 > Config Version: 74 > Cluster Name: vrprd > Cluster Id: 3304 > Cluster Member: Yes > Cluster Generation: 1720 > Membership state: Cluster-Member > Nodes: 3 > Expected votes: 6 > Quorum device votes: 3 > Total votes: 6 > Quorum: 4 > Active subsystems: 10 > Flags: Dirty > Ports Bound: 0 11 177 > Node name: vrprd1.hostmy.com > Node ID: 2 > Multicast addresses: x.x.x.244 > Node addresses: x.x.x.96 > -- > > > Thanks! > Paras. > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From Chris.Jankowski at hp.com Wed Jul 6 16:46:04 2011 From: Chris.Jankowski at hp.com (Jankowski, Chris) Date: Wed, 6 Jul 2011 16:46:04 +0000 Subject: [Linux-cluster] DR node in a cluster In-Reply-To: References: Message-ID: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net> Paras, A curiosity question: How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service? What storage configuration are you using? Thanks and regards, Chris From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan Sent: Thursday, 7 July 2011 02:15 To: linux clustering Subject: [Linux-cluster] DR node in a cluster Hi, My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case. -- Here is the cman_tool status output. - Version: 6.2.0 Config Version: 74 Cluster Name: vrprd Cluster Id: 3304 Cluster Member: Yes Cluster Generation: 1720 Membership state: Cluster-Member Nodes: 3 Expected votes: 6 Quorum device votes: 3 Total votes: 6 Quorum: 4 Active subsystems: 10 Flags: Dirty Ports Bound: 0 11 177 Node name: vrprd1.hostmy.com Node ID: 2 Multicast addresses: x.x.x.244 Node addresses: x.x.x.96 -- Thanks! Paras. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradhanparas at gmail.com Wed Jul 6 17:16:39 2011 From: pradhanparas at gmail.com (Paras pradhan) Date: Wed, 6 Jul 2011 12:16:39 -0500 Subject: [Linux-cluster] DR node in a cluster In-Reply-To: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net> References: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net> Message-ID: Chris, All the nodes are connected to a single SAN at this moment through fibre. @steven: -- If you don't have enough nodes at a site to allow quorum to be established, then when communication fails between sites you must fence those nodes or risk data corruption when communication is re-established, ----- Yes true, but in this case a single node can made the cluster quorate. (qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?). Steve On Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris wrote: > Paras,**** > > ** ** > > A curiosity question:**** > > ** ** > > How do you make sure that your storage will survive failure of **either** > of your site without loss of data and continuity of service?**** > > What storage configuration are you using?**** > > ** ** > > Thanks and regards,**** > > > Chris**** > > ** ** > > *From:* linux-cluster-bounces at redhat.com [mailto: > linux-cluster-bounces at redhat.com] *On Behalf Of *Paras pradhan > *Sent:* Thursday, 7 July 2011 02:15 > *To:* linux clustering > *Subject:* [Linux-cluster] DR node in a cluster**** > > ** ** > > Hi,**** > > ** ** > > My GFS2 linux cluster has three nodes. Two at the data center and one at > the DR site. If the nodes at DR site break/turnoff, all the services move to > DR node. But if the 2 nodes at the data center lost communication with the > DR node, I am not sure how does the cluster handles the split brain. So I am > looking for some recommendation in this kind of scenario. I am usig Qdisk > votes (=3) in this case.**** > > ** ** > > --**** > > Here is the cman_tool status output.**** > > ** ** > > ** ** > > -**** > > Version: 6.2.0**** > > Config Version: 74**** > > Cluster Name: vrprd**** > > Cluster Id: 3304**** > > Cluster Member: Yes**** > > Cluster Generation: 1720**** > > Membership state: Cluster-Member**** > > Nodes: 3**** > > Expected votes: 6**** > > Quorum device votes: 3**** > > Total votes: 6**** > > Quorum: 4 **** > > Active subsystems: 10**** > > Flags: Dirty **** > > Ports Bound: 0 11 177 **** > > Node name: vrprd1.hostmy.com**** > > Node ID: 2**** > > Multicast addresses: x.x.x.244 **** > > Node addresses: x.x.x.96 **** > > --**** > > ** ** > > Thanks!**** > > Paras.**** > > ** ** > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rossnick-lists at cybercat.ca Wed Jul 6 17:08:13 2011 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Wed, 6 Jul 2011 13:08:13 -0400 Subject: [Linux-cluster] Cluster and remote location Message-ID: <829B72D777C94E15A5ED9D1CF15E5E94@versa> Hi ! In our curent setup we have an 8-node cluster at site A. In the near future, we will have a different cluster at site B. Both site will be bridged with a lan-extension, and we plan on bridging the "service" vlan, the one that that cluster services operetes on. The "totem-ring" vlan will remain private on both sides. For some services, we may overlap the ips in both cluster, so that this service could only be run from one cluster at the time. So is there anything I should pay attention to ? I must stress out that they will be different cluster at each site and they will have separate fibre-channel network and disks, separate totem cluster network and shared service network. Thanks, From fdinitto at redhat.com Thu Jul 7 05:29:42 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Thu, 07 Jul 2011 07:29:42 +0200 Subject: [Linux-cluster] Cluster and remote location In-Reply-To: <829B72D777C94E15A5ED9D1CF15E5E94@versa> References: <829B72D777C94E15A5ED9D1CF15E5E94@versa> Message-ID: <4E154446.6040707@redhat.com> On 07/06/2011 07:08 PM, Nicolas Ross wrote: > Hi ! > > In our curent setup we have an 8-node cluster at site A. In the near > future, we will have a different cluster at site B. Both site will be > bridged with a lan-extension, and we plan on bridging the "service" > vlan, the one that that cluster services operetes on. The "totem-ring" > vlan will remain private on both sides. > > For some services, we may overlap the ips in both cluster, so that this > service could only be run from one cluster at the time. > > So is there anything I should pay attention to ? Well yes.. what you describe is a multi-site cluster (as we agreed to call it at LPC 2010 http://etherpad.osuosl.org/lpc2010-high-availability-clustering). There is no infrastructure to support this setup yet and to avoid services to be running at the same time on cluster A and B. you can still do a setup that involves cluster A to be active and B in hot-standby (for example cluster B would have no rgmanager running but everything else can be ready). Manual failover has to be done by sysadmin between clusters. Protection against nodes failing within the same cluster is still operational as-is. Fabio From Chris.Jankowski at hp.com Thu Jul 7 08:01:43 2011 From: Chris.Jankowski at hp.com (Jankowski, Chris) Date: Thu, 7 Jul 2011 08:01:43 +0000 Subject: [Linux-cluster] DR node in a cluster In-Reply-To: References: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net> Message-ID: <036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net> Paras, With your SAN on one site, what is the point of having a stretched cluster? If your datacenter, where the SAN is located, burns down, you've lost all your data. The DR servers in the DR datacenter are kind of useless without the data on shared storage. Regards, Chris From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan Sent: Thursday, 7 July 2011 03:17 To: linux clustering Subject: Re: [Linux-cluster] DR node in a cluster Chris, All the nodes are connected to a single SAN at this moment through fibre. @steven: -- If you don't have enough nodes at a site to allow quorum to be established, then when communication fails between sites you must fence those nodes or risk data corruption when communication is re-established, ----- Yes true, but in this case a single node can made the cluster quorate. (qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?). Steve On Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris > wrote: Paras, A curiosity question: How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service? What storage configuration are you using? Thanks and regards, Chris From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan Sent: Thursday, 7 July 2011 02:15 To: linux clustering Subject: [Linux-cluster] DR node in a cluster Hi, My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case. -- Here is the cman_tool status output. - Version: 6.2.0 Config Version: 74 Cluster Name: vrprd Cluster Id: 3304 Cluster Member: Yes Cluster Generation: 1720 Membership state: Cluster-Member Nodes: 3 Expected votes: 6 Quorum device votes: 3 Total votes: 6 Quorum: 4 Active subsystems: 10 Flags: Dirty Ports Bound: 0 11 177 Node name: vrprd1.hostmy.com Node ID: 2 Multicast addresses: x.x.x.244 Node addresses: x.x.x.96 -- Thanks! Paras. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From pradhanparas at gmail.com Thu Jul 7 17:26:26 2011 From: pradhanparas at gmail.com (Paras pradhan) Date: Thu, 7 Jul 2011 12:26:26 -0500 Subject: [Linux-cluster] DR node in a cluster In-Reply-To: <036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net> References: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net> <036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net> Message-ID: Yes because of the licensing issue we are now limited to a single San but not in the future. Thanks guys for the replies Paras On Thursday, July 7, 2011, Jankowski, Chris wrote: > Paras,?With your SAN on one site, what is the point of having a stretched cluster?If your datacenter, where the SAN is located, burns down, you?ve lost all your data.The DR servers in the DR datacenter are kind of useless without the data on shared storage.?Regards,?Chris??From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan > Sent: Thursday, 7 July 2011 03:17 > To: linux clustering > Subject: Re: [Linux-cluster] DR node in a cluster?Chris,?All the nodes are connected to a single SAN at this moment through fibre.??@steven:?--?If you don't have enough nodes at a site to allow quorum to be > established, then when communication fails between sites you must fence > those nodes or risk data corruption when communication is > re-established, > -----?Yes true, but in this case a single node can made the cluster quorate. (qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?). > SteveOn Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris wrote:Paras,?A curiosity question:?How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service?What storage configuration are you using??Thanks and regards, > Chris?From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan > Sent: Thursday, 7 July 2011 02:15 > To: linux clustering > Subject: [Linux-cluster] DR node in a cluster?Hi,?My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case.?--Here is the cman_tool stat From teigland at redhat.com Thu Jul 7 20:39:45 2011 From: teigland at redhat.com (David Teigland) Date: Thu, 7 Jul 2011 16:39:45 -0400 Subject: [Linux-cluster] [PATCH V2] dumping the unknown address when got a connect from non cluster node In-Reply-To: <20110704.122551.868642282278092140.yamato@redhat.com> References: <20110701.162658.1002310074831263303.yamato@redhat.com> <20110701174136.GC23008@redhat.com> <20110704.122551.868642282278092140.yamato@redhat.com> Message-ID: <20110707203944.GA9863@redhat.com> On Mon, Jul 04, 2011 at 12:25:51PM +0900, Masatake YAMATO wrote: > >> Another patch useful for debugging cluster.conf and network configuration. > >> > >> This is useful when you build a cluster with nodes connected each others with > >> a software bridge(virbrN). If you install wrong iptabels configuration, dlm > >> cannot establish connections. You will just see Thanks, I've pushed both patches to the next branch in dlm.git, if you'd like to try them out. Dave From fdinitto at redhat.com Fri Jul 8 12:28:09 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 08 Jul 2011 14:28:09 +0200 Subject: [Linux-cluster] fence-agents-3.1.5 stable release Message-ID: <4E16F7D9.9040903@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Welcome to the fence-agents 3.1.5 release. This release includes a few minor bug fixes and support for more devices (Eaton Switched ePDU). The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-3.1.5.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio Under the hood (from 3.1.4): Arnaud Quette (1): eaton_snmp: add support for Eaton Switched ePDU Fabio M. Di Nitto (3): relaxng: ship bits required to build the schema at runtime relaxng: drop definition relaxng: drop static agents definitions Lon Hohberger (1): Make fence_ack_manual 'usage' more accessible Marek 'marx' Grac (3): fence_drac5: Fix support for Dell DRAC CMC fence_bladecenter: Reboot operation did not work correctly with - --missing- fence_drac5: Incorrect output of 'list' operation on Drac 5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJOFvfXAAoJEFA6oBJjVJ+OjbsP/1pP6I2W86x++n+OJpOUvfNK 9ZXFBMvz3hphqHcYkACgoKE5qmQDYgCdH5EmJXrO4iQ0fCZ16/vyH2USqG3CqO7h 3VozL78IHjM8YDlssoWjD/vbzK5w6KbaC4Qpy2vcW73ARi1Ot0vsJN9InexyKorl XAT4MIBqdJNwSfI2wGT9pe1duoY0AqNrz+UlRssXjYgaxlxq/5LtVmbRnRpbypqO yBLtspTK+fSy0ofD2hxsOpTHDSgmMaj8REeN49iP924JbKbGE0FBl5yHh4Kd7UGI 6Q5gdbm7PetZAz9jubbJdH2yKRV5C0btzlvH7/LJsL4AK8qjA49cz7erygRU0sRv 1HOKFJ+xIWiNvp/I5AYhjIjMc0r1Eafrmpg7FgyhG9bNM6avmh7KiTSAxXfwOjRK H22uakA/cxwOGWMMwLfRsSqwh5H9QTDRrXbF2xMHuenq/qDVvgHxveFoLqWQSdGd HFsbcjWtfcPNer/+Fawk/7FoDkh2K0+EuhaPJdNplk+NkyTjw+EE67z6tdnislWB 0Ocx8YwX3ID7QjwySxPCNoayuUKUjDJiOgugEiHf4PrbGoqczQHug6O49IUwb2ZR oT2DwpJnMptQluh1f5Xkw7OLB+tVA8elbnDjqzejB/5aj4LGZy2r/3OrQkOGfdnK Stv3UdSiSk0mdv9SPF6F =omcd -----END PGP SIGNATURE----- From jpolo at wtransnet.com Fri Jul 8 15:41:08 2011 From: jpolo at wtransnet.com (Javi Polo) Date: Fri, 08 Jul 2011 17:41:08 +0200 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference Message-ID: <4E172514.7060806@wtransnet.com> Hello everyone! I've set up a cluster in order to use GFS2. The cluster works really well ;) Then, I've exported the GFS2 filesystem via NFS to share with machines outside the cluster, and in a read fashion it works OK, but as soon as I try to write in it, the filesystem seems to hang: root at file03:~# mount filepro01:/mnt/gfs /mnt/tmp -o soft root at file03:~# ls /mnt/tmp/ algo caca caca2 testa root at file03:~# mkdir /mnt/tmp/otracosa at this point, the NFS stopped working. I can see in the nfs client: [11132241.127470] nfs: server filepro01 not responding, timed out however, the directory was indeed created, and the other node can continue using the gfs2 filesystem (locally) On the NFS server (filepro01) looking at the logs I found some nasty things. This first part is mounting the filesystem, which is OK: [6234925.738508] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [6234925.787305] NFSD: starting 90-second grace period [6234925.825811] GFS2 (built Feb 7 2011 16:11:33) installed [6234925.826698] GFS2: fsid=: Trying to join cluster "lock_dlm", "wtn_cluster:file01" [6234925.886991] GFS2: fsid=wtn_cluster:file01.0: Joined cluster. Now mounting FS... [6234925.975113] GFS2: fsid=wtn_cluster:file01.0: jid=0, already locked for use [6234925.975116] GFS2: fsid=wtn_cluster:file01.0: jid=0: Looking at journal... [6234926.075105] GFS2: fsid=wtn_cluster:file01.0: jid=0: Acquiring the transaction lock... [6234926.075152] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replaying journal... [6234926.076200] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replayed 8 of 9 blocks [6234926.076204] GFS2: fsid=wtn_cluster:file01.0: jid=0: Found 1 revoke tags [6234926.076649] GFS2: fsid=wtn_cluster:file01.0: jid=0: Journal replayed in 1s [6234926.076800] GFS2: fsid=wtn_cluster:file01.0: jid=0: Done [6234926.076945] GFS2: fsid=wtn_cluster:file01.0: jid=1: Trying to acquire journal lock... [6234926.078723] GFS2: fsid=wtn_cluster:file01.0: jid=1: Looking at journal... [6234926.257645] GFS2: fsid=wtn_cluster:file01.0: jid=1: Done [6234926.258187] GFS2: fsid=wtn_cluster:file01.0: jid=2: Trying to acquire journal lock... [6234926.260966] GFS2: fsid=wtn_cluster:file01.0: jid=2: Looking at journal... [6234926.549636] GFS2: fsid=wtn_cluster:file01.0: jid=2: Done [6234930.789787] ipmi message handler version 39.2 and when we try to write from nfs client, bang: [6235083.656954] BUG: unable to handle kernel NULL pointer dereference at 00000024 [6235083.656973] IP: [] gfs2_drevalidate+0xe/0x200 [gfs2] [6235083.656992] *pdpt = 0000000001831027 *pde = 0000000000000000 [6235083.657003] Oops: 0000 [#1] SMP [6235083.657012] last sysfs file: /sys/module/dlm/initstate [6235083.657018] Modules linked in: ipmi_msghandler xenfs gfs2 ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dlm configfs nfsd e xportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc drbd lru_cache lp parport [last unloaded: scsi_transport_iscsi] [6235083.657090] [6235083.657095] Pid: 1497, comm: nfsd Tainted: G W 2.6.38-2-virtual #29~lucid1-Ubuntu / [6235083.657103] EIP: 0061:[] EFLAGS: 00010282 CPU: 0 [6235083.657115] EIP is at gfs2_drevalidate+0xe/0x200 [gfs2] [6235083.657120] EAX: eb9d7180 EBX: eb9d7180 ECX: ee2ec000 EDX: 00000000 [6235083.657127] ESI: eb924580 EDI: 00000000 EBP: c1dc5c68 ESP: c1dc5c20 [6235083.657133] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 [6235083.657139] Process nfsd (pid: 1497, ti=c1dc4000 task=c1b18ca0 task.ti=c1dc4000) [6235083.657145] Stack: [6235083.657150] c1dc5c28 c0627afd c1dc5c68 c0242314 00000000 c1dc5c7c ee2dba0c ee2c02d0 [6235083.657170] 00000001 eb924580 c1a47038 c1dc5cb0 eb9d7188 00000004 14a2fc97 eb9d7180 [6235083.657190] eb924580 00000000 c1dc5c7c c023a18f eb9d7180 eb924580 eb925000 c1dc5ca0 [6235083.657210] Call Trace: [6235083.657220] [] ? _raw_spin_lock+0xd/0x10 [6235083.657230] [] ? __d_lookup+0xf4/0x150 [6235083.657242] [] ? gfs2_permission+0xcc/0x120 [gfs2] [6235083.657253] [] ? gfs2_check_acl+0x0/0x80 [gfs2] [6235083.657263] [] d_revalidate+0x1f/0x60 [6235083.657271] [] __lookup_hash+0xa2/0x180 [6235083.657284] [] ? encode_post_op_attr+0x86/0x90 [nfsd] [6235083.657292] [] lookup_one_len+0x43/0x80 [6235083.657303] [] compose_entry_fh+0x9f/0xe0 [nfsd] [6235083.657315] [] encode_entryplus_baggage+0x51/0xb0 [nfsd] [6235083.657327] [] encode_entry+0x2a5/0x2f0 [nfsd] [6235083.657338] [] nfs3svc_encode_entry_plus+0x40/0x50 [nfsd] [6235083.657349] [] nfsd_buffered_readdir+0xfd/0x1a0 [nfsd] [6235083.657361] [] ? nfs3svc_encode_entry_plus+0x0/0x50 [nfsd] [6235083.657372] [] nfsd_readdir+0x70/0xb0 [nfsd] [6235083.657383] [] nfsd3_proc_readdirplus+0xd8/0x200 [nfsd] [6235083.657394] [] ? nfs3svc_encode_entry_plus+0x0/0x50 [nfsd] [6235083.657405] [] nfsd_dispatch+0xd3/0x210 [nfsd] [6235083.657423] [] svc_process_common+0x2e3/0x590 [sunrpc] [6235083.657438] [] ? svc_xprt_received+0x2d/0x40 [sunrpc] [6235083.657452] [] ? svc_recv+0x48b/0x750 [sunrpc] [6235083.657465] [] svc_process+0xdc/0x140 [sunrpc] [6235083.657474] [] ? down_read+0x10/0x20 [6235083.657483] [] nfsd+0xb4/0x140 [nfsd] [6235083.657493] [] ? complete+0x4e/0x60 [6235083.657503] [] ? nfsd+0x0/0x140 [nfsd] [6235083.657513] [] kthread+0x74/0x80 [6235083.657520] [] ? kthread+0x0/0x80 [6235083.657528] [] kernel_thread_helper+0x6/0x10 [6235083.657533] Code: 8b 53 08 e8 75 d4 0a d2 f7 d0 89 03 31 c0 5b 5d c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 57 56 53 83 ec 3c 3e 8d 74 26 00 42 24 40 89 c3 b8 f6 ff ff ff 74 0d 83 c4 3c 5b 5e 5f 5d c3 [6235083.657652] EIP: [] gfs2_drevalidate+0xe/0x200 [gfs2] SS:ESP 0069:c1dc5c20 [6235083.865070] CR2: 0000000000000024 [6235083.865077] ---[ end trace 2dfc9195648a185b ]--- [6235099.205542] dlm: connecting to 2 Is this a bug? Is it known? Are there any workarounds? The gfs2+nfs server is a xen client, with ubuntu 10.04 and kernel 2.6.38-2-virtual # gfs2_tool version gfs2_tool 3.0.12 (built Jul 5 2011 16:52:20) Copyright (C) Red Hat, Inc. 2004-2010 All rights reserved. # cman_tool version 6.2.0 config 2011070805 Here's also the cluster.conf file, just in case ;) Thanks in advance :) -- Javi Polo Administrador de Sistemas Tel 93 734 97 70 Fax 93 734 97 71 jpolo at wtransnet.com From swhiteho at redhat.com Fri Jul 8 16:22:18 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 08 Jul 2011 17:22:18 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <4E172514.7060806@wtransnet.com> References: <4E172514.7060806@wtransnet.com> Message-ID: <1310142138.2705.37.camel@menhir> Hi, On Fri, 2011-07-08 at 17:41 +0200, Javi Polo wrote: > Hello everyone! > > I've set up a cluster in order to use GFS2. The cluster works really well ;) > Then, I've exported the GFS2 filesystem via NFS to share with machines > outside the cluster, and in a read fashion it works OK, but as soon as I > try to write in it, the filesystem seems to hang: > > root at file03:~# mount filepro01:/mnt/gfs /mnt/tmp -o soft > root at file03:~# ls /mnt/tmp/ > algo caca caca2 testa > root at file03:~# mkdir /mnt/tmp/otracosa > > at this point, the NFS stopped working. I can see in the nfs client: > > [11132241.127470] nfs: server filepro01 not responding, timed out > > however, the directory was indeed created, and the other node can > continue using the gfs2 filesystem (locally) > On the NFS server (filepro01) looking at the logs I found some nasty > things. This first part is mounting the filesystem, which is OK: > Currently we don't recommend using NFS on a GFS2 filesystem which is also being used locally. That will hopefully change in the future, however, in the mean time I'd suggest using the localflocks mount option on all the mounts (and be aware the fcntl/flock locking is then node local) to avoid problems that you are otherwise likely to hit during recovery. Also... > [6234925.738508] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state > recovery directory > [6234925.787305] NFSD: starting 90-second grace period > [6234925.825811] GFS2 (built Feb 7 2011 16:11:33) installed > [6234925.826698] GFS2: fsid=: Trying to join cluster "lock_dlm", > "wtn_cluster:file01" > [6234925.886991] GFS2: fsid=wtn_cluster:file01.0: Joined cluster. Now > mounting FS... > [6234925.975113] GFS2: fsid=wtn_cluster:file01.0: jid=0, already locked > for use > [6234925.975116] GFS2: fsid=wtn_cluster:file01.0: jid=0: Looking at > journal... > [6234926.075105] GFS2: fsid=wtn_cluster:file01.0: jid=0: Acquiring the > transaction lock... > [6234926.075152] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replaying > journal... > [6234926.076200] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replayed 8 of 9 > blocks > [6234926.076204] GFS2: fsid=wtn_cluster:file01.0: jid=0: Found 1 revoke tags > [6234926.076649] GFS2: fsid=wtn_cluster:file01.0: jid=0: Journal > replayed in 1s > [6234926.076800] GFS2: fsid=wtn_cluster:file01.0: jid=0: Done > [6234926.076945] GFS2: fsid=wtn_cluster:file01.0: jid=1: Trying to > acquire journal lock... > [6234926.078723] GFS2: fsid=wtn_cluster:file01.0: jid=1: Looking at > journal... > [6234926.257645] GFS2: fsid=wtn_cluster:file01.0: jid=1: Done > [6234926.258187] GFS2: fsid=wtn_cluster:file01.0: jid=2: Trying to > acquire journal lock... > [6234926.260966] GFS2: fsid=wtn_cluster:file01.0: jid=2: Looking at > journal... > [6234926.549636] GFS2: fsid=wtn_cluster:file01.0: jid=2: Done > [6234930.789787] ipmi message handler version 39.2 > That all looks ok, but... > and when we try to write from nfs client, bang: > > [6235083.656954] BUG: unable to handle kernel NULL pointer dereference > at 00000024 > [6235083.656973] IP: [] gfs2_drevalidate+0xe/0x200 [gfs2] > [6235083.656992] *pdpt = 0000000001831027 *pde = 0000000000000000 > [6235083.657003] Oops: 0000 [#1] SMP > [6235083.657012] last sysfs file: /sys/module/dlm/initstate > [6235083.657018] Modules linked in: ipmi_msghandler xenfs gfs2 ib_iser > rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp > libiscsi scsi_transport_iscsi dlm configfs nfsd e > xportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc drbd lru_cache lp > parport [last unloaded: scsi_transport_iscsi] > [6235083.657090] > [6235083.657095] Pid: 1497, comm: nfsd Tainted: G W > 2.6.38-2-virtual #29~lucid1-Ubuntu / > [6235083.657103] EIP: 0061:[] EFLAGS: 00010282 CPU: 0 > [6235083.657115] EIP is at gfs2_drevalidate+0xe/0x200 [gfs2] this should not happen. It looks like we are trying to look up something that is 24 (hex) bytes into a structure. Does the fs have posix acls enabled or selinux or something else using xattrs? Steve. From ajb2 at mssl.ucl.ac.uk Fri Jul 8 16:46:01 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Fri, 8 Jul 2011 17:46:01 +0100 (BST) Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <1310142138.2705.37.camel@menhir> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> Message-ID: On Fri, 8 Jul 2011, Steven Whitehouse wrote: > Currently we don't recommend using NFS on a GFS2 filesystem which is > also being used locally. After much dealing with NFS internals, I would recommend NOT using it on any filesystem where the files are accessed locally. NFSv2/3 doesn't play nice with anything else which may access the underlaying disk (including Samba. The only "safe" method is to export your samba shares from a NFS client elsewhere on the network). YMMV. NFSv4 is supposedly better behaved. I've not tested it. From jpolo at wtransnet.com Fri Jul 8 16:59:50 2011 From: jpolo at wtransnet.com (Javi Polo) Date: Fri, 08 Jul 2011 18:59:50 +0200 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <1310142138.2705.37.camel@menhir> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> Message-ID: <4E173786.1060507@wtransnet.com> thx for the fast reply! El 07/08/11 18:22, Steven Whitehouse escribi?: > Currently we don't recommend using NFS on a GFS2 filesystem which is > also being used locally. That will hopefully change in the future, > however, in the mean time I'd suggest using the localflocks mount option > on all the mounts (and be aware the fcntl/flock locking is then node > local) to avoid problems that you are otherwise likely to hit during > recovery. Also... > I dont think I really understood you. You mean that a host wich uses GFS2 locally is not recommended to export the filesystem via NFS, but if the host just uses it as NFS export, and who access the filesystem are just the nfs clients, it is allright :? > [6235083.656954] BUG: unable to handle kernel NULL pointer dereference >> this should not happen. It looks like we are trying to look up something >> that is 24 (hex) bytes into a structure. Does the fs have posix acls >> enabled or selinux or something else using xattrs? Nope, at least as far as I know. As I dont usually use ubuntu, I have checked to see if it had selinux enabled by default, or some ACLs related thing, but it seems it's not .... -- Javi Polo Administrador de Sistemas Tel 93 734 97 70 Fax 93 734 97 71 jpolo at wtransnet.com From Colin.Simpson at iongeo.com Fri Jul 8 17:04:56 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Fri, 08 Jul 2011 18:04:56 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> Message-ID: <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> That's not ideal either when Samba isn't too happy working over NFS, and that is not recommended by the Samba people as being a sensible config. Colin On Fri, 2011-07-08 at 17:46 +0100, Alan Brown wrote: > On Fri, 8 Jul 2011, Steven Whitehouse wrote: > > > Currently we don't recommend using NFS on a GFS2 filesystem which is > > also being used locally. > > After much dealing with NFS internals, I would recommend NOT using it > on > any filesystem where the files are accessed locally. > > NFSv2/3 doesn't play nice with anything else which may access the > underlaying disk (including Samba. The only "safe" method is to export > your samba shares from a NFS client elsewhere on the network). > > YMMV. NFSv4 is supposedly better behaved. I've not tested it. > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From ajb2 at mssl.ucl.ac.uk Fri Jul 8 17:36:53 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Fri, 8 Jul 2011 18:36:53 +0100 (BST) Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> Message-ID: On Fri, 8 Jul 2011, Colin Simpson wrote: > That's not ideal either when Samba isn't too happy working over NFS, and > that is not recommended by the Samba people as being a sensible config. I know but there's a real (and demonstrable) risk of data corruption for NFS vs _anything_ if NFS clients and local processes (or clients of other services such as a samba server) happen to grab the same file for writing at the same time. Apart from that, the 1 second granularity of NFS timestamps can (and has) result in writes made by non-nfs processes to cause NFS clients which have that file opened read/write to see "stale filehandle" errors due to the inode having changed when they weren't expecting it. We (should) all know NFS was a kludge. What's surprising is how much kludge stll remains in the current v2/3 code (which is surprisingly opaque and incredibly crufty, much of it dates from the early 1990s or earlier) As I said earlier, V4 is supposed to play a lot nicer but I haven't tested it - as as far as I know it's not suported on GFS systems anyway (That was the RH official line when I tried to get it working last time..) I'd love to get v4 running properly in active/active/active setup from multiple GFS-mounted fileservers to the clients. If anyone knows how to reliably do it on EL5.6 systems then I'm open to trying again as I believe that this would solve a number of issues being seen locally (including various crash bugs). On the other hand, v2/3 aren't going away anytime soon and some effort really needs to be put into making them work properly. On the gripping hand, I'd also like to see viable alternatives to NFS when it comes to feeding 100+ desktop clients Making them mount the filesystems using GFS might sound like an alternative until you consider what happens if any of them crash/reboot during the day. Batch processes can wait all day, but users with frozen desktops get irate - quickly. From Colin.Simpson at iongeo.com Fri Jul 8 18:36:45 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Fri, 08 Jul 2011 19:36:45 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir><1310144696.3779.4.camel@bhac.iouk.ioroot.tld> Message-ID: <1310150205.3779.44.camel@bhac.iouk.ioroot.tld> Very interesting. Certainly in our application it would be highly unlikely that samba and NFS would try to write to the same file simultaneously that would very much be an edge case (and users would know the result would be undefined). I can certainly personally live with that level of potential file corruption, though I can see others may not. But I guess you are also telling me that file locking between the two wouldn't be helping here either? (I rule out NFSv2 as something we have thankfully eliminated). NFSv3 could be gone for us if we are lucky by 2012 (when RHEL 4 goes EOL and if RHEL5's NFSv4 is robust enough). Currently by default RHEL6 clusters export (with the standard RA) on NFSv4 and RHEL6 (and Fedora etc) mount these as NFSv4. So I'd hope supported.....I haven't as yet tried to wrap any security round these, from a discussion here a while back that looks like hard work. I'd certainly love to have pNFS to allow multiple active nodes. OT: My main NFS issue I have just now is supporting laptops with the automounter. NFS is just so undynamic. Once a mount is in place the client changing IP will leave the mount hung. And laptops do this all the time (on, off wired, wireless VPN etc). We have some nasty scripts that clean up the mounts when laptops move network (lots of forcing kills of autofs and umount -fl's). Mostly works ok. Again if a user disconnects a laptop during an ongoing file operation they can expect undefined file contents. It's better than the alternative of hung mounts, lots of things hate that. We aren't talking complex file formats or operations here, copying source files, data, docs to the local disk, so no nasty binary file corruption issues. Maybe not such a great thing to do, but users like to have a consistent file system view that matches the office based systems. Sadly it looks like NFS is the least dynamic network component left in a Linux distro. I posted a longer version of this problem to the linux-nfs mailing list, I heard from someone that basically said the NFS committee and developers (not just Linux) are largely targeting NFS as Enterprise Storage protocol. I presume he means storage servers using NFS to share to say front end web servers. So less interested in certainly my use case. Possibly the best bet (in a while) for desktop network file sharing will be the Samba, they seem to be trying to target cifs (with full Unix extension) as being a solution for this. Thanks Colin On Fri, 2011-07-08 at 18:36 +0100, Alan Brown wrote: > On Fri, 8 Jul 2011, Colin Simpson wrote: > > > That's not ideal either when Samba isn't too happy working over NFS, > and > > that is not recommended by the Samba people as being a sensible > config. > > I know but there's a real (and demonstrable) risk of data corruption > for > NFS vs _anything_ if NFS clients and local processes (or clients of > other > services such as a samba server) happen to grab the same file for > writing > at the same time. > > Apart from that, the 1 second granularity of NFS timestamps can (and > has) > result in writes made by non-nfs processes to cause NFS clients which > have > that file opened read/write to see "stale filehandle" errors due to > the > inode having changed when they weren't expecting it. > > We (should) all know NFS was a kludge. What's surprising is how much > kludge stll remains in the current v2/3 code (which is surprisingly > opaque > and incredibly crufty, much of it dates from the early 1990s or > earlier) > > As I said earlier, V4 is supposed to play a lot nicer but I haven't > tested > it - as as far as I know it's not suported on GFS systems anyway (That > was > the RH official line when I tried to get it working last time..) > > I'd love to get v4 running properly in active/active/active setup from > multiple GFS-mounted fileservers to the clients. If anyone knows how > to > reliably do it on EL5.6 systems then I'm open to trying again as I > believe > that this would solve a number of issues being seen locally (including > various crash bugs). > > On the other hand, v2/3 aren't going away anytime soon and some effort > really needs to be put into making them work properly. > > On the gripping hand, I'd also like to see viable alternatives to NFS > when > it comes to feeding 100+ desktop clients > > Making them mount the filesystems using GFS might sound like an > alternative until you consider what happens if any of them > crash/reboot > during the day. Batch processes can wait all day, but users with > frozen > desktops get irate - quickly. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From fdinitto at redhat.com Fri Jul 8 18:57:32 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 08 Jul 2011 20:57:32 +0200 Subject: [Linux-cluster] cluster 3.1.4 release Message-ID: <4E17531C.7070305@redhat.com> Welcome to the cluster 3.1.4 release. This release fixes a few bugs and adds a new dynamic relaxng schema creation. In order to run this version of cman/cluster, it is strictly required to have fence-agents at least in version 3.1.5 and resource-agents in version 3.9.2. Alternatively you have to disable cluster.conf validation (see ccs_config_validate.8 for information). The new source tarball can be downloaded here: https://fedorahosted.org/releases/c/l/cluster/cluster-3.1.4.tar.xz ChangeLog: https://fedorahosted.org/releases/c/l/cluster/Changelog-3.1.4 To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio From ajb2 at mssl.ucl.ac.uk Fri Jul 8 19:46:45 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Fri, 08 Jul 2011 20:46:45 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <1310150205.3779.44.camel@bhac.iouk.ioroot.tld> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir><1310144696.3779.4.camel@bhac.iouk.ioroot.tld> <1310150205.3779.44.camel@bhac.iouk.ioroot.tld> Message-ID: <4E175EA5.10305@mssl.ucl.ac.uk> Colin Simpson wrote: > But I guess you are also telling me that file locking between the two > wouldn't be helping here either? Correct. NFSd (v2/3) doesn't pass client locks to the filesystem, nor does it respect locks set by other processes. It has a number of other foibles - try setting up a large number of services where you have one NFS export per service (eg, multiple disk mounts) and there's a good chance the exports will fail at startup because they all try to run at once and end up scribbling all over each other's export list (There's a name for this kind of failure mode, which I can't remember) From bfields at fieldses.org Fri Jul 8 21:09:05 2011 From: bfields at fieldses.org (J. Bruce Fields) Date: Fri, 8 Jul 2011 17:09:05 -0400 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> Message-ID: <20110708210905.GD13886@fieldses.org> On Fri, Jul 08, 2011 at 06:36:53PM +0100, Alan Brown wrote: > On Fri, 8 Jul 2011, Colin Simpson wrote: > > > That's not ideal either when Samba isn't too happy working over NFS, and > > that is not recommended by the Samba people as being a sensible config. > > I know but there's a real (and demonstrable) risk of data corruption for > NFS vs _anything_ if NFS clients and local processes (or clients of other > services such as a samba server) happen to grab the same file for writing > at the same time. With default mount options, the linux NFS client (like most NFS clients) assumes that a file has a most one writer at a time. (Applications that need to do write-sharing over NFS need to use file locking.) Note this issue isn't special to local process--the same restriction applies to two NFS clients writing to the same file. > Apart from that, the 1 second granularity of NFS timestamps The NFS protocol supports higher granularity timestamps. The limitation is the exported filesystem. If you're using something other than ext2/3, you're probably getting higher granularity. > can (and has) > result in writes made by non-nfs processes to cause NFS clients which have > that file opened read/write to see "stale filehandle" errors due to the > inode having changed when they weren't expecting it. Changing file data or attributes won't result in stale filehandle errors. (Bug reports welcome if you've seen otherwise.) Stale filehandle errors should only happen when a client attempts to use a file which no longer exists on the server. (E.g. if another client deletes a file while your client has it open.) (This can also happen if you rename a file across directories on a filesystem exported with the subtree_check option. The subtree_check option is deprecated, for that reason.) > We (should) all know NFS was a kludge. What's surprising is how much > kludge stll remains in the current v2/3 code (which is surprisingly opaque > and incredibly crufty, much of it dates from the early 1990s or earlier) Details welcome. > As I said earlier, V4 is supposed to play a lot nicer V4 has a number of improvements, but what I've described above applies across versions (module some technical details about timestamps vs. change attributes). --b. > but I haven't tested > it - as as far as I know it's not suported on GFS systems anyway (That was > the RH official line when I tried to get it working last time..) > > I'd love to get v4 running properly in active/active/active setup from > multiple GFS-mounted fileservers to the clients. If anyone knows how to > reliably do it on EL5.6 systems then I'm open to trying again as I believe > that this would solve a number of issues being seen locally (including > various crash bugs). > > On the other hand, v2/3 aren't going away anytime soon and some effort > really needs to be put into making them work properly. > > On the gripping hand, I'd also like to see viable alternatives to NFS when > it comes to feeding 100+ desktop clients > > Making them mount the filesystems using GFS might sound like an > alternative until you consider what happens if any of them crash/reboot > during the day. Batch processes can wait all day, but users with frozen > desktops get irate - quickly. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ajb2 at mssl.ucl.ac.uk Mon Jul 11 08:30:11 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Mon, 11 Jul 2011 09:30:11 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <20110708210905.GD13886@fieldses.org> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> <20110708210905.GD13886@fieldses.org> Message-ID: <4E1AB493.7060302@mssl.ucl.ac.uk> On 08/07/11 22:09, J. Bruce Fields wrote: > With default mount options, the linux NFS client (like most NFS clients) > assumes that a file has a most one writer at a time. (Applications that > need to do write-sharing over NFS need to use file locking.) The problem is that file locking on V3 isn't passed back down to the filesystem - hence the issues with nfs vs samba (or local disk access(*)) on the same server. (*) Local disk access includes anything running on other nodes in a GFS/GFS2 environment. This precludes exporting the same GFS(2) filesystem on multiple cluster nodes. > The NFS protocol supports higher granularity timestamps. The limitation > is the exported filesystem. If you're using something other than > ext2/3, you're probably getting higher granularity. GFS/GFS2 in this case... >> can (and has) >> result in writes made by non-nfs processes to cause NFS clients which have >> that file opened read/write to see "stale filehandle" errors due to the >> inode having changed when they weren't expecting it. > > Changing file data or attributes won't result in stale filehandle > errors. (Bug reports welcome if you've seen otherwise.) I'll have to try and repeat the issue, but it's a race condition with a narrow window at the best of times. > Stale > filehandle errors should only happen when a client attempts to use a > file which no longer exists on the server. (E.g. if another client > deletes a file while your client has it open.) It's possible this has happened. I have no idea what user batch scripts are trying to do on the compute nodes, but in the case that was brought to my attention the file was edited on one node while another had it open. > (This can also happen if > you rename a file across directories on a filesystem exported with the > subtree_check option. The subtree_check option is deprecated, for that > reason.) All our FSes are exported no_subtree_check and at the root of the FS. >> We (should) all know NFS was a kludge. What's surprising is how much >> kludge stll remains in the current v2/3 code (which is surprisingly opaque >> and incredibly crufty, much of it dates from the early 1990s or earlier) > > Details welcome. The non-parallelisation in exportfs (leading to race conditions) for starters. We had to insert flock statements in every call to it in /usr/share/cluster/nfsclient.sh in order to have reliable service startups There are a number of RH Bugzilla tickets revolving around NFS behaviour which would be worth looking at. >> As I said earlier, V4 is supposed to play a lot nicer > > V4 has a number of improvements, but what I've described above applies > across versions (module some technical details about timestamps vs. > change attributes). Thanks for the input. NFS has been a major pain point in our organisation for years. If you have ideas for doing things better then I'm very interested. Alan From swhiteho at redhat.com Mon Jul 11 10:43:58 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 11 Jul 2011 11:43:58 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <4E1AB493.7060302@mssl.ucl.ac.uk> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> <20110708210905.GD13886@fieldses.org> <4E1AB493.7060302@mssl.ucl.ac.uk> Message-ID: <1310381038.2766.9.camel@menhir> Hi, On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > On 08/07/11 22:09, J. Bruce Fields wrote: > > > With default mount options, the linux NFS client (like most NFS clients) > > assumes that a file has a most one writer at a time. (Applications that > > need to do write-sharing over NFS need to use file locking.) > > The problem is that file locking on V3 isn't passed back down to the > filesystem - hence the issues with nfs vs samba (or local disk > access(*)) on the same server. > > (*) Local disk access includes anything running on other nodes in a > GFS/GFS2 environment. This precludes exporting the same GFS(2) > filesystem on multiple cluster nodes. > Well the locks are kind of passed down, but there is not enough info to make it work correctly, hence we require the localflocks mount option to prevent this information from being passed down at all. > > > The NFS protocol supports higher granularity timestamps. The limitation > > is the exported filesystem. If you're using something other than > > ext2/3, you're probably getting higher granularity. > > GFS/GFS2 in this case... > GFS supports second resolution time stamps GFS2 supports nanosecond resolution time stamps > >> can (and has) > >> result in writes made by non-nfs processes to cause NFS clients which have > >> that file opened read/write to see "stale filehandle" errors due to the > >> inode having changed when they weren't expecting it. > > > > Changing file data or attributes won't result in stale filehandle > > errors. (Bug reports welcome if you've seen otherwise.) > > I'll have to try and repeat the issue, but it's a race condition with a > narrow window at the best of times. > GFS2 doesn't do anything odd with filehandles, they shouldn't be coming up as stale unless the inode has been removed. > > Stale > > filehandle errors should only happen when a client attempts to use a > > file which no longer exists on the server. (E.g. if another client > > deletes a file while your client has it open.) > > It's possible this has happened. I have no idea what user batch scripts > are trying to do on the compute nodes, but in the case that was brought > to my attention the file was edited on one node while another had it open. > That probably means the editor made a copy of it and then moved it back over the top of the original file, thus unlinking the original file. > > (This can also happen if > > you rename a file across directories on a filesystem exported with the > > subtree_check option. The subtree_check option is deprecated, for that > > reason.) > > All our FSes are exported no_subtree_check and at the root of the FS. > > >> We (should) all know NFS was a kludge. What's surprising is how much > >> kludge stll remains in the current v2/3 code (which is surprisingly opaque > >> and incredibly crufty, much of it dates from the early 1990s or earlier) > > > > Details welcome. > > The non-parallelisation in exportfs (leading to race conditions) for > starters. We had to insert flock statements in every call to it in > /usr/share/cluster/nfsclient.sh in order to have reliable service startups > > There are a number of RH Bugzilla tickets revolving around NFS behaviour > which would be worth looking at. > > >> As I said earlier, V4 is supposed to play a lot nicer > > > > V4 has a number of improvements, but what I've described above applies > > across versions (module some technical details about timestamps vs. > > change attributes). > > Thanks for the input. > > NFS has been a major pain point in our organisation for years. If you > have ideas for doing things better then I'm very interested. > > Alan > NFS and GFS2 is an area in which we are trying to gradually increase the supportable use cases. It is also a rather complex area, so it will take some time to do this, Steve. From bfields at fieldses.org Mon Jul 11 12:05:46 2011 From: bfields at fieldses.org (J. Bruce Fields) Date: Mon, 11 Jul 2011 08:05:46 -0400 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <1310381038.2766.9.camel@menhir> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <1310144696.3779.4.camel@bhac.iouk.ioroot.tld> <20110708210905.GD13886@fieldses.org> <4E1AB493.7060302@mssl.ucl.ac.uk> <1310381038.2766.9.camel@menhir> Message-ID: <20110711120546.GA26712@fieldses.org> On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote: > Hi, > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > > On 08/07/11 22:09, J. Bruce Fields wrote: > > > > > With default mount options, the linux NFS client (like most NFS clients) > > > assumes that a file has a most one writer at a time. (Applications that > > > need to do write-sharing over NFS need to use file locking.) > > > > The problem is that file locking on V3 isn't passed back down to the > > filesystem - hence the issues with nfs vs samba (or local disk > > access(*)) on the same server. The NFS server *does* acquire locks on the exported filesystem (and does it the same way for v2, v3, and v4). For local filesystems (ext3, xfs, btrfs), this is sufficient. For exports of cluster filesystems like gfs2, there are more complicated problems that, as Steve says, will require some work to do to fix. Samba is a more complicated issue due to the imperfect match between Windows and Linux locking semantics, but depending on how it's configured Samba will also acquire locks on the exported filesystem. --b. From Ralph.Grothe at itdz-berlin.de Mon Jul 11 13:11:44 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Mon, 11 Jul 2011 15:11:44 +0200 Subject: [Linux-cluster] how to disable one node In-Reply-To: <1309954418.14584.2148764421@webmail.messagingengine.com> References: <1309954418.14584.2148764421@webmail.messagingengine.com> Message-ID: I'm not sure if you can access this doc (I think it requires a login account at RHN), and if this addresses your issue? In RHN Knowledge base there is this article entitled "How do I disable the cluster software on a member system in Red Hat Enterprise Linux?" https://access.redhat.com/kb/docs/DOC-5695 Good Luck Ralph ________________________________ From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Helen Heath Sent: Wednesday, July 06, 2011 2:14 PM To: linux-cluster at redhat.com Subject: [Linux-cluster] how to disable one node Hi all - I hope someone can shed some light on this. I have a 2-node cluster running on RedHat 3 which has a shared /clust1 filesystem and is connected to a network power switch. There is something very wrong with the cluster, as every day currently it is rebooting whichever is the primary node, for no reason I can track down. No hardware faults anywhere in the cluster, no failures of any kind logging in any log files, etc etc. It started out well over a year ago rebooting the primary node every other week, then across time it progressed to once a week, then once a day. I logged a call with RedHat way back when it first started; nothing was ever found to be the problem, and of course in time, RedHat v3 went out of support and they would no longer assist in troubleshooting the issue. Prior to this problem starting the cluster had been running happily with no issues for about 5 years. Now this cluster is shortly being replaced with new hardware and RedHat 5, so hopefully whatever is the problem will as mysteriously vanish as it appeared. However, I need to stop this daily reboot as it is playing havoc with the application that runs on this system (a heavily-utilised database) and having tried everything I can think of, I decided to 'break' the cluster; ie, take down one node so that only one node remains running the application. I cannot find a way to do this that persists across a reboot of the node that should be out of the cluster. I've run "/sbin/chkconfig --del clumanager" and it did take the service out of chkconfig (I verified this). The RedHat document http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html /Cluster_Administration/s1-admin-disable.html seems to indicate this should persist across a reboot - ie, you reboot the node and it does not attempt to rejoin the cluster; however, this didn't work! The primary node cluster monitoring software saw that the secondary node was down, STONITH kicked in, the NPS powered the port this node is connected to off and back on, the secondary node rebooted and rejoined the cluster! Does anyone know how to either temporarily remove the secondary node from the cluster in such a way that persists across reboots but can be easily brought back into the cluster when needed, or else (and preferably) how to temporarily stop the cluster monitoring software running on the primary node from even looking out for the secondary node - as in, it doesn't care whether the secondary node is up or not? I've checked for the period the secondary node is down that the primary node is quite happy to carry on processing as usual but as soon as the cluster monitoring software on the primary node realises the secondary node is down, it reboots it, and I'm back to square one! This is now really urgent (I've been trying to find an answer to this for some weeks now) as I go on holiday on Friday and I really don't want to leave my second-in-command with a mess on his hands! thanks -- Helen Heath helen_heath at fastmail.fm =*= Everything that has a beginning has an ending. Make your peace with that and all will be well. -- Buddhist saying From linux at alteeve.com Tue Jul 12 00:07:51 2011 From: linux at alteeve.com (Digimer) Date: Mon, 11 Jul 2011 20:07:51 -0400 Subject: [Linux-cluster] Detecting Windows Xen VM crash in RHCS2 (EL5) w/ rgmanager Message-ID: <4E1B9057.9010906@alteeve.com> Hi all, Doing some testing, I found that rgmanager detects a crash in my *nix VMs and properly restarts them. However, when I BSOD a Windows domU, the VM is left alone. Is it possible to have RGManager tell when a windows VM dies? If so, what would the magical incantation be? All packages from the stock repos: EL5.6 Xen 3.1 CMAN+RGmanager -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "I feel confined, only free to expand myself within boundaries." From Colin.Simpson at iongeo.com Tue Jul 12 01:29:56 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Tue, 12 Jul 2011 02:29:56 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <20110711120546.GA26712@fieldses.org> References: <20110711120546.GA26712@fieldses.org> Message-ID: <1310434196.11870.17.camel@shyster> OK, so my question is, is there any other reason apart from the risk of individual file corruption from locking being incompatible between local/samba vs NFS that may lead to issues i.e. we aren't really interested in locking working between NFS and local/samba access just that it works consistently in NFS when accessing files that way (with a single node server) and locally/samba when accessing files that way. I mean I'm thinking of, for example, I have a build that generates source code via NFS then some time later a PC comes in via Samba and accesses these files for building on that environment. The two systems aren't requiring locking to work cross platform/protocol, just need to be exported to the two systems. But locking on each one separately is useful. If there are and we should be using all access via NFS on NFS exported filesystems, one issue that also springs to mind is commercial backup systems that support GFS2 but don't support backing up via NFS. Is there anything else I should know about GFS2 limitations? Is there a book "GFS: The Missing Manual"? :) Thanks Colin On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote: > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote: > > Hi, > > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > > > On 08/07/11 22:09, J. Bruce Fields wrote: > > > > > > > With default mount options, the linux NFS client (like most NFS > clients) > > > > assumes that a file has a most one writer at a time. > (Applications that > > > > need to do write-sharing over NFS need to use file locking.) > > > > > > The problem is that file locking on V3 isn't passed back down to > the > > > filesystem - hence the issues with nfs vs samba (or local disk > > > access(*)) on the same server. > > The NFS server *does* acquire locks on the exported filesystem (and > does > it the same way for v2, v3, and v4). > > For local filesystems (ext3, xfs, btrfs), this is sufficient. > > For exports of cluster filesystems like gfs2, there are more > complicated > problems that, as Steve says, will require some work to do to fix. > > Samba is a more complicated issue due to the imperfect match between > Windows and Linux locking semantics, but depending on how it's > configured Samba will also acquire locks on the exported filesystem. > > --b. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. From jpolo at wtransnet.com Tue Jul 12 12:49:54 2011 From: jpolo at wtransnet.com (Javi Polo) Date: Tue, 12 Jul 2011 14:49:54 +0200 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <4E173786.1060507@wtransnet.com> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <4E173786.1060507@wtransnet.com> Message-ID: <4E1C42F2.6020609@wtransnet.com> El 07/08/11 18:59, Javi Polo escribi?: > >> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference >>> this should not happen. It looks like we are trying to look up >>> something >>> that is 24 (hex) bytes into a structure. Does the fs have posix acls >>> enabled or selinux or something else using xattrs? > > Nope, at least as far as I know. As I dont usually use ubuntu, I have > checked to see if it had selinux enabled by default, or some ACLs > related thing, but it seems it's not .... > Anyone could hint me with this matter? I'm not using selinux nor xattrs nor posix acls, just a plain gfs2 filesystem with 3 journals ... thx -- Javi Polo Administrador de Sistemas Tel 93 734 97 70 Fax 93 734 97 71 jpolo at wtransnet.com From Colin.Simpson at iongeo.com Tue Jul 12 18:52:51 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Tue, 12 Jul 2011 19:52:51 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernelNULL pointer deference In-Reply-To: <1310434196.11870.17.camel@shyster> References: <1310434196.11870.17.camel@shyster> Message-ID: <1310496771.15833.86.camel@bhac.iouk.ioroot.tld> I just ask this as I have a cluster where we wish to share a project directories and home dirs and have them accessible by Linux clients via NFS and PC's via Samba. As I say the locking cross OS doesn't matter. And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel often panicing (every week or so) on one node. Could this be the cause? It's hard to catch as the fencing has stopped me so far getting a good core (and the change to crashkernel param which changed in 6.1 the new param doesn't play with the old kernel) . Plus I guess I need to see if it happens on the latest kernels, but they are worse for me due to BZ#712139. I guess the first thing I'll get from support is try the latest hotfix kernel (which I can only get once I've tested the test kernel). Also plus long fence intervals aren't great to capture. So is it time for me to look at going back to ext4 for an HA file server. Can anyone from RH tell me if I'm wasting my time even trying this on GFS2 (that I will get instability and kernel crashes)? Really unfortunate if so, as I really like the setup when it's working..... Also, after a node crashes some GFS mounts aren't too happy, they take a long time to mount back on the original failed node. The filesystems are dirty when we fsck them lots of Ondisk and fsck bitmaps differ at block 109405952 (0x6856700) Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) Metadata type is 0 (free) Some differences in free space etc Can anyone from RH tell me if I'm wasting my time even trying this on GFS2 (that I will get GFS2 instability and kernel crashes)? Thanks Colin On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote: > OK, so my question is, is there any other reason apart from the risk > of > individual file corruption from locking being incompatible between > local/samba vs NFS that may lead to issues i.e. we aren't really > interested in locking working between NFS and local/samba access just > that it works consistently in NFS when accessing files that way (with > a > single node server) and locally/samba when accessing files that way. > > I mean I'm thinking of, for example, I have a build that generates > source code via NFS then some time later a PC comes in via Samba and > accesses these files for building on that environment. The two systems > aren't requiring locking to work cross platform/protocol, just need to > be exported to the two systems. But locking on each one separately is > useful. > > If there are and we should be using all access via NFS on NFS exported > filesystems, one issue that also springs to mind is commercial backup > systems that support GFS2 but don't support backing up via NFS. > > Is there anything else I should know about GFS2 limitations? > Is there a book "GFS: The Missing Manual"? :) > > Thanks > > Colin > > On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote: > > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote: > > > Hi, > > > > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > > > > On 08/07/11 22:09, J. Bruce Fields wrote: > > > > > > > > > With default mount options, the linux NFS client (like most > NFS > > clients) > > > > > assumes that a file has a most one writer at a time. > > (Applications that > > > > > need to do write-sharing over NFS need to use file locking.) > > > > > > > > The problem is that file locking on V3 isn't passed back down to > > the > > > > filesystem - hence the issues with nfs vs samba (or local disk > > > > access(*)) on the same server. > > > > The NFS server *does* acquire locks on the exported filesystem (and > > does > > it the same way for v2, v3, and v4). > > > > For local filesystems (ext3, xfs, btrfs), this is sufficient. > > > > For exports of cluster filesystems like gfs2, there are more > > complicated > > problems that, as Steve says, will require some work to do to fix. > > > > Samba is a more complicated issue due to the imperfect match between > > Windows and Linux locking semantics, but depending on how it's > > configured Samba will also acquire locks on the exported filesystem. > > > > --b. > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > This email and any files transmitted with it are confidential and are > intended solely for the use of the individual or entity to whom they > are addressed. If you are not the original recipient or the person > responsible for delivering the email to the intended recipient, be > advised that you have received this email in error, and that any use, > dissemination, forwarding, printing, or copying of this email is > strictly prohibited. If you received this email in error, please > immediately notify the sender and delete the original. > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > From swap_project at yahoo.com Tue Jul 12 19:07:26 2011 From: swap_project at yahoo.com (Srija) Date: Tue, 12 Jul 2011 12:07:26 -0700 (PDT) Subject: [Linux-cluster] totem token parameter setting In-Reply-To: <20110704.122551.868642282278092140.yamato@redhat.com> Message-ID: <1310497646.27199.YahooMailClassic@web112816.mail.gq1.yahoo.com> Hi, In a sixteen node cluster , I need to add . It needs the cluster nodes to be rebooted. In the test lab ( three node clusers) I modify this parameter rebooting one node at a time. Each node has zen guests. So before rebooting , moving guests to different node. In that way the guests are not being effected for the change over. Is it ok to reboot the server one at a time or it needs the whole cluster nodes down then bring up one by one? Heard that for this parameters, it is suggested to bring down the whole cluster node, then bring up. Pl. advice. Thanks again. From fdinitto at redhat.com Wed Jul 13 08:26:19 2011 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Wed, 13 Jul 2011 10:26:19 +0200 Subject: [Linux-cluster] cluster 3.1.5 release Message-ID: <4E1D56AB.9080907@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Welcome to the cluster 3.1.5 release. This release addresses two issues in ccs_update_schema and relax the requirements on fence-agents and resource-agents that were erroneously introduced in the 3.1.4 release. It is still highly recommended to update both fence-agents and resource-agents. The new source tarball can be downloaded here: https://fedorahosted.org/releases/c/l/cluster/cluster-3.1.5.tar.xz ChangeLog: https://fedorahosted.org/releases/c/l/cluster/Changelog-3.1.5 To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJOHVaqAAoJEFA6oBJjVJ+Oq7IP/izCXBx94D7j1rqpAiB1wJca 5oX86vBJYOWGRA3iac7qx/RLJnd6yO60UFZ3X98vgx+pid1HLEf9Z5OYVmyVfhck agWchsxL995PbDo3Yc/uGZjtmAmy2XEbF9tAb/UtXvt/6/YEi/vDQiRCeBgX67pB NHTj9Pl3R68VgRlKve/68VrB7zmkkzJWfy8Xz2UARZ27A+qU2wWw/4i9ee/wtUc0 vFQJfFEVy5YtqZL+P2sga0G6ZxJOHugY0fbzgQMLjv/k+aeAZV/wVBmEoyvshGqZ YyH3EMzaxPxvgk8XKF8dRvq5PtbMpz0GOYvpsIj59FYUAcJ7ElGpsvIlWWn+VJZ9 RsHHddUvu+iZkH7Xmyz39AjZyiFhwwR/7qD1mnPWFMBQGdbwZU/k5+VeImw1Getn HTVI2J0g7r0waWDJodm9hTXl97yLvkQwvreyl2UzdueS3sqD7J7+Z5GhHoVc3xEt 9WmvU/TY/oaClcZGBnPQuLhxAwOkb3v0W+gEUEUZ+TCSkH0o5yQoIkJL/Qo0DtHQ AbxEc2xzXuzXKYLBH8Ce8gntJDDIojg0e19YJvtt/FRjA4L6S/CyqzafsJMVa3Pv 0s3AIYc7BdgmxCnx1MqxG13VKf1/huICwR8bj0VEZ4cdTxKSzgsXnivw9H2OFNn6 VL7+BR6SX6Bht1oo0xqe =j+xc -----END PGP SIGNATURE----- From swhiteho at redhat.com Wed Jul 13 15:53:59 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 13 Jul 2011 16:53:59 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernelNULL pointer deference In-Reply-To: <1310496771.15833.86.camel@bhac.iouk.ioroot.tld> References: <1310434196.11870.17.camel@shyster> <1310496771.15833.86.camel@bhac.iouk.ioroot.tld> Message-ID: <1310572439.3237.49.camel@menhir> Hi, On Tue, 2011-07-12 at 19:52 +0100, Colin Simpson wrote: > I just ask this as I have a cluster where we wish to share a project > directories and home dirs and have them accessible by Linux clients via > NFS and PC's via Samba. As I say the locking cross OS doesn't matter. > If it doesn't matter, then you can set the "localflocks" mount option on GFS2 and each mount will act as if it was a single node filesystem so far as locking is concerned. From a support perspective that config (active/active NFS and Samba) is not supported (by Red Hat), because we don't test it, because generally you do need locking in order to make it safe wrt to accesses between NFS/Samba. It is something where we'd like to expand our support in future though, and the more requests we receive the better idea we get of exactly what use cases people require, and thus where to spend our time. > And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel > often panicing (every week or so) on one node. Could this be the cause? > It shouldn't be. If the set up you've described causes problems then they will be in terms of coherency between the NFS and Samba exports, if you've got a panic then thats something else entirely. > It's hard to catch as the fencing has stopped me so far getting a good > core (and the change to crashkernel param which changed in 6.1 the new > param doesn't play with the old kernel) . Plus I guess I need to see if > it happens on the latest kernels, but they are worse for me due to > BZ#712139. I guess the first thing I'll get from support is try the > latest hotfix kernel (which I can only get once I've tested the test > kernel). Also plus long fence intervals aren't great to capture. > Do you not get messages in syslog? Thats the first thing to look at, getting a core is helpful, but often not essential in kernel debugging. > So is it time for me to look at going back to ext4 for an HA file > server. > > Can anyone from RH tell me if I'm wasting my time even trying this on > GFS2 (that I will get instability and kernel crashes)? > > Really unfortunate if so, as I really like the setup when it's > working..... > > Also, after a node crashes some GFS mounts aren't too happy, they take a > long time to mount back on the original failed node. The filesystems are > dirty when we fsck them lots of > > Ondisk and fsck bitmaps differ at block 109405952 (0x6856700) > Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) > Metadata type is 0 (free) > > Some differences in free space etc > > Can anyone from RH tell me if I'm wasting my time even trying this on > GFS2 (that I will get GFS2 instability and kernel crashes)? > I suspect that it will not work exactly as you expect due to potential coherency issues, but you still should not be getting kernel crashes either way, Steve. > Thanks > > Colin > > On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote: > > OK, so my question is, is there any other reason apart from the risk > > of > > individual file corruption from locking being incompatible between > > local/samba vs NFS that may lead to issues i.e. we aren't really > > interested in locking working between NFS and local/samba access just > > that it works consistently in NFS when accessing files that way (with > > a > > single node server) and locally/samba when accessing files that way. > > > > I mean I'm thinking of, for example, I have a build that generates > > source code via NFS then some time later a PC comes in via Samba and > > accesses these files for building on that environment. The two systems > > aren't requiring locking to work cross platform/protocol, just need to > > be exported to the two systems. But locking on each one separately is > > useful. > > > > If there are and we should be using all access via NFS on NFS exported > > filesystems, one issue that also springs to mind is commercial backup > > systems that support GFS2 but don't support backing up via NFS. > > > > Is there anything else I should know about GFS2 limitations? > > Is there a book "GFS: The Missing Manual"? :) > > > > Thanks > > > > Colin > > > > On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote: > > > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote: > > > > Hi, > > > > > > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > > > > > On 08/07/11 22:09, J. Bruce Fields wrote: > > > > > > > > > > > With default mount options, the linux NFS client (like most > > NFS > > > clients) > > > > > > assumes that a file has a most one writer at a time. > > > (Applications that > > > > > > need to do write-sharing over NFS need to use file locking.) > > > > > > > > > > The problem is that file locking on V3 isn't passed back down to > > > the > > > > > filesystem - hence the issues with nfs vs samba (or local disk > > > > > access(*)) on the same server. > > > > > > The NFS server *does* acquire locks on the exported filesystem (and > > > does > > > it the same way for v2, v3, and v4). > > > > > > For local filesystems (ext3, xfs, btrfs), this is sufficient. > > > > > > For exports of cluster filesystems like gfs2, there are more > > > complicated > > > problems that, as Steve says, will require some work to do to fix. > > > > > > Samba is a more complicated issue due to the imperfect match between > > > Windows and Linux locking semantics, but depending on how it's > > > configured Samba will also acquire locks on the exported filesystem. > > > > > > --b. > > > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > This email and any files transmitted with it are confidential and are > > intended solely for the use of the individual or entity to whom they > > are addressed. If you are not the original recipient or the person > > responsible for delivering the email to the intended recipient, be > > advised that you have received this email in error, and that any use, > > dissemination, forwarding, printing, or copying of this email is > > strictly prohibited. If you received this email in error, please > > immediately notify the sender and delete the original. > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From laszlo.budai at gmail.com Wed Jul 13 16:07:22 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Wed, 13 Jul 2011 19:07:22 +0300 Subject: [Linux-cluster] clurgmgrd[XXXX]: Error storing ip: Duplicate Message-ID: <4E1DC2BA.8040701@gmail.com> Hello everyone, I was asked to investigate why the rgmanager is not running on a red hat cluster. The cluster is on RHEL 5.3. #rpm -q cman rgmanager cman-2.0.98-1.el5 rgmanager-2.0.46-1.el5 Currently cman is running, rgmanager not. # clustat Cluster Status for prod-clust1 @ Wed Jul 13 15:42:16 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ pnl-p 1 Online psd-p 2 Online, Local # cman_tool status Version: 6.1.0 Config Version: 14 Cluster Name: prod-clust1 Cluster Id: 3382 Cluster Member: Yes Cluster Generation: 1136 Membership state: Cluster-Member Nodes: 2 Expected votes: 1 Total votes: 2 Quorum: 1 Active subsystems: 7 Flags: 2node Dirty Ports Bound: 0 Node name: pnl-p Node ID: 1 Multicast addresses: 224.0.0.1 Node addresses: 10.0.0.2 # cman_tool services type level name id state fence 0 default 00010002 none # service rgmanager status clurgmgrd dead but pid file exists this is the situation on both nodes. for one of the nodes I cannot see any message from rgmanager, and it was confirmed that the error is older then the oldest log file. on the other node I can see the messages when rgmanager was started (after reboot) and here they are: messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: Resource Group Manager Starting messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: Error storing ip: Duplicate my question is what the second line means and what are the consequences? Is it possible that a duplicate IP would shut down the rgmanager? because after a few seconds (25 seconds as we can see) I can see the following: messages.3:Jun 26 02:00:35 node-pnl-01 clurgmgrd[8720]: Shutting down and later on: messages.3:Jun 26 02:00:59 node-pnl-01 clurgmgrd[8720]: Shutdown complete, exiting Right now it is not an option to start the rgmanager and test. I have to figure it out only from the log files. Thank you in advance for any ideas. Laszlo From Colin.Simpson at iongeo.com Wed Jul 13 16:41:33 2011 From: Colin.Simpson at iongeo.com (Colin Simpson) Date: Wed, 13 Jul 2011 17:41:33 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernelNULL pointer deference In-Reply-To: <1310572439.3237.49.camel@menhir> References: <1310434196.11870.17.camel@shyster><1310496771.15833.86.camel@bhac.iouk.ioroot.tld> <1310572439.3237.49.camel@menhir> Message-ID: <1310575293.32040.39.camel@bhac.iouk.ioroot.tld> Hi I'd be looking at wanting to do single node NFS (active/passive, failover) but we are currently running with CTDB Samba on both nodes for this same directory. Would that work with "localflocks" and/or be supported in such a config? I'm thinking the clustered samba would also have to go in such a config using "localflocks". Sadly the messages file says nothing at all, apart from one node reports the other node isn't responding and it fences it. There are kdump disks on the nodes, but the RHEL 6.1 update changed the kernel param to autocrashkernel=auto and this doesn't work with the 6.0 kernel (just we are currently running due to another bug in 6.1's latest kernel. I seem to remember I used "crashkernel=512M-2G:64M,2G-:128M" with the older kernels, but I can't remember. Maybe I should try that again but the only was I know to get a kdump is to set a large fence delay. Thanks Colin On Wed, 2011-07-13 at 16:53 +0100, Steven Whitehouse wrote: > Hi, > > On Tue, 2011-07-12 at 19:52 +0100, Colin Simpson wrote: > > I just ask this as I have a cluster where we wish to share a project > > directories and home dirs and have them accessible by Linux clients > via > > NFS and PC's via Samba. As I say the locking cross OS doesn't > matter. > > > If it doesn't matter, then you can set the "localflocks" mount option > on > GFS2 and each mount will act as if it was a single node filesystem so > far as locking is concerned. From a support perspective that config > (active/active NFS and Samba) is not supported (by Red Hat), because > we > don't test it, because generally you do need locking in order to make > it > safe wrt to accesses between NFS/Samba. > > It is something where we'd like to expand our support in future > though, > and the more requests we receive the better idea we get of exactly > what > use cases people require, and thus where to spend our time. > > > And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel > > often panicing (every week or so) on one node. Could this be the > cause? > > > It shouldn't be. If the set up you've described causes problems then > they will be in terms of coherency between the NFS and Samba exports, > if > you've got a panic then thats something else entirely. > > > It's hard to catch as the fencing has stopped me so far getting a > good > > core (and the change to crashkernel param which changed in 6.1 the > new > > param doesn't play with the old kernel) . Plus I guess I need to see > if > > it happens on the latest kernels, but they are worse for me due to > > BZ#712139. I guess the first thing I'll get from support is try the > > latest hotfix kernel (which I can only get once I've tested the test > > kernel). Also plus long fence intervals aren't great to capture. > > > Do you not get messages in syslog? Thats the first thing to look at, > getting a core is helpful, but often not essential in kernel > debugging. > > > So is it time for me to look at going back to ext4 for an HA file > > server. > > > > Can anyone from RH tell me if I'm wasting my time even trying this > on > > GFS2 (that I will get instability and kernel crashes)? > > > > Really unfortunate if so, as I really like the setup when it's > > working..... > > > > Also, after a node crashes some GFS mounts aren't too happy, they > take a > > long time to mount back on the original failed node. The filesystems > are > > dirty when we fsck them lots of > > > > Ondisk and fsck bitmaps differ at block 109405952 (0x6856700) > > Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free) > > Metadata type is 0 (free) > > > > Some differences in free space etc > > > > Can anyone from RH tell me if I'm wasting my time even trying this > on > > GFS2 (that I will get GFS2 instability and kernel crashes)? > > > I suspect that it will not work exactly as you expect due to potential > coherency issues, but you still should not be getting kernel crashes > either way, > > Steve. > > > Thanks > > > > Colin > > > > On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote: > > > OK, so my question is, is there any other reason apart from the > risk > > > of > > > individual file corruption from locking being incompatible between > > > local/samba vs NFS that may lead to issues i.e. we aren't really > > > interested in locking working between NFS and local/samba access > just > > > that it works consistently in NFS when accessing files that way > (with > > > a > > > single node server) and locally/samba when accessing files that > way. > > > > > > I mean I'm thinking of, for example, I have a build that generates > > > source code via NFS then some time later a PC comes in via Samba > and > > > accesses these files for building on that environment. The two > systems > > > aren't requiring locking to work cross platform/protocol, just > need to > > > be exported to the two systems. But locking on each one separately > is > > > useful. > > > > > > If there are and we should be using all access via NFS on NFS > exported > > > filesystems, one issue that also springs to mind is commercial > backup > > > systems that support GFS2 but don't support backing up via NFS. > > > > > > Is there anything else I should know about GFS2 limitations? > > > Is there a book "GFS: The Missing Manual"? :) > > > > > > Thanks > > > > > > Colin > > > > > > On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote: > > > > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse > wrote: > > > > > Hi, > > > > > > > > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote: > > > > > > On 08/07/11 22:09, J. Bruce Fields wrote: > > > > > > > > > > > > > With default mount options, the linux NFS client (like > most > > > NFS > > > > clients) > > > > > > > assumes that a file has a most one writer at a time. > > > > (Applications that > > > > > > > need to do write-sharing over NFS need to use file > locking.) > > > > > > > > > > > > The problem is that file locking on V3 isn't passed back > down to > > > > the > > > > > > filesystem - hence the issues with nfs vs samba (or local > disk > > > > > > access(*)) on the same server. > > > > > > > > The NFS server *does* acquire locks on the exported filesystem > (and > > > > does > > > > it the same way for v2, v3, and v4). > > > > > > > > For local filesystems (ext3, xfs, btrfs), this is sufficient. > > > > > > > > For exports of cluster filesystems like gfs2, there are more > > > > complicated > > > > problems that, as Steve says, will require some work to do to > fix. > > > > > > > > Samba is a more complicated issue due to the imperfect match > between > > > > Windows and Linux locking semantics, but depending on how it's > > > > configured Samba will also acquire locks on the exported > filesystem. > > > > > > > > --b. > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster at redhat.com > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > > > > > This email and any files transmitted with it are confidential and > are > > > intended solely for the use of the individual or entity to whom > they > > > are addressed. If you are not the original recipient or the > person > > > responsible for delivering the email to the intended recipient, be > > > advised that you have received this email in error, and that any > use, > > > dissemination, forwarding, printing, or copying of this email is > > > strictly prohibited. If you received this email in error, please > > > immediately notify the sender and delete the original. > > > > > > > > > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > From member at linkedin.com Thu Jul 14 01:19:33 2011 From: member at linkedin.com (Paul Morgan via LinkedIn) Date: Thu, 14 Jul 2011 01:19:33 +0000 (UTC) Subject: [Linux-cluster] Invitation to connect on LinkedIn Message-ID: <366543265.17198061.1310606373490.JavaMail.app@ela4-bed79.prod> LinkedIn ------------ Paul Morgan requested to add you as a connection on LinkedIn: ------------------------------------------ Marian, I'd like to add you to my professional network on LinkedIn. - Paul Accept invitation from Paul Morgan http://www.linkedin.com/e/-odgn7o-gq3171wt-58/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2959652231_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP4Pcz8RdzARej99bPxEpkISiz91bP4Nd3sUe34PdjcLrCBxbOYWrSlI/EML_comm_afe/ View invitation from Paul Morgan http://www.linkedin.com/e/-odgn7o-gq3171wt-58/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2959652231_2/39vcjcOczkSejkVcAALqnpPbOYWrSlI/svi/ ------------------------------------------ DID YOU KNOW you can showcase your professional knowledge on LinkedIn to receive job/consulting offers and enhance your professional reputation? Posting replies to questions on LinkedIn Answers puts you in front of the world's professional community. http://www.linkedin.com/e/-odgn7o-gq3171wt-58/abq/inv-24/ -- (c) 2011, LinkedIn Corporation -------------- next part -------------- An HTML attachment was scrubbed... URL: From laszlo.budai at gmail.com Thu Jul 14 15:42:55 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Thu, 14 Jul 2011 18:42:55 +0300 Subject: [Linux-cluster] question about failoverdomains Message-ID: <4E1F0E7F.5080405@gmail.com> Hi all, Please somebody tell me what is the behavior for the following failoverdomains configuration: This is a two node cluster. the idea is to create a configuration where some of the services prefer to run on one node, and other on the other node. For this I would have created two prioritized failover domains with the priorities set in such a way that in one domain the one node has higher priority than the other. My question related to this setup is: if a node fails (let suppose pnl-p fails) then the services which are configured to use dom1 will migrate to node psd-p. What will happen when the node recovers? will that service fail back to that node? Thank you, Laszlo From ajb2 at mssl.ucl.ac.uk Thu Jul 14 15:52:44 2011 From: ajb2 at mssl.ucl.ac.uk (Alan Brown) Date: Thu, 14 Jul 2011 16:52:44 +0100 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernelNULL pointer deference In-Reply-To: <1310575293.32040.39.camel@bhac.iouk.ioroot.tld> References: <1310434196.11870.17.camel@shyster><1310496771.15833.86.camel@bhac.iouk.ioroot.tld> <1310572439.3237.49.camel@menhir> <1310575293.32040.39.camel@bhac.iouk.ioroot.tld> Message-ID: <4E1F10CC.40406@mssl.ucl.ac.uk> > Maybe I should try that again but the > only was I know to get a kdump is to set a large fence delay. This is what I'd expect. We also found the fence delay has to be long enough to allow the crashdump to be written out. The only alternatives to speed this up are to use _very_ fast disk for the vmcore or dump to another machine over 10Gb links. Either way I suspect it'll max out around 30Mb/s if you have compression turned on. :( From linux at alteeve.com Thu Jul 14 15:52:49 2011 From: linux at alteeve.com (Digimer) Date: Thu, 14 Jul 2011 11:52:49 -0400 Subject: [Linux-cluster] question about failoverdomains In-Reply-To: <4E1F0E7F.5080405@gmail.com> References: <4E1F0E7F.5080405@gmail.com> Message-ID: <4E1F10D1.4020900@alteeve.com> On 07/14/2011 11:42 AM, Budai Laszlo wrote: > Hi all, > > Please somebody tell me what is the behavior for the following > failoverdomains configuration: > > > > > > > > > > > > This is a two node cluster. the idea is to create a configuration where > some of the services prefer to run on one node, and other on the other node. > For this I would have created two prioritized failover domains with the > priorities set in such a way that in one domain the one node has higher > priority than the other. > > My question related to this setup is: > if a node fails (let suppose pnl-p fails) then the services which are > configured to use dom1 will migrate to node psd-p. What will happen when > the node recovers? will that service fail back to that node? > > Thank you, > Laszlo These two domains will not offer failover. The domain contains only one node, so any services set to use either domain will only run on that given node. This is useful the cluster to starting local service (ie: clvmd, gfs2, etc). To get an ordered domain, define both nodes within a given domain, set 'ordered="1"' and then set different 'priority=""'. The node with the lower priority will be the "preferred node" and the other will be the fall-back. Pay attention to the "nofailback" option as well. See: https://fedorahosted.org/cluster/wiki/RGManager -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "I feel confined, only free to expand myself within boundaries." From laszlo.budai at gmail.com Thu Jul 14 20:38:50 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Thu, 14 Jul 2011 23:38:50 +0300 Subject: [Linux-cluster] question about failoverdomains In-Reply-To: <4E1F10D1.4020900@alteeve.com> References: <4E1F0E7F.5080405@gmail.com> <4E1F10D1.4020900@alteeve.com> Message-ID: <4E1F53DA.3070000@gmail.com> Thank you for the link. I've found the answer to my question there. Actually on this page: https://fedorahosted.org/cluster/wiki/FailoverDomains And the answer is: Yes, the service will migrate back (fail back) to the node which is member of its own domain. On the other side, you are wrong about the fact that these domains does not offer failover. That would be true if the domains were restricted. Kind regards, Laszlo On 07/14/2011 06:52 PM, Digimer wrote: > On 07/14/2011 11:42 AM, Budai Laszlo wrote: >> Hi all, >> >> Please somebody tell me what is the behavior for the following >> failoverdomains configuration: >> >> >> >> >> >> >> >> >> >> >> >> This is a two node cluster. the idea is to create a configuration where >> some of the services prefer to run on one node, and other on the other node. >> For this I would have created two prioritized failover domains with the >> priorities set in such a way that in one domain the one node has higher >> priority than the other. >> >> My question related to this setup is: >> if a node fails (let suppose pnl-p fails) then the services which are >> configured to use dom1 will migrate to node psd-p. What will happen when >> the node recovers? will that service fail back to that node? >> >> Thank you, >> Laszlo > These two domains will not offer failover. The domain contains only one > node, so any services set to use either domain will only run on that > given node. This is useful the cluster to starting local service (ie: > clvmd, gfs2, etc). > > To get an ordered domain, define both nodes within a given domain, set > 'ordered="1"' and then set different 'priority=""'. The node with > the lower priority will be the "preferred node" and the other will be > the fall-back. > > Pay attention to the "nofailback" option as well. > > See: https://fedorahosted.org/cluster/wiki/RGManager > From ifetch at du.edu Sat Jul 16 05:06:19 2011 From: ifetch at du.edu (Ivan Fetch) Date: Fri, 15 Jul 2011 23:06:19 -0600 Subject: [Linux-cluster] Multi-homing in rhel5 In-Reply-To: References: Message-ID: <8961ACA1-6DB3-4987-8281-664CF84C6957@du.edu> Hi COrey, On Feb 3, 2011, at 3:49 AM, Corey Kovacs wrote: > The cluster2 docs outline a procedure for multihoming which is > unsupported by redhat. > Which multihoming method is that? > Is anyone actually using this method or are people more inclined to > use configs in which secondary interfaces are given names by which the > cluster then uses them as primary config nodes. > > For example, on my cluster I have eth0 as the primary interface for > all normal system traffic, and eth1 as my cluster interconnect. > > eth0 - nodename > eth1 - nodename-clu <-- cluster config points to this as nodes.... > > clients access the cluster services via eth0. > > I've seen other configs where people configure the cluster to use eth0 > for cluster coms so that ricci/luci work correctly, but I don't use > those. > > Is there an advantage of one method over the other ? I am just getting started with RHCS, from Sun Cluster. I was planning to use private node host names as the primary cluster communication, using an ethernet bond of two NICs. I was also planning to have Luci available, until I am more adept at knowing what to add to cluster.conf by hand. Does Luci not function when the primary cluster communication is over a private node interconnect? Thanks, Ivan. . From post at michael-neubert.de Mon Jul 18 11:57:58 2011 From: post at michael-neubert.de (Michael Neubert) Date: Mon, 18 Jul 2011 13:57:58 +0200 Subject: [Linux-cluster] clean_start in combination with post_join_delay Message-ID: <7c425127a2c120a67cfaa665751e1110.squirrel@www.michael-neubert.de> Hello, at the moment I am using a GFS2 setup with the following values for the fencing daemon in the cluster config: clean_start="1" >> Documentation: "prevent any startup fencing the daemon might do" post_join_delay="60" >> Documentation: "number of seconds the daemon will wait before fencing any victims after a node joins the domain" If I set clean_start to 1 the post_join_delay paramter will have no sense in my opinion. Or is there another reason I do not see, why to use post_join_delay even though clean_start is set to 1? Thanks for every reply. Best wishes Michael From jpolo at wtransnet.com Tue Jul 19 12:18:55 2011 From: jpolo at wtransnet.com (Javi Polo) Date: Tue, 19 Jul 2011 14:18:55 +0200 Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference In-Reply-To: <4E1C42F2.6020609@wtransnet.com> References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir> <4E173786.1060507@wtransnet.com> <4E1C42F2.6020609@wtransnet.com> Message-ID: <4E25762F.3070900@wtransnet.com> El 07/12/11 14:49, Javi Polo escribi?: >>> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference >>>> this should not happen. It looks like we are trying to look up >>>> something >>>> that is 24 (hex) bytes into a structure. Does the fs have posix acls >>>> enabled or selinux or something else using xattrs? >> >> Nope, at least as far as I know. As I dont usually use ubuntu, I have >> checked to see if it had selinux enabled by default, or some ACLs >> related thing, but it seems it's not .... >> > > Anyone could hint me with this matter? > I'm not using selinux nor xattrs nor posix acls, just a plain gfs2 > filesystem with 3 journals ... > I finally manage to get it working by rolling back to an ubuntu linux-2.6.32-33-virtual kernel. I guess there's a problem in nfs modules, not in gfs2, because it crashed pretty much the same while testing with OCFS2 thanks you all :) -- Javi Polo Administrador de Sistemas Tel 93 734 97 70 Fax 93 734 97 71 jpolo at wtransnet.com From mcaubet at pic.es Tue Jul 19 14:19:17 2011 From: mcaubet at pic.es (Marc Caubet) Date: Tue, 19 Jul 2011 16:19:17 +0200 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 Message-ID: Hi, we are testing RedHat Cluster to build a KVM virtualization infrastructure. This is the first time we use the Linux Cluster so we are a little bit lost. Hope someone can help. Actually we have 2 hypervisors connected via Fiber Channel to a shared storage (both servers see the same 15TB device /dev/mapper/mpathb). Our idea is a shared storage to hold KVM virtual machines by using LVM2. Both server should be able to run Virtual Machines from the same storage, but we should be able to migrate or start virtual machines on the other server node on crash. So the plan is: - Virtual Machine image = Logical Volume - CLVM2 cluster: only one server node at the same time will be able to manage the volume group- - KVM Virtual Machine High Availability. Machines will run on one server node. If for some reason the server node crashes, the second will start / migrate the virtual machine. Basically we woul like to know: - How can we create a cluster for the LVM2 shared storage (when we create it, it does not work since both server nodes have the VG as Active) - How can we create a cluster service for a virtual machine (we guess it has to be done 1 by 1) - Since we have 2 server nodes, how to increase the number of votes for quorum (qdisk over a heartbeat logical volume partition?) Thanks and best regards, -- Marc Caubet Serrabou PIC (Port d'Informaci? Cient?fica) Campus UAB, Edificio D E-08193 Bellaterra, Barcelona Tel: +34 93 581 33 22 Fax: +34 93 581 41 10 http://www.pic.es Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Tue Jul 19 14:50:52 2011 From: linux at alteeve.com (Digimer) Date: Tue, 19 Jul 2011 10:50:52 -0400 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 In-Reply-To: References: Message-ID: <4E2599CC.4010503@alteeve.com> On 07/19/2011 10:19 AM, Marc Caubet wrote: > Hi, > > we are testing RedHat Cluster to build a KVM virtualization > infrastructure. This is the first time we use the Linux Cluster so we > are a little bit lost. Hope someone can help. > > Actually we have 2 hypervisors connected via Fiber Channel to a shared > storage (both servers see the same 15TB device /dev/mapper/mpathb). > > Our idea is a shared storage to hold KVM virtual machines by using LVM2. > Both server should be able to run Virtual Machines from the same > storage, but we should be able to migrate or start virtual machines on > the other server node on crash. > > So the plan is: > > - Virtual Machine image = Logical Volume > - CLVM2 cluster: only one server node at the same time will be able to > manage the volume group- > - KVM Virtual Machine High Availability. Machines will run on one server > node. If for some reason the server node crashes, the second will start > / migrate the virtual machine. > > Basically we woul like to know: > > - How can we create a cluster for the LVM2 shared storage (when we > create it, it does not work since both server nodes have the VG as Active) > - How can we create a cluster service for a virtual machine (we guess it > has to be done 1 by 1) > - Since we have 2 server nodes, how to increase the number of votes for > quorum (qdisk over a heartbeat logical volume partition?) > > Thanks and best regards, You need to setup a cluster with fencing, which will let you then use clustered LVM, clvmd, which in turn uses the distributed lock manager, dlm. This will allow for the same LVs to be seen and used across cluster nodes. Then you will simply add the VMs as resources to rgmanager, which uses (and sits on top of) corosync, which is itself the core of the cluster. I'm guessing that you are using RHEL 6, so this may not map perfectly, but I described how to build a similar Xen-based HA VM cluster on EL5. The main differences are; Corosync instead of OpenAIS (changes nothing, configuration wise), ignore DRBD as you have a proper SAN and replace Xen with KVM. The small GFS2 partition is still recommended for central storage of the VM definitions (needed for migration and recovery). However, if you don't have a GFS2 license, you can manually keep the configs in sync on matching local directories on the nodes. See if this helps at all: http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial Best of luck. :) -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "At what point did we forget that the Space Shuttle was, essentially, a program that strapped human beings to an explosion and tried to stab through the sky with fire and math?" From Ralph.Grothe at itdz-berlin.de Wed Jul 20 07:06:42 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Wed, 20 Jul 2011 09:06:42 +0200 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 In-Reply-To: References: Message-ID: Hi Marc, though Digimer's RHCS tutorial is an excellent introducion to the RHCS cluster with a thourough step by step reference to setting up a Xen cluster, and I'd highly recommend you read it, you probably also would like to have look at these articles in the cluster wiki which focus a little more condensed on your questions. Here is described how to set up so called HA-LVM which avoids the clmvd overhead and is for settings like yours where you only require active/passive VG activation (i.e. a shared storage VG is only activated on a single cluster node at any time). This is achieved by tagging of the affected VGs/LVs. https://fedorahosted.org/cluster/wiki/LVMFailover Unfortunately, this article lacks mentioning of the required locking_type setting in lvm.conf. But if you have access to RHN this article on HA-LVM does, and it also outlines both methods, i.e. the so called "preferred clvmd" method and the so called "original" (or what I'd call "tagging") method which doesn't require clvmd: https://access.redhat.com/kb/docs/DOC-3068 Should you on the other hand require active/active VGs (i.e. simultaneous activation of the same shared VG on more than one cluster node), which I consider not a requirement for a KVM cluster (but I lack any experience in this field) the recommended procedure is described here: https://access.redhat.com/kb/docs/DOC-17651 Important aside, after you've edited the lvm.conf you are required to make a new initial ramdisk (or at least touch the mtime of the current initrd) or cluster services won't start on that node (watch entries in messages or whereever syslogd logs clulog stuff to). In this article you may find something treating KVM VMs' migration. https://fedorahosted.org/cluster/wiki/KvmMigration Regards Ralph (an RHCS newbie himself) ________________________________ From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Marc Caubet Sent: Tuesday, July 19, 2011 4:19 PM To: linux-cluster at redhat.com Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 Hi, we are testing RedHat Cluster to build a KVM virtualization infrastructure. This is the first time we use the Linux Cluster so we are a little bit lost. Hope someone can help. Actually we have 2 hypervisors connected via Fiber Channel to a shared storage (both servers see the same 15TB device /dev/mapper/mpathb). Our idea is a shared storage to hold KVM virtual machines by using LVM2. Both server should be able to run Virtual Machines from the same storage, but we should be able to migrate or start virtual machines on the other server node on crash. So the plan is: - Virtual Machine image = Logical Volume - CLVM2 cluster: only one server node at the same time will be able to manage the volume group- - KVM Virtual Machine High Availability. Machines will run on one server node. If for some reason the server node crashes, the second will start / migrate the virtual machine. Basically we woul like to know: - How can we create a cluster for the LVM2 shared storage (when we create it, it does not work since both server nodes have the VG as Active) - How can we create a cluster service for a virtual machine (we guess it has to be done 1 by 1) - Since we have 2 server nodes, how to increase the number of votes for quorum (qdisk over a heartbeat logical volume partition?) Thanks and best regards, -- Marc Caubet Serrabou PIC (Port d'Informaci? Cient?fica) Campus UAB, Edificio D E-08193 Bellaterra, Barcelona Tel: +34 93 581 33 22 Fax: +34 93 581 41 10 http://www.pic.es Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html From mcaubet at pic.es Wed Jul 20 09:29:41 2011 From: mcaubet at pic.es (Marc Caubet) Date: Wed, 20 Jul 2011 11:29:41 +0200 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 In-Reply-To: <4E2599CC.4010503@alteeve.com> References: <4E2599CC.4010503@alteeve.com> Message-ID: Hi, thanks a lot for your reply. You need to setup a cluster with fencing, which will let you then use > clustered LVM, clvmd, which in turn uses the distributed lock manager, > dlm. This will allow for the same LVs to be seen and used across cluster > nodes. > Ok. I will try this. > > Then you will simply add the VMs as resources to rgmanager, which uses > (and sits on top of) corosync, which is itself the core of the cluster. > So each virtual machine will be a resource, is it right? > > I'm guessing that you are using RHEL 6, so this may not map perfectly, > but I described how to build a similar Xen-based HA VM cluster on EL5. > The main differences are; Corosync instead of OpenAIS (changes nothing, > configuration wise), ignore DRBD as you have a proper SAN and replace > Xen with KVM. The small GFS2 partition is still recommended for central > storage of the VM definitions (needed for migration and recovery). > However, if you don't have a GFS2 license, you can manually keep the > configs in sync on matching local directories on the nodes. > > See if this helps at all: > > http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial > Actually we are using SL6 but we probably will migrate to RHEL6 if this environment will be used as production infrastructure in the future. So we will consider GFS2. Thanks a lot for your answer. Marc > > Best of luck. :) > > -- > Digimer > E-Mail: digimer at alteeve.com > Freenode handle: digimer > Papers and Projects: http://alteeve.com > Node Assassin: http://nodeassassin.org > "At what point did we forget that the Space Shuttle was, essentially, > a program that strapped human beings to an explosion and tried to stab > through the sky with fire and math?" > -- Marc Caubet Serrabou PIC (Port d'Informaci? Cient?fica) Campus UAB, Edificio D E-08193 Bellaterra, Barcelona Tel: +34 93 581 33 22 Fax: +34 93 581 41 10 http://www.pic.es Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcaubet at pic.es Wed Jul 20 09:40:10 2011 From: mcaubet at pic.es (Marc Caubet) Date: Wed, 20 Jul 2011 11:40:10 +0200 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 In-Reply-To: References: Message-ID: Hi, thanks for your answer. On 20 July 2011 09:06, wrote: > Hi Marc, > > > though Digimer's RHCS tutorial is an excellent introducion to the > RHCS cluster with a thourough step by step reference to setting > up a Xen cluster, and I'd highly recommend you read it, you > probably also would like to have look at these articles in the > cluster wiki which focus a little more condensed on your > questions. > > Here is described how to set up so called HA-LVM which avoids the > clmvd overhead and is for settings like yours where you only > require active/passive VG activation (i.e. a shared storage VG is > only activated on a single cluster node at any time). > This is achieved by tagging of the affected VGs/LVs. > > https://fedorahosted.org/cluster/wiki/LVMFailover > > Unfortunately, this article lacks mentioning of the required > locking_type setting in lvm.conf. > > But if you have access to RHN this article on HA-LVM does, and it > also outlines both methods, i.e. the so called "preferred clvmd" > method and the so called "original" (or what I'd call "tagging") > method which doesn't require clvmd: > > https://access.redhat.com/kb/docs/DOC-3068 > > > Should you on the other hand require active/active VGs (i.e. > simultaneous activation of the same shared VG on more than one > cluster node), which I consider not a requirement for a KVM > cluster (but I lack any experience in this field) the recommended > procedure is described here: > > https://access.redhat.com/kb/docs/DOC-17651 > Great links, this is what we were looking for. We'll try this before testing GFS2 because we are preferably want to work directly over Logical Volumes. Thanks a lot for your reply, Marc > > > Important aside, after you've edited the lvm.conf you are > required to make a new initial ramdisk (or at least touch the > mtime of the current initrd) or cluster services won't start on > that node (watch entries in messages or whereever syslogd logs > clulog stuff to). > > > > In this article you may find something treating KVM VMs' > migration. > > https://fedorahosted.org/cluster/wiki/KvmMigration > > > Regards > Ralph > (an RHCS newbie himself) > ________________________________ > > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Marc > Caubet > Sent: Tuesday, July 19, 2011 4:19 PM > To: linux-cluster at redhat.com > Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 > > > Hi, > > we are testing RedHat Cluster to build a KVM > virtualization infrastructure. This is the first time we use the > Linux Cluster so we are a little bit lost. Hope someone can help. > > Actually we have 2 hypervisors connected via Fiber > Channel to a shared storage (both servers see the same 15TB > device /dev/mapper/mpathb). > > Our idea is a shared storage to hold KVM virtual machines > by using LVM2. Both server should be able to run Virtual Machines > from the same storage, but we should be able to migrate or start > virtual machines on the other server node on crash. > > So the plan is: > > - Virtual Machine image = Logical Volume > - CLVM2 cluster: only one server node at the same time > will be able to manage the volume group- > - KVM Virtual Machine High Availability. Machines will > run on one server node. If for some reason the server node > crashes, the second will start / migrate the virtual machine. > > Basically we woul like to know: > > - How can we create a cluster for the LVM2 shared storage > (when we create it, it does not work since both server nodes have > the VG as Active) > - How can we create a cluster service for a virtual > machine (we guess it has to be done 1 by 1) > - Since we have 2 server nodes, how to increase the > number of votes for quorum (qdisk over a heartbeat logical volume > partition?) > > Thanks and best regards, > -- > Marc Caubet Serrabou > PIC (Port d'Informaci? Cient?fica) > Campus UAB, Edificio D > E-08193 Bellaterra, Barcelona > Tel: +34 93 581 33 22 > Fax: +34 93 581 41 10 > http://www.pic.es > Avis - Aviso - Legal Notice: > http://www.ifae.es/legal.html > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Marc Caubet Serrabou PIC (Port d'Informaci? Cient?fica) Campus UAB, Edificio D E-08193 Bellaterra, Barcelona Tel: +34 93 581 33 22 Fax: +34 93 581 41 10 http://www.pic.es Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Wed Jul 20 13:02:21 2011 From: linux at alteeve.com (Digimer) Date: Wed, 20 Jul 2011 09:02:21 -0400 Subject: [Linux-cluster] Linux Cluster + KVM + LVM2 In-Reply-To: References: <4E2599CC.4010503@alteeve.com> Message-ID: <4E26D1DD.4040203@alteeve.com> On 07/20/2011 05:29 AM, Marc Caubet wrote: > Hi, > > thanks a lot for your reply. > > You need to setup a cluster with fencing, which will let you then use > clustered LVM, clvmd, which in turn uses the distributed lock manager, > dlm. This will allow for the same LVs to be seen and used across cluster > nodes. > > > Ok. I will try this. > > > > Then you will simply add the VMs as resources to rgmanager, which uses > (and sits on top of) corosync, which is itself the core of the cluster. > > > So each virtual machine will be a resource, is it right? If you wish, yes. Having the resource management means that recovery of VMs lost to a host node failure will be automated. It is not, in itself, a requirement. > I'm guessing that you are using RHEL 6, so this may not map perfectly, > but I described how to build a similar Xen-based HA VM cluster on EL5. > The main differences are; Corosync instead of OpenAIS (changes nothing, > configuration wise), ignore DRBD as you have a proper SAN and replace > Xen with KVM. The small GFS2 partition is still recommended for central > storage of the VM definitions (needed for migration and recovery). > However, if you don't have a GFS2 license, you can manually keep the > configs in sync on matching local directories on the nodes. > > See if this helps at all: > > http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial > > > Actually we are using SL6 but we probably will migrate to RHEL6 if this > environment will be used as production infrastructure in the future. So > we will consider GFS2. > > Thanks a lot for your answer. > > Marc SL6 is based on RHEL6, so the cluster stack will be the same. A couple of notes; Ralph is right, of course, and those RHEL docs are well worth reading. They are certainly more authoritative than my wiki. There are a few things to consider though, if you proceed without a cluster; * Live migration of VMs (as opposed to cold recovery), requires the new and old hosts to simultaneously write to the same LV, iirc. Assuming I am right, then you need to make sure you LV is ACTIVE on both nodes at the same time. I do not know if that is (safely) possible without clustered LVM (and it's use of DLM). * Without a cluster, VM recovery and what not will not be automatic, I don't believe. * Without the cluster's fencing, if a node is (accidentally) flagged as ACTIVE on two nodes, there is nothing preventing corruption of the LV. For example, let's say that a node hangs... After a time, you (or a script) recovers the VM on another node. After, the original node unblocks and goes back to writing to the LV. Suddenly, you've got the same VM running twice on the same block device. Fencing puts the hung node into a known safe state by forcing it to shut down. Only then, after confirmation that the node is actually gone, with another node recover the resources. Building a minimal cluster with fencing is not that hard. It does require some reading and some patience, but it's effectively; * Setup shared SSH keys between nodes. * Edit /etc/cluster/cluster.conf ** define the nodes and how to fence them (device, port) ** define the fence device(s) (ip, user/pass, etc). * Start the cluster * Start clvmd * Create clustered LVs (lvcreate -c y...) Done! -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "At what point did we forget that the Space Shuttle was, essentially, a program that strapped human beings to an explosion and tried to stab through the sky with fire and math?" From jordir at fib.upc.edu Fri Jul 22 10:08:59 2011 From: jordir at fib.upc.edu (Jordi Renye) Date: Fri, 22 Jul 2011 12:08:59 +0200 Subject: [Linux-cluster] rhel 6.1 gfs2 performance tests Message-ID: <4E294C3B.4000704@fib.upc.edu> Hi, We have configured redhat cluster RHEL 6.1 with two nodes. We have seen that performance of GFS2 on writing is half of ext3 partition. For example, time of commands: time cp -Rp /usr /gfs2partition/usr 0.681u 47.082s 7:01.80 11.3% 0+0k 561264+2994832io 0pf+0w whereas cp -R /usr /ext3partition/usr 0.543u 24.041s 4:16.86 9.5% 0+0k 2728584+3166184io 2pf+0w With ping_pong tool from Samba.org we've got next results: Los resultados son los siguientes: ping_pong /gfs2partition/pingpongtestfile 3 1582 locks/sec With ping_pong test r/w: ping_pong -rw /gfs2partition/pingpongtestfile 3 data increment = 2 4 locks/sec Do you think we can get better performance? Do you think are "normal" and "good" results ? Which recommendations do you tell us to get better performance? For example, we don't have a heartbeat network exclusively, but we have only one networks interface for application network and cluster network. Could we get better performance with one dedicated cluster network( for dlm,heartbeath,...). Thanks in advanced, Jordi Renye Capel LCFIB Laboratori de C?lcul Facultat d'Inform?tica de Barcelona Universitat Polit?cnica de Catalunya - Barcelona Tech From swhiteho at redhat.com Fri Jul 22 10:32:30 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 22 Jul 2011 11:32:30 +0100 Subject: [Linux-cluster] rhel 6.1 gfs2 performance tests In-Reply-To: <4E294C3B.4000704@fib.upc.edu> References: <4E294C3B.4000704@fib.upc.edu> Message-ID: <1311330750.2804.10.camel@menhir> Hi, On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote: > Hi, > > We have configured redhat cluster RHEL 6.1 with two nodes. > We have seen that performance of GFS2 on writing is > half of ext3 partition. > > For example, time of commands: > > time cp -Rp /usr /gfs2partition/usr > 0.681u 47.082s 7:01.80 11.3% 0+0k 561264+2994832io 0pf+0w > > whereas > > cp -R /usr /ext3partition/usr > 0.543u 24.041s 4:16.86 9.5% 0+0k 2728584+3166184io 2pf+0w > > With ping_pong tool from Samba.org we've got next results: > > Los resultados son los siguientes: > > ping_pong /gfs2partition/pingpongtestfile 3 > 1582 locks/sec > > With ping_pong test r/w: > > ping_pong -rw /gfs2partition/pingpongtestfile 3 > data increment = 2 > 4 locks/sec > > Do you think we can get better performance? Do you think > are "normal" and "good" results ? > > Which recommendations do you tell us to get better performance? > > For example, we don't have a heartbeat network exclusively, but > we have only one networks interface for application network and cluster > network. > Could we get better performance with one dedicated cluster network( for > dlm,heartbeath,...). > > Thanks in advanced, > It depends what you are trying to optimise for... what is the actual application that you want to run? cp doesn't use fcntl locks to the best of my knowledge, so I doubt that will have any particular effect on the performance. Also it would be quite unusual for fcntl locks to have any effect on the performance of the fs as a whole. Usually the most important factor is how the workload is balances between nodes. Also, did you mount with noatime, nodiratime? Steve. From jordir at fib.upc.edu Fri Jul 22 10:41:30 2011 From: jordir at fib.upc.edu (Jordi Renye) Date: Fri, 22 Jul 2011 12:41:30 +0200 Subject: [Linux-cluster] rhel 6.1 gfs2 performance tests In-Reply-To: <1311330750.2804.10.camel@menhir> References: <4E294C3B.4000704@fib.upc.edu> <1311330750.2804.10.camel@menhir> Message-ID: <4E2953DA.8090404@fib.upc.edu> We are sharing gfs2 partition through samba to three hundred clients aprox. Partition GFS2 is mounted in two nodes of cluster. Clients can boot in linux and windows. There is one share for home folder, another for profiles, another for shared applications and data: there is 5 shares. >> Also, did you mount with noatime, nodiratime? Yes, I'm mounting with these options. Jordi Renye LCFIB - UPC El 22/07/2011 12:32, Steven Whitehouse escribi?: > Hi, > > On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote: >> Hi, >> >> We have configured redhat cluster RHEL 6.1 with two nodes. >> We have seen that performance of GFS2 on writing is >> half of ext3 partition. >> >> For example, time of commands: >> >> time cp -Rp /usr /gfs2partition/usr >> 0.681u 47.082s 7:01.80 11.3% 0+0k 561264+2994832io 0pf+0w >> >> whereas >> >> cp -R /usr /ext3partition/usr >> 0.543u 24.041s 4:16.86 9.5% 0+0k 2728584+3166184io 2pf+0w >> >> With ping_pong tool from Samba.org we've got next results: >> >> Los resultados son los siguientes: >> >> ping_pong /gfs2partition/pingpongtestfile 3 >> 1582 locks/sec >> >> With ping_pong test r/w: >> >> ping_pong -rw /gfs2partition/pingpongtestfile 3 >> data increment = 2 >> 4 locks/sec >> >> Do you think we can get better performance? Do you think >> are "normal" and "good" results ? >> >> Which recommendations do you tell us to get better performance? >> >> For example, we don't have a heartbeat network exclusively, but >> we have only one networks interface for application network and cluster >> network. >> Could we get better performance with one dedicated cluster network( for >> dlm,heartbeath,...). >> >> Thanks in advanced, >> > It depends what you are trying to optimise for... what is the actual > application that you want to run? > > cp doesn't use fcntl locks to the best of my knowledge, so I doubt that > will have any particular effect on the performance. Also it would be > quite unusual for fcntl locks to have any effect on the performance of > the fs as a whole. > > Usually the most important factor is how the workload is balances > between nodes. Also, did you mount with noatime, nodiratime? > > Steve. > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Jordi Renye Capel o o o T?cnic de Sistemes N1 o o o Laboratori de C?lcul o o o Facultat d'Inform?tica de Barcelona U P C Universitat Polit?cnica de Catalunya - Barcelona Tech E-mail : jordir at fib.upc.edu Tel. : 16943 Web : http://www.fib.upc.edu/ ====================================================================== Abans d'imprimir aquest missatge, si us plau, assegureu-vos que sigui necessari. El medi ambient ?s cosa de tots. --[ http://www.fib.upc.edu/disclaimer/ ]------------------------------ ADVERTIMENT / TEXT LEGAL: Aquest missatge pot contenir informaci? confidencial o legalment protegida i est? exclusivament adre?at a la persona o entitat destinat?ria. Si vost? no es el destinatari final o persona encarregada de recollir-lo, no est? autoritzat a llegir-lo, retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar el seu contingut. Si ha rebut aquest correu electr?nic per error, li preguem que informi al remitent i elimini del seu sistema el missatge i el material annex que pugui contenir. Gr?cies per la seva col?laboraci?. From carlopmart at gmail.com Mon Jul 25 12:38:53 2011 From: carlopmart at gmail.com (carlopmart) Date: Mon, 25 Jul 2011 14:38:53 +0200 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E04B61B.9070208@cybercat.ca> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> Message-ID: <4E2D63DD.4050007@gmail.com> On 06/24/2011 06:06 PM, Nicolas Ross wrote: > >> Thanks for that, that'll prevent me from modifying a system file... >> >> And yes, I find it a little disapointing. We're now at 6.1, and our >> setup is exactly what RHCS was designed for... A GFS over fiber, httpd >> running content from that gfs... > > Two thing I need to mention in this issue. One, support doesn't think > anymore that it's a coro-sync specific issue, they are searching to a > driver issue or other source for this problem. > > Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0), > for another issue, and since I did, I don't think I saw that issue > again. I saw spikes in my cpu graphs, but I'm not 100% sure that they > are caused by this issue. > > So, as a temporary work-around for this time, woule be (at your own > risks) to downgrade to 2.6.32-71.29.1.el6 kernel : > > yum install kernel-2.6.32-71.29.1.el6.x86_64 > > Regards, Hi Steven and Nicolas, Is this bug resolved in RHEL6.1 with all updates applied?? Do I need to use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? Thanks. -- CL Martinez carlopmart {at} gmail {d0t} com From sdake at redhat.com Mon Jul 25 13:44:09 2011 From: sdake at redhat.com (Steven Dake) Date: Mon, 25 Jul 2011 06:44:09 -0700 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E2D63DD.4050007@gmail.com> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com> Message-ID: <4E2D7329.6050607@redhat.com> On 07/25/2011 05:38 AM, carlopmart wrote: > On 06/24/2011 06:06 PM, Nicolas Ross wrote: >> >>> Thanks for that, that'll prevent me from modifying a system file... >>> >>> And yes, I find it a little disapointing. We're now at 6.1, and our >>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd >>> running content from that gfs... >> >> Two thing I need to mention in this issue. One, support doesn't think >> anymore that it's a coro-sync specific issue, they are searching to a >> driver issue or other source for this problem. >> >> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0), >> for another issue, and since I did, I don't think I saw that issue >> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they >> are caused by this issue. >> >> So, as a temporary work-around for this time, woule be (at your own >> risks) to downgrade to 2.6.32-71.29.1.el6 kernel : >> >> yum install kernel-2.6.32-71.29.1.el6.x86_64 >> >> Regards, > > Hi Steven and Nicolas, > > Is this bug resolved in RHEL6.1 with all updates applied?? Do I need to > use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? > > Thanks. > the corosync portion is going through QE. The kernel portion remains open. Regards -steve From carlopmart at gmail.com Mon Jul 25 13:48:21 2011 From: carlopmart at gmail.com (carlopmart) Date: Mon, 25 Jul 2011 15:48:21 +0200 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E2D7329.6050607@redhat.com> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com> <4E2D7329.6050607@redhat.com> Message-ID: <4E2D7425.4070801@gmail.com> On 07/25/2011 03:44 PM, Steven Dake wrote: > On 07/25/2011 05:38 AM, carlopmart wrote: >> On 06/24/2011 06:06 PM, Nicolas Ross wrote: >>> >>>> Thanks for that, that'll prevent me from modifying a system file... >>>> >>>> And yes, I find it a little disapointing. We're now at 6.1, and our >>>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd >>>> running content from that gfs... >>> >>> Two thing I need to mention in this issue. One, support doesn't think >>> anymore that it's a coro-sync specific issue, they are searching to a >>> driver issue or other source for this problem. >>> >>> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0), >>> for another issue, and since I did, I don't think I saw that issue >>> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they >>> are caused by this issue. >>> >>> So, as a temporary work-around for this time, woule be (at your own >>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel : >>> >>> yum install kernel-2.6.32-71.29.1.el6.x86_64 >>> >>> Regards, >> >> Hi Steven and Nicolas, >> >> Is this bug resolved in RHEL6.1 with all updates applied?? Do I need to >> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? >> >> Thanks. >> > > the corosync portion is going through QE. The kernel portion remains open. > > Regards > -steve > Thanks Steve, then, Can I use last corosync version provided with RHEL6.1 and last RHEL6.0's kernel version without problems?? -- CL Martinez carlopmart {at} gmail {d0t} com From swhiteho at redhat.com Mon Jul 25 13:51:11 2011 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 25 Jul 2011 14:51:11 +0100 Subject: [Linux-cluster] rhel 6.1 gfs2 performance tests In-Reply-To: <4E2953DA.8090404@fib.upc.edu> References: <4E294C3B.4000704@fib.upc.edu> <1311330750.2804.10.camel@menhir> <4E2953DA.8090404@fib.upc.edu> Message-ID: <1311601871.2697.7.camel@menhir> Hi, On Fri, 2011-07-22 at 12:41 +0200, Jordi Renye wrote: > We are sharing gfs2 partition through samba > to three hundred clients aprox. > > Partition GFS2 is mounted in two nodes of > cluster. > > Clients can boot in linux and windows. > > There is one share for home folder, another > for profiles, another for shared applications and > data: there is 5 shares. > > >> Also, did you mount with noatime, nodiratime? > > Yes, I'm mounting with these options. > > Jordi Renye > LCFIB - UPC > > Were the tests being run directly on gfs2, or via Samba in this case? Steve. > > El 22/07/2011 12:32, Steven Whitehouse escribi?: > > Hi, > > > > On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote: > >> Hi, > >> > >> We have configured redhat cluster RHEL 6.1 with two nodes. > >> We have seen that performance of GFS2 on writing is > >> half of ext3 partition. > >> > >> For example, time of commands: > >> > >> time cp -Rp /usr /gfs2partition/usr > >> 0.681u 47.082s 7:01.80 11.3% 0+0k 561264+2994832io 0pf+0w > >> > >> whereas > >> > >> cp -R /usr /ext3partition/usr > >> 0.543u 24.041s 4:16.86 9.5% 0+0k 2728584+3166184io 2pf+0w > >> > >> With ping_pong tool from Samba.org we've got next results: > >> > >> Los resultados son los siguientes: > >> > >> ping_pong /gfs2partition/pingpongtestfile 3 > >> 1582 locks/sec > >> > >> With ping_pong test r/w: > >> > >> ping_pong -rw /gfs2partition/pingpongtestfile 3 > >> data increment = 2 > >> 4 locks/sec > >> > >> Do you think we can get better performance? Do you think > >> are "normal" and "good" results ? > >> > >> Which recommendations do you tell us to get better performance? > >> > >> For example, we don't have a heartbeat network exclusively, but > >> we have only one networks interface for application network and cluster > >> network. > >> Could we get better performance with one dedicated cluster network( for > >> dlm,heartbeath,...). > >> > >> Thanks in advanced, > >> > > It depends what you are trying to optimise for... what is the actual > > application that you want to run? > > > > cp doesn't use fcntl locks to the best of my knowledge, so I doubt that > > will have any particular effect on the performance. Also it would be > > quite unusual for fcntl locks to have any effect on the performance of > > the fs as a whole. > > > > Usually the most important factor is how the workload is balances > > between nodes. Also, did you mount with noatime, nodiratime? > > > > Steve. > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > From sdake at redhat.com Mon Jul 25 15:42:03 2011 From: sdake at redhat.com (Steven Dake) Date: Mon, 25 Jul 2011 08:42:03 -0700 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E2D7425.4070801@gmail.com> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com> <4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com> Message-ID: <4E2D8ECB.6020305@redhat.com> On 07/25/2011 06:48 AM, carlopmart wrote: > On 07/25/2011 03:44 PM, Steven Dake wrote: >> On 07/25/2011 05:38 AM, carlopmart wrote: >>> On 06/24/2011 06:06 PM, Nicolas Ross wrote: >>>> >>>>> Thanks for that, that'll prevent me from modifying a system file... >>>>> >>>>> And yes, I find it a little disapointing. We're now at 6.1, and our >>>>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd >>>>> running content from that gfs... >>>> >>>> Two thing I need to mention in this issue. One, support doesn't think >>>> anymore that it's a coro-sync specific issue, they are searching to a >>>> driver issue or other source for this problem. >>>> >>>> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0), >>>> for another issue, and since I did, I don't think I saw that issue >>>> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they >>>> are caused by this issue. >>>> >>>> So, as a temporary work-around for this time, woule be (at your own >>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel : >>>> >>>> yum install kernel-2.6.32-71.29.1.el6.x86_64 >>>> >>>> Regards, >>> >>> Hi Steven and Nicolas, >>> >>> Is this bug resolved in RHEL6.1 with all updates applied?? Do I >>> need to >>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? >>> >>> Thanks. >>> >> >> the corosync portion is going through QE. The kernel portion remains >> open. >> >> Regards >> -steve >> > > Thanks Steve, then, Can I use last corosync version provided with > RHEL6.1 and last RHEL6.0's kernel version without problems?? > > > I recommend not mixing without a support signoff. Regards -steve From carlopmart at gmail.com Mon Jul 25 15:45:11 2011 From: carlopmart at gmail.com (carlopmart) Date: Mon, 25 Jul 2011 17:45:11 +0200 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E2D8ECB.6020305@redhat.com> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com> <4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com> <4E2D8ECB.6020305@redhat.com> Message-ID: <4E2D8F87.30508@gmail.com> On 07/25/2011 05:42 PM, Steven Dake wrote: >>>>> are caused by this issue. >>>>> >>>>> So, as a temporary work-around for this time, woule be (at your own >>>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel : >>>>> >>>>> yum install kernel-2.6.32-71.29.1.el6.x86_64 >>>>> >>>>> Regards, >>>> >>>> Hi Steven and Nicolas, >>>> >>>> Is this bug resolved in RHEL6.1 with all updates applied?? Do I >>>> need to >>>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? >>>> >>>> Thanks. >>>> >>> >>> the corosync portion is going through QE. The kernel portion remains >>> open. >>> >>> Regards >>> -steve >>> >> >> Thanks Steve, then, Can I use last corosync version provided with >> RHEL6.1 and last RHEL6.0's kernel version without problems?? >> >> >> > > I recommend not mixing without a support signoff. > Then, how can I install rhcs under rhel6.x and prevent this bug?? -- CL Martinez carlopmart {at} gmail {d0t} com From sdake at redhat.com Mon Jul 25 16:04:27 2011 From: sdake at redhat.com (Steven Dake) Date: Mon, 25 Jul 2011 09:04:27 -0700 Subject: [Linux-cluster] Corosync goes cpu to 95-99% In-Reply-To: <4E2D8F87.30508@gmail.com> References: <4DD29D03.9080901@gmail.com> <4DD2BAC3.50509@redhat.com> <4DD2BD7D.5070704@gmail.com> <4DD2CA90.6090802@redhat.com> <3B50BA7445114813AE429BEE51A2BA52@versa> <4DD78908.2030801@gmail.com> <0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com> <4DD873C7.8080402@cybercat.ca> <22E7D11CD5E64E338A66811F31F06238@versa> <4DE545D7.1080703@redhat.com> <4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca> <4DE75602.1000408@gmail.com> <51BB988BCCF547E69BF222BDAF34C4DE@versa> <4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com> <4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com> <4E2D8ECB.6020305@redhat.com> <4E2D8F87.30508@gmail.com> Message-ID: <4E2D940B.5020803@redhat.com> On 07/25/2011 08:45 AM, carlopmart wrote: > On 07/25/2011 05:42 PM, Steven Dake wrote: >>>>>> are caused by this issue. >>>>>> >>>>>> So, as a temporary work-around for this time, woule be (at your own >>>>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel : >>>>>> >>>>>> yum install kernel-2.6.32-71.29.1.el6.x86_64 >>>>>> >>>>>> Regards, >>>>> >>>>> Hi Steven and Nicolas, >>>>> >>>>> Is this bug resolved in RHEL6.1 with all updates applied?? Do I >>>>> need to >>>>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1? >>>>> >>>>> Thanks. >>>>> >>>> >>>> the corosync portion is going through QE. The kernel portion remains >>>> open. >>>> >>>> Regards >>>> -steve >>>> >>> >>> Thanks Steve, then, Can I use last corosync version provided with >>> RHEL6.1 and last RHEL6.0's kernel version without problems?? >>> >>> >>> >> >> I recommend not mixing without a support signoff. >> > > Then, how can I install rhcs under rhel6.x and prevent this bug?? > > get a support signoff. Also the corosync updates have not finished through our validation process. Only hot fixes (from support) are available Regards -steve From jordir at fib.upc.edu Tue Jul 26 10:36:55 2011 From: jordir at fib.upc.edu (Jordi Renye) Date: Tue, 26 Jul 2011 12:36:55 +0200 Subject: [Linux-cluster] rhel 6.1 gfs2 performance tests In-Reply-To: <1311601871.2697.7.camel@menhir> References: <4E294C3B.4000704@fib.upc.edu> <1311330750.2804.10.camel@menhir> <4E2953DA.8090404@fib.upc.edu> <1311601871.2697.7.camel@menhir> Message-ID: <4E2E98C7.6020100@fib.upc.edu> Tests run directly on gfs2. Soon, we would like testing through samba clients. Jordi Renye LCFIB - UPC El 25/07/2011 15:51, Steven Whitehouse escribi?: > Hi, > > On Fri, 2011-07-22 at 12:41 +0200, Jordi Renye wrote: >> We are sharing gfs2 partition through samba >> to three hundred clients aprox. >> >> Partition GFS2 is mounted in two nodes of >> cluster. >> >> Clients can boot in linux and windows. >> >> There is one share for home folder, another >> for profiles, another for shared applications and >> data: there is 5 shares. >> >>>> Also, did you mount with noatime, nodiratime? >> Yes, I'm mounting with these options. >> >> Jordi Renye >> LCFIB - UPC >> >> > Were the tests being run directly on gfs2, or via Samba in this case? > > Steve. > >> El 22/07/2011 12:32, Steven Whitehouse escribi?: >>> Hi, >>> >>> On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote: >>>> Hi, >>>> >>>> We have configured redhat cluster RHEL 6.1 with two nodes. >>>> We have seen that performance of GFS2 on writing is >>>> half of ext3 partition. >>>> >>>> For example, time of commands: >>>> >>>> time cp -Rp /usr /gfs2partition/usr >>>> 0.681u 47.082s 7:01.80 11.3% 0+0k 561264+2994832io 0pf+0w >>>> >>>> whereas >>>> >>>> cp -R /usr /ext3partition/usr >>>> 0.543u 24.041s 4:16.86 9.5% 0+0k 2728584+3166184io 2pf+0w >>>> >>>> With ping_pong tool from Samba.org we've got next results: >>>> >>>> Los resultados son los siguientes: >>>> >>>> ping_pong /gfs2partition/pingpongtestfile 3 >>>> 1582 locks/sec >>>> >>>> With ping_pong test r/w: >>>> >>>> ping_pong -rw /gfs2partition/pingpongtestfile 3 >>>> data increment = 2 >>>> 4 locks/sec >>>> >>>> Do you think we can get better performance? Do you think >>>> are "normal" and "good" results ? >>>> >>>> Which recommendations do you tell us to get better performance? >>>> >>>> For example, we don't have a heartbeat network exclusively, but >>>> we have only one networks interface for application network and cluster >>>> network. >>>> Could we get better performance with one dedicated cluster network( for >>>> dlm,heartbeath,...). >>>> >>>> Thanks in advanced, >>>> >>> It depends what you are trying to optimise for... what is the actual >>> application that you want to run? >>> >>> cp doesn't use fcntl locks to the best of my knowledge, so I doubt that >>> will have any particular effect on the performance. Also it would be >>> quite unusual for fcntl locks to have any effect on the performance of >>> the fs as a whole. >>> >>> Usually the most important factor is how the workload is balances >>> between nodes. Also, did you mount with noatime, nodiratime? >>> >>> Steve. >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Jordi Renye Capel o o o T?cnic de Sistemes N1 o o o Laboratori de C?lcul o o o Facultat d'Inform?tica de Barcelona U P C Universitat Polit?cnica de Catalunya - Barcelona Tech E-mail : jordir at fib.upc.edu Tel. : 16943 Web : http://www.fib.upc.edu/ ====================================================================== Abans d'imprimir aquest missatge, si us plau, assegureu-vos que sigui necessari. El medi ambient ?s cosa de tots. --[ http://www.fib.upc.edu/disclaimer/ ]------------------------------ ADVERTIMENT / TEXT LEGAL: Aquest missatge pot contenir informaci? confidencial o legalment protegida i est? exclusivament adre?at a la persona o entitat destinat?ria. Si vost? no es el destinatari final o persona encarregada de recollir-lo, no est? autoritzat a llegir-lo, retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar el seu contingut. Si ha rebut aquest correu electr?nic per error, li preguem que informi al remitent i elimini del seu sistema el missatge i el material annex que pugui contenir. Gr?cies per la seva col?laboraci?. From ifetch at du.edu Wed Jul 27 05:47:03 2011 From: ifetch at du.edu (Ivan Fetch) Date: Tue, 26 Jul 2011 23:47:03 -0600 Subject: [Linux-cluster] RAID1 in RHCS / RHEL5 Message-ID: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu> Hello, I am not finding a lot of docs or commentary about accomplishing RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to RAID1 two mpath devices (using the device multipathing with RHEL5). The mpath devices (mpath0, mpath1) do not match up on all nodes, but I don't believe this technically matters, since md uses UUIDs to detects it's disk devices. I could just configure an md device, but it seems like there should be a notion of which node owns the md. WHat happens when the active node is resyncing a mirror, and that node dies or gets fenced? WHat happens if someone tries to operate on the md from the inactive node? I would very much appreciate hearing from those who are accomplishing this, how they did it, gotchas and lessons learned. Thanks, - Ivan . From gordan at bobich.net Wed Jul 27 08:05:16 2011 From: gordan at bobich.net (Gordan Bobic) Date: Wed, 27 Jul 2011 09:05:16 +0100 Subject: [Linux-cluster] RAID1 in RHCS / RHEL5 In-Reply-To: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu> References: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu> Message-ID: I don't think this has anything to do with clustering, but what you are probably looking for is DRBD: http://www.drbd.org/ Gordan On Tue, 26 Jul 2011 23:47:03 -0600, Ivan Fetch wrote: > Hello, > > I am not finding a lot of docs or commentary about accomplishing > RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to > RAID1 two mpath devices (using the device multipathing with RHEL5). > The mpath devices (mpath0, mpath1) do not match up on all nodes, but > I > don't believe this technically matters, since md uses UUIDs to > detects > it's disk devices. > > I could just configure an md device, but it seems like there should > be a notion of which node owns the md. WHat happens when the active > node is resyncing a mirror, and that node dies or gets fenced? WHat > happens if someone tries to operate on the md from the inactive node? > > I would very much appreciate hearing from those who are accomplishing > this, how they did it, gotchas and lessons learned. > > Thanks, > > - Ivan > > > > > > > > > > > > > > > > > > > > > > > > . > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ifetch at du.edu Wed Jul 27 15:14:08 2011 From: ifetch at du.edu (Ivan Fetch) Date: Wed, 27 Jul 2011 09:14:08 -0600 Subject: [Linux-cluster] RAID1 in RHCS / RHEL5 In-Reply-To: References: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu> Message-ID: Hi GOrdan, Thanks for your reply. I had thought about DRBd, but would rather not mirror over network links, when I can mirror over the SAN. I am also not sure how DRBD would handle our running a Postgresql database on top of it, having never used DRBD in production. Do you have any experience with databases on DRBD? Thanks, Ivan. On Jul 27, 2011, at 2:05 AM, Gordan Bobic wrote: > I don't think this has anything to do with clustering, but what you are > probably looking for is DRBD: > http://www.drbd.org/ > > Gordan > > On Tue, 26 Jul 2011 23:47:03 -0600, Ivan Fetch wrote: >> Hello, >> >> I am not finding a lot of docs or commentary about accomplishing >> RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to >> RAID1 two mpath devices (using the device multipathing with RHEL5). >> The mpath devices (mpath0, mpath1) do not match up on all nodes, but >> I >> don't believe this technically matters, since md uses UUIDs to >> detects >> it's disk devices. >> >> I could just configure an md device, but it seems like there should >> be a notion of which node owns the md. WHat happens when the active >> node is resyncing a mirror, and that node dies or gets fenced? WHat >> happens if someone tries to operate on the md from the inactive node? >> >> I would very much appreciate hearing from those who are accomplishing >> this, how they did it, gotchas and lessons learned. >> >> Thanks, >> >> - Ivan >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> . >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster . From gordan at bobich.net Wed Jul 27 15:39:58 2011 From: gordan at bobich.net (Gordan Bobic) Date: Wed, 27 Jul 2011 16:39:58 +0100 Subject: [Linux-cluster] RAID1 in RHCS / RHEL5 In-Reply-To: References: "<33A35733-136D-41A6-9EAA-11D106E14583@du.edu>" Message-ID: On Wed, 27 Jul 2011 09:14:08 -0600, Ivan Fetch wrote: > I had thought about DRBd, but would rather not mirror over network > links, when I can mirror over the SAN. Then I guess that is something you need to bring up with your SAN vendor. > I am also not sure how DRBD > would handle our running a Postgresql database on top of it, having > never used DRBD in production. Do you have any experience with > databases on DRBD? Yes, I have been using DRBD for backing databases in fail-over scenarios for years. Provided that your fencing works properely and that you don't try to mount the device on both sides at the same time, it'll work fine. Gordan From laszlo.budai at gmail.com Thu Jul 28 00:19:38 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Thu, 28 Jul 2011 03:19:38 +0300 Subject: [Linux-cluster] service startup order Message-ID: <4E30AB1A.1080102@gmail.com> Hello everybody, I would like to know how the Red Hat cluster starts up services in RHEL 4.5. I'm curious about the ordering of services. Does the cluster starts the services in the order as they appears in the service section of the cluster.conf? Is it starting one service at a time, or it starts the services in parallel? rgmanager-1.9.*68-1* Thank you, Laszlo -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Thu Jul 28 00:50:38 2011 From: linux at alteeve.com (Digimer) Date: Wed, 27 Jul 2011 20:50:38 -0400 Subject: [Linux-cluster] service startup order In-Reply-To: <4E30AB1A.1080102@gmail.com> References: <4E30AB1A.1080102@gmail.com> Message-ID: <4E30B25E.5010302@alteeve.com> On 07/27/2011 08:19 PM, Budai Laszlo wrote: > Hello everybody, > > I would like to know how the Red Hat cluster starts up services in RHEL 4.5. > I'm curious about the ordering of services. Does the cluster starts the > services in the order as they appears in the service section of the > cluster.conf? > Is it starting one service at a time, or it starts the services in parallel? > > rgmanager-1.9.*68-1* > > Thank you, > Laszlo I'm not familiar with the intricacies of RHCS/rgmanager on EL4, but I suspect the rgmanager start order is the same. Parallel services will be started simultaneously. Services configured as service trees will start in the order that they are defined (and stopped in reverse order). This covers the start order well: - https://fedorahosted.org/cluster/wiki/ResourceTrees -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "At what point did we forget that the Space Shuttle was, essentially, a program that strapped human beings to an explosion and tried to stab through the sky with fire and math?" From Ralph.Grothe at itdz-berlin.de Thu Jul 28 06:24:48 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Thu, 28 Jul 2011 08:24:48 +0200 Subject: [Linux-cluster] service startup order In-Reply-To: <4E30B25E.5010302@alteeve.com> References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> Message-ID: Hi Digimer, hi Lazlo, sorry, for intruding your thread but this is something that I am also interested in and which I haven't fully fathomed yet. > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer > Sent: Thursday, July 28, 2011 2:51 AM > To: linux clustering > Subject: Re: [Linux-cluster] service startup order > > Parallel services will be started simultaneously. Services > configured as > service trees will start in the order that they are defined > (and stopped > in reverse order). > > This covers the start order well: > - https://fedorahosted.org/cluster/wiki/ResourceTrees > The referred to wiki article only seems to treat starting/stopping order and hierarchy (parent-child vs. sibling) of resources within one service aka resource group. That sounds pretty clear. But what about ordering and possible dependencies between separate services? You mentioned service trees. You didn't actually mean resource trees? If however your wording was deliberate (what I assume) I wonder if one can nest service tag blocks as one can nest resource tags within a single service block to express dependencies or hierarchy and hence starting order between and of services? Because all the sample configuration snippets I have seen so far in various docs lack such nesting of services. The reason that interests me is because I have such a case where a customer requires such a dependency between two distinct services that during normal operation (i.e. no node has left the cluster) are hosted on different nodes. I told them, from what I have perceived of HA clustering and RHCS in particular so far, that if they wish to express such an interdependency that they would have to put all resources which are now split up in two services, in a nested manner that would map the intended hierarchy, in a single service. Because they insisted on their layout I searched a little and discovered the, in the official RHCS Admin doc not mentioned, service tag attributes "depend" and "depend_mode". However, their usage at first seemed pretty useless because the clusterware seemed to completely ignore them and start/stop services in sometimes unpredictable ways and even restarted them at random. Until I, more by accident, discovered that additionally the "rm" tag's attribute "central_processing" needed to be defined and assigned to "1" or "true" for this feature to work approximately. I say apprimately here because we still have issues with this cluster that require futher testing. I hardly dare mentioning, that unfortunately this system already went into production, now of course lacking any HA, why we had to defer further testing. Regards Ralph From laszlo.budai at gmail.com Thu Jul 28 07:06:22 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Thu, 28 Jul 2011 10:06:22 +0300 Subject: [Linux-cluster] service startup order In-Reply-To: References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> Message-ID: <4E310A6E.1090206@gmail.com> Hello everybody, @Digimer: thank you for that link. I was aware of it, and I knew about Resource trees in a service. But as Ralph has mentioned I was talking about services not resources. For instance the Solaris cluster allows one to define resource dependencies between resources even if they are members of different resource groups (a.k.a. services), and also allows for specifying resource groups dependencies. But as far as I know Red Hat Cluster Suite does not provides these features, or those are not documented enough (if at all). In RHEL6 there is an other resource group manager: Pacemaker which has a richer portfolio of dependencies, but right now it is still in the unsupported technology preview phase. Anyway it is not my case with RHEL 4.5 :( @Ralph: could you please provide me some references where have you found those attributes of the service tag (depend, depend_mode) and for the RM tag (central_processing)? Thank you. Kind regards, Laszlo On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote: > Hi Digimer, hi Lazlo, > > sorry, for intruding your thread but this is something that I am > also interested in and which I haven't fully fathomed yet. > >> -----Original Message----- >> From: linux-cluster-bounces at redhat.com >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer >> Sent: Thursday, July 28, 2011 2:51 AM >> To: linux clustering >> Subject: Re: [Linux-cluster] service startup order >> >> Parallel services will be started simultaneously. Services >> configured as >> service trees will start in the order that they are defined >> (and stopped >> in reverse order). >> >> This covers the start order well: >> - https://fedorahosted.org/cluster/wiki/ResourceTrees >> > The referred to wiki article only seems to treat > starting/stopping order and hierarchy (parent-child vs. sibling) > of resources within one service aka resource group. > That sounds pretty clear. > But what about ordering and possible dependencies between > separate services? > > You mentioned service trees. You didn't actually mean resource > trees? > If however your wording was deliberate (what I assume) I wonder > if one can nest service tag blocks as one can nest resource tags > within a single service block to express dependencies or > hierarchy and hence starting order between and of services? > Because all the sample configuration snippets I have seen so far > in various docs lack such nesting of services. > > The reason that interests me is because I have such a case where > a customer requires such a dependency between two distinct > services that during normal operation (i.e. no node has left the > cluster) are hosted on different nodes. > I told them, from what I have perceived of HA clustering and RHCS > in particular so far, that if they wish to express such an > interdependency that they would have to put all resources which > are now split up in two services, in a nested manner that would > map the intended hierarchy, in a single service. > > Because they insisted on their layout I searched a little and > discovered the, in the official RHCS Admin doc not mentioned, > service tag attributes "depend" and "depend_mode". > > However, their usage at first seemed pretty useless because the > clusterware seemed to completely ignore them and start/stop > services in sometimes unpredictable ways and even restarted them > at random. > Until I, more by accident, discovered that additionally the "rm" > tag's attribute "central_processing" needed to be defined and > assigned to "1" or "true" for this feature to work approximately. > I say apprimately here because we still have issues with this > cluster that require futher testing. > I hardly dare mentioning, that unfortunately this system already > went into production, now of course lacking any HA, > why we had to defer further testing. > > > Regards > Ralph > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From Ralph.Grothe at itdz-berlin.de Thu Jul 28 07:34:50 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Thu, 28 Jul 2011 09:34:50 +0200 Subject: [Linux-cluster] service startup order In-Reply-To: <4E310A6E.1090206@gmail.com> References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> <4E310A6E.1090206@gmail.com> Message-ID: > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai Laszlo > Sent: Thursday, July 28, 2011 9:06 AM > To: linux-cluster at redhat.com > Subject: Re: [Linux-cluster] service startup order > > @Ralph: could you please provide me some references where > have you found > those attributes of the service tag (depend, depend_mode) and > for the RM > tag (central_processing)? Thank you. Even Digimer is metioning this attribute in her excellent wiki http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr ocessing If you have a login account at RHN you may find this article in their knowledge base https://access.redhat.com/kb/docs/DOC-26981 Apart from that the only text where this attrib was used that I have come across so far was an RH doc treating deployment of SAP on RHCS But it only appears there in a sample config snippet without further explanation. I guess that they needed to activate it because they were using the RHCS event processing interface? http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf > > Kind regards, > Laszlo > > > On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote: > > Hi Digimer, hi Lazlo, > > > > sorry, for intruding your thread but this is something that I am > > also interested in and which I haven't fully fathomed yet. > > > >> -----Original Message----- > >> From: linux-cluster-bounces at redhat.com > >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer > >> Sent: Thursday, July 28, 2011 2:51 AM > >> To: linux clustering > >> Subject: Re: [Linux-cluster] service startup order > >> > >> Parallel services will be started simultaneously. Services > >> configured as > >> service trees will start in the order that they are defined > >> (and stopped > >> in reverse order). > >> > >> This covers the start order well: > >> - https://fedorahosted.org/cluster/wiki/ResourceTrees > >> > > The referred to wiki article only seems to treat > > starting/stopping order and hierarchy (parent-child vs. sibling) > > of resources within one service aka resource group. > > That sounds pretty clear. > > But what about ordering and possible dependencies between > > separate services? > > > > You mentioned service trees. You didn't actually mean resource > > trees? > > If however your wording was deliberate (what I assume) I wonder > > if one can nest service tag blocks as one can nest resource tags > > within a single service block to express dependencies or > > hierarchy and hence starting order between and of services? > > Because all the sample configuration snippets I have seen so far > > in various docs lack such nesting of services. > > > > The reason that interests me is because I have such a case where > > a customer requires such a dependency between two distinct > > services that during normal operation (i.e. no node has left the > > cluster) are hosted on different nodes. > > I told them, from what I have perceived of HA clustering and RHCS > > in particular so far, that if they wish to express such an > > interdependency that they would have to put all resources which > > are now split up in two services, in a nested manner that would > > map the intended hierarchy, in a single service. > > > > Because they insisted on their layout I searched a little and > > discovered the, in the official RHCS Admin doc not mentioned, > > service tag attributes "depend" and "depend_mode". > > > > However, their usage at first seemed pretty useless because the > > clusterware seemed to completely ignore them and start/stop > > services in sometimes unpredictable ways and even restarted them > > at random. > > Until I, more by accident, discovered that additionally the "rm" > > tag's attribute "central_processing" needed to be defined and > > assigned to "1" or "true" for this feature to work approximately. > > I say apprimately here because we still have issues with this > > cluster that require futher testing. > > I hardly dare mentioning, that unfortunately this system already > > went into production, now of course lacking any HA, > > why we had to defer further testing. > > > > > > Regards > > Ralph > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From laszlo.budai at gmail.com Thu Jul 28 08:06:55 2011 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Thu, 28 Jul 2011 11:06:55 +0300 Subject: [Linux-cluster] service startup order In-Reply-To: References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> <4E310A6E.1090206@gmail.com> Message-ID: <4E31189F.3020805@gmail.com> Hi Ralph, thank you for your quick answer. That Knowledge base article indeed presents that dependencies possibility. Unfortunately I do not have the required version of rgmanager :( kind regards, Laszlo On 07/28/2011 10:34 AM, Ralph.Grothe at itdz-berlin.de wrote: > > >> -----Original Message----- >> From: linux-cluster-bounces at redhat.com >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai > Laszlo >> Sent: Thursday, July 28, 2011 9:06 AM >> To: linux-cluster at redhat.com >> Subject: Re: [Linux-cluster] service startup order >> >> @Ralph: could you please provide me some references where >> have you found >> those attributes of the service tag (depend, depend_mode) and >> for the RM >> tag (central_processing)? Thank you. > Even Digimer is metioning this attribute in her excellent wiki > > http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr > ocessing > > > If you have a login account at RHN you may find this article in > their knowledge base > > https://access.redhat.com/kb/docs/DOC-26981 > > > Apart from that the only text where this attrib was used that I > have come across so far was an RH doc treating deployment of SAP > on RHCS > But it only appears there in a sample config snippet without > further explanation. > I guess that they needed to activate it because they were using > the RHCS event processing interface? > > http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf > >> Kind regards, >> Laszlo >> >> >> On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote: >>> Hi Digimer, hi Lazlo, >>> >>> sorry, for intruding your thread but this is something that I > am >>> also interested in and which I haven't fully fathomed yet. >>> >>>> -----Original Message----- >>>> From: linux-cluster-bounces at redhat.com >>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of > Digimer >>>> Sent: Thursday, July 28, 2011 2:51 AM >>>> To: linux clustering >>>> Subject: Re: [Linux-cluster] service startup order >>>> >>>> Parallel services will be started simultaneously. Services >>>> configured as >>>> service trees will start in the order that they are defined >>>> (and stopped >>>> in reverse order). >>>> >>>> This covers the start order well: >>>> - https://fedorahosted.org/cluster/wiki/ResourceTrees >>>> >>> The referred to wiki article only seems to treat >>> starting/stopping order and hierarchy (parent-child vs. > sibling) >>> of resources within one service aka resource group. >>> That sounds pretty clear. >>> But what about ordering and possible dependencies between >>> separate services? >>> >>> You mentioned service trees. You didn't actually mean > resource >>> trees? >>> If however your wording was deliberate (what I assume) I > wonder >>> if one can nest service tag blocks as one can nest resource > tags >>> within a single service block to express dependencies or >>> hierarchy and hence starting order between and of services? >>> Because all the sample configuration snippets I have seen so > far >>> in various docs lack such nesting of services. >>> >>> The reason that interests me is because I have such a case > where >>> a customer requires such a dependency between two distinct >>> services that during normal operation (i.e. no node has left > the >>> cluster) are hosted on different nodes. >>> I told them, from what I have perceived of HA clustering and > RHCS >>> in particular so far, that if they wish to express such an >>> interdependency that they would have to put all resources > which >>> are now split up in two services, in a nested manner that > would >>> map the intended hierarchy, in a single service. >>> >>> Because they insisted on their layout I searched a little and >>> discovered the, in the official RHCS Admin doc not mentioned, >>> service tag attributes "depend" and "depend_mode". >>> >>> However, their usage at first seemed pretty useless because > the >>> clusterware seemed to completely ignore them and start/stop >>> services in sometimes unpredictable ways and even restarted > them >>> at random. >>> Until I, more by accident, discovered that additionally the > "rm" >>> tag's attribute "central_processing" needed to be defined and >>> assigned to "1" or "true" for this feature to work > approximately. >>> I say apprimately here because we still have issues with this >>> cluster that require futher testing. >>> I hardly dare mentioning, that unfortunately this system > already >>> went into production, now of course lacking any HA, >>> why we had to defer further testing. >>> >>> >>> Regards >>> Ralph >>> >>> >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From Ralph.Grothe at itdz-berlin.de Thu Jul 28 09:01:44 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Thu, 28 Jul 2011 11:01:44 +0200 Subject: [Linux-cluster] service startup order In-Reply-To: <4E31189F.3020805@gmail.com> References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> <4E310A6E.1090206@gmail.com> <4E31189F.3020805@gmail.com> Message-ID: > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai Laszlo > Sent: Thursday, July 28, 2011 10:07 AM > To: linux-cluster at redhat.com > Subject: Re: [Linux-cluster] service startup order > > Hi Ralph, > > thank you for your quick answer. > That Knowledge base article indeed presents that dependencies > possibility. Unfortunately I do not have the required version of > rgmanager :( > Lazlo, As you wrote, in RHEL 6.X as a tech preview one now has the choice to install pacemaker and corosync which however isn't yet officially supported by RH. But it looks that the RHCS will be moving to Pacemaker in forthcoming releases. With Pacemaker, as you remarked, one can configure dependencies between services. I'm in a similar dilemma like you. Because we have support contracts with RH their current RHCS under RHEL 5.6 is what we "sell" as a service to our customers. So, how much I would like to shift to Pacemaker, I am tied to the RHCS version that we have support for. Good Luck Ralph From linux at alteeve.com Thu Jul 28 11:36:45 2011 From: linux at alteeve.com (Digimer) Date: Thu, 28 Jul 2011 07:36:45 -0400 Subject: [Linux-cluster] service startup order In-Reply-To: References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> Message-ID: <4E3149CD.4010006@alteeve.com> On 07/28/2011 02:24 AM, Ralph.Grothe at itdz-berlin.de wrote: > Hi Digimer, hi Lazlo, > > sorry, for intruding your thread but this is something that I am > also interested in and which I haven't fully fathomed yet. > >> -----Original Message----- >> From: linux-cluster-bounces at redhat.com >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer >> Sent: Thursday, July 28, 2011 2:51 AM >> To: linux clustering >> Subject: Re: [Linux-cluster] service startup order >> >> Parallel services will be started simultaneously. Services >> configured as >> service trees will start in the order that they are defined >> (and stopped >> in reverse order). >> >> This covers the start order well: >> - https://fedorahosted.org/cluster/wiki/ResourceTrees >> > > The referred to wiki article only seems to treat > starting/stopping order and hierarchy (parent-child vs. sibling) > of resources within one service aka resource group. > That sounds pretty clear. > But what about ordering and possible dependencies between > separate services? > > You mentioned service trees. You didn't actually mean resource > trees? > If however your wording was deliberate (what I assume) I wonder > if one can nest service tag blocks as one can nest resource tags > within a single service block to express dependencies or > hierarchy and hence starting order between and of services? > Because all the sample configuration snippets I have seen so far > in various docs lack such nesting of services. > > The reason that interests me is because I have such a case where > a customer requires such a dependency between two distinct > services that during normal operation (i.e. no node has left the > cluster) are hosted on different nodes. > I told them, from what I have perceived of HA clustering and RHCS > in particular so far, that if they wish to express such an > interdependency that they would have to put all resources which > are now split up in two services, in a nested manner that would > map the intended hierarchy, in a single service. > > Because they insisted on their layout I searched a little and > discovered the, in the official RHCS Admin doc not mentioned, > service tag attributes "depend" and "depend_mode". > > However, their usage at first seemed pretty useless because the > clusterware seemed to completely ignore them and start/stop > services in sometimes unpredictable ways and even restarted them > at random. > Until I, more by accident, discovered that additionally the "rm" > tag's attribute "central_processing" needed to be defined and > assigned to "1" or "true" for this feature to work approximately. > I say apprimately here because we still have issues with this > cluster that require futher testing. > I hardly dare mentioning, that unfortunately this system already > went into production, now of course lacking any HA, > why we had to defer further testing. > > > Regards > Ralph I am late in returning to the thread, my apologies. :) I did not choose my words carefully, and I did mean resources, not services. When I used services, I was thinking about ordered daemon starting (that is, a resource group of system services/init.d scripts). As far as I understand, and as has been mentioned already further down this thread, rgmanager is limited in it's ability for complex dependency checking. That is why Pacemaker is so attractive and why it will eventually replace rgmanager as the primary cluster manager eventually (EL7?). -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "At what point did we forget that the Space Shuttle was, essentially, a program that strapped human beings to an explosion and tried to stab through the sky with fire and math?" From linux at alteeve.com Thu Jul 28 11:41:08 2011 From: linux at alteeve.com (Digimer) Date: Thu, 28 Jul 2011 07:41:08 -0400 Subject: [Linux-cluster] service startup order In-Reply-To: References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com> <4E310A6E.1090206@gmail.com> Message-ID: <4E314AD4.5010809@alteeve.com> On 07/28/2011 03:34 AM, Ralph.Grothe at itdz-berlin.de wrote: > > >> -----Original Message----- >> From: linux-cluster-bounces at redhat.com >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai > Laszlo >> Sent: Thursday, July 28, 2011 9:06 AM >> To: linux-cluster at redhat.com >> Subject: Re: [Linux-cluster] service startup order >> >> @Ralph: could you please provide me some references where >> have you found >> those attributes of the service tag (depend, depend_mode) and >> for the RM >> tag (central_processing)? Thank you. > > Even Digimer is metioning this attribute in her excellent wiki > > http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr > ocessing > > > If you have a login account at RHN you may find this article in > their knowledge base > > https://access.redhat.com/kb/docs/DOC-26981 > > > Apart from that the only text where this attrib was used that I > have come across so far was an RH doc treating deployment of SAP > on RHCS > But it only appears there in a sample config snippet without > further explanation. > I guess that they needed to activate it because they were using > the RHCS event processing interface? > > http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf Eek, that cluster.conf article is in a poor shape. As it stands now, it is little more that a wiki'fied dump of the cluster.ng xmllint file. The lack of documentation of many options is why there are so many /no info/ entries. I had planned to bug the devs to get better definitions of the options, and may still do that someday. However, if working on that article, I came to realize that Pacemaker is the future of resource management, so for now, I'm focusing on learning Pacemaker. As a general note of caution; Though the extra arguments may exist, unless you find documentation from Red Hat directly on their use, I'd hesitate to use the options in production. I am unclear on Red Hat's policy towards maintaining the functionality of those undocumented/minimally documented attributes. If you have a Red Hat contract, I'd strongly urge you to speak to your rep about using those attributes. -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "At what point did we forget that the Space Shuttle was, essentially, a program that strapped human beings to an explosion and tried to stab through the sky with fire and math?" From cos at aaaaa.org Thu Jul 28 21:39:24 2011 From: cos at aaaaa.org (Ofer Inbar) Date: Thu, 28 Jul 2011 17:39:24 -0400 Subject: [Linux-cluster] RHCS resource agent: status interval vs. monitor interval Message-ID: <20110728213924.GD341@mip.aaaaa.org> In the section of a RHCS resource agent's meta-data, there are nodes for both action name="status" and action name="monitor". Both of them have an interval and a timeout. For example, in ip.sh: I assume that one of them controls how often rgmanager runs the resource agent to check the resource status, but which one, and what's the point of the other one? I tried to find the answer in: https://fedorahosted.org/cluster/wiki/ResourceActions http://www.opencf.org/cgi-bin/viewcvs.cgi/*checkout*/specs/ra/resource-agent-api.txt?rev=1.10 Neither of them explain why there are separate "status" and "monitor" actions. -- Cos From Ralph.Grothe at itdz-berlin.de Fri Jul 29 05:57:01 2011 From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de) Date: Fri, 29 Jul 2011 07:57:01 +0200 Subject: [Linux-cluster] RHCS resource agent: status interval vs. monitorinterval In-Reply-To: <20110728213924.GD341@mip.aaaaa.org> References: <20110728213924.GD341@mip.aaaaa.org> Message-ID: I'm not sure and may be wrong. But to my understanding the "monitor" action adheres to the OCF RA API http://www.linux-ha.org/doc/dev-guides/_resource_agent_actions.ht ml while the status action seems to be purely RHCS specific. I assume they chose "status" to stay in tradition with the LSB init scripts' invocation parameters. So on an RHCS cluster "status" should apply. > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Ofer Inbar > Sent: Thursday, July 28, 2011 11:39 PM > To: linux-cluster at redhat.com > Subject: [Linux-cluster] RHCS resource agent: status interval > vs. monitorinterval > > In the section of a RHCS resource agent's meta-data, > there are nodes for both action name="status" and action > name="monitor". > Both of them have an interval and a timeout. For example, in ip.sh: > > > > > > > > > > I assume that one of them controls how often rgmanager runs the > resource agent to check the resource status, but which one, and > what's the point of the other one? > > I tried to find the answer in: > https://fedorahosted.org/cluster/wiki/ResourceActions > > http://www.opencf.org/cgi-bin/viewcvs.cgi/*checkout*/specs/ra/ resource-agent-api.txt?rev=1.10 > > Neither of them explain why there are separate "status" and > "monitor" actions. > -- Cos > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From jordir at fib.upc.edu Fri Jul 29 10:10:05 2011 From: jordir at fib.upc.edu (Jordi Renye) Date: Fri, 29 Jul 2011 12:10:05 +0200 Subject: [Linux-cluster] samba share partition of nfs mounted Message-ID: <4E3286FD.3090602@fib.upc.edu> Hi, Due to performance problems with GFS2 in two cluster node, we would like change directions to next architecture: - node A: partition EXT3 (before GFS2) - share partition directly via samba to pc clients - node B: mount partition of node A via NFS - share this nfs mounted partition through samba to pc clients We have 300 clients of samba (between windows and linux) that mount remote partition using it as HOME directory. We are making load balancing, between windows and linux clients: windows mount from node A, and linux from node B. Problems with GFS2 were: - long time to take backup. - problems with some applications as Eclipse. - with this two problems, we have not yet interactive tests with samba. We would like to think that EXT3 resolve this problems. Do you see, any problems with this configuration ? First thing I see, it's difficult to add high availability in case Node A going down. Thanks in advanced, Jordi Renye UPC