From yamato at redhat.com  Fri Jul  1 02:56:13 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Fri, 01 Jul 2011 11:56:13 +0900 (JST)
Subject: [Linux-cluster] [PATCH]
 /config/dlm/<cluster>/comms/<comm>/addr_list
In-Reply-To: <20110630213414.GC16480@redhat.com>
References: <20110609140546.GA30732@redhat.com>
	<20110630.213710.655406242398069789.yamato@redhat.com>
	<20110630213414.GC16480@redhat.com>
Message-ID: <20110701.115613.319846100234085929.yamato@redhat.com>

On Thu, 30 Jun 2011 17:34:14 -0400, David Teigland <teigland at redhat.com> wrote:
> On Thu, Jun 30, 2011 at 09:37:10PM +0900, Masatake YAMATO wrote:
>> Added addr_list. Could you try my patch?
>> 
>> Signed-off-by: Masatake YAMATO <yamato at redhat.com>
> 
> Thanks, it looks good, I'll push it to the next branch.  Do you use this
> mainly for debugging?  or is there some other reason that I should note in
> the commit message?
> Dave
> 

For understanding dlm and for debugging my cluster.conf:)

Masatake YAMATO



From yamato at redhat.com  Fri Jul  1 07:26:58 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Fri, 01 Jul 2011 16:26:58 +0900 (JST)
Subject: [Linux-cluster] [PATCH] dumping the unknown address when got a
 connect from non cluster node
Message-ID: <20110701.162658.1002310074831263303.yamato@redhat.com>

Another patch useful for debugging cluster.conf and network configuration.

This is useful when you build a cluster with nodes connected each others with
a software bridge(virbrN). If you install wrong iptabels configuration, dlm
cannot establish connections. You will just see 

       dlm: connect from non cluster node

in demsg. It is difficult to understand what happens quickly.
This patch dumps the address of the non cluster node.


Signed-off-by: Masatake YAMATO <yamato at redhat.com>
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index bffa1e7..90c1c2e 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con)
 	/* Get the new node's NODEID */
 	make_sockaddr(&peeraddr, 0, &len);
 	if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) {
+		int i;
+		unsigned char *b=(unsigned char *)&peeraddr;
 		log_print("connect from non cluster node");
+		for (i=0; i<sizeof(struct sockaddr_storage);i++)
+			printk("%02x ", b[i]);
+		printk("\n");
 		sock_release(newsock);
 		mutex_unlock(&con->sock_mutex);
 		return -1;



From yamato at redhat.com  Fri Jul  1 08:45:56 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Fri, 01 Jul 2011 17:45:56 +0900 (JST)
Subject: [Linux-cluster]  [PATCH] trivial fix
Message-ID: <20110701.174556.1041000521202229132.yamato@redhat.com>

Fix a typo.

Signed-off-by: Masatake YAMATO

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index bffa1e7..f0d4855 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection *con)
 	int one = 1;
 
 	if (con->nodeid == 0) {
-		log_print("attempt to connect sock 0 foiled");
+		log_print("attempt to connect sock 0 failed");
 		return;
 	}
 



From sklemer at gmail.com  Fri Jul  1 09:03:39 2011
From: sklemer at gmail.com (=?UTF-8?B?16nXnNeV150g16fXnNee16g=?=)
Date: Fri, 1 Jul 2011 12:03:39 +0300
Subject: [Linux-cluster] fence_ipmilan fails to reboot
In-Reply-To: <BANLkTims0ecxMwOuBifRrTxD6d5UJve0cg@mail.gmail.com>
References: <BANLkTims0ecxMwOuBifRrTxD6d5UJve0cg@mail.gmail.com>
Message-ID: <BANLkTi=V54gX-w5k=jy2zzck1HnNHwJZhQ@mail.gmail.com>

Hi.

I think you need to add the power_wait"10" & lanplus="1"

Try this line:

fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="xx.xx.xx.xx"
lanplus="1" login="xxxt" name="node1_ilo" passwd="yyy


Regards

Shalom.

On Thu, Jun 30, 2011 at 1:03 PM, Parvez Shaikh <parvez.h.shaikh at gmail.com>wrote:

> Hi all,
>
> I am on RHEL 5.5; and I have two rack mounted servers with IPMI configured.
>
> When I run command from the prompt to reboot the server through
> fence_ipmilan, it shutsdown the server fine but it fails to power it on
>
> # fence_ipmilan -a <IPMI IP Address> -l admin -p password -o reboot
>>
> Rebooting machine @ IPMI:<IPMI IP Address>...Failed
>>
>
> But I can power it on or power off just fine
>
>>
>> # fence_ipmilan -a <IPMI IP Address> -l admin -p password -o on
>>
> Powering on machine @ IPMI:<IPMI IP Address>...Done
>>
>
> Due to this my fencing is failing and failover is not happening.
>
> I have questions around this -
>
> 1. Can we provide action (off or reboot) in cluster.conf for ipmi lan
> fencing?
> 2. Is there anything wrong in my configuration? Cluster.conf file is pasted
> below
> 3. Is this a known issue which is fixed in newer versions
>
> Here is how my cluster.conf looks like -
>
> <?xml version="1.0"?>
> <cluster config_version="4" name="Cluster">
>  <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>  <clusternodes>
>   <clusternode name="blade1.domain" nodeid="1" votes="1">
>    <fence>
>     <method name="1">
>      <device lanplus="" name="IPMI_1"/>
>     </method>
>    </fence>
>   </clusternode>
>   <clusternode name="blade2.domain" nodeid="2" votes="1">
>    <fence>
>     <method name="1">
>      <device lanplus="" name="IPMI_2"/>
>     </method>
>    </fence>
>   </clusternode>
>  </clusternodes>
>  <cman expected_votes="1" two_node="1"/>
>  <fencedevices>
>   <fencedevice agent="fence_ipmilan" auth="none" ipaddr="<IMPI 1 IP
> Address>" login="admin" name="IPMI_1" passwd="password"/>
>   <fencedevice agent="fence_ipmilan" auth="none" ipaddr="<IMPI 2 IP
> Address>" login="admin" name="IPMI_2" passwd="password"/>
>  </fencedevices>
>  <rm>
>   <failoverdomains>
>    <failoverdomain name="FailoveDomain" ordered="1" restricted="1">
>     <failoverdomainnode name="blade1.domain" priority="2"/>
>     <failoverdomainnode name="blade2.domain" priority="1"/>
>    </failoverdomain>
>   </failoverdomains>
>   <resources/>
>   <service autostart="1" name="service" recovery="relocate"/>
>  </rm>
> </cluster>
>
> Thanks,
> Parvez
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110701/58c9593e/attachment.htm>

From ccaulfie at redhat.com  Fri Jul  1 10:21:26 2011
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Fri, 01 Jul 2011 11:21:26 +0100
Subject: [Linux-cluster] [PATCH] trivial fix
In-Reply-To: <20110701.174556.1041000521202229132.yamato@redhat.com>
References: <20110701.174556.1041000521202229132.yamato@redhat.com>
Message-ID: <4E0D9FA6.4040001@redhat.com>

On 01/07/11 09:45, Masatake YAMATO wrote:
> Fix a typo.
>
> Signed-off-by: Masatake YAMATO
>
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index bffa1e7..f0d4855 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection *con)
>   	int one = 1;
>
>   	if (con->nodeid == 0) {
> -		log_print("attempt to connect sock 0 foiled");
> +		log_print("attempt to connect sock 0 failed");
>   		return;
>   	}
>

That's not a typo. I did mean "foiled" and not "failed". Read the code 
and it will make sense ;-)

Chrissie



From parvez.h.shaikh at gmail.com  Fri Jul  1 10:24:54 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Fri, 1 Jul 2011 15:54:54 +0530
Subject: [Linux-cluster] fence_ipmilan fails to reboot - SOLVED
Message-ID: <BANLkTikQ-KwSttebMLmw7B_60AXnnwH1ew@mail.gmail.com>

Hi all,

Thanks for your responses, after providing auth=password; fencing succeeded

                <fencedevice agent="fence_ipmilan"
*auth="password"*ipaddr="IP" login="admin" name="IPMI_1"
passwd="password"/>

Thanks,
Parvez


On Fri, Jul 1, 2011 at 2:33 PM, ???? ???? <sklemer at gmail.com> wrote:

> Hi.
>
> I think you need to add the power_wait"10" & lanplus="1"
>
> Try this line:
>
> fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="xx.xx.xx.xx"
> lanplus="1" login="xxxt" name="node1_ilo" passwd="yyy
>
>
> Regards
>
> Shalom.
>
> On Thu, Jun 30, 2011 at 1:03 PM, Parvez Shaikh <parvez.h.shaikh at gmail.com>wrote:
>
>> Hi all,
>>
>> I am on RHEL 5.5; and I have two rack mounted servers with IPMI
>> configured.
>>
>> When I run command from the prompt to reboot the server through
>> fence_ipmilan, it shutsdown the server fine but it fails to power it on
>>
>> # fence_ipmilan -a <IPMI IP Address> -l admin -p password -o reboot
>>>
>> Rebooting machine @ IPMI:<IPMI IP Address>...Failed
>>>
>>
>> But I can power it on or power off just fine
>>
>>>
>>> # fence_ipmilan -a <IPMI IP Address> -l admin -p password -o on
>>>
>> Powering on machine @ IPMI:<IPMI IP Address>...Done
>>>
>>
>> Due to this my fencing is failing and failover is not happening.
>>
>> I have questions around this -
>>
>> 1. Can we provide action (off or reboot) in cluster.conf for ipmi lan
>> fencing?
>> 2. Is there anything wrong in my configuration? Cluster.conf file is
>> pasted below
>> 3. Is this a known issue which is fixed in newer versions
>>
>> Here is how my cluster.conf looks like -
>>
>> <?xml version="1.0"?>
>> <cluster config_version="4" name="Cluster">
>>  <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>  <clusternodes>
>>   <clusternode name="blade1.domain" nodeid="1" votes="1">
>>    <fence>
>>     <method name="1">
>>      <device lanplus="" name="IPMI_1"/>
>>     </method>
>>    </fence>
>>   </clusternode>
>>   <clusternode name="blade2.domain" nodeid="2" votes="1">
>>    <fence>
>>     <method name="1">
>>      <device lanplus="" name="IPMI_2"/>
>>     </method>
>>    </fence>
>>   </clusternode>
>>  </clusternodes>
>>  <cman expected_votes="1" two_node="1"/>
>>  <fencedevices>
>>   <fencedevice agent="fence_ipmilan" auth="none" ipaddr="<IMPI 1 IP
>> Address>" login="admin" name="IPMI_1" passwd="password"/>
>>   <fencedevice agent="fence_ipmilan" auth="none" ipaddr="<IMPI 2 IP
>> Address>" login="admin" name="IPMI_2" passwd="password"/>
>>  </fencedevices>
>>  <rm>
>>   <failoverdomains>
>>    <failoverdomain name="FailoveDomain" ordered="1" restricted="1">
>>     <failoverdomainnode name="blade1.domain" priority="2"/>
>>     <failoverdomainnode name="blade2.domain" priority="1"/>
>>    </failoverdomain>
>>   </failoverdomains>
>>   <resources/>
>>   <service autostart="1" name="service" recovery="relocate"/>
>>  </rm>
>> </cluster>
>>
>> Thanks,
>> Parvez
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110701/9461a1a9/attachment.htm>

From teigland at redhat.com  Fri Jul  1 17:41:36 2011
From: teigland at redhat.com (David Teigland)
Date: Fri, 1 Jul 2011 13:41:36 -0400
Subject: [Linux-cluster] [PATCH] dumping the unknown address when got a
 connect from non cluster node
In-Reply-To: <20110701.162658.1002310074831263303.yamato@redhat.com>
References: <20110701.162658.1002310074831263303.yamato@redhat.com>
Message-ID: <20110701174136.GC23008@redhat.com>

On Fri, Jul 01, 2011 at 04:26:58PM +0900, Masatake YAMATO wrote:
> Another patch useful for debugging cluster.conf and network configuration.
> 
> This is useful when you build a cluster with nodes connected each others with
> a software bridge(virbrN). If you install wrong iptabels configuration, dlm
> cannot establish connections. You will just see 
> 
>        dlm: connect from non cluster node
> 
> in demsg. It is difficult to understand what happens quickly.
> This patch dumps the address of the non cluster node.
> 
> 
> Signed-off-by: Masatake YAMATO <yamato at redhat.com>
> 
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index bffa1e7..90c1c2e 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con)
>  	/* Get the new node's NODEID */
>  	make_sockaddr(&peeraddr, 0, &len);
>  	if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) {
> +		int i;
> +		unsigned char *b=(unsigned char *)&peeraddr;
>  		log_print("connect from non cluster node");
> +		for (i=0; i<sizeof(struct sockaddr_storage);i++)
> +			printk("%02x ", b[i]);
> +		printk("\n");
>  		sock_release(newsock);
>  		mutex_unlock(&con->sock_mutex);
>  		return -1;

Could you use print_hex_dump_bytes instead?
Dave



From yamato at redhat.com  Mon Jul  4 03:11:12 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Mon, 04 Jul 2011 12:11:12 +0900 (JST)
Subject: [Linux-cluster] [PATCH] trivial fix
In-Reply-To: <4E0D9FA6.4040001@redhat.com>
References: <20110701.174556.1041000521202229132.yamato@redhat.com>
	<4E0D9FA6.4040001@redhat.com>
Message-ID: <20110704.121112.149598301835655010.yamato@redhat.com>

> On 01/07/11 09:45, Masatake YAMATO wrote:
>> Fix a typo.
>>
>> Signed-off-by: Masatake YAMATO
>>
>> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
>> index bffa1e7..f0d4855 100644
>> --- a/fs/dlm/lowcomms.c
>> +++ b/fs/dlm/lowcomms.c
>> @@ -932,7 +932,7 @@ static void tcp_connect_to_sock(struct connection
>> *con)
>>   	int one = 1;
>>
>>   	if (con->nodeid == 0) {
>> -		log_print("attempt to connect sock 0 foiled");
>> +		log_print("attempt to connect sock 0 failed");
>>   		return;
>>   	}
>>
> 
> That's not a typo. I did mean "foiled" and not "failed". Read the code
> and it will make sense ;-)

Oh, sorry.
 
> Chrissie
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From yamato at redhat.com  Mon Jul  4 03:25:51 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Mon, 04 Jul 2011 12:25:51 +0900 (JST)
Subject: [Linux-cluster] [PATCH V2] dumping the unknown address when got a
 connect from non cluster node
In-Reply-To: <20110701174136.GC23008@redhat.com>
References: <20110701.162658.1002310074831263303.yamato@redhat.com>
	<20110701174136.GC23008@redhat.com>
Message-ID: <20110704.122551.868642282278092140.yamato@redhat.com>

>> Another patch useful for debugging cluster.conf and network configuration.
>> 
>> This is useful when you build a cluster with nodes connected each others with
>> a software bridge(virbrN). If you install wrong iptabels configuration, dlm
>> cannot establish connections. You will just see 
>> 
>>        dlm: connect from non cluster node
>> 
>> in demsg. It is difficult to understand what happens quickly.
>> This patch dumps the address of the non cluster node.
>> 
>> 
>> Signed-off-by: Masatake YAMATO <yamato at redhat.com>
>> 
>> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
>> index bffa1e7..90c1c2e 100644
>> --- a/fs/dlm/lowcomms.c
>> +++ b/fs/dlm/lowcomms.c
>> @@ -748,7 +748,12 @@ static int tcp_accept_from_sock(struct connection *con)
>>  	/* Get the new node's NODEID */
>>  	make_sockaddr(&peeraddr, 0, &len);
>>  	if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) {
>> +		int i;
>> +		unsigned char *b=(unsigned char *)&peeraddr;
>>  		log_print("connect from non cluster node");
>> +		for (i=0; i<sizeof(struct sockaddr_storage);i++)
>> +			printk("%02x ", b[i]);
>> +		printk("\n");
>>  		sock_release(newsock);
>>  		mutex_unlock(&con->sock_mutex);
>>  		return -1;
> 
> Could you use print_hex_dump_bytes instead?
> Dave

Here is the revised version.


This patch is useful when you build a cluster with nodes connected
each others with a software bridge(virbrN). If you install wrong
iptabels configuration, dlm cannot establish connections. You will
just see

       dlm: connect from non cluster node

in demsg. It is difficult to understand what happens quickly.  This
patch dumps the address of the non cluster node with print_hex_dump_bytes
function:


       dlm: connect from non cluster node
       ss: 02 00 00 00 c0 a8 97 01 00 00 00 00 00 00 00 00  ................
       ....


Using print_hex_dump_bytes is sugested by David Teigland.


Signed-off-by: Masatake YAMATO <yamato at redhat.com>

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index bffa1e7..a762e9f 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -512,12 +512,10 @@ static void process_sctp_notification(struct connection *con,
 			}
 			make_sockaddr(&prim.ssp_addr, 0, &addr_len);
 			if (dlm_addr_to_nodeid(&prim.ssp_addr, &nodeid)) {
-				int i;
 				unsigned char *b=(unsigned char *)&prim.ssp_addr;
 				log_print("reject connect from unknown addr");
-				for (i=0; i<sizeof(struct sockaddr_storage);i++)
-					printk("%02x ", b[i]);
-				printk("\n");
+				print_hex_dump_bytes("ss: ", DUMP_PREFIX_NONE, 
+						     b, sizeof(struct sockaddr_storage));
 				sctp_send_shutdown(prim.ssp_assoc_id);
 				return;
 			}
@@ -748,7 +746,10 @@ static int tcp_accept_from_sock(struct connection *con)
 	/* Get the new node's NODEID */
 	make_sockaddr(&peeraddr, 0, &len);
 	if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) {
+		unsigned char *b=(unsigned char *)&peeraddr;
 		log_print("connect from non cluster node");
+		print_hex_dump_bytes("ss: ", DUMP_PREFIX_NONE, 
+				     b, sizeof(struct sockaddr_storage));
 		sock_release(newsock);
 		mutex_unlock(&con->sock_mutex);
 		return -1;



From parvez.h.shaikh at gmail.com  Tue Jul  5 11:32:34 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Tue, 5 Jul 2011 17:02:34 +0530
Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster
Message-ID: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>

Hi all,

I was trying to find out how much time does it take for RHCS to detect
failure and recover from it. I found the link -
http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf

It says that network polling interval is 2 seconds and 6 retries are
attempted before declaring a node as failed. I want to know can we tune this
or configure it, say instead of 6 retries I want only 3 retries. Also
reducing network polling time from 2 seconds to say 1 second (can it be less
than 1 second, which I think would consume more CPU)?

Also I have a script resource and I see it invoked with status argument
after every 30 seconds, can we configure that as well?

Failover also involve fencing, any pointers on how can we control /
configure fencing time would also be useful,I use bladecenter fencing, IPMI
fencing as well as UCS fencing.

Thanks,
Parvez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110705/68244a55/attachment.htm>

From ccaulfie at redhat.com  Tue Jul  5 12:28:40 2011
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 05 Jul 2011 13:28:40 +0100
Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster
In-Reply-To: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>
References: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>
Message-ID: <4E130378.1090408@redhat.com>

That's a *very* old document. it's from 2003 and refers to RHEL2.1 .. 
which I sincerely hope you weren't planning to implement.

Before you do anything more I recommend you read the documentation for 
the actual version of clustering you are going to install

https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/

Chrissie

On 05/07/11 12:32, Parvez Shaikh wrote:
> Hi all,
>
> I was trying to find out how much time does it take for RHCS to detect
> failure and recover from it. I found the link -
> http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf
>
> It says that network polling interval is 2 seconds and 6 retries are
> attempted before declaring a node as failed. I want to know can we tune
> this or configure it, say instead of 6 retries I want only 3 retries.
> Also reducing network polling time from 2 seconds to say 1 second (can
> it be less than 1 second, which I think would consume more CPU)?
>
> Also I have a script resource and I see it invoked with status argument
> after every 30 seconds, can we configure that as well?
>
> Failover also involve fencing, any pointers on how can we control /
> configure fencing time would also be useful,I use bladecenter fencing,
> IPMI fencing as well as UCS fencing.
>
> Thanks,
> Parvez
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From parvez.h.shaikh at gmail.com  Tue Jul  5 12:43:02 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Tue, 5 Jul 2011 18:13:02 +0530
Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster
In-Reply-To: <4E130378.1090408@redhat.com>
References: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>
	<4E130378.1090408@redhat.com>
Message-ID: <CAKrd531n66Ovu26K5QWCadiT3W9Gz_JrTPnOb1BOK04gc8L-Tg@mail.gmail.com>

Hello Christine,

Thanks for the link enlisting various documents, I have RHC running over
RHEL 5.5 and has been working fine. However I would greatly appreciate, some
document or pointers which help me in estimate failover time or adjust it;
if that is possible.

I have been through Administration Guide and could not find how I can adjust
it.

Thanks,
Parvez

On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield <ccaulfie at redhat.com>wrote:

> That's a *very* old document. it's from 2003 and refers to RHEL2.1 .. which
> I sincerely hope you weren't planning to implement.
>
> Before you do anything more I recommend you read the documentation for the
> actual version of clustering you are going to install
>
> https://access.redhat.com/**knowledge/docs/Red_Hat_**Enterprise_Linux/<https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/>
>
> Chrissie
>
>
> On 05/07/11 12:32, Parvez Shaikh wrote:
>
>> Hi all,
>>
>> I was trying to find out how much time does it take for RHCS to detect
>> failure and recover from it. I found the link -
>> http://www.redhat.com/**whitepapers/rha/RHA_**ClusterSuiteWPPDF.pdf<http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf>
>>
>> It says that network polling interval is 2 seconds and 6 retries are
>> attempted before declaring a node as failed. I want to know can we tune
>> this or configure it, say instead of 6 retries I want only 3 retries.
>> Also reducing network polling time from 2 seconds to say 1 second (can
>> it be less than 1 second, which I think would consume more CPU)?
>>
>> Also I have a script resource and I see it invoked with status argument
>> after every 30 seconds, can we configure that as well?
>>
>> Failover also involve fencing, any pointers on how can we control /
>> configure fencing time would also be useful,I use bladecenter fencing,
>> IPMI fencing as well as UCS fencing.
>>
>> Thanks,
>> Parvez
>>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/**mailman/listinfo/linux-cluster<https://www.redhat.com/mailman/listinfo/linux-cluster>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110705/30a916ce/attachment.htm>

From ccaulfie at redhat.com  Tue Jul  5 13:20:47 2011
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 05 Jul 2011 14:20:47 +0100
Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster
In-Reply-To: <CAKrd531n66Ovu26K5QWCadiT3W9Gz_JrTPnOb1BOK04gc8L-Tg@mail.gmail.com>
References: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>	<4E130378.1090408@redhat.com>
	<CAKrd531n66Ovu26K5QWCadiT3W9Gz_JrTPnOb1BOK04gc8L-Tg@mail.gmail.com>
Message-ID: <4E130FAF.1060600@redhat.com>

Hiya,

I don't have the URL to have but I'm pretty sure there's something in 
the Red Hat knowledge base about calculating failover times. You'll need 
to have paid support to get at it.

Failing that here's a document I wrote that talks about configuring the 
insider bits of openais and cman. "man 5 openais.conf" is also helpful.

I hope this helps :-)

Chrissie

On 05/07/11 13:43, Parvez Shaikh wrote:
> Hello Christine,
>
> Thanks for the link enlisting various documents, I have RHC running over
> RHEL 5.5 and has been working fine. However I would greatly appreciate,
> some document or pointers which help me in estimate failover time or
> adjust it; if that is possible.
>
> I have been through Administration Guide and could not find how I can
> adjust it.
>
> Thanks,
> Parvez
>
> On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield <ccaulfie at redhat.com
> <mailto:ccaulfie at redhat.com>> wrote:
>
>     That's a *very* old document. it's from 2003 and refers to RHEL2.1
>     .. which I sincerely hope you weren't planning to implement.
>
>     Before you do anything more I recommend you read the documentation
>     for the actual version of clustering you are going to install
>
>     https://access.redhat.com/__knowledge/docs/Red_Hat___Enterprise_Linux/
>     <https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/>
>
>     Chrissie
>
>
>     On 05/07/11 12:32, Parvez Shaikh wrote:
>
>         Hi all,
>
>         I was trying to find out how much time does it take for RHCS to
>         detect
>         failure and recover from it. I found the link -
>         http://www.redhat.com/__whitepapers/rha/RHA___ClusterSuiteWPPDF.pdf
>         <http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf>
>
>         It says that network polling interval is 2 seconds and 6 retries are
>         attempted before declaring a node as failed. I want to know can
>         we tune
>         this or configure it, say instead of 6 retries I want only 3
>         retries.
>         Also reducing network polling time from 2 seconds to say 1
>         second (can
>         it be less than 1 second, which I think would consume more CPU)?
>
>         Also I have a script resource and I see it invoked with status
>         argument
>         after every 30 seconds, can we configure that as well?
>
>         Failover also involve fencing, any pointers on how can we control /
>         configure fencing time would also be useful,I use bladecenter
>         fencing,
>         IPMI fencing as well as UCS fencing.
>
>         Thanks,
>         Parvez
>
>
>
>         --
>         Linux-cluster mailing list
>         Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>         https://www.redhat.com/__mailman/listinfo/linux-cluster
>         <https://www.redhat.com/mailman/listinfo/linux-cluster>
>
>
>     --
>     Linux-cluster mailing list
>     Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>     https://www.redhat.com/__mailman/listinfo/linux-cluster
>     <https://www.redhat.com/mailman/listinfo/linux-cluster>
>
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From ccaulfie at redhat.com  Tue Jul  5 15:10:56 2011
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 05 Jul 2011 16:10:56 +0100
Subject: [Linux-cluster] Configuring failover time with Red Hat Cluster
In-Reply-To: <4E130FAF.1060600@redhat.com>
References: <CAKrd531m7E-o-cYLSquF9MeRcdygpxmNHY3ZNpJLKkfkwep3FA@mail.gmail.com>	<4E130378.1090408@redhat.com>	<CAKrd531n66Ovu26K5QWCadiT3W9Gz_JrTPnOb1BOK04gc8L-Tg@mail.gmail.com>
	<4E130FAF.1060600@redhat.com>
Message-ID: <4E132980.2050409@redhat.com>

I forgot to paste the URL, sorry!

http://people.redhat.com/ccaulfie/docs/CmanYinYang.pdf

Chrissie

On 05/07/11 14:20, Christine Caulfield wrote:
> Hiya,
>
> I don't have the URL to have but I'm pretty sure there's something in
> the Red Hat knowledge base about calculating failover times. You'll need
> to have paid support to get at it.
>
> Failing that here's a document I wrote that talks about configuring the
> insider bits of openais and cman. "man 5 openais.conf" is also helpful.
>
> I hope this helps :-)
>
> Chrissie
>
> On 05/07/11 13:43, Parvez Shaikh wrote:
>> Hello Christine,
>>
>> Thanks for the link enlisting various documents, I have RHC running over
>> RHEL 5.5 and has been working fine. However I would greatly appreciate,
>> some document or pointers which help me in estimate failover time or
>> adjust it; if that is possible.
>>
>> I have been through Administration Guide and could not find how I can
>> adjust it.
>>
>> Thanks,
>> Parvez
>>
>> On Tue, Jul 5, 2011 at 5:58 PM, Christine Caulfield <ccaulfie at redhat.com
>> <mailto:ccaulfie at redhat.com>> wrote:
>>
>> That's a *very* old document. it's from 2003 and refers to RHEL2.1
>> .. which I sincerely hope you weren't planning to implement.
>>
>> Before you do anything more I recommend you read the documentation
>> for the actual version of clustering you are going to install
>>
>> https://access.redhat.com/__knowledge/docs/Red_Hat___Enterprise_Linux/
>> <https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/>
>>
>> Chrissie
>>
>>
>> On 05/07/11 12:32, Parvez Shaikh wrote:
>>
>> Hi all,
>>
>> I was trying to find out how much time does it take for RHCS to
>> detect
>> failure and recover from it. I found the link -
>> http://www.redhat.com/__whitepapers/rha/RHA___ClusterSuiteWPPDF.pdf
>> <http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf>
>>
>> It says that network polling interval is 2 seconds and 6 retries are
>> attempted before declaring a node as failed. I want to know can
>> we tune
>> this or configure it, say instead of 6 retries I want only 3
>> retries.
>> Also reducing network polling time from 2 seconds to say 1
>> second (can
>> it be less than 1 second, which I think would consume more CPU)?
>>
>> Also I have a script resource and I see it invoked with status
>> argument
>> after every 30 seconds, can we configure that as well?
>>
>> Failover also involve fencing, any pointers on how can we control /
>> configure fencing time would also be useful,I use bladecenter
>> fencing,
>> IPMI fencing as well as UCS fencing.
>>
>> Thanks,
>> Parvez
>>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>> https://www.redhat.com/__mailman/listinfo/linux-cluster
>> <https://www.redhat.com/mailman/listinfo/linux-cluster>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>> https://www.redhat.com/__mailman/listinfo/linux-cluster
>> <https://www.redhat.com/mailman/listinfo/linux-cluster>
>>
>>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From swap_project at yahoo.com  Wed Jul  6 03:21:02 2011
From: swap_project at yahoo.com (Srija)
Date: Tue, 5 Jul 2011 20:21:02 -0700 (PDT)
Subject: [Linux-cluster] Cluster node  issue
In-Reply-To: <BANLkTikQ-KwSttebMLmw7B_60AXnnwH1ew@mail.gmail.com>
Message-ID: <1309922462.81116.YahooMailClassic@web112808.mail.gq1.yahoo.com>

Hi,

We have 16 nodes cluster.  Recently facing issues with the nodes.  The problem is,

occassionaly find one of the nodes is not accessable through ssh. The node is up

and running, the zen guests on the nodes are also pingable .  But the node, and

the guests on the nodes are not able to accessable. Very recently it happened to 

one of the node again.

The  nodes are of rhel5.5, 

kernel 2.6.18-194.3.1.el5xen #1 SMP Sun May 2 04:26:43 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

when it happens that node becomes detached from the cluster.  

If anybody can give some hints  that will be really appreciated. Not sure it is the 

kernel but or not...Here is the few line of the log file, when it happened last time.

Thanks in advance..

____________________________________________________________

Jul  1 17:11:03 server crond[11715]: (root) CMD (python /usr/share/rhn/virtualization/poller.py)
Jul  1 17:11:03 server crond[11716]: (root) CMD (python /usr/share/rhn/virtualization/poller.py)
Jul  1 17:11:01 server crond[11685]: (root) error: Job execution of per-minute job scheduled for 17:10 delayed into subsequent minute 17:11. Skipping job run.
Jul  1 17:11:03 server crond[11685]: CRON (root) ERROR: cannot set security context
Jul  1 17:17:13 server xinetd[6778]: START: pblocald pid=11896 from=xxx.xx.222.4
Jul  1 17:21:01 server crond[11852]: (root) error: Job execution of per-minute job scheduled for 17:15 delayed into subsequent minute 17:21. Skipping job run.
Jul  1 17:21:01 server crond[11852]: CRON (root) ERROR: cannot set security context
Jul  1 17:21:05 server crond[12031]: (root) CMD (python /usr/share/rhn/virtualization/poller.py)
Jul  1 17:23:34 server INFO: task cmahealthd:7492 blocked for more than 120 seconds.
Jul  1 17:23:37 server "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  1 17:23:37 server cmahealthd    D 0000000000000180     0  7492      1          7507  7430 (NOTLB)
Jul  1 17:23:37 server ffff880065f29b18  0000000000000282  0000000000000000  0000000000000000
Jul  1 17:23:37 server 0000000000000009  ffff880065c35040  ffff88007f6720c0  000000000001d982
Jul  1 17:23:37 server ffff880065c35228  ffff88007e16b400
Jul  1 17:23:37 server Call Trace:
Jul  1 17:23:37 server [<ffffffff80287795>] __wake_up_common+0x3e/0x68
Jul  1 17:23:37 server [<ffffffff802d81f3>] base_probe+0x0/0x36
Jul  1 17:23:37 server [<ffffffff80262fb3>] wait_for_completion+0x7d/0xaa
Jul  1 17:23:52 server [<ffffffff80288f86>] default_wake_function+0x0/0xe
Jul  1 17:23:52 server [<ffffffff80298e84>] call_usermodehelper_keys+0xe3/0xf8
Jul  1 17:23:52 server [<ffffffff80298e99>] __call_usermodehelper+0x0/0x4f
Jul  1 17:23:52 server [<ffffffff802071b2>] find_get_page+0x4d/0x55
Jul  1 17:23:52 server [<ffffffff80299275>] request_module+0x139/0x14d
Jul  1 17:23:52 server [<ffffffff8022cf67>] mntput_no_expire+0x19/0x89
Jul  1 17:23:52 server [<ffffffff8020edda>] link_path_walk+0xa6/0xb2
Jul  1 17:23:52 server [<ffffffff80263914>] mutex_lock+0xd/0x1d
Jul  1 17:23:52 server [<ffffffff802d8211>] base_probe+0x1e/0x36
Jul  1 17:23:52 server [<ffffffff803af5c9>] kobj_lookup+0x132/0x19b
Jul  1 17:31:24 server xinetd[6778]: START: pblocald pid=12151 from=xxx.xx.222.4
Jul  1 17:28:16 server openais[6172]: [TOTEM] entering GATHER state from 12.
Jul  1 17:28:41 server openais[6172]: [TOTEM] Creating commit token because I am the rep.
Jul  1 17:28:41 server openais[6172]: [TOTEM] Saving state aru 20a high seq received 20a
Jul  1 17:28:41 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e90
Jul  1 17:28:49 server openais[6172]: [TOTEM] entering COMMIT state.
Jul  1 17:31:30 server openais[6172]: [TOTEM] Creating commit token because I am the rep.
Jul  1 17:31:30 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e94
Jul  1 17:31:30 server openais[6172]: [TOTEM] entering COMMIT state.
Jul  1 17:31:30 server openais[6172]: [TOTEM] entering GATHER state from 13.
Jul  1 17:31:30 server openais[6172]: [TOTEM] Creating commit token because I am the rep.
Jul  1 17:33:30 server [<ffffffff8024b204>] chrdev_open+0x53/0x183
Jul  1 17:33:30 server [<ffffffff8024b1b1>] chrdev_open+0x0/0x183
Jul  1 17:33:30 server [<ffffffff8021edc8>] __dentry_open+0xd9/0x1dc
Jul  1 17:33:30 server [<ffffffff80227bca>] do_filp_open+0x2a/0x38
Jul  1 17:33:30 server [<ffffffff8021a270>] do_sys_open+0x44/0xbe
Jul  1 17:33:30 server [<ffffffff8026168d>] ia32_sysret+0x0/0x5
Jul  1 17:33:30 server
Jul  1 17:31:30 server openais[6172]: [TOTEM] Storing new sequence id for ring 2e98
Jul  1 17:31:30 server openais[6172]: [TOTEM] entering COMMIT state.
Jul  1 17:31:30 server openais[6172]: [TOTEM] entering RECOVERY state.
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [0] member 192.168.xxx.9:
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11916 rep 192.168.xxx.9
Jul  1 17:31:30 server openais[6172]: [TOTEM] aru 20a high delivered 20a received flag 1
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [1] member 192.168.xxx.10:
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10
Jul  1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [2] member 192.168.xxx.11:
Jul  1 17:33:30 server INFO: task cmahealthd:7492 blocked for more than 120 seconds.
Jul  1 17:33:30 server "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  1 17:33:30 server cmahealthd    D 0000000000000180     0  7492      1          7507  7430 (NOTLB)
Jul  1 17:33:30 server ffff880065f29b18  0000000000000282  0000000000000000  0000000000000000
Jul  1 17:33:30 server 0000000000000009  ffff880065c35040  ffff88007f6720c0  000000000001d982
Jul  1 17:33:30 server ffff880065c35228  ffff88007e16b400
Jul  1 17:33:30 server Call Trace:
Jul  1 17:33:30 server [<ffffffff80287795>] __wake_up_common+0x3e/0x68
Jul  1 17:33:30 server [<ffffffff802d81f3>] base_probe+0x0/0x36
Jul  1 17:33:30 server [<ffffffff80262fb3>] wait_for_completion+0x7d/0xaa
Jul  1 17:33:29 server dlm_controld[6272]: cluster is down, exiting
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10
Jul  1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [3] member 192.168.xxx.12:
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10
Jul  1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [4] member 192.168.xxx.13:
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10
Jul  1 17:31:30 server openais[6172]: [TOTEM] aru e2 high delivered e2 received flag 1
Jul  1 17:31:30 server openais[6172]: [TOTEM] position [5] member 192.168.xxx.14:
Jul  1 17:31:30 server openais[6172]: [TOTEM] previous ring seq 11924 rep 192.168.xxx.10
Jul  1 17:33:30 server [<ffffffff80288f86>] default_wake_function+0x0/0xe
Jul  1 17:33:30 server [<ffffffff80298e84>] call_usermodehelper_keys+0xe3/0xf8
Jul  1 17:33:30 server [<ffffffff80298e99>] __call_usermodehelper+0x0/0x4f
Jul  1 17:33:30 server [<ffffffff802071b2>] find_get_page+0x4d/0x55
Jul  1 17:33:30 server [<ffffffff80299275>] request_module+0x139/0x14d
Jul  1 17:33:30 server [<ffffffff8022cf67>] mntput_no_expire+0x19/0x89
Jul  1 17:33:30 server [<ffffffff8020edda>] link_path_walk+0xa6/0xb2
Jul  1 17:33:30 server [<ffffffff80263914>] mutex_lock+0xd/0x1d
Jul  1 17:33:30 server [<ffffffff802d8211>] base_probe+0x1e/0x36
Jul  1 17:33:30 server [<ffffffff803af5c9>] kobj_lookup+0x132/0x19b
Jul  1 17:33:30 server gfs_controld[6280]: cluster is down, exiting



From member at linkedin.com  Wed Jul  6 12:11:42 2011
From: member at linkedin.com (Arif Bhai Surat via LinkedIn)
Date: Wed, 6 Jul 2011 12:11:42 +0000 (UTC)
Subject: [Linux-cluster] Invitation to connect on LinkedIn
Message-ID: <122461066.15294886.1309954302233.JavaMail.app@ela4-bed77.prod>

LinkedIn
------------




    Arif Bhai Surat requested to add you as a connection on LinkedIn:
  
------------------------------------------

Marian,

I'd like to add you to my professional network on LinkedIn.

- Arif Bhai

Accept invitation from Arif Bhai Surat
http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2940492630_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP0Pdz8Vd30Qej99bRhepD95rkhIbP0Td3gMdjsQd3cLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Arif Bhai Surat
http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2940492630_2/39vc3cSczAQc3gVcAALqnpPbOYWrSlI/svi/ 

------------------------------------------
DID YOU KNOW you can use your LinkedIn profile as your website? Select a vanity URL and then promote this address on your business cards, email signatures, website, etc
http://www.linkedin.com/e/-odgn7o-gps8ywhk-6a/ewp/inv-21/


 
-- 
(c) 2011, LinkedIn Corporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110706/6d712b71/attachment.htm>

From helen_heath at fastmail.fm  Wed Jul  6 12:13:38 2011
From: helen_heath at fastmail.fm (Helen Heath)
Date: Wed, 06 Jul 2011 13:13:38 +0100
Subject: [Linux-cluster] how to disable one node
Message-ID: <1309954418.14584.2148764421@webmail.messagingengine.com>

Hi all -

I hope someone can shed some light on this.  I have a 2-node
cluster running on RedHat 3 which has a shared /clust1 filesystem
and is connected to a network power switch.  There is something
very wrong with the cluster, as every day currently it is
rebooting whichever is the primary node, for no reason I can
track down.  No hardware faults anywhere in the cluster, no
failures of any kind logging in any log files, etc etc.   It
started out well over a year ago rebooting the primary node every
other week, then across time it progressed to once a week, then
once a day.  I logged a call with RedHat way back when it first
started; nothing was ever found to be the problem, and of course
in time, RedHat v3 went out of support and they would no longer
assist in troubleshooting the issue.  Prior to this problem
starting the cluster had been running happily with no issues for
about 5 years.

Now this cluster is shortly being replaced with new hardware and
RedHat 5, so hopefully whatever is the problem will as
mysteriously vanish as it appeared.  However, I need to stop this
daily reboot as it is playing havoc with the application that
runs on this system (a heavily-utilised database) and having
tried everything I can think of, I decided to 'break' the
cluster; ie, take down one node so that only one node remains
running the application.

I cannot find a way to do this that persists across a reboot of
the node that should be out of the cluster.  I've run
"/sbin/chkconfig --del clumanager" and it did take the service
out of chkconfig (I verified this).  The RedHat document
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html
/Cluster_Administration/s1-admin-disable.html seems to indicate
this should persist across a reboot - ie, you reboot the node and
it does not attempt to rejoin the cluster; however, this didn't
work!  The primary node cluster monitoring software saw that the
secondary node was down, STONITH kicked in, the NPS powered the
port this node is connected to off and back on, the secondary
node rebooted and rejoined the cluster!

Does anyone know how to either temporarily remove the secondary
node from the cluster in such a way that persists across reboots
but can be easily brought back into the cluster when needed, or
else (and preferably) how to temporarily stop the cluster
monitoring software running on the primary node from even looking
out for the secondary node - as in, it doesn't care whether the
secondary node is up or not?  I've checked for the period the
secondary node is down that the primary node is quite happy to
carry on processing as usual but as soon as the cluster
monitoring software on the primary node realises the secondary
node is down, it reboots it, and I'm back to square one!

This is now really urgent (I've been trying to find an answer to
this for some weeks now) as I go on holiday on Friday and I
really don't want to leave my second-in-command with a mess on
his hands!

thanks
-- 
  Helen Heath
  helen_heath at fastmail.fm

=*=
Everything that has a beginning has an ending. Make your peace with that and all will be well.
-- Buddhist saying

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110706/5c68a6a7/attachment.htm>

From anprice at redhat.com  Wed Jul  6 13:55:07 2011
From: anprice at redhat.com (Andrew Price)
Date: Wed, 06 Jul 2011 14:55:07 +0100
Subject: [Linux-cluster] gfs2-utils 3.1.2 Released
Message-ID: <4E14693B.7090401@redhat.com>

Hi,

gfs2-utils 3.1.2 has been released. This version features various bug 
fixes, compression for gfs2_edit savemeta, and improved translation 
infrastructure. See below for a full list of changes.

The source tarball is available from:

    https://fedorahosted.org/released/gfs2-utils/gfs2-utils-3.1.2.tar.gz

To report bugs or issues, please use:

    https://bugzilla.redhat.com/

Regards,

Andy Price
Red Hat File Systems


Changes since 3.1.1:

Abhijith Das (2):
       gfs2_convert: exits with success without doing anything
       gfs2_convert exits with success without doing anything

Andrew Price (5):
       gfs2_edit: Add compression to savemeta and restoremeta
       gfs2_utils: More error handling improvements
       gfs2-utils: quieten some new build warnings
       gfs2_edit: Fix savemeta compression for older zlibs
       gfs2-utils: Fix up make-tarball.sh

Benjamin Marzinski (1):
       gfs2_grow: write one rindex entry and then the rest

Bob Peterson (2):
       gfs2_edit savemeta was not saving some directory info
       fsck.gfs2 only rebuilds one missing journal at a time

Carlos Maiolino (5):
       Add i18n support to gfs2-utils
       Track translatable files
       gfs2_convert: Add i18n support
       gfs2_convert: set translatable strings
       i18n support: Add gfs2_convert to translatable list

Steven Whitehouse (1):
       Remove last traces of unlinked file from gfs2-utils



From pradhanparas at gmail.com  Wed Jul  6 16:15:00 2011
From: pradhanparas at gmail.com (Paras pradhan)
Date: Wed, 6 Jul 2011 11:15:00 -0500
Subject: [Linux-cluster] DR node in a cluster
Message-ID: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>

Hi,

My GFS2 linux cluster has three nodes. Two at the data center and one at the
DR site. If the nodes at DR site break/turnoff, all the services move to DR
node. But if the 2 nodes at the data center lost communication with the DR
node, I am not sure how does the cluster handles the split brain. So I am
looking for some recommendation in this kind of scenario. I am usig Qdisk
votes (=3) in this case.

--
Here is the cman_tool status output.


-
Version: 6.2.0
Config Version: 74
Cluster Name: vrprd
Cluster Id: 3304
Cluster Member: Yes
Cluster Generation: 1720
Membership state: Cluster-Member
Nodes: 3
Expected votes: 6
Quorum device votes: 3
Total votes: 6
Quorum: 4
Active subsystems: 10
Flags: Dirty
Ports Bound: 0 11 177
Node name: vrprd1.hostmy.com
Node ID: 2
Multicast addresses: x.x.x.244
Node addresses: x.x.x.96
--

Thanks!
Paras.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110706/a7c320b1/attachment.htm>

From swhiteho at redhat.com  Wed Jul  6 16:28:34 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 06 Jul 2011 17:28:34 +0100
Subject: [Linux-cluster] DR node in a cluster
In-Reply-To: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
References: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
Message-ID: <1309969714.2739.16.camel@menhir>

Hi,

On Wed, 2011-07-06 at 11:15 -0500, Paras pradhan wrote:
> Hi,
> 
> 
> My GFS2 linux cluster has three nodes. Two at the data center and one
> at the DR site. If the nodes at DR site break/turnoff, all the
> services move to DR node. But if the 2 nodes at the data center lost
> communication with the DR node, I am not sure how does the cluster
> handles the split brain. So I am looking for some recommendation in
> this kind of scenario. I am usig Qdisk votes (=3) in this case.
> 
> 
Using GFS2 in stretched clusters like this is not something that we
support or recommend. It might work in some circumstances, but it is
very complicated to ensure that recovery will work correctly in all
cases. If you don't have enough nodes at a site to allow quorum to be
established, then when communication fails between sites you must fence
those nodes or risk data corruption when communication is
re-established,

Steve.

> --
> Here is the cman_tool status output.
> 
> 
> 
> 
> -
> Version: 6.2.0
> Config Version: 74
> Cluster Name: vrprd
> Cluster Id: 3304
> Cluster Member: Yes
> Cluster Generation: 1720
> Membership state: Cluster-Member
> Nodes: 3
> Expected votes: 6
> Quorum device votes: 3
> Total votes: 6
> Quorum: 4  
> Active subsystems: 10
> Flags: Dirty 
> Ports Bound: 0 11 177  
> Node name: vrprd1.hostmy.com
> Node ID: 2
> Multicast addresses: x.x.x.244 
> Node addresses: x.x.x.96 
> --
> 
> 
> Thanks!
> Paras.
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From Chris.Jankowski at hp.com  Wed Jul  6 16:46:04 2011
From: Chris.Jankowski at hp.com (Jankowski, Chris)
Date: Wed, 6 Jul 2011 16:46:04 +0000
Subject: [Linux-cluster] DR node in a cluster
In-Reply-To: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
References: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
Message-ID: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net>

Paras,

A curiosity question:

How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service?
What storage configuration are you using?

Thanks and regards,

Chris

From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan
Sent: Thursday, 7 July 2011 02:15
To: linux clustering
Subject: [Linux-cluster] DR node in a cluster

Hi,

My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case.

--
Here is the cman_tool status output.


-
Version: 6.2.0
Config Version: 74
Cluster Name: vrprd
Cluster Id: 3304
Cluster Member: Yes
Cluster Generation: 1720
Membership state: Cluster-Member
Nodes: 3
Expected votes: 6
Quorum device votes: 3
Total votes: 6
Quorum: 4
Active subsystems: 10
Flags: Dirty
Ports Bound: 0 11 177
Node name: vrprd1.hostmy.com<http://vrprd1.hostmy.com>
Node ID: 2
Multicast addresses: x.x.x.244
Node addresses: x.x.x.96
--

Thanks!
Paras.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110706/5777eade/attachment.htm>

From pradhanparas at gmail.com  Wed Jul  6 17:16:39 2011
From: pradhanparas at gmail.com (Paras pradhan)
Date: Wed, 6 Jul 2011 12:16:39 -0500
Subject: [Linux-cluster] DR node in a cluster
In-Reply-To: <036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net>
References: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
	<036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <CADyt5g=sq45dwQE=Ouxuv3YPA7MBcXOFqF=Sm1tEd6UHDtoUGQ@mail.gmail.com>

Chris,

All the nodes are connected to a single SAN at this moment through fibre.


@steven:

--
 If you don't have enough nodes at a site to allow quorum to be
established, then when communication fails between sites you must fence
those nodes or risk data corruption when communication is
re-established,
-----

Yes true, but in this case a single node can made the cluster quorate.
(qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?).


Steve

On Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris <Chris.Jankowski at hp.com>wrote:

> Paras,****
>
> ** **
>
> A curiosity question:****
>
> ** **
>
> How do you make sure that your storage will survive failure of **either**
> of your site without loss of data and continuity of service?****
>
> What storage configuration are you using?****
>
> ** **
>
> Thanks and regards,****
>
>
> Chris****
>
> ** **
>
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *Paras pradhan
> *Sent:* Thursday, 7 July 2011 02:15
> *To:* linux clustering
> *Subject:* [Linux-cluster] DR node in a cluster****
>
> ** **
>
> Hi,****
>
> ** **
>
> My GFS2 linux cluster has three nodes. Two at the data center and one at
> the DR site. If the nodes at DR site break/turnoff, all the services move to
> DR node. But if the 2 nodes at the data center lost communication with the
> DR node, I am not sure how does the cluster handles the split brain. So I am
> looking for some recommendation in this kind of scenario. I am usig Qdisk
> votes (=3) in this case.****
>
> ** **
>
> --****
>
> Here is the cman_tool status output.****
>
> ** **
>
> ** **
>
> -****
>
> Version: 6.2.0****
>
> Config Version: 74****
>
> Cluster Name: vrprd****
>
> Cluster Id: 3304****
>
> Cluster Member: Yes****
>
> Cluster Generation: 1720****
>
> Membership state: Cluster-Member****
>
> Nodes: 3****
>
> Expected votes: 6****
>
> Quorum device votes: 3****
>
> Total votes: 6****
>
> Quorum: 4  ****
>
> Active subsystems: 10****
>
> Flags: Dirty ****
>
> Ports Bound: 0 11 177  ****
>
> Node name: vrprd1.hostmy.com****
>
> Node ID: 2****
>
> Multicast addresses: x.x.x.244 ****
>
> Node addresses: x.x.x.96 ****
>
> --****
>
> ** **
>
> Thanks!****
>
> Paras.****
>
> ** **
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110706/02a2fe08/attachment.htm>

From rossnick-lists at cybercat.ca  Wed Jul  6 17:08:13 2011
From: rossnick-lists at cybercat.ca (Nicolas Ross)
Date: Wed, 6 Jul 2011 13:08:13 -0400
Subject: [Linux-cluster] Cluster and remote location
Message-ID: <829B72D777C94E15A5ED9D1CF15E5E94@versa>

Hi !

In our curent setup we have an 8-node cluster at site A. In the near future, 
we will have a different cluster at site B. Both site will be bridged with a 
lan-extension, and we plan on bridging the "service" vlan, the one that that 
cluster services operetes on. The "totem-ring" vlan will remain private on 
both sides.

For some services, we may overlap the ips in both cluster, so that this 
service could only be run from one cluster at the time.

So is there anything I should pay attention to ?

I must stress out that they will be different cluster at each site and they 
will have separate fibre-channel network and disks, separate totem cluster 
network and shared service network.

Thanks, 



From fdinitto at redhat.com  Thu Jul  7 05:29:42 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Thu, 07 Jul 2011 07:29:42 +0200
Subject: [Linux-cluster] Cluster and remote location
In-Reply-To: <829B72D777C94E15A5ED9D1CF15E5E94@versa>
References: <829B72D777C94E15A5ED9D1CF15E5E94@versa>
Message-ID: <4E154446.6040707@redhat.com>

On 07/06/2011 07:08 PM, Nicolas Ross wrote:
> Hi !
> 
> In our curent setup we have an 8-node cluster at site A. In the near
> future, we will have a different cluster at site B. Both site will be
> bridged with a lan-extension, and we plan on bridging the "service"
> vlan, the one that that cluster services operetes on. The "totem-ring"
> vlan will remain private on both sides.
> 
> For some services, we may overlap the ips in both cluster, so that this
> service could only be run from one cluster at the time.
> 
> So is there anything I should pay attention to ?

Well yes..

what you describe is a multi-site cluster (as we agreed to call it at
LPC 2010 http://etherpad.osuosl.org/lpc2010-high-availability-clustering).

There is no infrastructure to support this setup yet and to avoid
services to be running at the same time on cluster A and B.

you can still do a setup that involves cluster A to be active and B in
hot-standby (for example cluster B would have no rgmanager running but
everything else can be ready).

Manual failover has to be done by sysadmin between clusters. Protection
against nodes failing within the same cluster is still operational as-is.

Fabio



From Chris.Jankowski at hp.com  Thu Jul  7 08:01:43 2011
From: Chris.Jankowski at hp.com (Jankowski, Chris)
Date: Thu, 7 Jul 2011 08:01:43 +0000
Subject: [Linux-cluster] DR node in a cluster
In-Reply-To: <CADyt5g=sq45dwQE=Ouxuv3YPA7MBcXOFqF=Sm1tEd6UHDtoUGQ@mail.gmail.com>
References: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
	<036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net>
	<CADyt5g=sq45dwQE=Ouxuv3YPA7MBcXOFqF=Sm1tEd6UHDtoUGQ@mail.gmail.com>
Message-ID: <036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net>

Paras,

With your SAN on one site, what is the point of having a stretched cluster?
If your datacenter, where the SAN is located, burns down, you've lost all your data.
The DR servers in the DR datacenter are kind of useless without the data on shared storage.

Regards,

Chris


From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan
Sent: Thursday, 7 July 2011 03:17
To: linux clustering
Subject: Re: [Linux-cluster] DR node in a cluster

Chris,

All the nodes are connected to a single SAN at this moment through fibre.


@steven:

--
 If you don't have enough nodes at a site to allow quorum to be
established, then when communication fails between sites you must fence
those nodes or risk data corruption when communication is
re-established,
-----

Yes true, but in this case a single node can made the cluster quorate. (qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?).


Steve
On Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris <Chris.Jankowski at hp.com<mailto:Chris.Jankowski at hp.com>> wrote:
Paras,

A curiosity question:

How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service?
What storage configuration are you using?

Thanks and regards,

Chris

From: linux-cluster-bounces at redhat.com<mailto:linux-cluster-bounces at redhat.com> [mailto:linux-cluster-bounces at redhat.com<mailto:linux-cluster-bounces at redhat.com>] On Behalf Of Paras pradhan
Sent: Thursday, 7 July 2011 02:15
To: linux clustering
Subject: [Linux-cluster] DR node in a cluster

Hi,

My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case.

--
Here is the cman_tool status output.


-
Version: 6.2.0
Config Version: 74
Cluster Name: vrprd
Cluster Id: 3304
Cluster Member: Yes
Cluster Generation: 1720
Membership state: Cluster-Member
Nodes: 3
Expected votes: 6
Quorum device votes: 3
Total votes: 6
Quorum: 4
Active subsystems: 10
Flags: Dirty
Ports Bound: 0 11 177
Node name: vrprd1.hostmy.com<http://vrprd1.hostmy.com>
Node ID: 2
Multicast addresses: x.x.x.244
Node addresses: x.x.x.96
--

Thanks!
Paras.


--
Linux-cluster mailing list
Linux-cluster at redhat.com<mailto:Linux-cluster at redhat.com>
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110707/ea640eea/attachment.htm>

From pradhanparas at gmail.com  Thu Jul  7 17:26:26 2011
From: pradhanparas at gmail.com (Paras pradhan)
Date: Thu, 7 Jul 2011 12:26:26 -0500
Subject: [Linux-cluster] DR node in a cluster
In-Reply-To: <036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net>
References: <CADyt5gnTRWFW3BBs7-3nfE-tsWuMCZm5LtdbnQWp2=_KVkxXKQ@mail.gmail.com>
	<036B68E61A28CA49AC2767596576CD596F69021BC1@GVW1113EXC.americas.hpqcorp.net>
	<CADyt5g=sq45dwQE=Ouxuv3YPA7MBcXOFqF=Sm1tEd6UHDtoUGQ@mail.gmail.com>
	<036B68E61A28CA49AC2767596576CD596F69021EEC@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <CADyt5gk5euADH0Cq=sBM3eqyYYAEpB+d2Pk0731_RtPdAtnY5A@mail.gmail.com>

Yes because of the licensing issue we are now limited to a single San
but not in the future.

Thanks guys for the replies
Paras

On Thursday, July 7, 2011, Jankowski, Chris <Chris.Jankowski at hp.com> wrote:
> Paras,?With your SAN on one site, what is the point of having a stretched cluster?If your datacenter, where the SAN is located, burns down, you?ve lost all your data.The DR servers in the DR datacenter are kind of useless without the data on shared storage.?Regards,?Chris??From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan
> Sent: Thursday, 7 July 2011 03:17
> To: linux clustering
> Subject: Re: [Linux-cluster] DR node in a cluster?Chris,?All the nodes are connected to a single SAN at this moment through fibre.??@steven:?--?If you don't have enough nodes at a site to allow quorum to be
> established, then when communication fails between sites you must fence
> those nodes or risk data corruption when communication is
> re-established,
> -----?Yes true, but in this case a single node can made the cluster quorate. (qdisk vote=3 ,node votes=3, total=6) which is not recommened I guess (?).
> SteveOn Wed, Jul 6, 2011 at 11:46 AM, Jankowski, Chris <Chris.Jankowski at hp.com> wrote:Paras,?A curiosity question:?How do you make sure that your storage will survive failure of *either* of your site without loss of data and continuity of service?What storage configuration are you using??Thanks and regards,
> Chris?From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Paras pradhan
> Sent: Thursday, 7 July 2011 02:15
> To: linux clustering
> Subject: [Linux-cluster] DR node in a cluster?Hi,?My GFS2 linux cluster has three nodes. Two at the data center and one at the DR site. If the nodes at DR site break/turnoff, all the services move to DR node. But if the 2 nodes at the data center lost communication with the DR node, I am not sure how does the cluster handles the split brain. So I am looking for some recommendation in this kind of scenario. I am usig Qdisk votes (=3) in this case.?--Here is the cman_tool stat



From teigland at redhat.com  Thu Jul  7 20:39:45 2011
From: teigland at redhat.com (David Teigland)
Date: Thu, 7 Jul 2011 16:39:45 -0400
Subject: [Linux-cluster] [PATCH V2] dumping the unknown address when got
 a connect from non cluster node
In-Reply-To: <20110704.122551.868642282278092140.yamato@redhat.com>
References: <20110701.162658.1002310074831263303.yamato@redhat.com>
	<20110701174136.GC23008@redhat.com>
	<20110704.122551.868642282278092140.yamato@redhat.com>
Message-ID: <20110707203944.GA9863@redhat.com>

On Mon, Jul 04, 2011 at 12:25:51PM +0900, Masatake YAMATO wrote:
> >> Another patch useful for debugging cluster.conf and network configuration.
> >> 
> >> This is useful when you build a cluster with nodes connected each others with
> >> a software bridge(virbrN). If you install wrong iptabels configuration, dlm
> >> cannot establish connections. You will just see 

Thanks, I've pushed both patches to the next branch in dlm.git, if you'd
like to try them out.
Dave



From fdinitto at redhat.com  Fri Jul  8 12:28:09 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Fri, 08 Jul 2011 14:28:09 +0200
Subject: [Linux-cluster] fence-agents-3.1.5 stable release
Message-ID: <4E16F7D9.9040903@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Welcome to the fence-agents 3.1.5 release.

This release includes a few minor bug fixes and support for more devices
(Eaton Switched ePDU).

The new source tarball can be downloaded here:

https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-3.1.5.tar.xz

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

Happy clustering,
Fabio

Under the hood (from 3.1.4):

Arnaud Quette (1):
      eaton_snmp: add support for Eaton Switched ePDU

Fabio M. Di Nitto (3):
      relaxng: ship bits required to build the schema at runtime
      relaxng: drop <grammar> definition
      relaxng: drop static agents definitions

Lon Hohberger (1):
      Make fence_ack_manual 'usage' more accessible

Marek 'marx' Grac (3):
      fence_drac5: Fix support for Dell DRAC CMC
      fence_bladecenter: Reboot operation did not work correctly with
- --missing-
      fence_drac5: Incorrect output of 'list' operation on Drac 5

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJOFvfXAAoJEFA6oBJjVJ+OjbsP/1pP6I2W86x++n+OJpOUvfNK
9ZXFBMvz3hphqHcYkACgoKE5qmQDYgCdH5EmJXrO4iQ0fCZ16/vyH2USqG3CqO7h
3VozL78IHjM8YDlssoWjD/vbzK5w6KbaC4Qpy2vcW73ARi1Ot0vsJN9InexyKorl
XAT4MIBqdJNwSfI2wGT9pe1duoY0AqNrz+UlRssXjYgaxlxq/5LtVmbRnRpbypqO
yBLtspTK+fSy0ofD2hxsOpTHDSgmMaj8REeN49iP924JbKbGE0FBl5yHh4Kd7UGI
6Q5gdbm7PetZAz9jubbJdH2yKRV5C0btzlvH7/LJsL4AK8qjA49cz7erygRU0sRv
1HOKFJ+xIWiNvp/I5AYhjIjMc0r1Eafrmpg7FgyhG9bNM6avmh7KiTSAxXfwOjRK
H22uakA/cxwOGWMMwLfRsSqwh5H9QTDRrXbF2xMHuenq/qDVvgHxveFoLqWQSdGd
HFsbcjWtfcPNer/+Fawk/7FoDkh2K0+EuhaPJdNplk+NkyTjw+EE67z6tdnislWB
0Ocx8YwX3ID7QjwySxPCNoayuUKUjDJiOgugEiHf4PrbGoqczQHug6O49IUwb2ZR
oT2DwpJnMptQluh1f5Xkw7OLB+tVA8elbnDjqzejB/5aj4LGZy2r/3OrQkOGfdnK
Stv3UdSiSk0mdv9SPF6F
=omcd
-----END PGP SIGNATURE-----



From jpolo at wtransnet.com  Fri Jul  8 15:41:08 2011
From: jpolo at wtransnet.com (Javi Polo)
Date: Fri, 08 Jul 2011 17:41:08 +0200
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL
	pointer deference
Message-ID: <4E172514.7060806@wtransnet.com>

Hello everyone!

I've set up a cluster in order to use GFS2. The cluster works really well ;)
Then, I've exported the GFS2 filesystem via NFS to share with machines 
outside the cluster, and in a read fashion it works OK, but as soon as I 
try to write in it, the filesystem seems to hang:

root at file03:~# mount filepro01:/mnt/gfs /mnt/tmp -o soft
root at file03:~# ls /mnt/tmp/
algo  caca  caca2  testa
root at file03:~# mkdir /mnt/tmp/otracosa

at this point, the NFS stopped working. I can see in the nfs client:

[11132241.127470] nfs: server filepro01 not responding, timed out

however, the directory was indeed created, and the other node can 
continue using the gfs2 filesystem (locally)
On the NFS server (filepro01) looking at the logs I found some nasty 
things. This first part is mounting the filesystem, which is OK:

[6234925.738508] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state 
recovery directory
[6234925.787305] NFSD: starting 90-second grace period
[6234925.825811] GFS2 (built Feb  7 2011 16:11:33) installed
[6234925.826698] GFS2: fsid=: Trying to join cluster "lock_dlm", 
"wtn_cluster:file01"
[6234925.886991] GFS2: fsid=wtn_cluster:file01.0: Joined cluster. Now 
mounting FS...
[6234925.975113] GFS2: fsid=wtn_cluster:file01.0: jid=0, already locked 
for use
[6234925.975116] GFS2: fsid=wtn_cluster:file01.0: jid=0: Looking at 
journal...
[6234926.075105] GFS2: fsid=wtn_cluster:file01.0: jid=0: Acquiring the 
transaction lock...
[6234926.075152] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replaying 
journal...
[6234926.076200] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replayed 8 of 9 
blocks
[6234926.076204] GFS2: fsid=wtn_cluster:file01.0: jid=0: Found 1 revoke tags
[6234926.076649] GFS2: fsid=wtn_cluster:file01.0: jid=0: Journal 
replayed in 1s
[6234926.076800] GFS2: fsid=wtn_cluster:file01.0: jid=0: Done
[6234926.076945] GFS2: fsid=wtn_cluster:file01.0: jid=1: Trying to 
acquire journal lock...
[6234926.078723] GFS2: fsid=wtn_cluster:file01.0: jid=1: Looking at 
journal...
[6234926.257645] GFS2: fsid=wtn_cluster:file01.0: jid=1: Done
[6234926.258187] GFS2: fsid=wtn_cluster:file01.0: jid=2: Trying to 
acquire journal lock...
[6234926.260966] GFS2: fsid=wtn_cluster:file01.0: jid=2: Looking at 
journal...
[6234926.549636] GFS2: fsid=wtn_cluster:file01.0: jid=2: Done
[6234930.789787] ipmi message handler version 39.2

and when we try to write from nfs client, bang:

[6235083.656954] BUG: unable to handle kernel NULL pointer dereference 
at 00000024
[6235083.656973] IP: [<ee2d6c1e>] gfs2_drevalidate+0xe/0x200 [gfs2]
[6235083.656992] *pdpt = 0000000001831027 *pde = 0000000000000000
[6235083.657003] Oops: 0000 [#1] SMP
[6235083.657012] last sysfs file: /sys/module/dlm/initstate
[6235083.657018] Modules linked in: ipmi_msghandler xenfs gfs2 ib_iser 
rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi dlm configfs nfsd e
xportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc drbd lru_cache lp 
parport [last unloaded: scsi_transport_iscsi]
[6235083.657090]
[6235083.657095] Pid: 1497, comm: nfsd Tainted: G        W   
2.6.38-2-virtual #29~lucid1-Ubuntu /
[6235083.657103] EIP: 0061:[<ee2d6c1e>] EFLAGS: 00010282 CPU: 0
[6235083.657115] EIP is at gfs2_drevalidate+0xe/0x200 [gfs2]
[6235083.657120] EAX: eb9d7180 EBX: eb9d7180 ECX: ee2ec000 EDX: 00000000
[6235083.657127] ESI: eb924580 EDI: 00000000 EBP: c1dc5c68 ESP: c1dc5c20
[6235083.657133]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[6235083.657139] Process nfsd (pid: 1497, ti=c1dc4000 task=c1b18ca0 
task.ti=c1dc4000)
[6235083.657145] Stack:
[6235083.657150]  c1dc5c28 c0627afd c1dc5c68 c0242314 00000000 c1dc5c7c 
ee2dba0c ee2c02d0
[6235083.657170]  00000001 eb924580 c1a47038 c1dc5cb0 eb9d7188 00000004 
14a2fc97 eb9d7180
[6235083.657190]  eb924580 00000000 c1dc5c7c c023a18f eb9d7180 eb924580 
eb925000 c1dc5ca0
[6235083.657210] Call Trace:
[6235083.657220]  [<c0627afd>] ? _raw_spin_lock+0xd/0x10
[6235083.657230]  [<c0242314>] ? __d_lookup+0xf4/0x150
[6235083.657242]  [<ee2dba0c>] ? gfs2_permission+0xcc/0x120 [gfs2]
[6235083.657253]  [<ee2c02d0>] ? gfs2_check_acl+0x0/0x80 [gfs2]
[6235083.657263]  [<c023a18f>] d_revalidate+0x1f/0x60
[6235083.657271]  [<c023a2e2>] __lookup_hash+0xa2/0x180
[6235083.657284]  [<edd8e266>] ? encode_post_op_attr+0x86/0x90 [nfsd]
[6235083.657292]  [<c023a4c3>] lookup_one_len+0x43/0x80
[6235083.657303]  [<edd8d13f>] compose_entry_fh+0x9f/0xe0 [nfsd]
[6235083.657315]  [<edd8e491>] encode_entryplus_baggage+0x51/0xb0 [nfsd]
[6235083.657327]  [<edd8e795>] encode_entry+0x2a5/0x2f0 [nfsd]
[6235083.657338]  [<edd8e820>] nfs3svc_encode_entry_plus+0x40/0x50 [nfsd]
[6235083.657349]  [<edd8366d>] nfsd_buffered_readdir+0xfd/0x1a0 [nfsd]
[6235083.657361]  [<edd8e7e0>] ? nfs3svc_encode_entry_plus+0x0/0x50 [nfsd]
[6235083.657372]  [<edd852a0>] nfsd_readdir+0x70/0xb0 [nfsd]
[6235083.657383]  [<edd8bd58>] nfsd3_proc_readdirplus+0xd8/0x200 [nfsd]
[6235083.657394]  [<edd8e7e0>] ? nfs3svc_encode_entry_plus+0x0/0x50 [nfsd]
[6235083.657405]  [<edd7f3a3>] nfsd_dispatch+0xd3/0x210 [nfsd]
[6235083.657423]  [<edd0fd83>] svc_process_common+0x2e3/0x590 [sunrpc]
[6235083.657438]  [<edd1c86d>] ? svc_xprt_received+0x2d/0x40 [sunrpc]
[6235083.657452]  [<edd1cd0b>] ? svc_recv+0x48b/0x750 [sunrpc]
[6235083.657465]  [<edd1010c>] svc_process+0xdc/0x140 [sunrpc]
[6235083.657474]  [<c0627010>] ? down_read+0x10/0x20
[6235083.657483]  [<edd7fa54>] nfsd+0xb4/0x140 [nfsd]
[6235083.657493]  [<c0143b9e>] ? complete+0x4e/0x60
[6235083.657503]  [<edd7f9a0>] ? nfsd+0x0/0x140 [nfsd]
[6235083.657513]  [<c0173354>] kthread+0x74/0x80
[6235083.657520]  [<c01732e0>] ? kthread+0x0/0x80
[6235083.657528]  [<c010af3e>] kernel_thread_helper+0x6/0x10
[6235083.657533] Code: 8b 53 08 e8 75 d4 0a d2 f7 d0 89 03 31 c0 5b 5d 
c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 57 56 53 83 ec 3c 3e 8d 
74 26 00 <f6> 42 24 40 89 c3 b8 f6 ff ff ff 74 0d 83 c4 3c 5b 5e 5f 5d c3
[6235083.657652] EIP: [<ee2d6c1e>] gfs2_drevalidate+0xe/0x200 [gfs2] 
SS:ESP 0069:c1dc5c20
[6235083.865070] CR2: 0000000000000024
[6235083.865077] ---[ end trace 2dfc9195648a185b ]---
[6235099.205542] dlm: connecting to 2


Is this a bug?
Is it known?
Are there any workarounds?

The gfs2+nfs server is a xen client, with ubuntu 10.04 and kernel 
2.6.38-2-virtual
# gfs2_tool version
gfs2_tool 3.0.12 (built Jul  5 2011 16:52:20)
Copyright (C) Red Hat, Inc.  2004-2010  All rights reserved.
# cman_tool version
6.2.0 config 2011070805

Here's also the cluster.conf file, just in case ;)
<?xml version="1.0"?>
<cluster name="wtn_cluster" config_version="2011070805">
<quorumd interval="5" tko="6" label="filepro-qdisk" votes="1"/>
<cman expected_votes="3" two_node="0"/>
<totem consensus="72000" token="60000"/>
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="filepro01" votes="1" nodeid="1">
<fence>
<method name="xen">
<device name="xen" nodename="filepro01" U="abcdefghijk" action="reboot"/>
</method>
</fence>
</clusternode>
<clusternode name="filepro02" votes="1" nodeid="2">
<fence>
<method name="xen">
<device name="xen" nodename="filepro02" U="qwertyuiop" action="reboot"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="xen" agent="fence_xen"/>
</fencedevices>
</cluster>

Thanks in advance :)

-- 
Javi Polo
Administrador de Sistemas
Tel  93 734 97 70
Fax 93 734 97 71
jpolo at wtransnet.com



From swhiteho at redhat.com  Fri Jul  8 16:22:18 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 08 Jul 2011 17:22:18 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <4E172514.7060806@wtransnet.com>
References: <4E172514.7060806@wtransnet.com>
Message-ID: <1310142138.2705.37.camel@menhir>

Hi,

On Fri, 2011-07-08 at 17:41 +0200, Javi Polo wrote:
> Hello everyone!
> 
> I've set up a cluster in order to use GFS2. The cluster works really well ;)
> Then, I've exported the GFS2 filesystem via NFS to share with machines 
> outside the cluster, and in a read fashion it works OK, but as soon as I 
> try to write in it, the filesystem seems to hang:
> 
> root at file03:~# mount filepro01:/mnt/gfs /mnt/tmp -o soft
> root at file03:~# ls /mnt/tmp/
> algo  caca  caca2  testa
> root at file03:~# mkdir /mnt/tmp/otracosa
> 
> at this point, the NFS stopped working. I can see in the nfs client:
> 
> [11132241.127470] nfs: server filepro01 not responding, timed out
> 
> however, the directory was indeed created, and the other node can 
> continue using the gfs2 filesystem (locally)
> On the NFS server (filepro01) looking at the logs I found some nasty 
> things. This first part is mounting the filesystem, which is OK:
> 
Currently we don't recommend using NFS on a GFS2 filesystem which is
also being used locally. That will hopefully change in the future,
however, in the mean time I'd suggest using the localflocks mount option
on all the mounts (and be aware the fcntl/flock locking is then node
local) to avoid problems that you are otherwise likely to hit during
recovery. Also...

> [6234925.738508] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state 
> recovery directory
> [6234925.787305] NFSD: starting 90-second grace period
> [6234925.825811] GFS2 (built Feb  7 2011 16:11:33) installed
> [6234925.826698] GFS2: fsid=: Trying to join cluster "lock_dlm", 
> "wtn_cluster:file01"
> [6234925.886991] GFS2: fsid=wtn_cluster:file01.0: Joined cluster. Now 
> mounting FS...
> [6234925.975113] GFS2: fsid=wtn_cluster:file01.0: jid=0, already locked 
> for use
> [6234925.975116] GFS2: fsid=wtn_cluster:file01.0: jid=0: Looking at 
> journal...
> [6234926.075105] GFS2: fsid=wtn_cluster:file01.0: jid=0: Acquiring the 
> transaction lock...
> [6234926.075152] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replaying 
> journal...
> [6234926.076200] GFS2: fsid=wtn_cluster:file01.0: jid=0: Replayed 8 of 9 
> blocks
> [6234926.076204] GFS2: fsid=wtn_cluster:file01.0: jid=0: Found 1 revoke tags
> [6234926.076649] GFS2: fsid=wtn_cluster:file01.0: jid=0: Journal 
> replayed in 1s
> [6234926.076800] GFS2: fsid=wtn_cluster:file01.0: jid=0: Done
> [6234926.076945] GFS2: fsid=wtn_cluster:file01.0: jid=1: Trying to 
> acquire journal lock...
> [6234926.078723] GFS2: fsid=wtn_cluster:file01.0: jid=1: Looking at 
> journal...
> [6234926.257645] GFS2: fsid=wtn_cluster:file01.0: jid=1: Done
> [6234926.258187] GFS2: fsid=wtn_cluster:file01.0: jid=2: Trying to 
> acquire journal lock...
> [6234926.260966] GFS2: fsid=wtn_cluster:file01.0: jid=2: Looking at 
> journal...
> [6234926.549636] GFS2: fsid=wtn_cluster:file01.0: jid=2: Done
> [6234930.789787] ipmi message handler version 39.2
> 
That all looks ok, but...

> and when we try to write from nfs client, bang:
> 
> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference 
> at 00000024
> [6235083.656973] IP: [<ee2d6c1e>] gfs2_drevalidate+0xe/0x200 [gfs2]
> [6235083.656992] *pdpt = 0000000001831027 *pde = 0000000000000000
> [6235083.657003] Oops: 0000 [#1] SMP
> [6235083.657012] last sysfs file: /sys/module/dlm/initstate
> [6235083.657018] Modules linked in: ipmi_msghandler xenfs gfs2 ib_iser 
> rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
> libiscsi scsi_transport_iscsi dlm configfs nfsd e
> xportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc drbd lru_cache lp 
> parport [last unloaded: scsi_transport_iscsi]
> [6235083.657090]
> [6235083.657095] Pid: 1497, comm: nfsd Tainted: G        W   
> 2.6.38-2-virtual #29~lucid1-Ubuntu /
> [6235083.657103] EIP: 0061:[<ee2d6c1e>] EFLAGS: 00010282 CPU: 0
> [6235083.657115] EIP is at gfs2_drevalidate+0xe/0x200 [gfs2]

this should not happen. It looks like we are trying to look up something
that is 24 (hex) bytes into a structure. Does the fs have posix acls
enabled or selinux or something else using xattrs?

Steve.




From ajb2 at mssl.ucl.ac.uk  Fri Jul  8 16:46:01 2011
From: ajb2 at mssl.ucl.ac.uk (Alan Brown)
Date: Fri, 8 Jul 2011 17:46:01 +0100 (BST)
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <1310142138.2705.37.camel@menhir>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
Message-ID: <Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>

On Fri, 8 Jul 2011, Steven Whitehouse wrote:

> Currently we don't recommend using NFS on a GFS2 filesystem which is
> also being used locally.

After much dealing with NFS internals, I would recommend NOT using it on
any filesystem where the files are accessed locally.

NFSv2/3 doesn't play nice with anything else which may access the
underlaying disk (including Samba. The only "safe" method is to export
your samba shares from a NFS client elsewhere on the network).

YMMV. NFSv4 is supposedly better behaved. I've not tested it.






From jpolo at wtransnet.com  Fri Jul  8 16:59:50 2011
From: jpolo at wtransnet.com (Javi Polo)
Date: Fri, 08 Jul 2011 18:59:50 +0200
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <1310142138.2705.37.camel@menhir>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
Message-ID: <4E173786.1060507@wtransnet.com>

thx for the fast reply!

El 07/08/11 18:22, Steven Whitehouse escribi?:

> Currently we don't recommend using NFS on a GFS2 filesystem which is
> also being used locally. That will hopefully change in the future,
> however, in the mean time I'd suggest using the localflocks mount option
> on all the mounts (and be aware the fcntl/flock locking is then node
> local) to avoid problems that you are otherwise likely to hit during
> recovery. Also...
>
I dont think I really understood you. You mean that a host wich uses 
GFS2 locally is not recommended to export the filesystem via NFS, but if 
the host just uses it as NFS export, and who access the filesystem are 
just the nfs clients, it is allright
:?

> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference
>> this should not happen. It looks like we are trying to look up something
>> that is 24 (hex) bytes into a structure. Does the fs have posix acls
>> enabled or selinux or something else using xattrs?

Nope, at least as far as I know. As I dont usually use ubuntu, I have 
checked to see if it had selinux enabled by default, or some ACLs 
related thing, but it seems it's not ....

-- 
Javi Polo
Administrador de Sistemas
Tel  93 734 97 70
Fax 93 734 97 71
jpolo at wtransnet.com



From Colin.Simpson at iongeo.com  Fri Jul  8 17:04:56 2011
From: Colin.Simpson at iongeo.com (Colin Simpson)
Date: Fri, 08 Jul 2011 18:04:56 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
	NULL pointer deference
In-Reply-To: <Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
Message-ID: <1310144696.3779.4.camel@bhac.iouk.ioroot.tld>

That's not ideal either when Samba isn't too happy working over NFS, and
that is not recommended by the Samba people as being a sensible config.


Colin

On Fri, 2011-07-08 at 17:46 +0100, Alan Brown wrote:
> On Fri, 8 Jul 2011, Steven Whitehouse wrote:
> 
> > Currently we don't recommend using NFS on a GFS2 filesystem which is
> > also being used locally.
> 
> After much dealing with NFS internals, I would recommend NOT using it
> on
> any filesystem where the files are accessed locally.
> 
> NFSv2/3 doesn't play nice with anything else which may access the
> underlaying disk (including Samba. The only "safe" method is to export
> your samba shares from a NFS client elsewhere on the network).
> 
> YMMV. NFSv4 is supposedly better behaved. I've not tested it.
> 
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.





From ajb2 at mssl.ucl.ac.uk  Fri Jul  8 17:36:53 2011
From: ajb2 at mssl.ucl.ac.uk (Alan Brown)
Date: Fri, 8 Jul 2011 18:36:53 +0100 (BST)
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
	<1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
Message-ID: <Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>

On Fri, 8 Jul 2011, Colin Simpson wrote:

> That's not ideal either when Samba isn't too happy working over NFS, and
> that is not recommended by the Samba people as being a sensible config.

I know but there's a real (and demonstrable) risk of data corruption for
NFS vs _anything_ if NFS clients and local processes (or clients of other
services such as a samba server) happen to grab the same file for writing
at the same time.

Apart from that, the 1 second granularity of NFS timestamps can (and has)
result in writes made by non-nfs processes to cause NFS clients which have
that file opened read/write to see "stale filehandle" errors due to the
inode having changed when they weren't expecting it.

We (should) all know NFS was a kludge. What's surprising is how much
kludge stll remains in the current v2/3 code (which is surprisingly opaque
and incredibly crufty, much of it dates from the early 1990s or earlier)

As I said earlier, V4 is supposed to play a lot nicer but I haven't tested
it - as as far as I know it's not suported on GFS systems anyway (That was
the RH official line when I tried to get it working last time..)

I'd love to get v4 running properly in active/active/active setup from
multiple GFS-mounted fileservers to the clients. If anyone knows how to
reliably do it on EL5.6 systems then I'm open to trying again as I believe
that this would solve a number of issues being seen locally (including
various crash bugs).

On the other hand, v2/3 aren't going away anytime soon and some effort
really needs to be put into making them work properly.

On the gripping hand, I'd also like to see viable alternatives to NFS when
it comes to feeding 100+ desktop clients

Making them mount the filesystems using GFS might sound like an
alternative until you consider what happens if any of them crash/reboot
during the day. Batch processes can wait all day, but users with frozen
desktops get irate - quickly.





From Colin.Simpson at iongeo.com  Fri Jul  8 18:36:45 2011
From: Colin.Simpson at iongeo.com (Colin Simpson)
Date: Fri, 08 Jul 2011 19:36:45 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
	NULL pointer deference
In-Reply-To: <Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
References: <4E172514.7060806@wtransnet.com>
	<1310142138.2705.37.camel@menhir><Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk><1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
Message-ID: <1310150205.3779.44.camel@bhac.iouk.ioroot.tld>

Very interesting. 

Certainly in our application it would be highly unlikely that samba and
NFS would try to write to the same file simultaneously that would very
much be an edge case (and users would know the result would be
undefined). I can certainly personally live with that level of potential
file corruption, though I can see others may not.

But I guess you are also telling me that file locking between the two
wouldn't be helping here either? (I rule out NFSv2 as something we have
thankfully eliminated). NFSv3 could be gone for us if we are lucky by
2012 (when RHEL 4 goes EOL and if RHEL5's NFSv4 is robust enough).

Currently by default RHEL6 clusters export (with the standard RA) on
NFSv4 and RHEL6 (and Fedora etc) mount these as NFSv4. So I'd hope
supported.....I haven't as yet tried to wrap any security round these,
from a discussion here a while back that looks like hard work. I'd
certainly love to have pNFS to allow multiple active nodes. 

OT: My main NFS issue I have just now is supporting laptops with the
automounter. NFS is just so undynamic. Once a mount is in place the
client changing IP will leave the mount hung. And laptops do this all
the time (on, off wired, wireless VPN etc). We have some nasty scripts
that clean up the mounts when laptops move network (lots of forcing
kills of autofs and umount -fl's). Mostly works ok. Again if a user
disconnects a laptop during an ongoing file operation they can expect
undefined file contents. It's better than the alternative of hung
mounts, lots of things hate that. We aren't talking complex file formats
or operations here, copying source files, data, docs to the local disk,
so no nasty binary file corruption issues. Maybe not such a great thing
to do, but users like to have a consistent file system view that matches
the office based systems. 

Sadly it looks like NFS is the least dynamic network component left in a
Linux distro. I posted a longer version of this problem to the linux-nfs
mailing list, I heard from someone that basically said the NFS committee
and developers (not just Linux) are largely targeting NFS as Enterprise
Storage protocol. I presume he means storage servers using NFS to share
to say front end web servers. So less interested in certainly my use
case. Possibly the best bet (in a while) for desktop network file
sharing will be the Samba, they seem to be trying to target cifs (with
full Unix extension) as being a solution for this. 

Thanks

Colin

On Fri, 2011-07-08 at 18:36 +0100, Alan Brown wrote:
> On Fri, 8 Jul 2011, Colin Simpson wrote:
> 
> > That's not ideal either when Samba isn't too happy working over NFS,
> and
> > that is not recommended by the Samba people as being a sensible
> config.
> 
> I know but there's a real (and demonstrable) risk of data corruption
> for
> NFS vs _anything_ if NFS clients and local processes (or clients of
> other
> services such as a samba server) happen to grab the same file for
> writing
> at the same time.
> 
> Apart from that, the 1 second granularity of NFS timestamps can (and
> has)
> result in writes made by non-nfs processes to cause NFS clients which
> have
> that file opened read/write to see "stale filehandle" errors due to
> the
> inode having changed when they weren't expecting it.
> 
> We (should) all know NFS was a kludge. What's surprising is how much
> kludge stll remains in the current v2/3 code (which is surprisingly
> opaque
> and incredibly crufty, much of it dates from the early 1990s or
> earlier)
> 
> As I said earlier, V4 is supposed to play a lot nicer but I haven't
> tested
> it - as as far as I know it's not suported on GFS systems anyway (That
> was
> the RH official line when I tried to get it working last time..)
> 
> I'd love to get v4 running properly in active/active/active setup from
> multiple GFS-mounted fileservers to the clients. If anyone knows how
> to
> reliably do it on EL5.6 systems then I'm open to trying again as I
> believe
> that this would solve a number of issues being seen locally (including
> various crash bugs).
> 
> On the other hand, v2/3 aren't going away anytime soon and some effort
> really needs to be put into making them work properly.
> 
> On the gripping hand, I'd also like to see viable alternatives to NFS
> when
> it comes to feeding 100+ desktop clients
> 
> Making them mount the filesystems using GFS might sound like an
> alternative until you consider what happens if any of them
> crash/reboot
> during the day. Batch processes can wait all day, but users with
> frozen
> desktops get irate - quickly.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.





From fdinitto at redhat.com  Fri Jul  8 18:57:32 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Fri, 08 Jul 2011 20:57:32 +0200
Subject: [Linux-cluster] cluster 3.1.4 release
Message-ID: <4E17531C.7070305@redhat.com>

Welcome to the cluster 3.1.4 release.

This release fixes a few bugs and adds a new dynamic relaxng schema
creation.
In order to run this version of cman/cluster, it is strictly required to
have fence-agents at least in version 3.1.5 and resource-agents in
version 3.9.2. Alternatively you have to disable cluster.conf validation
(see ccs_config_validate.8 for information).

The new source tarball can be downloaded here:

https://fedorahosted.org/releases/c/l/cluster/cluster-3.1.4.tar.xz

ChangeLog:

https://fedorahosted.org/releases/c/l/cluster/Changelog-3.1.4

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

Happy clustering,
Fabio



From ajb2 at mssl.ucl.ac.uk  Fri Jul  8 19:46:45 2011
From: ajb2 at mssl.ucl.ac.uk (Alan Brown)
Date: Fri, 08 Jul 2011 20:46:45 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <1310150205.3779.44.camel@bhac.iouk.ioroot.tld>
References: <4E172514.7060806@wtransnet.com>	<1310142138.2705.37.camel@menhir><Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk><1310144696.3779.4.camel@bhac.iouk.ioroot.tld>	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
	<1310150205.3779.44.camel@bhac.iouk.ioroot.tld>
Message-ID: <4E175EA5.10305@mssl.ucl.ac.uk>

Colin Simpson wrote:
> But I guess you are also telling me that file locking between the two
> wouldn't be helping here either? 

Correct.

NFSd (v2/3) doesn't pass client locks to the filesystem, nor does it 
respect locks set by other processes.

It has a number of other foibles - try setting up a large number of 
services where you have one NFS export per service (eg, multiple disk 
mounts) and there's a good chance the exports will fail at startup 
because they all try to run at once and end up scribbling all over each 
other's export list (There's a name for this kind of failure mode, which 
I can't remember)





From bfields at fieldses.org  Fri Jul  8 21:09:05 2011
From: bfields at fieldses.org (J. Bruce Fields)
Date: Fri, 8 Jul 2011 17:09:05 -0400
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
	<1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
Message-ID: <20110708210905.GD13886@fieldses.org>

On Fri, Jul 08, 2011 at 06:36:53PM +0100, Alan Brown wrote:
> On Fri, 8 Jul 2011, Colin Simpson wrote:
> 
> > That's not ideal either when Samba isn't too happy working over NFS, and
> > that is not recommended by the Samba people as being a sensible config.
> 
> I know but there's a real (and demonstrable) risk of data corruption for
> NFS vs _anything_ if NFS clients and local processes (or clients of other
> services such as a samba server) happen to grab the same file for writing
> at the same time.

With default mount options, the linux NFS client (like most NFS clients)
assumes that a file has a most one writer at a time.  (Applications that
need to do write-sharing over NFS need to use file locking.)

Note this issue isn't special to local process--the same restriction
applies to two NFS clients writing to the same file.

> Apart from that, the 1 second granularity of NFS timestamps

The NFS protocol supports higher granularity timestamps.  The limitation
is the exported filesystem.  If you're using something other than
ext2/3, you're probably getting higher granularity.

> can (and has)
> result in writes made by non-nfs processes to cause NFS clients which have
> that file opened read/write to see "stale filehandle" errors due to the
> inode having changed when they weren't expecting it.

Changing file data or attributes won't result in stale filehandle
errors.  (Bug reports welcome if you've seen otherwise.)  Stale
filehandle errors should only happen when a client attempts to use a
file which no longer exists on the server.  (E.g. if another client
deletes a file while your client has it open.)  (This can also happen if
you rename a file across directories on a filesystem exported with the
subtree_check option.  The subtree_check option is deprecated, for that
reason.)

> We (should) all know NFS was a kludge. What's surprising is how much
> kludge stll remains in the current v2/3 code (which is surprisingly opaque
> and incredibly crufty, much of it dates from the early 1990s or earlier)

Details welcome.

> As I said earlier, V4 is supposed to play a lot nicer

V4 has a number of improvements, but what I've described above applies
across versions (module some technical details about timestamps vs.
change attributes).

--b.

> but I haven't tested
> it - as as far as I know it's not suported on GFS systems anyway (That was
> the RH official line when I tried to get it working last time..)
> 
> I'd love to get v4 running properly in active/active/active setup from
> multiple GFS-mounted fileservers to the clients. If anyone knows how to
> reliably do it on EL5.6 systems then I'm open to trying again as I believe
> that this would solve a number of issues being seen locally (including
> various crash bugs).
> 
> On the other hand, v2/3 aren't going away anytime soon and some effort
> really needs to be put into making them work properly.
> 
> On the gripping hand, I'd also like to see viable alternatives to NFS when
> it comes to feeding 100+ desktop clients
> 
> Making them mount the filesystems using GFS might sound like an
> alternative until you consider what happens if any of them crash/reboot
> during the day. Batch processes can wait all day, but users with frozen
> desktops get irate - quickly.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From ajb2 at mssl.ucl.ac.uk  Mon Jul 11 08:30:11 2011
From: ajb2 at mssl.ucl.ac.uk (Alan Brown)
Date: Mon, 11 Jul 2011 09:30:11 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <20110708210905.GD13886@fieldses.org>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
	<1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
	<20110708210905.GD13886@fieldses.org>
Message-ID: <4E1AB493.7060302@mssl.ucl.ac.uk>

On 08/07/11 22:09, J. Bruce Fields wrote:

> With default mount options, the linux NFS client (like most NFS clients)
> assumes that a file has a most one writer at a time.  (Applications that
> need to do write-sharing over NFS need to use file locking.)

The problem is that file locking on V3 isn't passed back down to the 
filesystem - hence the issues with nfs vs samba (or local disk 
access(*)) on the same server.

(*) Local disk access includes anything running on other nodes in a 
GFS/GFS2 environment. This precludes exporting the same GFS(2) 
filesystem on multiple cluster nodes.


> The NFS protocol supports higher granularity timestamps.  The limitation
> is the exported filesystem.  If you're using something other than
> ext2/3, you're probably getting higher granularity.

GFS/GFS2 in this case...

>> can (and has)
>> result in writes made by non-nfs processes to cause NFS clients which have
>> that file opened read/write to see "stale filehandle" errors due to the
>> inode having changed when they weren't expecting it.
>
> Changing file data or attributes won't result in stale filehandle
> errors.  (Bug reports welcome if you've seen otherwise.)

I'll have to try and repeat the issue, but it's a race condition with a 
narrow window at the best of times.

> Stale
> filehandle errors should only happen when a client attempts to use a
> file which no longer exists on the server.  (E.g. if another client
> deletes a file while your client has it open.)

It's possible this has happened. I have no idea what user batch scripts 
are trying to do on the compute nodes, but in the case that was brought 
to my attention the file was edited on one node while another had it open.

>  (This can also happen if
> you rename a file across directories on a filesystem exported with the
> subtree_check option.  The subtree_check option is deprecated, for that
> reason.)

All our FSes are exported no_subtree_check and at the root of the FS.

>> We (should) all know NFS was a kludge. What's surprising is how much
>> kludge stll remains in the current v2/3 code (which is surprisingly opaque
>> and incredibly crufty, much of it dates from the early 1990s or earlier)
>
> Details welcome.

The non-parallelisation in exportfs (leading to race conditions) for 
starters. We had to insert flock statements in every call to it in 
/usr/share/cluster/nfsclient.sh in order to have reliable service startups

There are a number of RH Bugzilla tickets revolving around NFS behaviour 
which would be worth looking at.

>> As I said earlier, V4 is supposed to play a lot nicer
>
> V4 has a number of improvements, but what I've described above applies
> across versions (module some technical details about timestamps vs.
> change attributes).

Thanks for the input.

NFS has been a major pain point in our organisation for years. If you 
have ideas for doing things better then I'm very interested.

Alan




From swhiteho at redhat.com  Mon Jul 11 10:43:58 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 11 Jul 2011 11:43:58 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <4E1AB493.7060302@mssl.ucl.ac.uk>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
	<1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
	<20110708210905.GD13886@fieldses.org> <4E1AB493.7060302@mssl.ucl.ac.uk>
Message-ID: <1310381038.2766.9.camel@menhir>

Hi,

On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> On 08/07/11 22:09, J. Bruce Fields wrote:
> 
> > With default mount options, the linux NFS client (like most NFS clients)
> > assumes that a file has a most one writer at a time.  (Applications that
> > need to do write-sharing over NFS need to use file locking.)
> 
> The problem is that file locking on V3 isn't passed back down to the 
> filesystem - hence the issues with nfs vs samba (or local disk 
> access(*)) on the same server.
> 
> (*) Local disk access includes anything running on other nodes in a 
> GFS/GFS2 environment. This precludes exporting the same GFS(2) 
> filesystem on multiple cluster nodes.
> 
Well the locks are kind of passed down, but there is not enough info to
make it work correctly, hence we require the localflocks mount option to
prevent this information from being passed down at all.

> 
> > The NFS protocol supports higher granularity timestamps.  The limitation
> > is the exported filesystem.  If you're using something other than
> > ext2/3, you're probably getting higher granularity.
> 
> GFS/GFS2 in this case...
> 
GFS supports second resolution time stamps
GFS2 supports nanosecond resolution time stamps

> >> can (and has)
> >> result in writes made by non-nfs processes to cause NFS clients which have
> >> that file opened read/write to see "stale filehandle" errors due to the
> >> inode having changed when they weren't expecting it.
> >
> > Changing file data or attributes won't result in stale filehandle
> > errors.  (Bug reports welcome if you've seen otherwise.)
> 
> I'll have to try and repeat the issue, but it's a race condition with a 
> narrow window at the best of times.
> 
GFS2 doesn't do anything odd with filehandles, they shouldn't be coming
up as stale unless the inode has been removed.

> > Stale
> > filehandle errors should only happen when a client attempts to use a
> > file which no longer exists on the server.  (E.g. if another client
> > deletes a file while your client has it open.)
> 
> It's possible this has happened. I have no idea what user batch scripts 
> are trying to do on the compute nodes, but in the case that was brought 
> to my attention the file was edited on one node while another had it open.
> 
That probably means the editor made a copy of it and then moved it back
over the top of the original file, thus unlinking the original file.

> >  (This can also happen if
> > you rename a file across directories on a filesystem exported with the
> > subtree_check option.  The subtree_check option is deprecated, for that
> > reason.)
> 
> All our FSes are exported no_subtree_check and at the root of the FS.
> 
> >> We (should) all know NFS was a kludge. What's surprising is how much
> >> kludge stll remains in the current v2/3 code (which is surprisingly opaque
> >> and incredibly crufty, much of it dates from the early 1990s or earlier)
> >
> > Details welcome.
> 
> The non-parallelisation in exportfs (leading to race conditions) for 
> starters. We had to insert flock statements in every call to it in 
> /usr/share/cluster/nfsclient.sh in order to have reliable service startups
> 
> There are a number of RH Bugzilla tickets revolving around NFS behaviour 
> which would be worth looking at.
> 
> >> As I said earlier, V4 is supposed to play a lot nicer
> >
> > V4 has a number of improvements, but what I've described above applies
> > across versions (module some technical details about timestamps vs.
> > change attributes).
> 
> Thanks for the input.
> 
> NFS has been a major pain point in our organisation for years. If you 
> have ideas for doing things better then I'm very interested.
> 
> Alan
> 
NFS and GFS2 is an area in which we are trying to gradually increase the
supportable use cases. It is also a rather complex area, so it will take
some time to do this,

Steve.




From bfields at fieldses.org  Mon Jul 11 12:05:46 2011
From: bfields at fieldses.org (J. Bruce Fields)
Date: Mon, 11 Jul 2011 08:05:46 -0400
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <1310381038.2766.9.camel@menhir>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<Pine.LNX.4.64.1107081743340.18031@msslac.mssl.ucl.ac.uk>
	<1310144696.3779.4.camel@bhac.iouk.ioroot.tld>
	<Pine.LNX.4.64.1107081823360.18031@msslac.mssl.ucl.ac.uk>
	<20110708210905.GD13886@fieldses.org>
	<4E1AB493.7060302@mssl.ucl.ac.uk> <1310381038.2766.9.camel@menhir>
Message-ID: <20110711120546.GA26712@fieldses.org>

On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote:
> Hi,
> 
> On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> > On 08/07/11 22:09, J. Bruce Fields wrote:
> > 
> > > With default mount options, the linux NFS client (like most NFS clients)
> > > assumes that a file has a most one writer at a time.  (Applications that
> > > need to do write-sharing over NFS need to use file locking.)
> > 
> > The problem is that file locking on V3 isn't passed back down to the 
> > filesystem - hence the issues with nfs vs samba (or local disk 
> > access(*)) on the same server.

The NFS server *does* acquire locks on the exported filesystem (and does
it the same way for v2, v3, and v4).

For local filesystems (ext3, xfs, btrfs), this is sufficient.

For exports of cluster filesystems like gfs2, there are more complicated
problems that, as Steve says, will require some work to do to fix.

Samba is a more complicated issue due to the imperfect match between
Windows and Linux locking semantics, but depending on how it's
configured Samba will also acquire locks on the exported filesystem.

--b.



From Ralph.Grothe at itdz-berlin.de  Mon Jul 11 13:11:44 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Mon, 11 Jul 2011 15:11:44 +0200
Subject: [Linux-cluster] how to disable one node
In-Reply-To: <1309954418.14584.2148764421@webmail.messagingengine.com>
References: <1309954418.14584.2148764421@webmail.messagingengine.com>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF01432977@itdzex101.ITDZ.verwalt-berlin.de>

I'm not sure if you can access this doc (I think it requires a
login account at RHN),
and if this addresses your issue?

In RHN Knowledge base there is this article entitled 
"How do I disable the cluster software on a member system in Red
Hat Enterprise Linux?"

 
https://access.redhat.com/kb/docs/DOC-5695


Good Luck
Ralph
________________________________

	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Helen
Heath
	Sent: Wednesday, July 06, 2011 2:14 PM
	To: linux-cluster at redhat.com
	Subject: [Linux-cluster] how to disable one node
	
	
	Hi all -
	 
	I hope someone can shed some light on this.  I have a
2-node cluster running on RedHat 3 which has a shared /clust1
filesystem and is connected to a network power switch.  There is
something very wrong with the cluster, as every day currently it
is rebooting whichever is the primary node, for no reason I can
track down.  No hardware faults anywhere in the cluster, no
failures of any kind logging in any log files, etc etc.   It
started out well over a year ago rebooting the primary node every
other week, then across time it progressed to once a week, then
once a day.  I logged a call with RedHat way back when it first
started; nothing was ever found to be the problem, and of course
in time, RedHat v3 went out of support and they would no longer
assist in troubleshooting the issue.  Prior to this problem
starting the cluster had been running happily with no issues for
about 5 years.
	 
	Now this cluster is shortly being replaced with new
hardware and RedHat 5, so hopefully whatever is the problem will
as mysteriously vanish as it appeared.  However, I need to stop
this daily reboot as it is playing havoc with the application
that runs on this system (a heavily-utilised database) and having
tried everything I can think of, I decided to 'break' the
cluster; ie, take down one node so that only one node remains
running the application.
	 
	I cannot find a way to do this that persists across a
reboot of the node that should be out of the cluster.  I've run
"/sbin/chkconfig --del clumanager" and it did take the service
out of chkconfig (I verified this).  The RedHat document
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html
/Cluster_Administration/s1-admin-disable.html seems to indicate
this should persist across a reboot - ie, you reboot the node and
it does not attempt to rejoin the cluster; however, this didn't
work!  The primary node cluster monitoring software saw that the
secondary node was down, STONITH kicked in, the NPS powered the
port this node is connected to off and back on, the secondary
node rebooted and rejoined the cluster!
	 
	Does anyone know how to either temporarily remove the
secondary node from the cluster in such a way that persists
across reboots but can be easily brought back into the cluster
when needed, or else (and preferably) how to temporarily stop the
cluster monitoring software running on the primary node from even
looking out for the secondary node - as in, it doesn't care
whether the secondary node is up or not?  I've checked for the
period the secondary node is down that the primary node is quite
happy to carry on processing as usual but as soon as the cluster
monitoring software on the primary node realises the secondary
node is down, it reboots it, and I'm back to square one!
	 
	This is now really urgent (I've been trying to find an
answer to this for some weeks now) as I go on holiday on Friday
and I really don't want to leave my second-in-command with a mess
on his hands!
	 
	thanks
	 
	-- 
	Helen Heath
	helen_heath at fastmail.fm
	=*=
	Everything that has a beginning has an ending. Make your
peace with that and all will be well.
	-- Buddhist saying




From linux at alteeve.com  Tue Jul 12 00:07:51 2011
From: linux at alteeve.com (Digimer)
Date: Mon, 11 Jul 2011 20:07:51 -0400
Subject: [Linux-cluster] Detecting Windows Xen VM crash in RHCS2 (EL5) w/
	rgmanager
Message-ID: <4E1B9057.9010906@alteeve.com>

Hi all,

   Doing some testing, I found that rgmanager detects a crash in my *nix 
VMs and properly restarts them. However, when I BSOD a Windows domU, the 
VM is left alone.

   Is it possible to have RGManager tell when a windows VM dies? If so, 
what would the magical incantation be?

   All packages from the stock repos:
EL5.6
Xen 3.1
CMAN+RGmanager

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"I feel confined, only free to expand myself within boundaries."



From Colin.Simpson at iongeo.com  Tue Jul 12 01:29:56 2011
From: Colin.Simpson at iongeo.com (Colin Simpson)
Date: Tue, 12 Jul 2011 02:29:56 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
	NULL pointer deference
In-Reply-To: <20110711120546.GA26712@fieldses.org>
References: <20110711120546.GA26712@fieldses.org>
Message-ID: <1310434196.11870.17.camel@shyster>

OK, so my question is, is there any other reason apart from the risk of
individual file corruption from locking being incompatible between
local/samba vs NFS that may lead to issues i.e. we aren't really
interested in locking working between NFS and local/samba access just
that it works consistently in NFS when accessing files that way (with a
single node server) and locally/samba when accessing files that way.

I mean I'm thinking of, for example, I have a build that generates
source code via NFS then some time later a PC comes in via Samba and
accesses these files for building on that environment. The two systems
aren't requiring locking to work cross platform/protocol, just need to
be exported to the two systems. But locking on each one separately is
useful. 

If there are and we should be using all access via NFS on NFS exported
filesystems, one issue that also springs to mind is commercial backup
systems that support GFS2 but don't support backing up via NFS.

Is there anything else I should know about GFS2 limitations?
Is there a book "GFS: The Missing Manual"? :)

Thanks

Colin

On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote:
> On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote:
> > Hi,
> >
> > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> > > On 08/07/11 22:09, J. Bruce Fields wrote:
> > >
> > > > With default mount options, the linux NFS client (like most NFS
> clients)
> > > > assumes that a file has a most one writer at a time.
> (Applications that
> > > > need to do write-sharing over NFS need to use file locking.)
> > >
> > > The problem is that file locking on V3 isn't passed back down to
> the
> > > filesystem - hence the issues with nfs vs samba (or local disk
> > > access(*)) on the same server.
> 
> The NFS server *does* acquire locks on the exported filesystem (and
> does
> it the same way for v2, v3, and v4).
> 
> For local filesystems (ext3, xfs, btrfs), this is sufficient.
> 
> For exports of cluster filesystems like gfs2, there are more
> complicated
> problems that, as Steve says, will require some work to do to fix.
> 
> Samba is a more complicated issue due to the imperfect match between
> Windows and Linux locking semantics, but depending on how it's
> configured Samba will also acquire locks on the exported filesystem.
> 
> --b.
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.





From jpolo at wtransnet.com  Tue Jul 12 12:49:54 2011
From: jpolo at wtransnet.com (Javi Polo)
Date: Tue, 12 Jul 2011 14:49:54 +0200
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <4E173786.1060507@wtransnet.com>
References: <4E172514.7060806@wtransnet.com> <1310142138.2705.37.camel@menhir>
	<4E173786.1060507@wtransnet.com>
Message-ID: <4E1C42F2.6020609@wtransnet.com>

El 07/08/11 18:59, Javi Polo escribi?:
>
>> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference
>>> this should not happen. It looks like we are trying to look up 
>>> something
>>> that is 24 (hex) bytes into a structure. Does the fs have posix acls
>>> enabled or selinux or something else using xattrs?
>
> Nope, at least as far as I know. As I dont usually use ubuntu, I have 
> checked to see if it had selinux enabled by default, or some ACLs 
> related thing, but it seems it's not ....
>
Anyone could hint me with this matter?
I'm not using selinux nor xattrs nor posix acls, just a plain gfs2 
filesystem with 3 journals ...

thx

-- 
Javi Polo
Administrador de Sistemas
Tel  93 734 97 70
Fax 93 734 97 71
jpolo at wtransnet.com



From Colin.Simpson at iongeo.com  Tue Jul 12 18:52:51 2011
From: Colin.Simpson at iongeo.com (Colin Simpson)
Date: Tue, 12 Jul 2011 19:52:51 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle
	kernelNULL pointer deference
In-Reply-To: <1310434196.11870.17.camel@shyster>
References: <1310434196.11870.17.camel@shyster>
Message-ID: <1310496771.15833.86.camel@bhac.iouk.ioroot.tld>

I just ask this as I have a cluster where we wish to share a project
directories and home dirs and have them accessible by Linux clients via
NFS and PC's via Samba. As I say the locking cross OS doesn't matter. 

And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel
often panicing (every week or so) on one node. Could this be the cause?

It's hard to catch as the fencing has stopped me so far getting a good
core (and the change to crashkernel param which changed in 6.1 the new
param doesn't play with the old kernel) . Plus I guess I need to see if
it happens on the latest kernels, but they are worse for me due to
BZ#712139. I guess the first thing I'll get from support is try the
latest hotfix kernel (which I can only get once I've tested the test
kernel). Also plus long fence intervals aren't great to capture. 

So is it time for me to look at going back to ext4 for an HA file
server.

Can anyone from RH tell me if I'm wasting my time even trying this on
GFS2 (that I will get instability and kernel crashes)? 

Really unfortunate if so, as I really like the setup when it's
working.....

Also, after a node crashes some GFS mounts aren't too happy, they take a
long time to mount back on the original failed node. The filesystems are
dirty when we fsck them lots of 

Ondisk and fsck bitmaps differ at block 109405952 (0x6856700) 
Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free)
Metadata type is 0 (free)

Some differences in free space etc

Can anyone from RH tell me if I'm wasting my time even trying this on
GFS2 (that I will get GFS2 instability and kernel crashes)? 

Thanks

Colin

On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote:
> OK, so my question is, is there any other reason apart from the risk
> of
> individual file corruption from locking being incompatible between
> local/samba vs NFS that may lead to issues i.e. we aren't really
> interested in locking working between NFS and local/samba access just
> that it works consistently in NFS when accessing files that way (with
> a
> single node server) and locally/samba when accessing files that way.
> 
> I mean I'm thinking of, for example, I have a build that generates
> source code via NFS then some time later a PC comes in via Samba and
> accesses these files for building on that environment. The two systems
> aren't requiring locking to work cross platform/protocol, just need to
> be exported to the two systems. But locking on each one separately is
> useful.
> 
> If there are and we should be using all access via NFS on NFS exported
> filesystems, one issue that also springs to mind is commercial backup
> systems that support GFS2 but don't support backing up via NFS.
> 
> Is there anything else I should know about GFS2 limitations?
> Is there a book "GFS: The Missing Manual"? :)
> 
> Thanks
> 
> Colin
> 
> On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote:
> > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote:
> > > Hi,
> > >
> > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> > > > On 08/07/11 22:09, J. Bruce Fields wrote:
> > > >
> > > > > With default mount options, the linux NFS client (like most
> NFS
> > clients)
> > > > > assumes that a file has a most one writer at a time.
> > (Applications that
> > > > > need to do write-sharing over NFS need to use file locking.)
> > > >
> > > > The problem is that file locking on V3 isn't passed back down to
> > the
> > > > filesystem - hence the issues with nfs vs samba (or local disk
> > > > access(*)) on the same server.
> >
> > The NFS server *does* acquire locks on the exported filesystem (and
> > does
> > it the same way for v2, v3, and v4).
> >
> > For local filesystems (ext3, xfs, btrfs), this is sufficient.
> >
> > For exports of cluster filesystems like gfs2, there are more
> > complicated
> > problems that, as Steve says, will require some work to do to fix.
> >
> > Samba is a more complicated issue due to the imperfect match between
> > Windows and Linux locking semantics, but depending on how it's
> > configured Samba will also acquire locks on the exported filesystem.
> >
> > --b.
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> 
> This email and any files transmitted with it are confidential and are
> intended solely for the use of the individual or entity to whom they
> are addressed.  If you are not the original recipient or the person
> responsible for delivering the email to the intended recipient, be
> advised that you have received this email in error, and that any use,
> dissemination, forwarding, printing, or copying of this email is
> strictly prohibited. If you received this email in error, please
> immediately notify the sender and delete the original.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 




From swap_project at yahoo.com  Tue Jul 12 19:07:26 2011
From: swap_project at yahoo.com (Srija)
Date: Tue, 12 Jul 2011 12:07:26 -0700 (PDT)
Subject: [Linux-cluster] totem token parameter setting
In-Reply-To: <20110704.122551.868642282278092140.yamato@redhat.com>
Message-ID: <1310497646.27199.YahooMailClassic@web112816.mail.gq1.yahoo.com>

Hi,

In a  sixteen node cluster , I need to add  <totem token="20000"/>. 
It needs the cluster nodes  to be rebooted. 

In the test lab ( three node clusers) I modify  this parameter rebooting one node  at a time. Each node has zen guests. So before rebooting , moving guests to different node. In that way the guests are not being effected for the change over.  

Is it ok to reboot the server  one at a time or it needs the whole cluster nodes  down then bring up  one by one? Heard  that  for this parameters, it is suggested to bring down the whole cluster node, then bring up. 

Pl. advice.

Thanks  again.



From fdinitto at redhat.com  Wed Jul 13 08:26:19 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Wed, 13 Jul 2011 10:26:19 +0200
Subject: [Linux-cluster] cluster 3.1.5 release
Message-ID: <4E1D56AB.9080907@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Welcome to the cluster 3.1.5 release.

This release addresses two issues in ccs_update_schema and relax the
requirements on fence-agents and resource-agents that were erroneously
introduced in the 3.1.4 release. It is still highly recommended to
update both fence-agents and resource-agents.

The new source tarball can be downloaded here:

https://fedorahosted.org/releases/c/l/cluster/cluster-3.1.5.tar.xz

ChangeLog:

https://fedorahosted.org/releases/c/l/cluster/Changelog-3.1.5

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

Happy clustering,
Fabio
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJOHVaqAAoJEFA6oBJjVJ+Oq7IP/izCXBx94D7j1rqpAiB1wJca
5oX86vBJYOWGRA3iac7qx/RLJnd6yO60UFZ3X98vgx+pid1HLEf9Z5OYVmyVfhck
agWchsxL995PbDo3Yc/uGZjtmAmy2XEbF9tAb/UtXvt/6/YEi/vDQiRCeBgX67pB
NHTj9Pl3R68VgRlKve/68VrB7zmkkzJWfy8Xz2UARZ27A+qU2wWw/4i9ee/wtUc0
vFQJfFEVy5YtqZL+P2sga0G6ZxJOHugY0fbzgQMLjv/k+aeAZV/wVBmEoyvshGqZ
YyH3EMzaxPxvgk8XKF8dRvq5PtbMpz0GOYvpsIj59FYUAcJ7ElGpsvIlWWn+VJZ9
RsHHddUvu+iZkH7Xmyz39AjZyiFhwwR/7qD1mnPWFMBQGdbwZU/k5+VeImw1Getn
HTVI2J0g7r0waWDJodm9hTXl97yLvkQwvreyl2UzdueS3sqD7J7+Z5GhHoVc3xEt
9WmvU/TY/oaClcZGBnPQuLhxAwOkb3v0W+gEUEUZ+TCSkH0o5yQoIkJL/Qo0DtHQ
AbxEc2xzXuzXKYLBH8Ce8gntJDDIojg0e19YJvtt/FRjA4L6S/CyqzafsJMVa3Pv
0s3AIYc7BdgmxCnx1MqxG13VKf1/huICwR8bj0VEZ4cdTxKSzgsXnivw9H2OFNn6
VL7+BR6SX6Bht1oo0xqe
=j+xc
-----END PGP SIGNATURE-----



From swhiteho at redhat.com  Wed Jul 13 15:53:59 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 13 Jul 2011 16:53:59 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle
 kernelNULL pointer deference
In-Reply-To: <1310496771.15833.86.camel@bhac.iouk.ioroot.tld>
References: <1310434196.11870.17.camel@shyster>
	<1310496771.15833.86.camel@bhac.iouk.ioroot.tld>
Message-ID: <1310572439.3237.49.camel@menhir>

Hi,

On Tue, 2011-07-12 at 19:52 +0100, Colin Simpson wrote:
> I just ask this as I have a cluster where we wish to share a project
> directories and home dirs and have them accessible by Linux clients via
> NFS and PC's via Samba. As I say the locking cross OS doesn't matter. 
> 
If it doesn't matter, then you can set the "localflocks" mount option on
GFS2 and each mount will act as if it was a single node filesystem so
far as locking is concerned. From a support perspective that config
(active/active NFS and Samba) is not supported (by Red Hat), because we
don't test it, because generally you do need locking in order to make it
safe wrt to accesses between NFS/Samba.

It is something where we'd like to expand our support in future though,
and the more requests we receive the better idea we get of exactly what
use cases people require, and thus where to spend our time.

> And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel
> often panicing (every week or so) on one node. Could this be the cause?
> 
It shouldn't be. If the set up you've described causes problems then
they will be in terms of coherency between the NFS and Samba exports, if
you've got a panic then thats something else entirely.

> It's hard to catch as the fencing has stopped me so far getting a good
> core (and the change to crashkernel param which changed in 6.1 the new
> param doesn't play with the old kernel) . Plus I guess I need to see if
> it happens on the latest kernels, but they are worse for me due to
> BZ#712139. I guess the first thing I'll get from support is try the
> latest hotfix kernel (which I can only get once I've tested the test
> kernel). Also plus long fence intervals aren't great to capture. 
> 
Do you not get messages in syslog? Thats the first thing to look at,
getting a core is helpful, but often not essential in kernel debugging.

> So is it time for me to look at going back to ext4 for an HA file
> server.
> 
> Can anyone from RH tell me if I'm wasting my time even trying this on
> GFS2 (that I will get instability and kernel crashes)? 
> 
> Really unfortunate if so, as I really like the setup when it's
> working.....
> 
> Also, after a node crashes some GFS mounts aren't too happy, they take a
> long time to mount back on the original failed node. The filesystems are
> dirty when we fsck them lots of 
> 
> Ondisk and fsck bitmaps differ at block 109405952 (0x6856700) 
> Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free)
> Metadata type is 0 (free)
> 
> Some differences in free space etc
> 
> Can anyone from RH tell me if I'm wasting my time even trying this on
> GFS2 (that I will get GFS2 instability and kernel crashes)? 
> 
I suspect that it will not work exactly as you expect due to potential
coherency issues, but you still should not be getting kernel crashes
either way,

Steve.

> Thanks
> 
> Colin
> 
> On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote:
> > OK, so my question is, is there any other reason apart from the risk
> > of
> > individual file corruption from locking being incompatible between
> > local/samba vs NFS that may lead to issues i.e. we aren't really
> > interested in locking working between NFS and local/samba access just
> > that it works consistently in NFS when accessing files that way (with
> > a
> > single node server) and locally/samba when accessing files that way.
> > 
> > I mean I'm thinking of, for example, I have a build that generates
> > source code via NFS then some time later a PC comes in via Samba and
> > accesses these files for building on that environment. The two systems
> > aren't requiring locking to work cross platform/protocol, just need to
> > be exported to the two systems. But locking on each one separately is
> > useful.
> > 
> > If there are and we should be using all access via NFS on NFS exported
> > filesystems, one issue that also springs to mind is commercial backup
> > systems that support GFS2 but don't support backing up via NFS.
> > 
> > Is there anything else I should know about GFS2 limitations?
> > Is there a book "GFS: The Missing Manual"? :)
> > 
> > Thanks
> > 
> > Colin
> > 
> > On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote:
> > > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse wrote:
> > > > Hi,
> > > >
> > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> > > > > On 08/07/11 22:09, J. Bruce Fields wrote:
> > > > >
> > > > > > With default mount options, the linux NFS client (like most
> > NFS
> > > clients)
> > > > > > assumes that a file has a most one writer at a time.
> > > (Applications that
> > > > > > need to do write-sharing over NFS need to use file locking.)
> > > > >
> > > > > The problem is that file locking on V3 isn't passed back down to
> > > the
> > > > > filesystem - hence the issues with nfs vs samba (or local disk
> > > > > access(*)) on the same server.
> > >
> > > The NFS server *does* acquire locks on the exported filesystem (and
> > > does
> > > it the same way for v2, v3, and v4).
> > >
> > > For local filesystems (ext3, xfs, btrfs), this is sufficient.
> > >
> > > For exports of cluster filesystems like gfs2, there are more
> > > complicated
> > > problems that, as Steve says, will require some work to do to fix.
> > >
> > > Samba is a more complicated issue due to the imperfect match between
> > > Windows and Linux locking semantics, but depending on how it's
> > > configured Samba will also acquire locks on the exported filesystem.
> > >
> > > --b.
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> > 
> > This email and any files transmitted with it are confidential and are
> > intended solely for the use of the individual or entity to whom they
> > are addressed.  If you are not the original recipient or the person
> > responsible for delivering the email to the intended recipient, be
> > advised that you have received this email in error, and that any use,
> > dissemination, forwarding, printing, or copying of this email is
> > strictly prohibited. If you received this email in error, please
> > immediately notify the sender and delete the original.
> > 
> > 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> > 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From laszlo.budai at gmail.com  Wed Jul 13 16:07:22 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Wed, 13 Jul 2011 19:07:22 +0300
Subject: [Linux-cluster] clurgmgrd[XXXX]: <err> Error storing ip: Duplicate
Message-ID: <4E1DC2BA.8040701@gmail.com>

Hello everyone,


I was asked to investigate why the rgmanager is not running on a red hat
cluster. The cluster is on RHEL 5.3.
#rpm -q cman rgmanager
cman-2.0.98-1.el5
rgmanager-2.0.46-1.el5

Currently cman is running, rgmanager not.

# clustat
Cluster Status for prod-clust1 @ Wed Jul 13 15:42:16 2011
Member Status: Quorate

 Member Name                                     ID   Status
 ------ ----                                     ---- ------
 pnl-p                                               1 Online
 psd-p                                               2 Online, Local


# cman_tool status
Version: 6.1.0
Config Version: 14
Cluster Name: prod-clust1
Cluster Id: 3382
Cluster Member: Yes
Cluster Generation: 1136
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1
Active subsystems: 7
Flags: 2node Dirty
Ports Bound: 0
Node name: pnl-p
Node ID: 1
Multicast addresses: 224.0.0.1
Node addresses: 10.0.0.2


# cman_tool services
type             level name     id       state
fence            0     default  00010002 none


# service rgmanager status
clurgmgrd dead but pid file exists

this is the situation on both nodes.

for one of the nodes I cannot see any message from rgmanager, and it was
confirmed that the error is older then the oldest log file.
on the other node I can see the messages when rgmanager was started
(after reboot) and here they are:

messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: <notice>
Resource Group Manager Starting
messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: <err> Error
storing ip: Duplicate



my question is what the second line means and what are the consequences?
Is it possible that a duplicate IP would shut down the rgmanager?
because after a few seconds (25 seconds as we can see) I can see the
following:

messages.3:Jun 26 02:00:35 node-pnl-01 clurgmgrd[8720]: <notice>
Shutting down

and later on:

messages.3:Jun 26 02:00:59 node-pnl-01 clurgmgrd[8720]: <notice>
Shutdown complete, exiting

Right now it is not an option to start the rgmanager and test. I have to
figure it out only from the log files.

Thank you in advance for any ideas.

Laszlo



From Colin.Simpson at iongeo.com  Wed Jul 13 16:41:33 2011
From: Colin.Simpson at iongeo.com (Colin Simpson)
Date: Wed, 13 Jul 2011 17:41:33 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle
	kernelNULL pointer deference
In-Reply-To: <1310572439.3237.49.camel@menhir>
References: <1310434196.11870.17.camel@shyster><1310496771.15833.86.camel@bhac.iouk.ioroot.tld>
	<1310572439.3237.49.camel@menhir>
Message-ID: <1310575293.32040.39.camel@bhac.iouk.ioroot.tld>

Hi

I'd be looking at wanting to do single node NFS (active/passive,
failover) but we are currently running with CTDB Samba on both nodes for
this same directory. Would that work with "localflocks" and/or be
supported in such a config? 

I'm thinking the clustered samba would also have to go in such a config
using "localflocks".

Sadly the messages file says nothing at all, apart from one node reports
the other node isn't responding and it fences it. There are kdump disks
on the nodes, but the RHEL 6.1 update changed the kernel param to
autocrashkernel=auto and this doesn't work with the 6.0 kernel (just  we
are currently running due to another bug in 6.1's latest kernel. I seem
to remember I used "crashkernel=512M-2G:64M,2G-:128M" with the older
kernels, but I can't remember. Maybe I should try that again but the
only was I know to get a kdump is to set a large fence delay. 

Thanks

Colin

On Wed, 2011-07-13 at 16:53 +0100, Steven Whitehouse wrote:
> Hi,
> 
> On Tue, 2011-07-12 at 19:52 +0100, Colin Simpson wrote:
> > I just ask this as I have a cluster where we wish to share a project
> > directories and home dirs and have them accessible by Linux clients
> via
> > NFS and PC's via Samba. As I say the locking cross OS doesn't
> matter.
> >
> If it doesn't matter, then you can set the "localflocks" mount option
> on
> GFS2 and each mount will act as if it was a single node filesystem so
> far as locking is concerned. From a support perspective that config
> (active/active NFS and Samba) is not supported (by Red Hat), because
> we
> don't test it, because generally you do need locking in order to make
> it
> safe wrt to accesses between NFS/Samba.
> 
> It is something where we'd like to expand our support in future
> though,
> and the more requests we receive the better idea we get of exactly
> what
> use cases people require, and thus where to spend our time.
> 
> > And using 2.6.32-71.24.1.el6.x86_64 kernel we are seeing the kernel
> > often panicing (every week or so) on one node. Could this be the
> cause?
> >
> It shouldn't be. If the set up you've described causes problems then
> they will be in terms of coherency between the NFS and Samba exports,
> if
> you've got a panic then thats something else entirely.
> 
> > It's hard to catch as the fencing has stopped me so far getting a
> good
> > core (and the change to crashkernel param which changed in 6.1 the
> new
> > param doesn't play with the old kernel) . Plus I guess I need to see
> if
> > it happens on the latest kernels, but they are worse for me due to
> > BZ#712139. I guess the first thing I'll get from support is try the
> > latest hotfix kernel (which I can only get once I've tested the test
> > kernel). Also plus long fence intervals aren't great to capture.
> >
> Do you not get messages in syslog? Thats the first thing to look at,
> getting a core is helpful, but often not essential in kernel
> debugging.
> 
> > So is it time for me to look at going back to ext4 for an HA file
> > server.
> >
> > Can anyone from RH tell me if I'm wasting my time even trying this
> on
> > GFS2 (that I will get instability and kernel crashes)?
> >
> > Really unfortunate if so, as I really like the setup when it's
> > working.....
> >
> > Also, after a node crashes some GFS mounts aren't too happy, they
> take a
> > long time to mount back on the original failed node. The filesystems
> are
> > dirty when we fsck them lots of
> >
> > Ondisk and fsck bitmaps differ at block 109405952 (0x6856700)
> > Ondisk status is 1 (Data) but FSCK thinks it should be 0 (Free)
> > Metadata type is 0 (free)
> >
> > Some differences in free space etc
> >
> > Can anyone from RH tell me if I'm wasting my time even trying this
> on
> > GFS2 (that I will get GFS2 instability and kernel crashes)?
> >
> I suspect that it will not work exactly as you expect due to potential
> coherency issues, but you still should not be getting kernel crashes
> either way,
> 
> Steve.
> 
> > Thanks
> >
> > Colin
> >
> > On Tue, 2011-07-12 at 02:29 +0100, Colin Simpson wrote:
> > > OK, so my question is, is there any other reason apart from the
> risk
> > > of
> > > individual file corruption from locking being incompatible between
> > > local/samba vs NFS that may lead to issues i.e. we aren't really
> > > interested in locking working between NFS and local/samba access
> just
> > > that it works consistently in NFS when accessing files that way
> (with
> > > a
> > > single node server) and locally/samba when accessing files that
> way.
> > >
> > > I mean I'm thinking of, for example, I have a build that generates
> > > source code via NFS then some time later a PC comes in via Samba
> and
> > > accesses these files for building on that environment. The two
> systems
> > > aren't requiring locking to work cross platform/protocol, just
> need to
> > > be exported to the two systems. But locking on each one separately
> is
> > > useful.
> > >
> > > If there are and we should be using all access via NFS on NFS
> exported
> > > filesystems, one issue that also springs to mind is commercial
> backup
> > > systems that support GFS2 but don't support backing up via NFS.
> > >
> > > Is there anything else I should know about GFS2 limitations?
> > > Is there a book "GFS: The Missing Manual"? :)
> > >
> > > Thanks
> > >
> > > Colin
> > >
> > > On Mon, 2011-07-11 at 13:05 +0100, J. Bruce Fields wrote:
> > > > On Mon, Jul 11, 2011 at 11:43:58AM +0100, Steven Whitehouse
> wrote:
> > > > > Hi,
> > > > >
> > > > > On Mon, 2011-07-11 at 09:30 +0100, Alan Brown wrote:
> > > > > > On 08/07/11 22:09, J. Bruce Fields wrote:
> > > > > >
> > > > > > > With default mount options, the linux NFS client (like
> most
> > > NFS
> > > > clients)
> > > > > > > assumes that a file has a most one writer at a time.
> > > > (Applications that
> > > > > > > need to do write-sharing over NFS need to use file
> locking.)
> > > > > >
> > > > > > The problem is that file locking on V3 isn't passed back
> down to
> > > > the
> > > > > > filesystem - hence the issues with nfs vs samba (or local
> disk
> > > > > > access(*)) on the same server.
> > > >
> > > > The NFS server *does* acquire locks on the exported filesystem
> (and
> > > > does
> > > > it the same way for v2, v3, and v4).
> > > >
> > > > For local filesystems (ext3, xfs, btrfs), this is sufficient.
> > > >
> > > > For exports of cluster filesystems like gfs2, there are more
> > > > complicated
> > > > problems that, as Steve says, will require some work to do to
> fix.
> > > >
> > > > Samba is a more complicated issue due to the imperfect match
> between
> > > > Windows and Linux locking semantics, but depending on how it's
> > > > configured Samba will also acquire locks on the exported
> filesystem.
> > > >
> > > > --b.
> > > >
> > > > --
> > > > Linux-cluster mailing list
> > > > Linux-cluster at redhat.com
> > > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > > >
> > > >
> > >
> > > This email and any files transmitted with it are confidential and
> are
> > > intended solely for the use of the individual or entity to whom
> they
> > > are addressed.  If you are not the original recipient or the
> person
> > > responsible for delivering the email to the intended recipient, be
> > > advised that you have received this email in error, and that any
> use,
> > > dissemination, forwarding, printing, or copying of this email is
> > > strictly prohibited. If you received this email in error, please
> > > immediately notify the sender and delete the original.
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 




From member at linkedin.com  Thu Jul 14 01:19:33 2011
From: member at linkedin.com (Paul Morgan via LinkedIn)
Date: Thu, 14 Jul 2011 01:19:33 +0000 (UTC)
Subject: [Linux-cluster] Invitation to connect on LinkedIn
Message-ID: <366543265.17198061.1310606373490.JavaMail.app@ela4-bed79.prod>

LinkedIn
------------




    Paul Morgan requested to add you as a connection on LinkedIn:
  
------------------------------------------

Marian,

I'd like to add you to my professional network on LinkedIn.

- Paul

Accept invitation from Paul Morgan
http://www.linkedin.com/e/-odgn7o-gq3171wt-58/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2959652231_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP4Pcz8RdzARej99bPxEpkISiz91bP4Nd3sUe34PdjcLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Paul Morgan
http://www.linkedin.com/e/-odgn7o-gq3171wt-58/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2959652231_2/39vcjcOczkSejkVcAALqnpPbOYWrSlI/svi/ 
------------------------------------------

DID YOU KNOW you can showcase your professional knowledge on LinkedIn to receive job/consulting offers and enhance your professional reputation? Posting replies to questions on LinkedIn Answers puts you in front of the world's professional community.
http://www.linkedin.com/e/-odgn7o-gq3171wt-58/abq/inv-24/

 
-- 
(c) 2011, LinkedIn Corporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110714/0c089233/attachment.htm>

From laszlo.budai at gmail.com  Thu Jul 14 15:42:55 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Thu, 14 Jul 2011 18:42:55 +0300
Subject: [Linux-cluster] question about failoverdomains
Message-ID: <4E1F0E7F.5080405@gmail.com>

Hi all,

Please somebody tell me what is the behavior for the following
failoverdomains configuration:

<failoverdomains>
  <failoverdomain name="dom1" ordered="0" restricted="0">
      <failoverdomainnode name="pnl-p" priority="1"/>
   </failoverdomain>
   <failoverdomain name="dom2">
      <failoverdomainnode name="psd-p" priority="1"/>
   </failoverdomain>
</failoverdomains>


This is a two node cluster. the idea is to create a configuration where
some of the services prefer to run on one node, and other on the other node.
For this I would have created two prioritized failover domains with the
priorities set in such a way that in one domain the one node has higher
priority than the other.

My question related to this setup is:
if a node fails (let suppose pnl-p fails) then the services which are
configured to use dom1 will migrate to node psd-p. What will happen when
the node recovers? will that service fail back to that node?

Thank you,
Laszlo




From ajb2 at mssl.ucl.ac.uk  Thu Jul 14 15:52:44 2011
From: ajb2 at mssl.ucl.ac.uk (Alan Brown)
Date: Thu, 14 Jul 2011 16:52:44 +0100
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to
 handle	kernelNULL pointer deference
In-Reply-To: <1310575293.32040.39.camel@bhac.iouk.ioroot.tld>
References: <1310434196.11870.17.camel@shyster><1310496771.15833.86.camel@bhac.iouk.ioroot.tld>	<1310572439.3237.49.camel@menhir>
	<1310575293.32040.39.camel@bhac.iouk.ioroot.tld>
Message-ID: <4E1F10CC.40406@mssl.ucl.ac.uk>


 > Maybe I should try that again but the
> only was I know to get a kdump is to set a large fence delay. 

This is what I'd expect. We also found the fence delay has to be long 
enough to allow the crashdump to be written out.

The only alternatives to speed this up are to use _very_ fast disk for 
the vmcore or dump to another machine over 10Gb links. Either way I 
suspect it'll max out around 30Mb/s if you have compression turned on. :(







From linux at alteeve.com  Thu Jul 14 15:52:49 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 14 Jul 2011 11:52:49 -0400
Subject: [Linux-cluster] question about failoverdomains
In-Reply-To: <4E1F0E7F.5080405@gmail.com>
References: <4E1F0E7F.5080405@gmail.com>
Message-ID: <4E1F10D1.4020900@alteeve.com>

On 07/14/2011 11:42 AM, Budai Laszlo wrote:
> Hi all,
> 
> Please somebody tell me what is the behavior for the following
> failoverdomains configuration:
> 
> <failoverdomains>
>   <failoverdomain name="dom1" ordered="0" restricted="0">
>       <failoverdomainnode name="pnl-p" priority="1"/>
>    </failoverdomain>
>    <failoverdomain name="dom2">
>       <failoverdomainnode name="psd-p" priority="1"/>
>    </failoverdomain>
> </failoverdomains>
> 
> 
> This is a two node cluster. the idea is to create a configuration where
> some of the services prefer to run on one node, and other on the other node.
> For this I would have created two prioritized failover domains with the
> priorities set in such a way that in one domain the one node has higher
> priority than the other.
> 
> My question related to this setup is:
> if a node fails (let suppose pnl-p fails) then the services which are
> configured to use dom1 will migrate to node psd-p. What will happen when
> the node recovers? will that service fail back to that node?
> 
> Thank you,
> Laszlo

These two domains will not offer failover. The domain contains only one
node, so any services set to use either domain will only run on that
given node. This is useful the cluster to starting local service (ie:
clvmd, gfs2, etc).

To get an ordered domain, define both nodes within a given domain, set
'ordered="1"' and then set different 'priority="<int>"'. The node with
the lower priority will be the "preferred node" and the other will be
the fall-back.

Pay attention to the "nofailback" option as well.

See: https://fedorahosted.org/cluster/wiki/RGManager

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"I feel confined, only free to expand myself within boundaries."



From laszlo.budai at gmail.com  Thu Jul 14 20:38:50 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Thu, 14 Jul 2011 23:38:50 +0300
Subject: [Linux-cluster] question about failoverdomains
In-Reply-To: <4E1F10D1.4020900@alteeve.com>
References: <4E1F0E7F.5080405@gmail.com> <4E1F10D1.4020900@alteeve.com>
Message-ID: <4E1F53DA.3070000@gmail.com>

Thank you for the link. I've found the answer to my question there.
Actually on this page: https://fedorahosted.org/cluster/wiki/FailoverDomains
And the answer is: Yes, the service will migrate back (fail back) to the
node which is member of its own domain.
On the other side, you are wrong about the fact that these domains does
not offer failover. That would be true if the domains were restricted.

Kind regards,
Laszlo

On 07/14/2011 06:52 PM, Digimer wrote:
> On 07/14/2011 11:42 AM, Budai Laszlo wrote:
>> Hi all,
>>
>> Please somebody tell me what is the behavior for the following
>> failoverdomains configuration:
>>
>> <failoverdomains>
>>   <failoverdomain name="dom1" ordered="0" restricted="0">
>>       <failoverdomainnode name="pnl-p" priority="1"/>
>>    </failoverdomain>
>>    <failoverdomain name="dom2">
>>       <failoverdomainnode name="psd-p" priority="1"/>
>>    </failoverdomain>
>> </failoverdomains>
>>
>>
>> This is a two node cluster. the idea is to create a configuration where
>> some of the services prefer to run on one node, and other on the other node.
>> For this I would have created two prioritized failover domains with the
>> priorities set in such a way that in one domain the one node has higher
>> priority than the other.
>>
>> My question related to this setup is:
>> if a node fails (let suppose pnl-p fails) then the services which are
>> configured to use dom1 will migrate to node psd-p. What will happen when
>> the node recovers? will that service fail back to that node?
>>
>> Thank you,
>> Laszlo
> These two domains will not offer failover. The domain contains only one
> node, so any services set to use either domain will only run on that
> given node. This is useful the cluster to starting local service (ie:
> clvmd, gfs2, etc).
>
> To get an ordered domain, define both nodes within a given domain, set
> 'ordered="1"' and then set different 'priority="<int>"'. The node with
> the lower priority will be the "preferred node" and the other will be
> the fall-back.
>
> Pay attention to the "nofailback" option as well.
>
> See: https://fedorahosted.org/cluster/wiki/RGManager
>



From ifetch at du.edu  Sat Jul 16 05:06:19 2011
From: ifetch at du.edu (Ivan Fetch)
Date: Fri, 15 Jul 2011 23:06:19 -0600
Subject: [Linux-cluster] Multi-homing in rhel5
In-Reply-To: <AANLkTimAesyx5mz4eRrcVKj9O0ZeeoUxHEo-kBMge0A3@mail.gmail.com>
References: <AANLkTimAesyx5mz4eRrcVKj9O0ZeeoUxHEo-kBMge0A3@mail.gmail.com>
Message-ID: <8961ACA1-6DB3-4987-8281-664CF84C6957@du.edu>

Hi COrey,

On Feb 3, 2011, at 3:49 AM, Corey Kovacs wrote:

> The cluster2 docs outline a procedure for multihoming which is
> unsupported by redhat.
> 
Which multihoming method is that? 

> Is anyone actually using this method or are people more inclined to
> use configs in which secondary interfaces are given names by which the
> cluster then uses them as primary config nodes.
> 
> For example, on my cluster I have eth0 as the primary interface for
> all normal system traffic, and eth1 as my cluster interconnect.
> 
> eth0 - nodename
> eth1 - nodename-clu <-- cluster config points to this as nodes....
> 
> clients access the cluster services via eth0.
> 
> I've seen other configs where people configure the cluster to use eth0
> for cluster coms so that ricci/luci work correctly, but I don't use
> those.
> 
> Is there an advantage of one method over the other ?


I am just getting started with RHCS, from Sun Cluster. I was planning to use private node host names as the primary cluster communication, using an ethernet bond of two NICs. I was also planning to have Luci available, until I am more adept at knowing what to add to cluster.conf by hand. Does Luci not function when the primary cluster communication is over a private node interconnect?


Thanks,

Ivan.























.



From post at michael-neubert.de  Mon Jul 18 11:57:58 2011
From: post at michael-neubert.de (Michael Neubert)
Date: Mon, 18 Jul 2011 13:57:58 +0200
Subject: [Linux-cluster] clean_start in combination with post_join_delay
Message-ID: <7c425127a2c120a67cfaa665751e1110.squirrel@www.michael-neubert.de>

Hello,

at the moment I am using a GFS2 setup with the following values for the
fencing daemon in the cluster config:

clean_start="1"
>> Documentation: "prevent any startup fencing the daemon might do"

post_join_delay="60"
>> Documentation: "number of seconds the daemon will wait before fencing
any victims after a node joins the domain"

If I set clean_start to 1 the post_join_delay paramter will have no sense
in my opinion. Or is there another reason I do not see, why to use
post_join_delay even though clean_start is set to 1?

Thanks for every reply.

Best wishes
Michael



From jpolo at wtransnet.com  Tue Jul 19 12:18:55 2011
From: jpolo at wtransnet.com (Javi Polo)
Date: Tue, 19 Jul 2011 14:18:55 +0200
Subject: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel
 NULL pointer deference
In-Reply-To: <4E1C42F2.6020609@wtransnet.com>
References: <4E172514.7060806@wtransnet.com>
	<1310142138.2705.37.camel@menhir>	<4E173786.1060507@wtransnet.com>
	<4E1C42F2.6020609@wtransnet.com>
Message-ID: <4E25762F.3070900@wtransnet.com>

El 07/12/11 14:49, Javi Polo escribi?:
>>> [6235083.656954] BUG: unable to handle kernel NULL pointer dereference
>>>> this should not happen. It looks like we are trying to look up 
>>>> something
>>>> that is 24 (hex) bytes into a structure. Does the fs have posix acls
>>>> enabled or selinux or something else using xattrs?
>>
>> Nope, at least as far as I know. As I dont usually use ubuntu, I have 
>> checked to see if it had selinux enabled by default, or some ACLs 
>> related thing, but it seems it's not ....
>>
>
> Anyone could hint me with this matter?
> I'm not using selinux nor xattrs nor posix acls, just a plain gfs2 
> filesystem with 3 journals ...
>

I finally manage to get it working by rolling back to an ubuntu 
linux-2.6.32-33-virtual kernel. I guess there's a problem in nfs 
modules, not in gfs2, because it crashed pretty much the same while 
testing with OCFS2

thanks you all :)

-- 
Javi Polo
Administrador de Sistemas
Tel  93 734 97 70
Fax 93 734 97 71
jpolo at wtransnet.com



From mcaubet at pic.es  Tue Jul 19 14:19:17 2011
From: mcaubet at pic.es (Marc Caubet)
Date: Tue, 19 Jul 2011 16:19:17 +0200
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
Message-ID: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>

Hi,

we are testing RedHat Cluster to build a KVM virtualization infrastructure.
This is the first time we use the Linux Cluster so we are a little bit lost.
Hope someone can help.

Actually we have 2 hypervisors connected via Fiber Channel to a shared
storage (both servers see the same 15TB device /dev/mapper/mpathb).

Our idea is a shared storage to hold KVM virtual machines by using LVM2.
Both server should be able to run Virtual Machines from the same storage,
but we should be able to migrate or start virtual machines on the other
server node on crash.

So the plan is:

- Virtual Machine image = Logical Volume
- CLVM2 cluster: only one server node at the same time will be able to
manage the volume group-
- KVM Virtual Machine High Availability. Machines will run on one server
node. If for some reason the server node crashes, the second will start /
migrate the virtual machine.

Basically we woul like to know:

- How can we create a cluster for the LVM2 shared storage (when we create
it, it does not work since both server nodes have the VG as Active)
- How can we create a cluster service for a virtual machine (we guess it has
to be done 1 by 1)
- Since we have 2 server nodes, how to increase the number of votes for
quorum (qdisk over a heartbeat logical volume partition?)

Thanks and best regards,
-- 
Marc Caubet Serrabou
PIC (Port d'Informaci? Cient?fica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110719/b452ccd0/attachment.htm>

From linux at alteeve.com  Tue Jul 19 14:50:52 2011
From: linux at alteeve.com (Digimer)
Date: Tue, 19 Jul 2011 10:50:52 -0400
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
In-Reply-To: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
References: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
Message-ID: <4E2599CC.4010503@alteeve.com>

On 07/19/2011 10:19 AM, Marc Caubet wrote:
> Hi,
> 
> we are testing RedHat Cluster to build a KVM virtualization
> infrastructure. This is the first time we use the Linux Cluster so we
> are a little bit lost. Hope someone can help.
> 
> Actually we have 2 hypervisors connected via Fiber Channel to a shared
> storage (both servers see the same 15TB device /dev/mapper/mpathb).
> 
> Our idea is a shared storage to hold KVM virtual machines by using LVM2.
> Both server should be able to run Virtual Machines from the same
> storage, but we should be able to migrate or start virtual machines on
> the other server node on crash.
> 
> So the plan is:
> 
> - Virtual Machine image = Logical Volume
> - CLVM2 cluster: only one server node at the same time will be able to
> manage the volume group-
> - KVM Virtual Machine High Availability. Machines will run on one server
> node. If for some reason the server node crashes, the second will start
> / migrate the virtual machine.
> 
> Basically we woul like to know:
> 
> - How can we create a cluster for the LVM2 shared storage (when we
> create it, it does not work since both server nodes have the VG as Active)
> - How can we create a cluster service for a virtual machine (we guess it
> has to be done 1 by 1)
> - Since we have 2 server nodes, how to increase the number of votes for
> quorum (qdisk over a heartbeat logical volume partition?)
> 
> Thanks and best regards,

You need to setup a cluster with fencing, which will let you then use
clustered LVM, clvmd, which in turn uses the distributed lock manager,
dlm. This will allow for the same LVs to be seen and used across cluster
nodes.

Then you will simply add the VMs as resources to rgmanager, which uses
(and sits on top of) corosync, which is itself the core of the cluster.

I'm guessing that you are using RHEL 6, so this may not map perfectly,
but I described how to build a similar Xen-based HA VM cluster on EL5.
The main differences are; Corosync instead of OpenAIS (changes nothing,
configuration wise), ignore DRBD as you have a proper SAN and replace
Xen with KVM. The small GFS2 partition is still recommended for central
storage of the VM definitions (needed for migration and recovery).
However, if you don't have a GFS2 license, you can manually keep the
configs in sync on matching local directories on the nodes.

See if this helps at all:

http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial

Best of luck. :)

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"



From Ralph.Grothe at itdz-berlin.de  Wed Jul 20 07:06:42 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Wed, 20 Jul 2011 09:06:42 +0200
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
In-Reply-To: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
References: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF01432996@itdzex101.ITDZ.verwalt-berlin.de>

Hi Marc,


though Digimer's RHCS tutorial is an excellent introducion to the
RHCS cluster with a thourough step by step reference to setting
up a Xen cluster, and I'd highly recommend you read it, you
probably also would like to have look at these articles in the
cluster wiki which focus a little more condensed on your
questions.

Here is described how to set up so called HA-LVM which avoids the
clmvd overhead and is for settings like yours where you only
require active/passive VG activation (i.e. a shared storage VG is
only activated on a single cluster node at any time). 
This is achieved by tagging of the affected VGs/LVs.

https://fedorahosted.org/cluster/wiki/LVMFailover

Unfortunately, this article lacks mentioning of the required
locking_type setting in lvm.conf.

But if you have access to RHN this article on HA-LVM does, and it
also outlines both methods, i.e. the so called "preferred clvmd"
method and the so called "original" (or what I'd call "tagging")
method which doesn't require clvmd:

https://access.redhat.com/kb/docs/DOC-3068


Should you on the other hand require active/active VGs (i.e.
simultaneous activation of the same shared VG on more than one
cluster node), which I consider not a requirement for a KVM
cluster (but I lack any experience in this field) the recommended
procedure is described here:

https://access.redhat.com/kb/docs/DOC-17651


Important aside, after you've edited the lvm.conf you are
required to make a new initial ramdisk (or at least touch the
mtime of the current initrd) or cluster services won't start on
that node (watch entries in messages or whereever syslogd logs
clulog stuff to).



In this article you may find something treating KVM VMs'
migration.

https://fedorahosted.org/cluster/wiki/KvmMigration



Regards
Ralph
(an RHCS newbie himself)
________________________________

	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Marc
Caubet
	Sent: Tuesday, July 19, 2011 4:19 PM
	To: linux-cluster at redhat.com
	Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
	
	
	Hi,
	
	we are testing RedHat Cluster to build a KVM
virtualization infrastructure. This is the first time we use the
Linux Cluster so we are a little bit lost. Hope someone can help.
	
	Actually we have 2 hypervisors connected via Fiber
Channel to a shared storage (both servers see the same 15TB
device /dev/mapper/mpathb).
	
	Our idea is a shared storage to hold KVM virtual machines
by using LVM2. Both server should be able to run Virtual Machines
from the same storage, but we should be able to migrate or start
virtual machines on the other server node on crash.
	
	So the plan is:
	
	- Virtual Machine image = Logical Volume
	- CLVM2 cluster: only one server node at the same time
will be able to manage the volume group-
	- KVM Virtual Machine High Availability. Machines will
run on one server node. If for some reason the server node
crashes, the second will start / migrate the virtual machine.
	
	Basically we woul like to know:
	
	- How can we create a cluster for the LVM2 shared storage
(when we create it, it does not work since both server nodes have
the VG as Active)
	- How can we create a cluster service for a virtual
machine (we guess it has to be done 1 by 1)
	- Since we have 2 server nodes, how to increase the
number of votes for quorum (qdisk over a heartbeat logical volume
partition?)
	
	Thanks and best regards,
	-- 
	Marc Caubet Serrabou
	PIC (Port d'Informaci? Cient?fica)
	Campus UAB, Edificio D
	E-08193 Bellaterra, Barcelona
	Tel: +34 93 581 33 22
	Fax: +34 93 581 41 10
	http://www.pic.es
	Avis - Aviso - Legal Notice:
http://www.ifae.es/legal.html
	




From mcaubet at pic.es  Wed Jul 20 09:29:41 2011
From: mcaubet at pic.es (Marc Caubet)
Date: Wed, 20 Jul 2011 11:29:41 +0200
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
In-Reply-To: <4E2599CC.4010503@alteeve.com>
References: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
	<4E2599CC.4010503@alteeve.com>
Message-ID: <CAPrERe1YS4XAA249bF54y94mJGvTJKVhT=zeyEqf8p6kb0uEig@mail.gmail.com>

Hi,

thanks a lot for your reply.

You need to setup a cluster with fencing, which will let you then use
> clustered LVM, clvmd, which in turn uses the distributed lock manager,
> dlm. This will allow for the same LVs to be seen and used across cluster
> nodes.
>

Ok. I will try this.


>
> Then you will simply add the VMs as resources to rgmanager, which uses
> (and sits on top of) corosync, which is itself the core of the cluster.
>

So each virtual machine will be a resource, is it right?


>
> I'm guessing that you are using RHEL 6, so this may not map perfectly,
> but I described how to build a similar Xen-based HA VM cluster on EL5.
> The main differences are; Corosync instead of OpenAIS (changes nothing,
> configuration wise), ignore DRBD as you have a proper SAN and replace
> Xen with KVM. The small GFS2 partition is still recommended for central
> storage of the VM definitions (needed for migration and recovery).
> However, if you don't have a GFS2 license, you can manually keep the
> configs in sync on matching local directories on the nodes.
>
> See if this helps at all:
>
> http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial
>

Actually we are using SL6 but we probably will migrate to RHEL6 if this
environment will be used as production infrastructure in the future. So we
will consider GFS2.

Thanks a lot for your answer.

Marc



>
> Best of luck. :)
>
> --
> Digimer
> E-Mail:              digimer at alteeve.com
> Freenode handle:     digimer
> Papers and Projects: http://alteeve.com
> Node Assassin:       http://nodeassassin.org
> "At what point did we forget that the Space Shuttle was, essentially,
> a program that strapped human beings to an explosion and tried to stab
> through the sky with fire and math?"
>




-- 
Marc Caubet Serrabou
PIC (Port d'Informaci? Cient?fica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110720/abf60f23/attachment.htm>

From mcaubet at pic.es  Wed Jul 20 09:40:10 2011
From: mcaubet at pic.es (Marc Caubet)
Date: Wed, 20 Jul 2011 11:40:10 +0200
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
In-Reply-To: <A789DDB53ED7E94396E842EE2AC9B5FF01432996@itdzex101.ITDZ.verwalt-berlin.de>
References: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>
	<A789DDB53ED7E94396E842EE2AC9B5FF01432996@itdzex101.ITDZ.verwalt-berlin.de>
Message-ID: <CAPrERe2kW1XfL2n8ytBAVVrdzOfjfkjTDL-0SgWkJKHgkn7Jxg@mail.gmail.com>

Hi,

thanks for your answer.

On 20 July 2011 09:06, <Ralph.Grothe at itdz-berlin.de> wrote:

> Hi Marc,
>
>
> though Digimer's RHCS tutorial is an excellent introducion to the
> RHCS cluster with a thourough step by step reference to setting
> up a Xen cluster, and I'd highly recommend you read it, you
> probably also would like to have look at these articles in the
> cluster wiki which focus a little more condensed on your
> questions.
>
> Here is described how to set up so called HA-LVM which avoids the
> clmvd overhead and is for settings like yours where you only
> require active/passive VG activation (i.e. a shared storage VG is
> only activated on a single cluster node at any time).
> This is achieved by tagging of the affected VGs/LVs.
>
> https://fedorahosted.org/cluster/wiki/LVMFailover
>
> Unfortunately, this article lacks mentioning of the required
> locking_type setting in lvm.conf.
>
> But if you have access to RHN this article on HA-LVM does, and it
> also outlines both methods, i.e. the so called "preferred clvmd"
> method and the so called "original" (or what I'd call "tagging")
> method which doesn't require clvmd:
>
> https://access.redhat.com/kb/docs/DOC-3068
>
>
> Should you on the other hand require active/active VGs (i.e.
> simultaneous activation of the same shared VG on more than one
> cluster node), which I consider not a requirement for a KVM
> cluster (but I lack any experience in this field) the recommended
> procedure is described here:
>
> https://access.redhat.com/kb/docs/DOC-17651
>

Great links, this is what we were looking for. We'll try this before testing
GFS2 because we are preferably want to work directly over Logical Volumes.

Thanks a lot for your reply,
Marc



>
>
> Important aside, after you've edited the lvm.conf you are
> required to make a new initial ramdisk (or at least touch the
> mtime of the current initrd) or cluster services won't start on
> that node (watch entries in messages or whereever syslogd logs
> clulog stuff to).
>
>
>
> In this article you may find something treating KVM VMs'
> migration.
>
> https://fedorahosted.org/cluster/wiki/KvmMigration






>
>
> Regards
> Ralph
> (an RHCS newbie himself)
> ________________________________
>
>        From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Marc
> Caubet
>        Sent: Tuesday, July 19, 2011 4:19 PM
>        To: linux-cluster at redhat.com
>        Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
>
>
>        Hi,
>
>        we are testing RedHat Cluster to build a KVM
> virtualization infrastructure. This is the first time we use the
> Linux Cluster so we are a little bit lost. Hope someone can help.
>
>        Actually we have 2 hypervisors connected via Fiber
> Channel to a shared storage (both servers see the same 15TB
> device /dev/mapper/mpathb).
>
>        Our idea is a shared storage to hold KVM virtual machines
> by using LVM2. Both server should be able to run Virtual Machines
> from the same storage, but we should be able to migrate or start
> virtual machines on the other server node on crash.
>
>        So the plan is:
>
>        - Virtual Machine image = Logical Volume
>        - CLVM2 cluster: only one server node at the same time
> will be able to manage the volume group-
>        - KVM Virtual Machine High Availability. Machines will
> run on one server node. If for some reason the server node
> crashes, the second will start / migrate the virtual machine.
>
>        Basically we woul like to know:
>
>        - How can we create a cluster for the LVM2 shared storage
> (when we create it, it does not work since both server nodes have
> the VG as Active)
>        - How can we create a cluster service for a virtual
> machine (we guess it has to be done 1 by 1)
>        - Since we have 2 server nodes, how to increase the
> number of votes for quorum (qdisk over a heartbeat logical volume
> partition?)
>
>        Thanks and best regards,
>        --
>        Marc Caubet Serrabou
>        PIC (Port d'Informaci? Cient?fica)
>        Campus UAB, Edificio D
>        E-08193 Bellaterra, Barcelona
>        Tel: +34 93 581 33 22
>        Fax: +34 93 581 41 10
>        http://www.pic.es
>        Avis - Aviso - Legal Notice:
> http://www.ifae.es/legal.html
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
Marc Caubet Serrabou
PIC (Port d'Informaci? Cient?fica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avis - Aviso - Legal Notice: http://www.ifae.es/legal.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110720/7ba4e5ee/attachment.htm>

From linux at alteeve.com  Wed Jul 20 13:02:21 2011
From: linux at alteeve.com (Digimer)
Date: Wed, 20 Jul 2011 09:02:21 -0400
Subject: [Linux-cluster] Linux Cluster + KVM + LVM2
In-Reply-To: <CAPrERe1YS4XAA249bF54y94mJGvTJKVhT=zeyEqf8p6kb0uEig@mail.gmail.com>
References: <CAPrERe2ghp-VnJc1VVjv=nM96OEnPD7f7WkwfJY5tTsNebTx+A@mail.gmail.com>	<4E2599CC.4010503@alteeve.com>
	<CAPrERe1YS4XAA249bF54y94mJGvTJKVhT=zeyEqf8p6kb0uEig@mail.gmail.com>
Message-ID: <4E26D1DD.4040203@alteeve.com>

On 07/20/2011 05:29 AM, Marc Caubet wrote:
> Hi,
> 
> thanks a lot for your reply.
> 
>     You need to setup a cluster with fencing, which will let you then use
>     clustered LVM, clvmd, which in turn uses the distributed lock manager,
>     dlm. This will allow for the same LVs to be seen and used across cluster
>     nodes.
> 
> 
> Ok. I will try this.
>  
> 
> 
>     Then you will simply add the VMs as resources to rgmanager, which uses
>     (and sits on top of) corosync, which is itself the core of the cluster.
> 
> 
> So each virtual machine will be a resource, is it right?

If you wish, yes. Having the resource management means that recovery of
VMs lost to a host node failure will be automated. It is not, in itself,
a requirement.

>     I'm guessing that you are using RHEL 6, so this may not map perfectly,
>     but I described how to build a similar Xen-based HA VM cluster on EL5.
>     The main differences are; Corosync instead of OpenAIS (changes nothing,
>     configuration wise), ignore DRBD as you have a proper SAN and replace
>     Xen with KVM. The small GFS2 partition is still recommended for central
>     storage of the VM definitions (needed for migration and recovery).
>     However, if you don't have a GFS2 license, you can manually keep the
>     configs in sync on matching local directories on the nodes.
> 
>     See if this helps at all:
> 
>     http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_2_Tutorial
> 
> 
> Actually we are using SL6 but we probably will migrate to RHEL6 if this
> environment will be used as production infrastructure in the future. So
> we will consider GFS2.
> 
> Thanks a lot for your answer.
> 
> Marc

SL6 is based on RHEL6, so the cluster stack will be the same.

A couple of notes;

Ralph is right, of course, and those RHEL docs are well worth reading.
They are certainly more authoritative than my wiki. There are a few
things to consider though, if you proceed without a cluster;

* Live migration of VMs (as opposed to cold recovery), requires the new
and old hosts to simultaneously write to the same LV, iirc. Assuming I
am right, then you need to make sure you LV is ACTIVE on both nodes at
the same time. I do not know if that is (safely) possible without
clustered LVM (and it's use of DLM).

* Without a cluster, VM recovery and what not will not be automatic, I
don't believe.

* Without the cluster's fencing, if a node is (accidentally) flagged as
ACTIVE on two nodes, there is nothing preventing corruption of the LV.
For example, let's say that a node hangs... After a time, you (or a
script) recovers the VM on another node. After, the original node
unblocks and goes back to writing to the LV. Suddenly, you've got the
same VM running twice on the same block device. Fencing puts the hung
node into a known safe state by forcing it to shut down. Only then,
after confirmation that the node is actually gone, with another node
recover the resources.

Building a minimal cluster with fencing is not that hard. It does
require some reading and some patience, but it's effectively;

* Setup shared SSH keys between nodes.
* Edit /etc/cluster/cluster.conf
** define the nodes and how to fence them (device, port)
** define the fence device(s) (ip, user/pass, etc).
* Start the cluster
* Start clvmd
* Create clustered LVs (lvcreate -c y...)

Done!

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"



From jordir at fib.upc.edu  Fri Jul 22 10:08:59 2011
From: jordir at fib.upc.edu (Jordi Renye)
Date: Fri, 22 Jul 2011 12:08:59 +0200
Subject: [Linux-cluster] rhel 6.1 gfs2  performance tests
Message-ID: <4E294C3B.4000704@fib.upc.edu>

Hi,

We have configured redhat cluster RHEL 6.1  with two nodes.
We have seen that performance of GFS2 on writing  is
half of ext3 partition.

For example, time of commands:

time cp -Rp /usr /gfs2partition/usr
0.681u 47.082s 7:01.80 11.3%    0+0k 561264+2994832io 0pf+0w

whereas

  cp -R /usr /ext3partition/usr
0.543u 24.041s 4:16.86 9.5%     0+0k 2728584+3166184io 2pf+0w

With  ping_pong tool from Samba.org we've got next results:

Los resultados son los siguientes:

ping_pong /gfs2partition/pingpongtestfile 3
1582 locks/sec

With ping_pong test r/w:

ping_pong -rw /gfs2partition/pingpongtestfile 3
data increment = 2
4 locks/sec

Do you think we can get better performance? Do you think
are "normal" and "good" results ?

Which recommendations do you tell us to get better performance?

For example, we don't have a heartbeat network exclusively, but
we have only one networks interface for application network and cluster 
network.
Could we get better performance with one dedicated cluster network( for 
dlm,heartbeath,...).

Thanks in advanced,

Jordi Renye Capel
LCFIB
Laboratori de C?lcul
Facultat d'Inform?tica de Barcelona
Universitat Polit?cnica de Catalunya - Barcelona Tech





From swhiteho at redhat.com  Fri Jul 22 10:32:30 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 22 Jul 2011 11:32:30 +0100
Subject: [Linux-cluster] rhel 6.1 gfs2  performance tests
In-Reply-To: <4E294C3B.4000704@fib.upc.edu>
References: <4E294C3B.4000704@fib.upc.edu>
Message-ID: <1311330750.2804.10.camel@menhir>

Hi,

On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote:
> Hi,
> 
> We have configured redhat cluster RHEL 6.1  with two nodes.
> We have seen that performance of GFS2 on writing  is
> half of ext3 partition.
> 
> For example, time of commands:
> 
> time cp -Rp /usr /gfs2partition/usr
> 0.681u 47.082s 7:01.80 11.3%    0+0k 561264+2994832io 0pf+0w
> 
> whereas
> 
>   cp -R /usr /ext3partition/usr
> 0.543u 24.041s 4:16.86 9.5%     0+0k 2728584+3166184io 2pf+0w
> 
> With  ping_pong tool from Samba.org we've got next results:
> 
> Los resultados son los siguientes:
> 
> ping_pong /gfs2partition/pingpongtestfile 3
> 1582 locks/sec
> 
> With ping_pong test r/w:
> 
> ping_pong -rw /gfs2partition/pingpongtestfile 3
> data increment = 2
> 4 locks/sec
> 
> Do you think we can get better performance? Do you think
> are "normal" and "good" results ?
> 
> Which recommendations do you tell us to get better performance?
> 
> For example, we don't have a heartbeat network exclusively, but
> we have only one networks interface for application network and cluster 
> network.
> Could we get better performance with one dedicated cluster network( for 
> dlm,heartbeath,...).
> 
> Thanks in advanced,
> 
It depends what you are trying to optimise for... what is the actual
application that you want to run?

cp doesn't use fcntl locks to the best of my knowledge, so I doubt that
will have any particular effect on the performance. Also it would be
quite unusual for fcntl locks to have any effect on the performance of
the fs as a whole.

Usually the most important factor is how the workload is balances
between nodes. Also, did you mount with noatime, nodiratime?

Steve.




From jordir at fib.upc.edu  Fri Jul 22 10:41:30 2011
From: jordir at fib.upc.edu (Jordi Renye)
Date: Fri, 22 Jul 2011 12:41:30 +0200
Subject: [Linux-cluster] rhel 6.1 gfs2  performance tests
In-Reply-To: <1311330750.2804.10.camel@menhir>
References: <4E294C3B.4000704@fib.upc.edu> <1311330750.2804.10.camel@menhir>
Message-ID: <4E2953DA.8090404@fib.upc.edu>


We are sharing gfs2 partition through samba
to three hundred clients aprox.

Partition GFS2 is mounted in two nodes of
cluster.

Clients can boot in linux and windows.

There is one share for home folder, another
for profiles, another for shared applications and
data: there is 5 shares.

>>  Also, did you mount with noatime, nodiratime?

Yes, I'm  mounting with these options.

Jordi Renye
LCFIB - UPC



El 22/07/2011 12:32, Steven Whitehouse escribi?:
> Hi,
>
> On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote:
>> Hi,
>>
>> We have configured redhat cluster RHEL 6.1  with two nodes.
>> We have seen that performance of GFS2 on writing  is
>> half of ext3 partition.
>>
>> For example, time of commands:
>>
>> time cp -Rp /usr /gfs2partition/usr
>> 0.681u 47.082s 7:01.80 11.3%    0+0k 561264+2994832io 0pf+0w
>>
>> whereas
>>
>>    cp -R /usr /ext3partition/usr
>> 0.543u 24.041s 4:16.86 9.5%     0+0k 2728584+3166184io 2pf+0w
>>
>> With  ping_pong tool from Samba.org we've got next results:
>>
>> Los resultados son los siguientes:
>>
>> ping_pong /gfs2partition/pingpongtestfile 3
>> 1582 locks/sec
>>
>> With ping_pong test r/w:
>>
>> ping_pong -rw /gfs2partition/pingpongtestfile 3
>> data increment = 2
>> 4 locks/sec
>>
>> Do you think we can get better performance? Do you think
>> are "normal" and "good" results ?
>>
>> Which recommendations do you tell us to get better performance?
>>
>> For example, we don't have a heartbeat network exclusively, but
>> we have only one networks interface for application network and cluster
>> network.
>> Could we get better performance with one dedicated cluster network( for
>> dlm,heartbeath,...).
>>
>> Thanks in advanced,
>>
> It depends what you are trying to optimise for... what is the actual
> application that you want to run?
>
> cp doesn't use fcntl locks to the best of my knowledge, so I doubt that
> will have any particular effect on the performance. Also it would be
> quite unusual for fcntl locks to have any effect on the performance of
> the fs as a whole.
>
> Usually the most important factor is how the workload is balances
> between nodes. Also, did you mount with noatime, nodiratime?
>
> Steve.
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 

        Jordi Renye Capel
o o o  T?cnic de Sistemes N1
o o o  Laboratori de C?lcul
o o o  Facultat d'Inform?tica de Barcelona
U P C  Universitat Polit?cnica de Catalunya - Barcelona Tech

        E-mail : jordir at fib.upc.edu
        Tel.   : 16943
        Web    : http://www.fib.upc.edu/

======================================================================

Abans  d'imprimir aquest missatge, si us plau, assegureu-vos que sigui
necessari. El medi ambient ?s cosa de tots.

--[ http://www.fib.upc.edu/disclaimer/ ]------------------------------

ADVERTIMENT  /  TEXT  LEGAL:  Aquest  missatge pot contenir informaci?
confidencial  o  legalment protegida i est? exclusivament adre?at a la
persona  o entitat destinat?ria. Si vost? no es el destinatari final o
persona  encarregada  de  recollir-lo, no est? autoritzat a llegir-lo,
retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar el seu
contingut.  Si ha rebut aquest correu electr?nic per error, li preguem
que  informi  al  remitent  i elimini del seu sistema el missatge i el
material annex que pugui contenir. Gr?cies per la seva col?laboraci?.




From carlopmart at gmail.com  Mon Jul 25 12:38:53 2011
From: carlopmart at gmail.com (carlopmart)
Date: Mon, 25 Jul 2011 14:38:53 +0200
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E04B61B.9070208@cybercat.ca>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca>
Message-ID: <4E2D63DD.4050007@gmail.com>

On 06/24/2011 06:06 PM, Nicolas Ross wrote:
>
>> Thanks for that, that'll prevent me from modifying a system file...
>>
>> And yes, I find it a little disapointing. We're now at 6.1, and our
>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd
>> running content from that gfs...
>
> Two thing I need to mention in this issue. One, support doesn't think
> anymore that it's a coro-sync specific issue, they are searching to a
> driver issue or other source for this problem.
>
> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0),
> for another issue, and since I did, I don't think I saw that issue
> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they
> are caused by this issue.
>
> So, as a temporary work-around for this time, woule be (at your own
> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>
> yum install kernel-2.6.32-71.29.1.el6.x86_64
>
> Regards,

Hi Steven and Nicolas,

  Is this bug resolved in RHEL6.1 with all updates applied?? Do I need 
to use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?

Thanks.

-- 
CL Martinez
carlopmart {at} gmail {d0t} com



From sdake at redhat.com  Mon Jul 25 13:44:09 2011
From: sdake at redhat.com (Steven Dake)
Date: Mon, 25 Jul 2011 06:44:09 -0700
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E2D63DD.4050007@gmail.com>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com>
Message-ID: <4E2D7329.6050607@redhat.com>

On 07/25/2011 05:38 AM, carlopmart wrote:
> On 06/24/2011 06:06 PM, Nicolas Ross wrote:
>>
>>> Thanks for that, that'll prevent me from modifying a system file...
>>>
>>> And yes, I find it a little disapointing. We're now at 6.1, and our
>>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd
>>> running content from that gfs...
>>
>> Two thing I need to mention in this issue. One, support doesn't think
>> anymore that it's a coro-sync specific issue, they are searching to a
>> driver issue or other source for this problem.
>>
>> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0),
>> for another issue, and since I did, I don't think I saw that issue
>> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they
>> are caused by this issue.
>>
>> So, as a temporary work-around for this time, woule be (at your own
>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>>
>> yum install kernel-2.6.32-71.29.1.el6.x86_64
>>
>> Regards,
> 
> Hi Steven and Nicolas,
> 
>  Is this bug resolved in RHEL6.1 with all updates applied?? Do I need to
> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?
> 
> Thanks.
> 

the corosync portion is going through QE.  The kernel portion remains open.

Regards
-steve



From carlopmart at gmail.com  Mon Jul 25 13:48:21 2011
From: carlopmart at gmail.com (carlopmart)
Date: Mon, 25 Jul 2011 15:48:21 +0200
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E2D7329.6050607@redhat.com>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com>
	<4E2D7329.6050607@redhat.com>
Message-ID: <4E2D7425.4070801@gmail.com>

On 07/25/2011 03:44 PM, Steven Dake wrote:
> On 07/25/2011 05:38 AM, carlopmart wrote:
>> On 06/24/2011 06:06 PM, Nicolas Ross wrote:
>>>
>>>> Thanks for that, that'll prevent me from modifying a system file...
>>>>
>>>> And yes, I find it a little disapointing. We're now at 6.1, and our
>>>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd
>>>> running content from that gfs...
>>>
>>> Two thing I need to mention in this issue. One, support doesn't think
>>> anymore that it's a coro-sync specific issue, they are searching to a
>>> driver issue or other source for this problem.
>>>
>>> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0),
>>> for another issue, and since I did, I don't think I saw that issue
>>> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they
>>> are caused by this issue.
>>>
>>> So, as a temporary work-around for this time, woule be (at your own
>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>>>
>>> yum install kernel-2.6.32-71.29.1.el6.x86_64
>>>
>>> Regards,
>>
>> Hi Steven and Nicolas,
>>
>>   Is this bug resolved in RHEL6.1 with all updates applied?? Do I need to
>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?
>>
>> Thanks.
>>
>
> the corosync portion is going through QE.  The kernel portion remains open.
>
> Regards
> -steve
>

Thanks Steve, then, Can I use last corosync version provided with 
RHEL6.1 and last RHEL6.0's kernel version without problems??



-- 
CL Martinez
carlopmart {at} gmail {d0t} com



From swhiteho at redhat.com  Mon Jul 25 13:51:11 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 25 Jul 2011 14:51:11 +0100
Subject: [Linux-cluster] rhel 6.1 gfs2  performance tests
In-Reply-To: <4E2953DA.8090404@fib.upc.edu>
References: <4E294C3B.4000704@fib.upc.edu> <1311330750.2804.10.camel@menhir>
	<4E2953DA.8090404@fib.upc.edu>
Message-ID: <1311601871.2697.7.camel@menhir>

Hi,

On Fri, 2011-07-22 at 12:41 +0200, Jordi Renye wrote:
> We are sharing gfs2 partition through samba
> to three hundred clients aprox.
> 
> Partition GFS2 is mounted in two nodes of
> cluster.
> 
> Clients can boot in linux and windows.
> 
> There is one share for home folder, another
> for profiles, another for shared applications and
> data: there is 5 shares.
> 
> >>  Also, did you mount with noatime, nodiratime?
> 
> Yes, I'm  mounting with these options.
> 
> Jordi Renye
> LCFIB - UPC
> 
> 
Were the tests being run directly on gfs2, or via Samba in this case?

Steve.

> 
> El 22/07/2011 12:32, Steven Whitehouse escribi?:
> > Hi,
> >
> > On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote:
> >> Hi,
> >>
> >> We have configured redhat cluster RHEL 6.1  with two nodes.
> >> We have seen that performance of GFS2 on writing  is
> >> half of ext3 partition.
> >>
> >> For example, time of commands:
> >>
> >> time cp -Rp /usr /gfs2partition/usr
> >> 0.681u 47.082s 7:01.80 11.3%    0+0k 561264+2994832io 0pf+0w
> >>
> >> whereas
> >>
> >>    cp -R /usr /ext3partition/usr
> >> 0.543u 24.041s 4:16.86 9.5%     0+0k 2728584+3166184io 2pf+0w
> >>
> >> With  ping_pong tool from Samba.org we've got next results:
> >>
> >> Los resultados son los siguientes:
> >>
> >> ping_pong /gfs2partition/pingpongtestfile 3
> >> 1582 locks/sec
> >>
> >> With ping_pong test r/w:
> >>
> >> ping_pong -rw /gfs2partition/pingpongtestfile 3
> >> data increment = 2
> >> 4 locks/sec
> >>
> >> Do you think we can get better performance? Do you think
> >> are "normal" and "good" results ?
> >>
> >> Which recommendations do you tell us to get better performance?
> >>
> >> For example, we don't have a heartbeat network exclusively, but
> >> we have only one networks interface for application network and cluster
> >> network.
> >> Could we get better performance with one dedicated cluster network( for
> >> dlm,heartbeath,...).
> >>
> >> Thanks in advanced,
> >>
> > It depends what you are trying to optimise for... what is the actual
> > application that you want to run?
> >
> > cp doesn't use fcntl locks to the best of my knowledge, so I doubt that
> > will have any particular effect on the performance. Also it would be
> > quite unusual for fcntl locks to have any effect on the performance of
> > the fs as a whole.
> >
> > Usually the most important factor is how the workload is balances
> > between nodes. Also, did you mount with noatime, nodiratime?
> >
> > Steve.
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 




From sdake at redhat.com  Mon Jul 25 15:42:03 2011
From: sdake at redhat.com (Steven Dake)
Date: Mon, 25 Jul 2011 08:42:03 -0700
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E2D7425.4070801@gmail.com>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com>
	<4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com>
Message-ID: <4E2D8ECB.6020305@redhat.com>

On 07/25/2011 06:48 AM, carlopmart wrote:
> On 07/25/2011 03:44 PM, Steven Dake wrote:
>> On 07/25/2011 05:38 AM, carlopmart wrote:
>>> On 06/24/2011 06:06 PM, Nicolas Ross wrote:
>>>>
>>>>> Thanks for that, that'll prevent me from modifying a system file...
>>>>>
>>>>> And yes, I find it a little disapointing. We're now at 6.1, and our
>>>>> setup is exactly what RHCS was designed for... A GFS over fiber, httpd
>>>>> running content from that gfs...
>>>>
>>>> Two thing I need to mention in this issue. One, support doesn't think
>>>> anymore that it's a coro-sync specific issue, they are searching to a
>>>> driver issue or other source for this problem.
>>>>
>>>> Second, I downgraded my kernel to 2.6.32-71.29.1.el6 (pre-6.1, or 6.0),
>>>> for another issue, and since I did, I don't think I saw that issue
>>>> again. I saw spikes in my cpu graphs, but I'm not 100% sure that they
>>>> are caused by this issue.
>>>>
>>>> So, as a temporary work-around for this time, woule be (at your own
>>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>>>>
>>>> yum install kernel-2.6.32-71.29.1.el6.x86_64
>>>>
>>>> Regards,
>>>
>>> Hi Steven and Nicolas,
>>>
>>>   Is this bug resolved in RHEL6.1 with all updates applied?? Do I
>>> need to
>>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?
>>>
>>> Thanks.
>>>
>>
>> the corosync portion is going through QE.  The kernel portion remains
>> open.
>>
>> Regards
>> -steve
>>
> 
> Thanks Steve, then, Can I use last corosync version provided with
> RHEL6.1 and last RHEL6.0's kernel version without problems??
> 
> 
> 

I recommend not mixing without a support signoff.

Regards
-steve



From carlopmart at gmail.com  Mon Jul 25 15:45:11 2011
From: carlopmart at gmail.com (carlopmart)
Date: Mon, 25 Jul 2011 17:45:11 +0200
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E2D8ECB.6020305@redhat.com>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com>
	<4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com>
	<4E2D8ECB.6020305@redhat.com>
Message-ID: <4E2D8F87.30508@gmail.com>

On 07/25/2011 05:42 PM, Steven Dake wrote:
>>>>> are caused by this issue.
>>>>>
>>>>> So, as a temporary work-around for this time, woule be (at your own
>>>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>>>>>
>>>>> yum install kernel-2.6.32-71.29.1.el6.x86_64
>>>>>
>>>>> Regards,
>>>>
>>>> Hi Steven and Nicolas,
>>>>
>>>>    Is this bug resolved in RHEL6.1 with all updates applied?? Do I
>>>> need to
>>>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?
>>>>
>>>> Thanks.
>>>>
>>>
>>> the corosync portion is going through QE.  The kernel portion remains
>>> open.
>>>
>>> Regards
>>> -steve
>>>
>>
>> Thanks Steve, then, Can I use last corosync version provided with
>> RHEL6.1 and last RHEL6.0's kernel version without problems??
>>
>>
>>
>
> I recommend not mixing without a support signoff.
>

Then, how can I install rhcs under rhel6.x and prevent this bug??


-- 
CL Martinez
carlopmart {at} gmail {d0t} com



From sdake at redhat.com  Mon Jul 25 16:04:27 2011
From: sdake at redhat.com (Steven Dake)
Date: Mon, 25 Jul 2011 09:04:27 -0700
Subject: [Linux-cluster] Corosync goes cpu to 95-99%
In-Reply-To: <4E2D8F87.30508@gmail.com>
References: <4DD29D03.9080901@gmail.com>	<4DD2BAC3.50509@redhat.com>	<4DD2BD7D.5070704@gmail.com>	<4DD2CA90.6090802@redhat.com>	<3B50BA7445114813AE429BEE51A2BA52@versa>	<4DD78908.2030801@gmail.com>	<0B1965C8-9807-42B6-9453-01BE0C0B1DCB@cybercat.ca><4DD80D5D.10004@gmail.com>	<4DD873C7.8080402@cybercat.ca>	<22E7D11CD5E64E338A66811F31F06238@versa>	<4DE545D7.1080703@redhat.com>	<4DE69786.5010204@gmail.com><4DE6CAF6.4000002@cybercat.ca>	<4DE75602.1000408@gmail.com>
	<51BB988BCCF547E69BF222BDAF34C4DE@versa>
	<4E04B61B.9070208@cybercat.ca> <4E2D63DD.4050007@gmail.com>
	<4E2D7329.6050607@redhat.com> <4E2D7425.4070801@gmail.com>
	<4E2D8ECB.6020305@redhat.com> <4E2D8F87.30508@gmail.com>
Message-ID: <4E2D940B.5020803@redhat.com>

On 07/25/2011 08:45 AM, carlopmart wrote:
> On 07/25/2011 05:42 PM, Steven Dake wrote:
>>>>>> are caused by this issue.
>>>>>>
>>>>>> So, as a temporary work-around for this time, woule be (at your own
>>>>>> risks) to downgrade to 2.6.32-71.29.1.el6 kernel :
>>>>>>
>>>>>> yum install kernel-2.6.32-71.29.1.el6.x86_64
>>>>>>
>>>>>> Regards,
>>>>>
>>>>> Hi Steven and Nicolas,
>>>>>
>>>>>    Is this bug resolved in RHEL6.1 with all updates applied?? Do I
>>>>> need to
>>>>> use some specific kernel version 2.6.32-131.2.1 or 2.6.32-131.6.1?
>>>>>
>>>>> Thanks.
>>>>>
>>>>
>>>> the corosync portion is going through QE.  The kernel portion remains
>>>> open.
>>>>
>>>> Regards
>>>> -steve
>>>>
>>>
>>> Thanks Steve, then, Can I use last corosync version provided with
>>> RHEL6.1 and last RHEL6.0's kernel version without problems??
>>>
>>>
>>>
>>
>> I recommend not mixing without a support signoff.
>>
> 
> Then, how can I install rhcs under rhel6.x and prevent this bug??
> 
> 
get a support signoff.  Also the corosync updates have not finished
through our validation process.  Only hot fixes (from support) are available

Regards
-steve



From jordir at fib.upc.edu  Tue Jul 26 10:36:55 2011
From: jordir at fib.upc.edu (Jordi Renye)
Date: Tue, 26 Jul 2011 12:36:55 +0200
Subject: [Linux-cluster] rhel 6.1 gfs2  performance tests
In-Reply-To: <1311601871.2697.7.camel@menhir>
References: <4E294C3B.4000704@fib.upc.edu>
	<1311330750.2804.10.camel@menhir>	<4E2953DA.8090404@fib.upc.edu>
	<1311601871.2697.7.camel@menhir>
Message-ID: <4E2E98C7.6020100@fib.upc.edu>


Tests  run directly on gfs2. Soon, we would like
testing through samba clients.

Jordi Renye
LCFIB - UPC


El 25/07/2011 15:51, Steven Whitehouse escribi?:
> Hi,
>
> On Fri, 2011-07-22 at 12:41 +0200, Jordi Renye wrote:
>> We are sharing gfs2 partition through samba
>> to three hundred clients aprox.
>>
>> Partition GFS2 is mounted in two nodes of
>> cluster.
>>
>> Clients can boot in linux and windows.
>>
>> There is one share for home folder, another
>> for profiles, another for shared applications and
>> data: there is 5 shares.
>>
>>>>   Also, did you mount with noatime, nodiratime?
>> Yes, I'm  mounting with these options.
>>
>> Jordi Renye
>> LCFIB - UPC
>>
>>
> Were the tests being run directly on gfs2, or via Samba in this case?
>
> Steve.
>
>> El 22/07/2011 12:32, Steven Whitehouse escribi?:
>>> Hi,
>>>
>>> On Fri, 2011-07-22 at 12:08 +0200, Jordi Renye wrote:
>>>> Hi,
>>>>
>>>> We have configured redhat cluster RHEL 6.1  with two nodes.
>>>> We have seen that performance of GFS2 on writing  is
>>>> half of ext3 partition.
>>>>
>>>> For example, time of commands:
>>>>
>>>> time cp -Rp /usr /gfs2partition/usr
>>>> 0.681u 47.082s 7:01.80 11.3%    0+0k 561264+2994832io 0pf+0w
>>>>
>>>> whereas
>>>>
>>>>     cp -R /usr /ext3partition/usr
>>>> 0.543u 24.041s 4:16.86 9.5%     0+0k 2728584+3166184io 2pf+0w
>>>>
>>>> With  ping_pong tool from Samba.org we've got next results:
>>>>
>>>> Los resultados son los siguientes:
>>>>
>>>> ping_pong /gfs2partition/pingpongtestfile 3
>>>> 1582 locks/sec
>>>>
>>>> With ping_pong test r/w:
>>>>
>>>> ping_pong -rw /gfs2partition/pingpongtestfile 3
>>>> data increment = 2
>>>> 4 locks/sec
>>>>
>>>> Do you think we can get better performance? Do you think
>>>> are "normal" and "good" results ?
>>>>
>>>> Which recommendations do you tell us to get better performance?
>>>>
>>>> For example, we don't have a heartbeat network exclusively, but
>>>> we have only one networks interface for application network and cluster
>>>> network.
>>>> Could we get better performance with one dedicated cluster network( for
>>>> dlm,heartbeath,...).
>>>>
>>>> Thanks in advanced,
>>>>
>>> It depends what you are trying to optimise for... what is the actual
>>> application that you want to run?
>>>
>>> cp doesn't use fcntl locks to the best of my knowledge, so I doubt that
>>> will have any particular effect on the performance. Also it would be
>>> quite unusual for fcntl locks to have any effect on the performance of
>>> the fs as a whole.
>>>
>>> Usually the most important factor is how the workload is balances
>>> between nodes. Also, did you mount with noatime, nodiratime?
>>>
>>> Steve.
>>>
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 

        Jordi Renye Capel
o o o  T?cnic de Sistemes N1
o o o  Laboratori de C?lcul
o o o  Facultat d'Inform?tica de Barcelona
U P C  Universitat Polit?cnica de Catalunya - Barcelona Tech

        E-mail : jordir at fib.upc.edu
        Tel.   : 16943
        Web    : http://www.fib.upc.edu/

======================================================================

Abans  d'imprimir aquest missatge, si us plau, assegureu-vos que sigui
necessari. El medi ambient ?s cosa de tots.

--[ http://www.fib.upc.edu/disclaimer/ ]------------------------------

ADVERTIMENT  /  TEXT  LEGAL:  Aquest  missatge pot contenir informaci?
confidencial  o  legalment protegida i est? exclusivament adre?at a la
persona  o entitat destinat?ria. Si vost? no es el destinatari final o
persona  encarregada  de  recollir-lo, no est? autoritzat a llegir-lo,
retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar el seu
contingut.  Si ha rebut aquest correu electr?nic per error, li preguem
que  informi  al  remitent  i elimini del seu sistema el missatge i el
material annex que pugui contenir. Gr?cies per la seva col?laboraci?.




From ifetch at du.edu  Wed Jul 27 05:47:03 2011
From: ifetch at du.edu (Ivan Fetch)
Date: Tue, 26 Jul 2011 23:47:03 -0600
Subject: [Linux-cluster] RAID1 in RHCS / RHEL5
Message-ID: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu>

	Hello,

I am not finding a lot of docs or commentary about accomplishing RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to RAID1 two mpath devices (using the device multipathing with RHEL5). The mpath devices (mpath0, mpath1) do not match up on all nodes, but I don't believe this technically matters, since md uses UUIDs to detects it's disk devices.

I could just configure an md device, but it seems like there should be a notion of which node owns the md. WHat happens when the active node is resyncing a mirror, and that node dies or gets fenced? WHat happens if someone tries to operate on the md from the inactive node?

I would very much appreciate hearing from those who are accomplishing this, how they did it, gotchas and lessons learned.

Thanks,

- Ivan























.



From gordan at bobich.net  Wed Jul 27 08:05:16 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Wed, 27 Jul 2011 09:05:16 +0100
Subject: [Linux-cluster] RAID1 in RHCS / RHEL5
In-Reply-To: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu>
References: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu>
Message-ID: <f8b89e6834e655b58d2d0cba86a553e7@mail.shatteredsilicon.net>

 I don't think this has anything to do with clustering, but what you are 
 probably looking for is DRBD:
 http://www.drbd.org/

 Gordan

 On Tue, 26 Jul 2011 23:47:03 -0600, Ivan Fetch <ifetch at du.edu> wrote:
> Hello,
>
> I am not finding a lot of docs or commentary about accomplishing
> RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to
> RAID1 two mpath devices (using the device multipathing with RHEL5).
> The mpath devices (mpath0, mpath1) do not match up on all nodes, but 
> I
> don't believe this technically matters, since md uses UUIDs to 
> detects
> it's disk devices.
>
> I could just configure an md device, but it seems like there should
> be a notion of which node owns the md. WHat happens when the active
> node is resyncing a mirror, and that node dies or gets fenced? WHat
> happens if someone tries to operate on the md from the inactive node?
>
> I would very much appreciate hearing from those who are accomplishing
> this, how they did it, gotchas and lessons learned.
>
> Thanks,
>
> - Ivan
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> .
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From ifetch at du.edu  Wed Jul 27 15:14:08 2011
From: ifetch at du.edu (Ivan Fetch)
Date: Wed, 27 Jul 2011 09:14:08 -0600
Subject: [Linux-cluster] RAID1 in RHCS / RHEL5
In-Reply-To: <f8b89e6834e655b58d2d0cba86a553e7@mail.shatteredsilicon.net>
References: <33A35733-136D-41A6-9EAA-11D106E14583@du.edu>
	<f8b89e6834e655b58d2d0cba86a553e7@mail.shatteredsilicon.net>
Message-ID: <E81BD0E7-C58B-47DB-9732-FF093AE44F90@du.edu>

Hi GOrdan,

Thanks for your reply.

I had thought about DRBd, but would rather not mirror over network links, when I can mirror over the SAN. I am also not sure how DRBD would handle our running a Postgresql database on top of it, having never used DRBD in production. Do you have any experience with databases on DRBD?


Thanks,

Ivan.

On Jul 27, 2011, at 2:05 AM, Gordan Bobic wrote:

> I don't think this has anything to do with clustering, but what you are 
> probably looking for is DRBD:
> http://www.drbd.org/
> 
> Gordan
> 
> On Tue, 26 Jul 2011 23:47:03 -0600, Ivan Fetch <ifetch at du.edu> wrote:
>> Hello,
>> 
>> I am not finding a lot of docs or commentary about accomplishing
>> RAID1 of two shared storage (FC SAN) LUNs, in RHCS. I wlikd like to
>> RAID1 two mpath devices (using the device multipathing with RHEL5).
>> The mpath devices (mpath0, mpath1) do not match up on all nodes, but 
>> I
>> don't believe this technically matters, since md uses UUIDs to 
>> detects
>> it's disk devices.
>> 
>> I could just configure an md device, but it seems like there should
>> be a notion of which node owns the md. WHat happens when the active
>> node is resyncing a mirror, and that node dies or gets fenced? WHat
>> happens if someone tries to operate on the md from the inactive node?
>> 
>> I would very much appreciate hearing from those who are accomplishing
>> this, how they did it, gotchas and lessons learned.
>> 
>> Thanks,
>> 
>> - Ivan
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> .
>> 
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster























.



From gordan at bobich.net  Wed Jul 27 15:39:58 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Wed, 27 Jul 2011 16:39:58 +0100
Subject: [Linux-cluster] RAID1 in RHCS / RHEL5
In-Reply-To: <E81BD0E7-C58B-47DB-9732-FF093AE44F90@du.edu>
References: "<33A35733-136D-41A6-9EAA-11D106E14583@du.edu>"
	<f8b89e6834e655b58d2d0cba86a553e7@mail.shatteredsilicon.net>
	<E81BD0E7-C58B-47DB-9732-FF093AE44F90@du.edu>
Message-ID: <d37f870ccc2b0a923daa96ff6768133c@mail.shatteredsilicon.net>

 On Wed, 27 Jul 2011 09:14:08 -0600, Ivan Fetch <ifetch at du.edu> wrote:

> I had thought about DRBd, but would rather not mirror over network
> links, when I can mirror over the SAN.

 Then I guess that is something you need to bring up with your SAN 
 vendor.

> I am also not sure how DRBD
> would handle our running a Postgresql database on top of it, having
> never used DRBD in production. Do you have any experience with
> databases on DRBD?

 Yes, I have been using DRBD for backing databases in fail-over 
 scenarios
 for years. Provided that your fencing works properely and that you 
 don't
 try to mount the device on both sides at the same time, it'll work 
 fine.

 Gordan



From laszlo.budai at gmail.com  Thu Jul 28 00:19:38 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Thu, 28 Jul 2011 03:19:38 +0300
Subject: [Linux-cluster] service startup order
Message-ID: <4E30AB1A.1080102@gmail.com>

Hello everybody,

I would like to know how the Red Hat cluster starts up services in RHEL 4.5.
I'm curious about the ordering of services. Does the cluster starts the
services in the order as they appears in the service section of the
cluster.conf?
Is it starting one service at a time, or it starts the services in parallel?

rgmanager-1.9.*68-1*

Thank you,
Laszlo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110728/9387400f/attachment.htm>

From linux at alteeve.com  Thu Jul 28 00:50:38 2011
From: linux at alteeve.com (Digimer)
Date: Wed, 27 Jul 2011 20:50:38 -0400
Subject: [Linux-cluster] service startup order
In-Reply-To: <4E30AB1A.1080102@gmail.com>
References: <4E30AB1A.1080102@gmail.com>
Message-ID: <4E30B25E.5010302@alteeve.com>

On 07/27/2011 08:19 PM, Budai Laszlo wrote:
> Hello everybody,
> 
> I would like to know how the Red Hat cluster starts up services in RHEL 4.5.
> I'm curious about the ordering of services. Does the cluster starts the
> services in the order as they appears in the service section of the
> cluster.conf?
> Is it starting one service at a time, or it starts the services in parallel?
> 
> rgmanager-1.9.*68-1*
> 
> Thank you,
> Laszlo

I'm not familiar with the intricacies of RHCS/rgmanager on EL4, but I
suspect the rgmanager start order is the same.

Parallel services will be started simultaneously. Services configured as
service trees will start in the order that they are defined (and stopped
in reverse order).

This covers the start order well:
- https://fedorahosted.org/cluster/wiki/ResourceTrees

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"



From Ralph.Grothe at itdz-berlin.de  Thu Jul 28 06:24:48 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Thu, 28 Jul 2011 08:24:48 +0200
Subject: [Linux-cluster] service startup order
In-Reply-To: <4E30B25E.5010302@alteeve.com>
References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>

Hi Digimer, hi Lazlo,

sorry, for intruding your thread but this is something that I am
also interested in and which I haven't fully fathomed yet.

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer
> Sent: Thursday, July 28, 2011 2:51 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] service startup order
> 
> Parallel services will be started simultaneously. Services 
> configured as
> service trees will start in the order that they are defined 
> (and stopped
> in reverse order).
> 
> This covers the start order well:
> - https://fedorahosted.org/cluster/wiki/ResourceTrees
> 

The referred to wiki article only seems to treat
starting/stopping order and hierarchy (parent-child vs. sibling)
of resources within one service aka resource group.
That sounds pretty clear.
But what about ordering and possible dependencies between
separate services?

You mentioned service trees. You didn't actually mean resource
trees?
If however your wording was deliberate (what I assume) I wonder
if one can nest service tag blocks as one can nest resource tags
within a single service block to express dependencies or
hierarchy and hence starting order between and of services?
Because all the sample configuration snippets I have seen so far
in various docs lack such nesting of services.

The reason that interests me is because I have such a case where
a customer requires such a dependency between two distinct
services that during normal operation (i.e. no node has left the
cluster) are hosted on different nodes.
I told them, from what I have perceived of HA clustering and RHCS
in particular so far, that if they wish to express such an
interdependency that they would have to put all resources which
are now split up in two services, in a nested manner that would
map the intended hierarchy, in a single service.

Because they insisted on their layout I searched a little and
discovered the, in the official RHCS Admin doc not mentioned,
service tag attributes "depend" and "depend_mode".

However, their usage at first seemed pretty useless because the
clusterware seemed to completely ignore them and start/stop
services in sometimes unpredictable ways and even restarted them
at random.
Until I, more by accident, discovered that additionally the "rm"
tag's attribute "central_processing" needed to be defined and
assigned to "1" or "true" for this feature to work approximately.
I say apprimately here because we still have issues with this
cluster that require futher testing.
I hardly dare mentioning, that unfortunately this system already
went into production, now of course lacking any HA,
why we had to defer further testing.


Regards
Ralph

 



From laszlo.budai at gmail.com  Thu Jul 28 07:06:22 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Thu, 28 Jul 2011 10:06:22 +0300
Subject: [Linux-cluster] service startup order
In-Reply-To: <A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com>
	<A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
Message-ID: <4E310A6E.1090206@gmail.com>

Hello everybody,

@Digimer: thank you for that link. I was aware of it, and I knew about
Resource trees in a service. But as Ralph has mentioned I was talking
about services not resources.
For instance the Solaris cluster allows one to define resource
dependencies between resources even if they are members of different
resource groups (a.k.a. services), and also allows for specifying
resource groups dependencies. But as far as I know Red Hat Cluster Suite
does not provides these features, or those are not documented enough (if
at all).
In RHEL6 there is an other resource group manager: Pacemaker which has a
richer portfolio of dependencies, but right now it is still in the
unsupported technology preview phase. Anyway it is not my case with RHEL
4.5 :(

@Ralph: could you please provide me some references where have you found
those attributes of the service tag (depend, depend_mode) and for the RM
tag (central_processing)? Thank you.

Kind regards,
Laszlo


On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hi Digimer, hi Lazlo,
>
> sorry, for intruding your thread but this is something that I am
> also interested in and which I haven't fully fathomed yet.
>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com 
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer
>> Sent: Thursday, July 28, 2011 2:51 AM
>> To: linux clustering
>> Subject: Re: [Linux-cluster] service startup order
>>
>> Parallel services will be started simultaneously. Services 
>> configured as
>> service trees will start in the order that they are defined 
>> (and stopped
>> in reverse order).
>>
>> This covers the start order well:
>> - https://fedorahosted.org/cluster/wiki/ResourceTrees
>>
> The referred to wiki article only seems to treat
> starting/stopping order and hierarchy (parent-child vs. sibling)
> of resources within one service aka resource group.
> That sounds pretty clear.
> But what about ordering and possible dependencies between
> separate services?
>
> You mentioned service trees. You didn't actually mean resource
> trees?
> If however your wording was deliberate (what I assume) I wonder
> if one can nest service tag blocks as one can nest resource tags
> within a single service block to express dependencies or
> hierarchy and hence starting order between and of services?
> Because all the sample configuration snippets I have seen so far
> in various docs lack such nesting of services.
>
> The reason that interests me is because I have such a case where
> a customer requires such a dependency between two distinct
> services that during normal operation (i.e. no node has left the
> cluster) are hosted on different nodes.
> I told them, from what I have perceived of HA clustering and RHCS
> in particular so far, that if they wish to express such an
> interdependency that they would have to put all resources which
> are now split up in two services, in a nested manner that would
> map the intended hierarchy, in a single service.
>
> Because they insisted on their layout I searched a little and
> discovered the, in the official RHCS Admin doc not mentioned,
> service tag attributes "depend" and "depend_mode".
>
> However, their usage at first seemed pretty useless because the
> clusterware seemed to completely ignore them and start/stop
> services in sometimes unpredictable ways and even restarted them
> at random.
> Until I, more by accident, discovered that additionally the "rm"
> tag's attribute "central_processing" needed to be defined and
> assigned to "1" or "true" for this feature to work approximately.
> I say apprimately here because we still have issues with this
> cluster that require futher testing.
> I hardly dare mentioning, that unfortunately this system already
> went into production, now of course lacking any HA,
> why we had to defer further testing.
>
>
> Regards
> Ralph
>
>  
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From Ralph.Grothe at itdz-berlin.de  Thu Jul 28 07:34:50 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Thu, 28 Jul 2011 09:34:50 +0200
Subject: [Linux-cluster] service startup order
In-Reply-To: <4E310A6E.1090206@gmail.com>
References: <4E30AB1A.1080102@gmail.com>
	<4E30B25E.5010302@alteeve.com><A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
	<4E310A6E.1090206@gmail.com>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>

 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai
Laszlo
> Sent: Thursday, July 28, 2011 9:06 AM
> To: linux-cluster at redhat.com
> Subject: Re: [Linux-cluster] service startup order
> 
> @Ralph: could you please provide me some references where 
> have you found
> those attributes of the service tag (depend, depend_mode) and 
> for the RM
> tag (central_processing)? Thank you.

Even Digimer is metioning this attribute in her excellent wiki

http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr
ocessing


If you have a login account at RHN you may find this article in
their knowledge base

https://access.redhat.com/kb/docs/DOC-26981


Apart from that the only text where this attrib was used that I
have come across so far was an RH doc treating deployment of SAP
on RHCS
But it only appears there in a sample config snippet without
further explanation.
I guess that they needed to activate it because they were using
the RHCS event processing interface?

http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf

> 
> Kind regards,
> Laszlo
> 
> 
> On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote:
> > Hi Digimer, hi Lazlo,
> >
> > sorry, for intruding your thread but this is something that I
am
> > also interested in and which I haven't fully fathomed yet.
> >
> >> -----Original Message-----
> >> From: linux-cluster-bounces at redhat.com 
> >> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of
Digimer
> >> Sent: Thursday, July 28, 2011 2:51 AM
> >> To: linux clustering
> >> Subject: Re: [Linux-cluster] service startup order
> >>
> >> Parallel services will be started simultaneously. Services 
> >> configured as
> >> service trees will start in the order that they are defined 
> >> (and stopped
> >> in reverse order).
> >>
> >> This covers the start order well:
> >> - https://fedorahosted.org/cluster/wiki/ResourceTrees
> >>
> > The referred to wiki article only seems to treat
> > starting/stopping order and hierarchy (parent-child vs.
sibling)
> > of resources within one service aka resource group.
> > That sounds pretty clear.
> > But what about ordering and possible dependencies between
> > separate services?
> >
> > You mentioned service trees. You didn't actually mean
resource
> > trees?
> > If however your wording was deliberate (what I assume) I
wonder
> > if one can nest service tag blocks as one can nest resource
tags
> > within a single service block to express dependencies or
> > hierarchy and hence starting order between and of services?
> > Because all the sample configuration snippets I have seen so
far
> > in various docs lack such nesting of services.
> >
> > The reason that interests me is because I have such a case
where
> > a customer requires such a dependency between two distinct
> > services that during normal operation (i.e. no node has left
the
> > cluster) are hosted on different nodes.
> > I told them, from what I have perceived of HA clustering and
RHCS
> > in particular so far, that if they wish to express such an
> > interdependency that they would have to put all resources
which
> > are now split up in two services, in a nested manner that
would
> > map the intended hierarchy, in a single service.
> >
> > Because they insisted on their layout I searched a little and
> > discovered the, in the official RHCS Admin doc not mentioned,
> > service tag attributes "depend" and "depend_mode".
> >
> > However, their usage at first seemed pretty useless because
the
> > clusterware seemed to completely ignore them and start/stop
> > services in sometimes unpredictable ways and even restarted
them
> > at random.
> > Until I, more by accident, discovered that additionally the
"rm"
> > tag's attribute "central_processing" needed to be defined and
> > assigned to "1" or "true" for this feature to work
approximately.
> > I say apprimately here because we still have issues with this
> > cluster that require futher testing.
> > I hardly dare mentioning, that unfortunately this system
already
> > went into production, now of course lacking any HA,
> > why we had to defer further testing.
> >
> >
> > Regards
> > Ralph
> >
> >  
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 



From laszlo.budai at gmail.com  Thu Jul 28 08:06:55 2011
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Thu, 28 Jul 2011 11:06:55 +0300
Subject: [Linux-cluster] service startup order
In-Reply-To: <A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>
References: <4E30AB1A.1080102@gmail.com>	<4E30B25E.5010302@alteeve.com><A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>	<4E310A6E.1090206@gmail.com>
	<A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>
Message-ID: <4E31189F.3020805@gmail.com>

Hi Ralph,

thank you for your quick answer.
That Knowledge base article indeed presents that dependencies
possibility. Unfortunately I do not have the required version of
rgmanager :(

kind regards,
Laszlo

On 07/28/2011 10:34 AM, Ralph.Grothe at itdz-berlin.de wrote:
>  
>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com 
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai
> Laszlo
>> Sent: Thursday, July 28, 2011 9:06 AM
>> To: linux-cluster at redhat.com
>> Subject: Re: [Linux-cluster] service startup order
>>
>> @Ralph: could you please provide me some references where 
>> have you found
>> those attributes of the service tag (depend, depend_mode) and 
>> for the RM
>> tag (central_processing)? Thank you.
> Even Digimer is metioning this attribute in her excellent wiki
>
> http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr
> ocessing
>
>
> If you have a login account at RHN you may find this article in
> their knowledge base
>
> https://access.redhat.com/kb/docs/DOC-26981
>
>
> Apart from that the only text where this attrib was used that I
> have come across so far was an RH doc treating deployment of SAP
> on RHCS
> But it only appears there in a sample config snippet without
> further explanation.
> I guess that they needed to activate it because they were using
> the RHCS event processing interface?
>
> http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf
>
>> Kind regards,
>> Laszlo
>>
>>
>> On 07/28/2011 09:24 AM, Ralph.Grothe at itdz-berlin.de wrote:
>>> Hi Digimer, hi Lazlo,
>>>
>>> sorry, for intruding your thread but this is something that I
> am
>>> also interested in and which I haven't fully fathomed yet.
>>>
>>>> -----Original Message-----
>>>> From: linux-cluster-bounces at redhat.com 
>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of
> Digimer
>>>> Sent: Thursday, July 28, 2011 2:51 AM
>>>> To: linux clustering
>>>> Subject: Re: [Linux-cluster] service startup order
>>>>
>>>> Parallel services will be started simultaneously. Services 
>>>> configured as
>>>> service trees will start in the order that they are defined 
>>>> (and stopped
>>>> in reverse order).
>>>>
>>>> This covers the start order well:
>>>> - https://fedorahosted.org/cluster/wiki/ResourceTrees
>>>>
>>> The referred to wiki article only seems to treat
>>> starting/stopping order and hierarchy (parent-child vs.
> sibling)
>>> of resources within one service aka resource group.
>>> That sounds pretty clear.
>>> But what about ordering and possible dependencies between
>>> separate services?
>>>
>>> You mentioned service trees. You didn't actually mean
> resource
>>> trees?
>>> If however your wording was deliberate (what I assume) I
> wonder
>>> if one can nest service tag blocks as one can nest resource
> tags
>>> within a single service block to express dependencies or
>>> hierarchy and hence starting order between and of services?
>>> Because all the sample configuration snippets I have seen so
> far
>>> in various docs lack such nesting of services.
>>>
>>> The reason that interests me is because I have such a case
> where
>>> a customer requires such a dependency between two distinct
>>> services that during normal operation (i.e. no node has left
> the
>>> cluster) are hosted on different nodes.
>>> I told them, from what I have perceived of HA clustering and
> RHCS
>>> in particular so far, that if they wish to express such an
>>> interdependency that they would have to put all resources
> which
>>> are now split up in two services, in a nested manner that
> would
>>> map the intended hierarchy, in a single service.
>>>
>>> Because they insisted on their layout I searched a little and
>>> discovered the, in the official RHCS Admin doc not mentioned,
>>> service tag attributes "depend" and "depend_mode".
>>>
>>> However, their usage at first seemed pretty useless because
> the
>>> clusterware seemed to completely ignore them and start/stop
>>> services in sometimes unpredictable ways and even restarted
> them
>>> at random.
>>> Until I, more by accident, discovered that additionally the
> "rm"
>>> tag's attribute "central_processing" needed to be defined and
>>> assigned to "1" or "true" for this feature to work
> approximately.
>>> I say apprimately here because we still have issues with this
>>> cluster that require futher testing.
>>> I hardly dare mentioning, that unfortunately this system
> already
>>> went into production, now of course lacking any HA,
>>> why we had to defer further testing.
>>>
>>>
>>> Regards
>>> Ralph
>>>
>>>  
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From Ralph.Grothe at itdz-berlin.de  Thu Jul 28 09:01:44 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Thu, 28 Jul 2011 11:01:44 +0200
Subject: [Linux-cluster] service startup order
In-Reply-To: <4E31189F.3020805@gmail.com>
References: <4E30AB1A.1080102@gmail.com>	<4E30B25E.5010302@alteeve.com><A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>	<4E310A6E.1090206@gmail.com><A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>
	<4E31189F.3020805@gmail.com>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF014329C6@itdzex101.ITDZ.verwalt-berlin.de>

 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai
Laszlo
> Sent: Thursday, July 28, 2011 10:07 AM
> To: linux-cluster at redhat.com
> Subject: Re: [Linux-cluster] service startup order
> 
> Hi Ralph,
> 
> thank you for your quick answer.
> That Knowledge base article indeed presents that dependencies
> possibility. Unfortunately I do not have the required version
of
> rgmanager :(
> 

Lazlo, 

As you wrote, in RHEL 6.X as a tech preview one now has the
choice to install pacemaker and corosync
which however isn't yet officially supported by RH.

But it looks that the RHCS will be moving to Pacemaker in
forthcoming releases.

With Pacemaker, as you remarked, one can configure dependencies
between services.

I'm in a similar dilemma like you.
Because we have support contracts with RH their current RHCS
under RHEL 5.6 is what we "sell" as a service to our customers.
So, how much I would like to shift to Pacemaker, I am tied to the
RHCS version that we have support for.


Good Luck
Ralph



From linux at alteeve.com  Thu Jul 28 11:36:45 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 28 Jul 2011 07:36:45 -0400
Subject: [Linux-cluster] service startup order
In-Reply-To: <A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
References: <4E30AB1A.1080102@gmail.com> <4E30B25E.5010302@alteeve.com>
	<A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
Message-ID: <4E3149CD.4010006@alteeve.com>

On 07/28/2011 02:24 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hi Digimer, hi Lazlo,
> 
> sorry, for intruding your thread but this is something that I am
> also interested in and which I haven't fully fathomed yet.
> 
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com 
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer
>> Sent: Thursday, July 28, 2011 2:51 AM
>> To: linux clustering
>> Subject: Re: [Linux-cluster] service startup order
>>
>> Parallel services will be started simultaneously. Services 
>> configured as
>> service trees will start in the order that they are defined 
>> (and stopped
>> in reverse order).
>>
>> This covers the start order well:
>> - https://fedorahosted.org/cluster/wiki/ResourceTrees
>>
> 
> The referred to wiki article only seems to treat
> starting/stopping order and hierarchy (parent-child vs. sibling)
> of resources within one service aka resource group.
> That sounds pretty clear.
> But what about ordering and possible dependencies between
> separate services?
> 
> You mentioned service trees. You didn't actually mean resource
> trees?
> If however your wording was deliberate (what I assume) I wonder
> if one can nest service tag blocks as one can nest resource tags
> within a single service block to express dependencies or
> hierarchy and hence starting order between and of services?
> Because all the sample configuration snippets I have seen so far
> in various docs lack such nesting of services.
> 
> The reason that interests me is because I have such a case where
> a customer requires such a dependency between two distinct
> services that during normal operation (i.e. no node has left the
> cluster) are hosted on different nodes.
> I told them, from what I have perceived of HA clustering and RHCS
> in particular so far, that if they wish to express such an
> interdependency that they would have to put all resources which
> are now split up in two services, in a nested manner that would
> map the intended hierarchy, in a single service.
> 
> Because they insisted on their layout I searched a little and
> discovered the, in the official RHCS Admin doc not mentioned,
> service tag attributes "depend" and "depend_mode".
> 
> However, their usage at first seemed pretty useless because the
> clusterware seemed to completely ignore them and start/stop
> services in sometimes unpredictable ways and even restarted them
> at random.
> Until I, more by accident, discovered that additionally the "rm"
> tag's attribute "central_processing" needed to be defined and
> assigned to "1" or "true" for this feature to work approximately.
> I say apprimately here because we still have issues with this
> cluster that require futher testing.
> I hardly dare mentioning, that unfortunately this system already
> went into production, now of course lacking any HA,
> why we had to defer further testing.
> 
> 
> Regards
> Ralph

I am late in returning to the thread, my apologies. :)

I did not choose my words carefully, and I did mean resources, not
services. When I used services, I was thinking about ordered daemon
starting (that is, a resource group of system services/init.d scripts).

As far as I understand, and as has been mentioned already further down
this thread, rgmanager is limited in it's ability for complex dependency
checking. That is why Pacemaker is so attractive and why it will
eventually replace rgmanager as the primary cluster manager eventually
(EL7?).

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"



From linux at alteeve.com  Thu Jul 28 11:41:08 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 28 Jul 2011 07:41:08 -0400
Subject: [Linux-cluster] service startup order
In-Reply-To: <A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>
References: <4E30AB1A.1080102@gmail.com>
	<4E30B25E.5010302@alteeve.com><A789DDB53ED7E94396E842EE2AC9B5FF014329C3@itdzex101.ITDZ.verwalt-berlin.de>
	<4E310A6E.1090206@gmail.com>
	<A789DDB53ED7E94396E842EE2AC9B5FF014329C4@itdzex101.ITDZ.verwalt-berlin.de>
Message-ID: <4E314AD4.5010809@alteeve.com>

On 07/28/2011 03:34 AM, Ralph.Grothe at itdz-berlin.de wrote:
>  
> 
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com 
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Budai
> Laszlo
>> Sent: Thursday, July 28, 2011 9:06 AM
>> To: linux-cluster at redhat.com
>> Subject: Re: [Linux-cluster] service startup order
>>
>> @Ralph: could you please provide me some references where 
>> have you found
>> those attributes of the service tag (depend, depend_mode) and 
>> for the RM
>> tag (central_processing)? Thank you.
> 
> Even Digimer is metioning this attribute in her excellent wiki
> 
> http://wiki.alteeve.com/index.php/RHCS_v2_cluster.conf#central_pr
> ocessing
> 
> 
> If you have a login account at RHN you may find this article in
> their knowledge base
> 
> https://access.redhat.com/kb/docs/DOC-26981
> 
> 
> Apart from that the only text where this attrib was used that I
> have come across so far was an RH doc treating deployment of SAP
> on RHCS
> But it only appears there in a sample config snippet without
> further explanation.
> I guess that they needed to activate it because they were using
> the RHCS event processing interface?
> 
> http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf

Eek, that cluster.conf article is in a poor shape. As it stands now, it
is little more that a wiki'fied dump of the cluster.ng xmllint file. The
lack of documentation of many options is why there are so many /no info/
entries.

I had planned to bug the devs to get better definitions of the options,
and may still do that someday. However, if working on that article, I
came to realize that Pacemaker is the future of resource management, so
for now, I'm focusing on learning Pacemaker.

As a general note of caution; Though the extra arguments may exist,
unless you find documentation from Red Hat directly on their use, I'd
hesitate to use the options in production. I am unclear on Red Hat's
policy towards maintaining the functionality of those
undocumented/minimally documented attributes. If you have a Red Hat
contract, I'd strongly urge you to speak to your rep about using those
attributes.

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"At what point did we forget that the Space Shuttle was, essentially,
a program that strapped human beings to an explosion and tried to stab
through the sky with fire and math?"



From cos at aaaaa.org  Thu Jul 28 21:39:24 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Thu, 28 Jul 2011 17:39:24 -0400
Subject: [Linux-cluster] RHCS resource agent: status interval vs. monitor
	interval
Message-ID: <20110728213924.GD341@mip.aaaaa.org>

In the <actions> section of a RHCS resource agent's meta-data,
there are nodes for both action name="status" and action name="monitor".
Both of them have an interval and a timeout.  For example, in ip.sh:

        <!-- Checks to see if the IP is up and (optionally) the link is
             working -->
        <action name="status" interval="20" timeout="10"/>
        <action name="monitor" interval="20" timeout="10"/>

        <!-- Checks to see if we can ping the IP address locally -->
        <action name="status" depth="10" interval="60" timeout="20"/>
        <action name="monitor" depth="10" interval="60" timeout="20"/>

I assume that one of them controls how often rgmanager runs the
resource agent to check the resource status, but which one, and
what's the point of the other one?

I tried to find the answer in:
  https://fedorahosted.org/cluster/wiki/ResourceActions
  http://www.opencf.org/cgi-bin/viewcvs.cgi/*checkout*/specs/ra/resource-agent-api.txt?rev=1.10

Neither of them explain why there are separate "status" and "monitor" actions.
  -- Cos



From Ralph.Grothe at itdz-berlin.de  Fri Jul 29 05:57:01 2011
From: Ralph.Grothe at itdz-berlin.de (Ralph.Grothe at itdz-berlin.de)
Date: Fri, 29 Jul 2011 07:57:01 +0200
Subject: [Linux-cluster] RHCS resource agent: status interval vs.
	monitorinterval
In-Reply-To: <20110728213924.GD341@mip.aaaaa.org>
References: <20110728213924.GD341@mip.aaaaa.org>
Message-ID: <A789DDB53ED7E94396E842EE2AC9B5FF014329CC@itdzex101.ITDZ.verwalt-berlin.de>

I'm not sure and may be wrong.
But to my understanding the "monitor" action adheres to the OCF
RA API
http://www.linux-ha.org/doc/dev-guides/_resource_agent_actions.ht
ml
while the status action seems to be purely RHCS specific.
I assume they chose "status" to stay in tradition with the LSB
init scripts' invocation parameters.
So on an RHCS cluster "status" should apply.


> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Ofer
Inbar
> Sent: Thursday, July 28, 2011 11:39 PM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] RHCS resource agent: status interval 
> vs. monitorinterval
> 
> In the <actions> section of a RHCS resource agent's meta-data,
> there are nodes for both action name="status" and action 
> name="monitor".
> Both of them have an interval and a timeout.  For example, in
ip.sh:
> 
>         <!-- Checks to see if the IP is up and (optionally) 
> the link is
>              working -->
>         <action name="status" interval="20" timeout="10"/>
>         <action name="monitor" interval="20" timeout="10"/>
> 
>         <!-- Checks to see if we can ping the IP address
locally -->
>         <action name="status" depth="10" interval="60"
timeout="20"/>
>         <action name="monitor" depth="10" interval="60"
timeout="20"/>
> 
> I assume that one of them controls how often rgmanager runs the
> resource agent to check the resource status, but which one, and
> what's the point of the other one?
> 
> I tried to find the answer in:
>   https://fedorahosted.org/cluster/wiki/ResourceActions
>   
> http://www.opencf.org/cgi-bin/viewcvs.cgi/*checkout*/specs/ra/
resource-agent-api.txt?rev=1.10
> 
> Neither of them explain why there are separate "status" and 
> "monitor" actions.
>   -- Cos
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 



From jordir at fib.upc.edu  Fri Jul 29 10:10:05 2011
From: jordir at fib.upc.edu (Jordi Renye)
Date: Fri, 29 Jul 2011 12:10:05 +0200
Subject: [Linux-cluster] samba share  partition of nfs mounted
Message-ID: <4E3286FD.3090602@fib.upc.edu>

Hi,

Due to performance problems with GFS2 in two cluster node, we would like
change directions to next architecture:

- node A: partition EXT3 (before GFS2)
- share partition directly via samba to pc clients

- node B: mount partition of node A via NFS
- share this nfs mounted partition  through samba to pc clients

We have 300 clients of samba (between windows and  linux)
that mount remote partition  using it as HOME directory.

We are making load balancing, between windows and linux clients:
windows mount from node A, and linux from node B.

Problems with GFS2 were:
- long time to take backup.
- problems with some applications as Eclipse.
- with this two problems, we have not yet interactive tests with samba.


We would like to think that EXT3 resolve this problems. Do you see, any
problems with this configuration ?

First thing I see, it's difficult to add high availability  in case Node 
A going down.

Thanks in advanced,

Jordi Renye
UPC