From dma+linuxcluster at witbe.net  Tue Apr  1 09:32:18 2008
From: dma+linuxcluster at witbe.net (Daniel Maher)
Date: Tue, 1 Apr 2008 11:32:18 +0200
Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
In-Reply-To: <47F138E7.4000903@cmiware.com>
References: <20080331194027.101fdf09@danstation>
	<9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU>
	<47F138E7.4000903@cmiware.com>
Message-ID: <20080401113218.45dd3296@danstation>

On Mon, 31 Mar 2008 14:17:59 -0500 Chris Harms <chris at cmiware.com>
wrote:

> The non-SAN option would be to use DRBD (http://www.drbd.org) and put 
> NFS, Samba, etc on top of the DRBD partition.

Thank you for your reply.

On this topic, consider this paper by Lars Ellenberg :
http://www.drbd.org/fileadmin/drbd/publications/drbd8.linux-conf.eu.2007.pdf

Where he notes the following :
"The most inconvenient limitations is currently that DRBD supports only
two nodes natively."

While this is not a problem in my theoretical two-server setup, should
we wish to add a third server in the future (which i find highly
likely), then DRBD will no longer be an appropriate solution.

Furthermore, that same paper seems to suggest that DRBD is best used in
a primary / secondary relationship, whereas i'm suggesting an
"all-primary" sort of setup.


-- 
Daniel Maher <dma AT witbe.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080401/a63b0a3c/attachment.sig>

From dma+linuxcluster at witbe.net  Tue Apr  1 09:41:22 2008
From: dma+linuxcluster at witbe.net (Daniel Maher)
Date: Tue, 1 Apr 2008 11:41:22 +0200
Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
In-Reply-To: <9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU>
References: <20080331194027.101fdf09@danstation>
	<9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU>
Message-ID: <20080401114122.24df5244@danstation>

On Mon, 31 Mar 2008 13:57:46 -0500 "MARTI, ROBERT JESSE"
<RJM002 at shsu.edu> wrote:

> You don't have to have a mirrored LVM to do what youre trying to do.
> You just need a common mountable share - typically a SAN or NAS.  It
> shouldn't be too hard to configure (and I've already done it).  You
> don't even *have* to have cluster suite - if you have a load balancer.
> My brain isn't fast enough today to figure out how to share a load
> without a load balanced VIP or a DNS round robin (which should be easy
> to do as well).

Thank you for your reply.  As for your suggestion of having a common
mountable share - well, yes, that's exactly what i'm trying to do.
I want to take to servers, and create a NAS device from them.  I don't
already have a load balancer, but using RRDNS is straightforward
enough.

The other aspect of this initiative is to gain some useful applicative
experience with cluster suite, as we'd like to clusterise our front-end
web servers down the road as well.


-- 
Daniel Maher <dma AT witbe.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080401/08b7a888/attachment.sig>

From gordan at bobich.net  Tue Apr  1 09:50:10 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Tue, 1 Apr 2008 10:50:10 +0100 (BST)
Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
In-Reply-To: <20080401113218.45dd3296@danstation>
References: <20080331194027.101fdf09@danstation>
	<9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU>
	<47F138E7.4000903@cmiware.com> <20080401113218.45dd3296@danstation>
Message-ID: <alpine.LRH.1.10.0804011040050.4154@skynet.shatteredsilicon.net>

On Tue, 1 Apr 2008, Daniel Maher wrote:

>> The non-SAN option would be to use DRBD (http://www.drbd.org) and put
>> NFS, Samba, etc on top of the DRBD partition.
>
> On this topic, consider this paper by Lars Ellenberg :
> http://www.drbd.org/fileadmin/drbd/publications/drbd8.linux-conf.eu.2007.pdf
>
> Where he notes the following :
> "The most inconvenient limitations is currently that DRBD supports only
> two nodes natively."

I'm not 100% sure, but I think this limit is increased in latest 8.0 and 
8.2 releases.

> While this is not a problem in my theoretical two-server setup, should
> we wish to add a third server in the future (which i find highly
> likely), then DRBD will no longer be an appropriate solution.

I'd double check that this is still a limitation. Ask on the DRBD list.

> Furthermore, that same paper seems to suggest that DRBD is best used in
> a primary / secondary relationship, whereas i'm suggesting an
> "all-primary" sort of setup.

That is the way it has been used traditionally with DRBD <= 7.x, but for a 
while now primary/primary operation has been fully supported (obviously, 
you need to use a FS that is aware of such things, such as GFS(2) or 
OCFS(2)).

Gordan



From Danny.Wall at health-first.org  Tue Apr  1 16:45:28 2008
From: Danny.Wall at health-first.org (Danny Wall)
Date: Tue, 01 Apr 2008 12:45:28 -0400
Subject: [Linux-cluster] Using GFS and DLM without RHCS
Message-ID: <47F22E68020000C800008B81@mail-int.health-first.org>

Danny Wall wrote:
> I was wondering if it is possible to run GFS on several machines with
a
> shared GFS LUN, but not use full clustering like RHCS. From the FAQs:

> First of all, what's the problem with having RHCS running? It doesn't 
> mean you have to use it to handle resources failing over. You can run
> it all in active/active setup with load balancing in front.

I was looking to minimize everything as much as possible, so if it is
not needed, do not install it. This would reduce problems with updates
and overall management. Having said that, your solution is still a
better alternative for my needs, and options like this are what I am
looking for. Thanks

> If this is not an acceptable solution for you and you still cannot be 
> bothered to create cluster.conf (and that is all that is required), 
> you can always use OCFS2. This doesn't have a cluster component (it's 
> totally unrelated to RHCS), but you still have to create the 
> equivalent config, so you won't be saving yourself any effort.

> Gordan

OCFS is out of he question. OCFS can not handle the number of files and
directories on these servers. 

I don't technically need a cluster, but the cluster filesystem allows me
to have multiple servers with access to the storage at the same time,
reducing downtime, and allowing for processes like backups to run on a
different server and not overload the server used by the end users. If I
can implement this without the cluster, it will reduce complexity. Some
of the problems I have seen recently include the cluster failing to
relocate resources, and not properly fencing a node.

Thanks
Danny




#####################################
This message is for the named person's use only.  It may 
contain private, proprietary, or legally privileged information.  
No privilege is waived or lost by any mistransmission.  If you 
receive this message in error, please immediately delete it and 
all copies of it from your system, destroy any hard copies of it, 
and notify the sender.  You must not, directly or indirectly, use, 
disclose, distribute, print, or copy any part of this message if you 
are not the intended recipient.  Health First reserves the right to 
monitor all e-mail communications through its networks.  Any views 
or opinions expressed in this message are solely those of the 
individual sender, except (1) where the message states such views 
or opinions are on behalf of a particular entity;  and (2) the sender 
is authorized by the entity to give such views or opinions.
#####################################




From garromo at us.ibm.com  Tue Apr  1 17:38:10 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Tue, 1 Apr 2008 11:38:10 -0600
Subject: [Linux-cluster] VIP's on mixed subnets
Message-ID: <OF36CA8B0B.F6241E8B-ON8725741E.0060B0D3-8725741E.0060BDDB@us.ibm.com>

In my cluster all of my servers NICs are bonded.
Up until recently all of my VIPs (for resources/services) were in the same 
subnet.
Is it ok that VIPs be in mixed subnets?  Thanks.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080401/69ca4065/attachment.htm>

From tsucharz at poczta.onet.pl  Tue Apr  1 19:29:35 2008
From: tsucharz at poczta.onet.pl (Tomasz Sucharzewski)
Date: Tue, 1 Apr 2008 21:29:35 +0200
Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
In-Reply-To: <47F138E7.4000903@cmiware.com>
References: <20080331194027.101fdf09@danstation>
	<9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU>
	<47F138E7.4000903@cmiware.com>
Message-ID: <20080401212935.726ee726.tsucharz@poczta.onet.pl>

Hello,

BTW do you know any software solution that supports asynchronous replication on Linux like AVS on Solaris ?

Best regards,
Tomek 

On Mon, 31 Mar 2008 14:17:59 -0500
Chris Harms <chris at cmiware.com> wrote:

> The non-SAN option would be to use DRBD (http://www.drbd.org) and put 
> NFS, Samba, etc on top of the DRBD partition.
> 
> Chris
> 
> MARTI, ROBERT JESSE wrote:
> > You don't have to have a mirrored LVM to do what youre trying to do.
> > You just need a common mountable share - typically a SAN or NAS.  It
> > shouldn't be too hard to configure (and I've already done it).  You
> > don't even *have* to have cluster suite - if you have a load balancer.
> > My brain isn't fast enough today to figure out how to share a load
> > without a load balanced VIP or a DNS round robin (which should be easy
> > to do as well).
> >
> > Rob Marti
> > Systems Analyst II
> > Sam Houston State University
> >
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Daniel Maher
> > Sent: Monday, March 31, 2008 12:40 PM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
> >
> > Hello all,
> >
> > I have spent the day reading through the mailing list archives, Redhat
> > documentation, and CentOS forums, and - to be frank - my head is now
> > swimming with information.
> >
> > My scenario seems reasonably straightforward : I would like to have two
> > file servers which mirror each others' data, then i'd like those two
> > servers to act as a cluster, whereby they serve said data as if they
> > were one machine.  If one of the servers suffers a critical failure, the
> > other will stay up, and the data will continue to be accessible to the
> > rest of the network.
> >
> > I note with some trepidation that this might not be possible, as per
> > this document :
> > http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/en-US/RHEL51
> > 0/Cluster_Logical_Volume_Manager/mirrored_volumes.html
> >
> > However, i don't know if that document relates to the same scenario i've
> > described above.  I would very much appreciate any and all feedback,
> > links to further documentation, and any other information that you might
> > like to share.
> >
> > Thank you !
> >
> >
> > --
> > Daniel Maher <dma AT witbe.net>
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >   
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 
Tomasz Sucharzewski <tsucharz at poczta.onet.pl>



From anujhere at gmail.com  Tue Apr  1 21:23:37 2008
From: anujhere at gmail.com (=?UTF-8?Q?=E0=A4=85=E0=A4=A8=E0=A5=81=E0=A4=9C_Anuj_Singh?=)
Date: Wed, 2 Apr 2008 02:53:37 +0530
Subject: [Linux-cluster] distributed file system... can we achieve
	effectively using linux
Message-ID: <3120c9e30804011423u19670332lba8bfe82b4543066@mail.gmail.com>

Hi,

How can we create a common Q drive using linux that meets the following
needs ?

It should be possible to logically the common Q drive into smaller
partitions, each managed by a custodian

The custodian of a partition, should be able to monitor and control the
usage of a partition.

Presently, Q drives are used as shared folders at different locations over
WAN. (so the network traffic and server load will be a factor, not all the
files of Q - drive is required among locations.)

Present Q drives are on windows platform.

Do we have a better option over microsoft's DFS?

Thanks and Regards
Anuj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/365eecfd/attachment.htm>

From andrew at ntsg.umt.edu  Wed Apr  2 00:19:15 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Tue, 1 Apr 2008 18:19:15 -0600 (MDT)
Subject: [Linux-cluster] dlm high cpu on latest stock centos 5.1 kernel
Message-ID: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>

I have a GFS cluster with one node serving files via smb and nfs. Under
fairly light usage (5-10 users) the cpu is getting pounded by dlm. I am
using CentOS5.1 with the included kernel (2.6.18-53.1.14.el5). This sounds
like the dlm issue mentioned back in March of last year
(https://www.redhat.com/archives/linux-cluster/2007-March/msg00068.html)
that was resolved in 2.6.21.

Has (or will) this fix be back ported to the current el5 kernel? Will it
be in RHEL5.2? What is the easiest way for me to get this fix?

Also, if I try a newer kernel on this node, will there be any harm in the
other nodes using their current kernel?

Thanks,
-Andrew
-- 
Andrew A. Neuschwander, RHCE
Linux Systems Administrator
Numerical Terradynamic Simulation Group
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310



From david at eciad.ca  Wed Apr  2 00:30:56 2008
From: david at eciad.ca (David Ayre)
Date: Tue, 1 Apr 2008 17:30:56 -0700
Subject: [Linux-cluster] dlm high cpu on latest stock centos 5.1 kernel
In-Reply-To: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>
References: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>
Message-ID: <EE95BF49-125C-4A49-99D1-E7459BDE6E80@eciad.ca>

What do you mean by pounded exactly ?

We have an ongoing issue, similar... when we have about a dozen users  
using both smb/nfs, and at some seemingly random point in time our  
dlm_senddd chews up 100% of the CPU... then dies down at on its own  
after quite a while.  Killing SMB processes, shutting down SMB didn't  
seem to have any affect... only a reboot cures it.  I've seen this  
described (if this is the same issue) as a "soft lockup" as it does  
seem to come back to life:

http://lkml.org/lkml/2007/10/4/137

We've been assuming its a kernel/dlm version as we are running  
2.6.9-55.0.6.ELsmp with dlm-kernel 2.6.9-46.16.0.8

we were going to try a kernel update this week... but you seem to be  
using a later version and still have this problem ?

Could you elaborate on "getting pounded by dlm" ?  I've posted about  
this on this list in the past but received no assistance.




On 1-Apr-08, at 5:19 PM, Andrew A. Neuschwander wrote:

> I have a GFS cluster with one node serving files via smb and nfs.  
> Under
> fairly light usage (5-10 users) the cpu is getting pounded by dlm. I  
> am
> using CentOS5.1 with the included kernel (2.6.18-53.1.14.el5). This  
> sounds
> like the dlm issue mentioned back in March of last year
> (https://www.redhat.com/archives/linux-cluster/2007-March/msg00068.html 
> )
> that was resolved in 2.6.21.
>
> Has (or will) this fix be back ported to the current el5 kernel?  
> Will it
> be in RHEL5.2? What is the easiest way for me to get this fix?
>
> Also, if I try a newer kernel on this node, will there be any harm  
> in the
> other nodes using their current kernel?
>
> Thanks,
> -Andrew
> -- 
> Andrew A. Neuschwander, RHCE
> Linux Systems Administrator
> Numerical Terradynamic Simulation Group
> College of Forestry and Conservation
> The University of Montana
> http://www.ntsg.umt.edu
> andrew at ntsg.umt.edu - 406.243.6310
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
David Ayre
Programmer/Analyst - Information Technlogy Services
Emily Carr Institute of Art and Design
Vancouver, B.C.   Canada
604-844-3875 /  david at eciad.ca

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080401/8067c024/attachment.htm>

From andrew at ntsg.umt.edu  Wed Apr  2 00:51:01 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Tue, 1 Apr 2008 18:51:01 -0600 (MDT)
Subject: [Linux-cluster] dlm high cpu on latest stock centos 5.1 kernel
In-Reply-To: <EE95BF49-125C-4A49-99D1-E7459BDE6E80@eciad.ca>
References: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>
	<EE95BF49-125C-4A49-99D1-E7459BDE6E80@eciad.ca>
Message-ID: <47567.10.8.105.69.1207097461.squirrel@secure.ntsg.umt.edu>

My symptoms are similar. dlm_send sits on all of the cpu. Top shows the
cpu spending nearly all of it's time in sys or interrupt handling. Disk
and network I/O isn't very high (as seen via iostat and iptraf). But
SMB/NFS throughput and latency are horrible. Context switches per second
as seen by vmstat are in the 20,000+ range (I don't now if this is high
though, I haven't really paid attention to this in the past). Nothing
crashes, and it is still able to serve data (very slowly), and eventually
the load and latency recovers.

As an aside, does anyone know how to _view_ the resource group size after
file system creation on GFS?

Thanks,
-Andrew


On Tue, April 1, 2008 6:30 pm, David Ayre wrote:
> What do you mean by pounded exactly ?
>
> We have an ongoing issue, similar... when we have about a dozen users
> using both smb/nfs, and at some seemingly random point in time our
> dlm_senddd chews up 100% of the CPU... then dies down at on its own
> after quite a while.  Killing SMB processes, shutting down SMB didn't
> seem to have any affect... only a reboot cures it.  I've seen this
> described (if this is the same issue) as a "soft lockup" as it does
> seem to come back to life:
>
> http://lkml.org/lkml/2007/10/4/137
>
> We've been assuming its a kernel/dlm version as we are running
> 2.6.9-55.0.6.ELsmp with dlm-kernel 2.6.9-46.16.0.8
>
> we were going to try a kernel update this week... but you seem to be
> using a later version and still have this problem ?
>
> Could you elaborate on "getting pounded by dlm" ?  I've posted about
> this on this list in the past but received no assistance.
>
>
>
>
> On 1-Apr-08, at 5:19 PM, Andrew A. Neuschwander wrote:
>
>> I have a GFS cluster with one node serving files via smb and nfs.
>> Under
>> fairly light usage (5-10 users) the cpu is getting pounded by dlm. I
>> am
>> using CentOS5.1 with the included kernel (2.6.18-53.1.14.el5). This
>> sounds
>> like the dlm issue mentioned back in March of last year
>> (https://www.redhat.com/archives/linux-cluster/2007-March/msg00068.html
>> )
>> that was resolved in 2.6.21.
>>
>> Has (or will) this fix be back ported to the current el5 kernel?
>> Will it
>> be in RHEL5.2? What is the easiest way for me to get this fix?
>>
>> Also, if I try a newer kernel on this node, will there be any harm
>> in the
>> other nodes using their current kernel?
>>
>> Thanks,
>> -Andrew
>> --
>> Andrew A. Neuschwander, RHCE
>> Linux Systems Administrator
>> Numerical Terradynamic Simulation Group
>> College of Forestry and Conservation
>> The University of Montana
>> http://www.ntsg.umt.edu
>> andrew at ntsg.umt.edu - 406.243.6310
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> ~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
> David Ayre
> Programmer/Analyst - Information Technlogy Services
> Emily Carr Institute of Art and Design
> Vancouver, B.C.   Canada
> 604-844-3875 /  david at eciad.ca
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From sajithks at cdactvm.in  Wed Apr  2 04:57:59 2008
From: sajithks at cdactvm.in (sajith)
Date: Wed, 2 Apr 2008 10:27:59 +0530
Subject: [Linux-cluster] linux cluster on rhel5 without using gfs and shared
	storage
Message-ID: <200804020441.m324fXZ6024516@cdactvm.in>

Hai all,

	I am new to linux cluster. I want to set up a two node cluster using
rhcs. In my application I am using tomcat and mysql as database. My aim is
to configure both servers in active-passive configuration. I have tested the
failover of ip and process using conga. But I am stuck with the
configuration of mysql failover. I am confused with how to make the data
files redundant for mysql. If I am using nfs for data files the files are
not accessible if nfs server is down. How can I create an online back up of
my data files so that if my main server is down then also I can access my
data from the online backup? I don't have gfs and SAN storage. My data files
and application reside in same machine. Please help

Regards,
Sajith K.S


______________________________________
Scanned and protected by Email scanner



From ebaydaan at gmail.com  Tue Apr  1 19:42:37 2008
From: ebaydaan at gmail.com (Daan Biere)
Date: Tue, 1 Apr 2008 21:42:37 +0200
Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
References: <20080331194027.101fdf09@danstation><9F633DE6C0E04F4691DCB713AC44C94B066E4C5B@EXCHANGE.SHSU.EDU><47F138E7.4000903@cmiware.com>
	<20080401212935.726ee726.tsucharz@poczta.onet.pl>
Message-ID: <F164DC64BCD24597B008518A26A3E05E@coreduo>

Hi,
i think the closest solution to AVS will be drbd: http://www.drbd.org/

DRBD takes over the data, writes it to the local disk and sends it to the 
other host. On the other host, it takes it to the disk there.



The other components needed are a cluster membership service, which is 
supposed to be heartbeat, and some kind of application that works on top of 
a block device.



Examples:

A filesystem & fsck.

A journaling FS.

A database with recovery capabilities.



----- Original Message ----- 
From: "Tomasz Sucharzewski" <tsucharz at poczta.onet.pl>
To: <linux-cluster at redhat.com>
Sent: Tuesday, April 01, 2008 9:29 PM
Subject: Re: [Linux-cluster] (newbie) mirrored data / cluster ?


> Hello,
>
> BTW do you know any software solution that supports asynchronous 
> replication on Linux like AVS on Solaris ?
>
> Best regards,
> Tomek
>
> On Mon, 31 Mar 2008 14:17:59 -0500
> Chris Harms <chris at cmiware.com> wrote:
>
>> The non-SAN option would be to use DRBD (http://www.drbd.org) and put
>> NFS, Samba, etc on top of the DRBD partition.
>>
>> Chris
>>
>> MARTI, ROBERT JESSE wrote:
>> > You don't have to have a mirrored LVM to do what youre trying to do.
>> > You just need a common mountable share - typically a SAN or NAS.  It
>> > shouldn't be too hard to configure (and I've already done it).  You
>> > don't even *have* to have cluster suite - if you have a load balancer.
>> > My brain isn't fast enough today to figure out how to share a load
>> > without a load balanced VIP or a DNS round robin (which should be easy
>> > to do as well).
>> >
>> > Rob Marti
>> > Systems Analyst II
>> > Sam Houston State University
>> >
>> > -----Original Message-----
>> > From: linux-cluster-bounces at redhat.com
>> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Daniel Maher
>> > Sent: Monday, March 31, 2008 12:40 PM
>> > To: linux-cluster at redhat.com
>> > Subject: [Linux-cluster] (newbie) mirrored data / cluster ?
>> >
>> > Hello all,
>> >
>> > I have spent the day reading through the mailing list archives, Redhat
>> > documentation, and CentOS forums, and - to be frank - my head is now
>> > swimming with information.
>> >
>> > My scenario seems reasonably straightforward : I would like to have two
>> > file servers which mirror each others' data, then i'd like those two
>> > servers to act as a cluster, whereby they serve said data as if they
>> > were one machine.  If one of the servers suffers a critical failure, 
>> > the
>> > other will stay up, and the data will continue to be accessible to the
>> > rest of the network.
>> >
>> > I note with some trepidation that this might not be possible, as per
>> > this document :
>> > http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/en-US/RHEL51
>> > 0/Cluster_Logical_Volume_Manager/mirrored_volumes.html
>> >
>> > However, i don't know if that document relates to the same scenario 
>> > i've
>> > described above.  I would very much appreciate any and all feedback,
>> > links to further documentation, and any other information that you 
>> > might
>> > like to share.
>> >
>> > Thank you !
>> >
>> >
>> > --
>> > Daniel Maher <dma AT witbe.net>
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> -- 
> Tomasz Sucharzewski <tsucharz at poczta.onet.pl>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster 



From maciej.bogucki at artegence.com  Wed Apr  2 08:54:42 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Wed, 02 Apr 2008 10:54:42 +0200
Subject: [Linux-cluster] linux cluster on rhel5 without using gfs and
	shared	storage
In-Reply-To: <200804020441.m324fXZ6024516@cdactvm.in>
References: <200804020441.m324fXZ6024516@cdactvm.in>
Message-ID: <47F349D2.9030203@artegence.com>

sajith napisa?(a):
> Hai all,
> 
> 	I am new to linux cluster. I want to set up a two node cluster using
> rhcs. In my application I am using tomcat and mysql as database. My aim is
> to configure both servers in active-passive configuration. I have tested the
> failover of ip and process using conga. But I am stuck with the
> configuration of mysql failover. I am confused with how to make the data
> files redundant for mysql. If I am using nfs for data files the files are
> not accessible if nfs server is down. How can I create an online back up of
> my data files so that if my main server is down then also I can access my
> data from the online backup? I don't have gfs and SAN storage. My data files
> and application reside in same machine. Please help

Hello,

You can use drbd[1] to mirror block device via network. You also need
some automatic failover mechanism fe. RHCS or heratbeat[2].

[1] - http://www.drbd.org/
[2] - http://www.linux-ha.org/

Best Regards
Maciej Bogucki



From swhiteho at redhat.com  Wed Apr  2 09:53:34 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 02 Apr 2008 10:53:34 +0100
Subject: [Linux-cluster] About GFS1 and I/O barriers.
In-Reply-To: <20080331151622.1360a2cb@mathieu.toulouse>
References: <20080328153458.45fc6e13@mathieu.toulouse>
	<20080331124651.3f0d2428@mathieu.toulouse>
	<1206960860.3635.126.camel@quoit>
	<20080331151622.1360a2cb@mathieu.toulouse>
Message-ID: <1207130014.3310.24.camel@localhost.localdomain>

Hi,

On Mon, 2008-03-31 at 15:16 +0200, Mathieu Avila wrote:
> Le Mon, 31 Mar 2008 11:54:20 +0100,
> Steven Whitehouse <swhiteho at redhat.com> a ?crit :
> 
> > Hi,
> > 
> 
> Hi,
> 
> > Both GFS1 and GFS2 are safe from this problem since neither of them
> > use barriers. Instead we do a flush at the critical points to ensure
> > that all data is on disk before proceeding with the next stage.
> > 
> 
> I don't think this solves the problem.
> 
> Consider a cheap iSCSI disk (no NVRAM, no UPS) accessed by all my GFS
> nodes; this disk has a write cache enabled, which means it will reply
> that write requests are performed even if they are not really written
> on the platters. The disk (like most disks nowadays) has some logic
> that allows it to optimize writes by re-scheduling them. It is possible
> that all writes are ACK'd before the power failure, but only a fraction
> of them were really performed : some are before the flush, some are
> after the flush. 
> --Not all blocks writes before the flush were performed but other
> blocks after the flush are written -> the FS is corrupted.--
> So, after the power failure all data in the disk's write cache are
> forgotten. If the journal data was in the disk cache, the journal was
> not written to disk, but other metadata have been written, so there are
> metadata inconsistencies.
> 
I don't agree that write caching implies that I/O must be acked before
it has hit disk. It might well be reordered (which is ok), but if we
wait for all outstanding I/O completions, then we ought to be able to be
sure that all I/O is actually on disk, or at the very least that further
I/O will not be reordered with already ACKed data. If devices are
sending ACKs in advance of the I/O hitting disk then I think thats
broken behaviour.

Consider what happens if a device was to send an ACK for a write and
then it discovers an uncorrectable error during the write - how would it
then be able to report it since it had already sent an "ok"? So far as I
can see the only reason for having the drive send an I/O completion back
is to report the success or otherwise of the operation, and if that
operation hasn't been completed, then we might just as well not wait for
ACKs.

> This is the problem that I/O barriers try to solve, by really forcing
> the block device (and the block layer) to have all blocks issued before
> the barrier to be written before any other after the barrier starts
> begin written.
> 
> The other solution is to completely disable the write cache of the
> disks, but this leads to dramatically bad performances.
> 
If its a choice between poor performance thats correct and good
performance which might lose data, then I know which I would choose :-)
Not all devices support barriers, so it always has to be an option; ext3
uses the barrier=1 mount option for this reason, and if it fails (e.g.
if the underlying device doesn't support barriers) it falls back to the
same technique which we are using in gfs1/2.

The other thing to bear in mind is that barriers, as currently
implemented are not really that great either. It would be nice to
replace them with something that allows better performance with (for
example) mirrors where the only current method of implementing the
barrier is to wait for all the I/O completions from all the disks in the
mirror set (and thus we are back to waiting for outstanding I/O again).

Steve.





From Bennie_R_Thomas at raytheon.com  Wed Apr  2 14:05:27 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Wed, 02 Apr 2008 09:05:27 -0500
Subject: [Linux-cluster] linux cluster on rhel5 without using gfs and
	shared	storage
In-Reply-To: <47F349D2.9030203@artegence.com>
References: <200804020441.m324fXZ6024516@cdactvm.in>
	<47F349D2.9030203@artegence.com>
Message-ID: <47F392A7.7060704@raytheon.com>

You can attach a network disk device and have it fail over with the 
active system and make
tomcat and mysql dependant of the disk resource. This is the simple 
route. when dealing with
clusters you should keep the "KISS" approach in-mind.

Regards,

Bennie

Any views or opinions presented are solely those of the author and do 
not necessarily represent those
of Raytheon unless specifically stated. Electronic communications 
including email might be monitored
by Raytheon. for operational or business reasons.


Maciej Bogucki wrote:
> sajith napisa?(a):
>   
>> Hai all,
>>
>> 	I am new to linux cluster. I want to set up a two node cluster using
>> rhcs. In my application I am using tomcat and mysql as database. My aim is
>> to configure both servers in active-passive configuration. I have tested the
>> failover of ip and process using conga. But I am stuck with the
>> configuration of mysql failover. I am confused with how to make the data
>> files redundant for mysql. If I am using nfs for data files the files are
>> not accessible if nfs server is down. How can I create an online back up of
>> my data files so that if my main server is down then also I can access my
>> data from the online backup? I don't have gfs and SAN storage. My data files
>> and application reside in same machine. Please help
>>     
>
> Hello,
>
> You can use drbd[1] to mirror block device via network. You also need
> some automatic failover mechanism fe. RHCS or heratbeat[2].
>
> [1] - http://www.drbd.org/
> [2] - http://www.linux-ha.org/
>
> Best Regards
> Maciej Bogucki
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/573cdb58/attachment.htm>

From Bennie_R_Thomas at raytheon.com  Wed Apr  2 14:13:13 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Wed, 02 Apr 2008 09:13:13 -0500
Subject: [Linux-cluster] linux cluster on rhel5 without using gfs and
	shared	storage
In-Reply-To: <47F392A7.7060704@raytheon.com>
References: <200804020441.m324fXZ6024516@cdactvm.in>	<47F349D2.9030203@artegence.com>
	<47F392A7.7060704@raytheon.com>
Message-ID: <47F39479.9050203@raytheon.com>

I guess ignore my last reply; I just read the title in it's entirety. He 
does not want to use shared storage.
Sorry !!!



Bennie Thomas wrote:
> You can attach a network disk device and have it fail over with the 
> active system and make
> tomcat and mysql dependant of the disk resource. This is the simple 
> route. when dealing with
> clusters you should keep the "KISS" approach in-mind.
>
> Regards,
>
> Bennie
>
> Any views or opinions presented are solely those of the author and do 
> not necessarily represent those
> of Raytheon unless specifically stated. Electronic communications 
> including email might be monitored
> by Raytheon. for operational or business reasons.
>
>
> Maciej Bogucki wrote:
>> sajith napisa?(a):
>>   
>>> Hai all,
>>>
>>> 	I am new to linux cluster. I want to set up a two node cluster using
>>> rhcs. In my application I am using tomcat and mysql as database. My aim is
>>> to configure both servers in active-passive configuration. I have tested the
>>> failover of ip and process using conga. But I am stuck with the
>>> configuration of mysql failover. I am confused with how to make the data
>>> files redundant for mysql. If I am using nfs for data files the files are
>>> not accessible if nfs server is down. How can I create an online back up of
>>> my data files so that if my main server is down then also I can access my
>>> data from the online backup? I don't have gfs and SAN storage. My data files
>>> and application reside in same machine. Please help
>>>     
>>
>> Hello,
>>
>> You can use drbd[1] to mirror block device via network. You also need
>> some automatic failover mechanism fe. RHCS or heratbeat[2].
>>
>> [1] - http://www.drbd.org/
>> [2] - http://www.linux-ha.org/
>>
>> Best Regards
>> Maciej Bogucki
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>   
>
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/21e59517/attachment.htm>

From s.wendy.cheng at gmail.com  Wed Apr  2 14:26:58 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 2 Apr 2008 10:26:58 -0400
Subject: [Linux-cluster] About GFS1 and I/O barriers.
In-Reply-To: <1207130014.3310.24.camel@localhost.localdomain>
References: <20080328153458.45fc6e13@mathieu.toulouse>
	<20080331124651.3f0d2428@mathieu.toulouse>
	<1206960860.3635.126.camel@quoit>
	<20080331151622.1360a2cb@mathieu.toulouse>
	<1207130014.3310.24.camel@localhost.localdomain>
Message-ID: <1a2a6dd60804020726g20d77419k47298eb000c431ec@mail.gmail.com>

On Wed, Apr 2, 2008 at 5:53 AM, Steven Whitehouse <swhiteho at redhat.com>
wrote:

> Hi,
>
> On Mon, 2008-03-31 at 15:16 +0200, Mathieu Avila wrote:
> > Le Mon, 31 Mar 2008 11:54:20 +0100,
> > Steven Whitehouse <swhiteho at redhat.com> a ?crit :
> >
> > > Hi,
> > >
> >
> > Hi,
> >
> > > Both GFS1 and GFS2 are safe from this problem since neither of them
> > > use barriers. Instead we do a flush at the critical points to ensure
> > > that all data is on disk before proceeding with the next stage.
> > >
> >
> > I don't think this solves the problem.
> >
> > Consider a cheap iSCSI disk (no NVRAM, no UPS) accessed by all my GFS
> > nodes; this disk has a write cache enabled, which means it will reply
> > that write requests are performed even if they are not really written
> > on the platters. The disk (like most disks nowadays) has some logic
> > that allows it to optimize writes by re-scheduling them. It is possible
> > that all writes are ACK'd before the power failure, but only a fraction
> > of them were really performed : some are before the flush, some are
> > after the flush.
> > --Not all blocks writes before the flush were performed but other
> > blocks after the flush are written -> the FS is corrupted.--
> > So, after the power failure all data in the disk's write cache are
> > forgotten. If the journal data was in the disk cache, the journal was
> > not written to disk, but other metadata have been written, so there are
> > metadata inconsistencies.
> >
> I don't agree that write caching implies that I/O must be acked before
> it has hit disk. It might well be reordered (which is ok), but if we
> wait for all outstanding I/O completions, then we ought to be able to be
> sure that all I/O is actually on disk, or at the very least that further
> I/O will not be reordered with already ACKed data. If devices are
> sending ACKs in advance of the I/O hitting disk then I think thats
> broken behaviour.


You seem to assume when disk subsystem acks back, the data is surely on
disk. That is not correct . You may consider it a brokoen behavior, mostly
from firmware bugs, but it occurs more often than you would expect. The
problem is extremely difficult to debug from host side. So I think the
proposal here is how the filesystem should protect itself from this
situation (though I'm fuzzy about what the actual proposal is without
looking into other subsystems, particularly volume manager, that are
involved)  You can not say "oh, then I don't have the responsibility. Please
go to talk to disk vendors". Serious implementations have been trying to
find good ways to solve this issue.

-- Wendy

Consider what happens if a device was to send an ACK for a write and
> then it discovers an uncorrectable error during the write - how would it
> then be able to report it since it had already sent an "ok"? So far as I
> can see the only reason for having the drive send an I/O completion back
> is to report the success or otherwise of the operation, and if that
> operation hasn't been completed, then we might just as well not wait for
> ACKs.
>
> > This is the problem that I/O barriers try to solve, by really forcing
> > the block device (and the block layer) to have all blocks issued before
> > the barrier to be written before any other after the barrier starts
> > begin written.
> >
> > The other solution is to completely disable the write cache of the
> > disks, but this leads to dramatically bad performances.
> >
> If its a choice between poor performance thats correct and good
> performance which might lose data, then I know which I would choose :-)
> Not all devices support barriers, so it always has to be an option; ext3
> uses the barrier=1 mount option for this reason, and if it fails (e.g.
> if the underlying device doesn't support barriers) it falls back to the
> same technique which we are using in gfs1/2.
>
> The other thing to bear in mind is that barriers, as currently
> implemented are not really that great either. It would be nice to
> replace them with something that allows better performance with (for
> example) mirrors where the only current method of implementing the
> barrier is to wait for all the I/O completions from all the disks in the
> mirror set (and thus we are back to waiting for outstanding I/O again).
>
> Steve.
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/70dac5ef/attachment.htm>

From cluster at defuturo.co.uk  Wed Apr  2 14:02:58 2008
From: cluster at defuturo.co.uk (Robert Clark)
Date: Wed, 02 Apr 2008 15:02:58 +0100
Subject: [Linux-cluster] clvmd hang
Message-ID: <1207144978.2373.34.camel@rutabaga.defuturo.co.uk>

  I'm having some problems with clvmd hanging on our 8-node cluster.
Once hung, any lvm commands wait indefinitely. This normally happens
when starting up the cluster or if multiple nodes reboot. After some
experimentation I've managed to reproduce it consistently on a smaller
3-node test cluster by stopping clvmd on one node and then running
vgscan on another. The vgscan will hang together with clvmd. Restarting
clvmd on the stopped node doesn't wake it up.

  Once hung, an strace shows 3 clvmd threads, 2 waiting on futexes and
one trying to read from /dev/misc/dlm_clvmd. All 3 threads wait
indefinitely on these system calls. Here's the last part of the strace:

[pid  2951] select(1024, [4 6], NULL, NULL, {90, 0}) = 1 (in [4], left {56, 190000})
[pid  2951] accept(4, {sa_family=AF_FILE, path=@}, [2]) = 5
[pid  2951] ioctl(6, 0x7805, 0)         = 1
[pid  2951] select(1024, [4 5 6], NULL, NULL, {90, 0}) = 1 (in [5], left {90, 0})
[pid  2951] read(5, "3\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\0\4\4P_global\0\0", 4096) = 29
[pid  2951] futex(0x84d64f4, FUTEX_WAIT, 2, NULL <unfinished ...>

  P_global doesn't show up in /proc/cluster/dlm_locks at this point.
Here's what I can get from dlm_debug:

clvmd rebuilt 5 resources
clvmd purge requests
clvmd purged 0 requests
clvmd mark waiting requests
clvmd marked 0 requests
clvmd purge locks of departed nodes
clvmd purged 0 locks
clvmd update remastered resources
clvmd updated 0 resources
clvmd rebuild locks
clvmd rebuilt 0 locks
clvmd recover event 22 done
clvmd move flags 0,0,1 ids 11,22,22
clvmd process held requests
clvmd processed 0 requests
clvmd resend marked requests
clvmd resent 0 requests
clvmd recover event 22 finished
clvmd move flags 1,0,0 ids 22,22,22
clvmd move flags 0,1,0 ids 22,23,22
clvmd move use event 23
clvmd recover event 23
clvmd add node 1
clvmd total nodes 3
clvmd rebuild resource directory
clvmd rebuilt 5 resources
clvmd purge requests
clvmd purged 0 requests
clvmd mark waiting requests
clvmd marked 0 requests
clvmd recover event 23 done
clvmd move flags 0,0,1 ids 22,23,23
clvmd process held requests
clvmd processed 0 requests
clvmd resend marked requests
clvmd resent 0 requests
clvmd recover event 23 finished

  I'm running 4.6 with kernel-hugemem-2.6.9-67.0.7.EL,
lvm2-cluster-2.02.27-2.el4_6.2 & dlm-kernel-hugemem-2.6.9-52.5. Has
anyone else seen anything like this?

	Thanks,

		Robert



From ccaulfie at redhat.com  Wed Apr  2 14:43:22 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Wed, 02 Apr 2008 15:43:22 +0100
Subject: [Linux-cluster] clvmd hang
In-Reply-To: <1207144978.2373.34.camel@rutabaga.defuturo.co.uk>
References: <1207144978.2373.34.camel@rutabaga.defuturo.co.uk>
Message-ID: <47F39B8A.3040107@redhat.com>

Robert Clark wrote:
>   I'm having some problems with clvmd hanging on our 8-node cluster.
> Once hung, any lvm commands wait indefinitely. This normally happens
> when starting up the cluster or if multiple nodes reboot. After some
> experimentation I've managed to reproduce it consistently on a smaller
> 3-node test cluster by stopping clvmd on one node and then running
> vgscan on another. The vgscan will hang together with clvmd. Restarting
> clvmd on the stopped node doesn't wake it up.
> 
>   Once hung, an strace shows 3 clvmd threads, 2 waiting on futexes and
> one trying to read from /dev/misc/dlm_clvmd. All 3 threads wait
> indefinitely on these system calls. Here's the last part of the strace:
> 
> [pid  2951] select(1024, [4 6], NULL, NULL, {90, 0}) = 1 (in [4], left {56, 190000})
> [pid  2951] accept(4, {sa_family=AF_FILE, path=@}, [2]) = 5
> [pid  2951] ioctl(6, 0x7805, 0)         = 1
> [pid  2951] select(1024, [4 5 6], NULL, NULL, {90, 0}) = 1 (in [5], left {90, 0})
> [pid  2951] read(5, "3\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\0\4\4P_global\0\0", 4096) = 29
> [pid  2951] futex(0x84d64f4, FUTEX_WAIT, 2, NULL <unfinished ...>
> 
>   P_global doesn't show up in /proc/cluster/dlm_locks at this point.
> Here's what I can get from dlm_debug:
> 
> clvmd rebuilt 5 resources
> clvmd purge requests
> clvmd purged 0 requests
> clvmd mark waiting requests
> clvmd marked 0 requests
> clvmd purge locks of departed nodes
> clvmd purged 0 locks
> clvmd update remastered resources
> clvmd updated 0 resources
> clvmd rebuild locks
> clvmd rebuilt 0 locks
> clvmd recover event 22 done
> clvmd move flags 0,0,1 ids 11,22,22
> clvmd process held requests
> clvmd processed 0 requests
> clvmd resend marked requests
> clvmd resent 0 requests
> clvmd recover event 22 finished
> clvmd move flags 1,0,0 ids 22,22,22
> clvmd move flags 0,1,0 ids 22,23,22
> clvmd move use event 23
> clvmd recover event 23
> clvmd add node 1
> clvmd total nodes 3
> clvmd rebuild resource directory
> clvmd rebuilt 5 resources
> clvmd purge requests
> clvmd purged 0 requests
> clvmd mark waiting requests
> clvmd marked 0 requests
> clvmd recover event 23 done
> clvmd move flags 0,0,1 ids 22,23,23
> clvmd process held requests
> clvmd processed 0 requests
> clvmd resend marked requests
> clvmd resent 0 requests
> clvmd recover event 23 finished
> 
>   I'm running 4.6 with kernel-hugemem-2.6.9-67.0.7.EL,
> lvm2-cluster-2.02.27-2.el4_6.2 & dlm-kernel-hugemem-2.6.9-52.5. Has
> anyone else seen anything like this?
> 

Yes, we seem to have collected quite a few bugzillas on the subject! The
fix is in CVS for LVM2. Packages are on their way I believe.

-- 

Chrissie



From tiagocruz at forumgdh.net  Wed Apr  2 15:08:53 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Wed, 02 Apr 2008 12:08:53 -0300
Subject: [Linux-cluster] Why my cluster stop to work when one node down?
Message-ID: <1207148933.27447.6.camel@tuxkiller.ig.com.br>

Hello guys,

I have one cluster with two machines, running RHEL 5.1 x86_64.
The Storage device has imported using GNDB and formated using GFS, to
mount on both nodes:

[root at teste-spo-la-v1 ~]# gnbd_import -v -l
Device name : cluster
----------------------
    Minor # : 0
 sysfs name : /block/gnbd0
     Server : gnbdserv
       Port : 14567
      State : Open Connected Clear
   Readonly : No
    Sectors : 20971520

# gfs2_mkfs -p lock_dlm -t mycluster:export1 -j 2 /dev/gnbd/cluster
# mount /dev/gnbd/cluster /mnt/

Everything works graceful, until one node get out (shutdown, network
stop, xm destroy...)


teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering GATHER state from 0. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Creating commit token because I am the rep. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Saving state aru 46 high seq received 46 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Storing new sequence id for ring 4c 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering COMMIT state. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering RECOVERY state. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] position [0] member 10.25.0.251: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] previous ring seq 72 rep 10.25.0.251 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] aru 46 high delivered 46 received flag 1 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Did not need to originate any messages in recovery. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Sending initial ORF token 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE 
Apr  2 12:00:07 teste-spo-la-v1 kernel: dlm: closing connection to node 3
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration: 
Apr  2 12:00:07 teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)  
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.252)  
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CMAN ] quorum lost, blocking activity 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)  
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined: 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [SYNC ] This node is within the primary component and will provide service. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering OPERATIONAL state. 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] got nodejoin message 10.25.0.251 
Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CPG  ] got joinlist message from node 2 
Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection. 
Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused 
Apr  2 12:00:16 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection. 
Apr  2 12:00:17 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused 
Apr  2 12:00:22 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection. 


So then, my GFS mount point has broken... the terminal freeze when I try
to access the directory "/mnt" and just come back when the second node
has back again to the cluster.


Follow the cluster.conf:

<?xml version="1.0"?>
<cluster name="mycluster" config_version="2">
	
<cman expected_votes="1">
</cman>

<fence_daemon post_join_delay="60">
</fence_daemon>

<clusternodes>
<clusternode name="node1.mycluster.com" nodeid="2">
	<fence>
		<method name="single">
			<device name="gnbd" ipaddr="10.25.0.251"/>
		</method>
	</fence>
</clusternode>
<clusternode name="node2.mycluster.com" nodeid="3">
	<fence>
		<method name="single">
			<device name="gnbd" ipaddr="10.25.0.252"/>
		</method>
	</fence>
</clusternode>
</clusternodes>

<fencedevices>
	<fencedevice name="gnbd" agent="fence_gnbd"/>
</fencedevices>
</cluster>


Thanks!

-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From gordan at bobich.net  Wed Apr  2 15:16:16 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 2 Apr 2008 16:16:16 +0100 (BST)
Subject: [Linux-cluster] Why my cluster stop to work when one node
 down?
In-Reply-To: <1207148933.27447.6.camel@tuxkiller.ig.com.br>
References: <1207148933.27447.6.camel@tuxkiller.ig.com.br>
Message-ID: <alpine.LRH.1.10.0804021615080.13468@skynet.shatteredsilicon.net>

Replace:

<cman expected_votes="1">
</cman>

with

<cman two_node="1" expected_votes="1"/>

in cluster.conf.

Gordan

On Wed, 2 Apr 2008, Tiago Cruz wrote:

> Hello guys,
>
> I have one cluster with two machines, running RHEL 5.1 x86_64.
> The Storage device has imported using GNDB and formated using GFS, to
> mount on both nodes:
>
> [root at teste-spo-la-v1 ~]# gnbd_import -v -l
> Device name : cluster
> ----------------------
>    Minor # : 0
> sysfs name : /block/gnbd0
>     Server : gnbdserv
>       Port : 14567
>      State : Open Connected Clear
>   Readonly : No
>    Sectors : 20971520
>
> # gfs2_mkfs -p lock_dlm -t mycluster:export1 -j 2 /dev/gnbd/cluster
> # mount /dev/gnbd/cluster /mnt/
>
> Everything works graceful, until one node get out (shutdown, network
> stop, xm destroy...)
>
>
> teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering GATHER state from 0.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Creating commit token because I am the rep.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Saving state aru 46 high seq received 46
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Storing new sequence id for ring 4c
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering COMMIT state.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering RECOVERY state.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] position [0] member 10.25.0.251:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] previous ring seq 72 rep 10.25.0.251
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] aru 46 high delivered 46 received flag 1
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Did not need to originate any messages in recovery.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Sending initial ORF token
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE
> Apr  2 12:00:07 teste-spo-la-v1 kernel: dlm: closing connection to node 3
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration:
> Apr  2 12:00:07 teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.252)
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CMAN ] quorum lost, blocking activity
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined:
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [SYNC ] This node is within the primary component and will provide service.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering OPERATIONAL state.
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] got nodejoin message 10.25.0.251
> Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CPG  ] got joinlist message from node 2
> Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
> Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused
> Apr  2 12:00:16 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
> Apr  2 12:00:17 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused
> Apr  2 12:00:22 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
>
>
> So then, my GFS mount point has broken... the terminal freeze when I try
> to access the directory "/mnt" and just come back when the second node
> has back again to the cluster.
>
>
> Follow the cluster.conf:
>
> <?xml version="1.0"?>
> <cluster name="mycluster" config_version="2">
>
> <cman expected_votes="1">
> </cman>
>
> <fence_daemon post_join_delay="60">
> </fence_daemon>
>
> <clusternodes>
> <clusternode name="node1.mycluster.com" nodeid="2">
> 	<fence>
> 		<method name="single">
> 			<device name="gnbd" ipaddr="10.25.0.251"/>
> 		</method>
> 	</fence>
> </clusternode>
> <clusternode name="node2.mycluster.com" nodeid="3">
> 	<fence>
> 		<method name="single">
> 			<device name="gnbd" ipaddr="10.25.0.252"/>
> 		</method>
> 	</fence>
> </clusternode>
> </clusternodes>
>
> <fencedevices>
> 	<fencedevice name="gnbd" agent="fence_gnbd"/>
> </fencedevices>
> </cluster>
>
>
> Thanks!
>
> -- 
> Tiago Cruz
> http://everlinux.com
> Linux User #282636
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From swhiteho at redhat.com  Wed Apr  2 15:17:08 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 02 Apr 2008 16:17:08 +0100
Subject: [Linux-cluster] About GFS1 and I/O barriers.
In-Reply-To: <1a2a6dd60804020726g20d77419k47298eb000c431ec@mail.gmail.com>
References: <20080328153458.45fc6e13@mathieu.toulouse>
	<20080331124651.3f0d2428@mathieu.toulouse>
	<1206960860.3635.126.camel@quoit>
	<20080331151622.1360a2cb@mathieu.toulouse>
	<1207130014.3310.24.camel@localhost.localdomain>
	<1a2a6dd60804020726g20d77419k47298eb000c431ec@mail.gmail.com>
Message-ID: <1207149428.3635.151.camel@quoit>

Hi,

On Wed, 2008-04-02 at 10:26 -0400, Wendy Cheng wrote:
> 
> 
> On Wed, Apr 2, 2008 at 5:53 AM, Steven Whitehouse
> <swhiteho at redhat.com> wrote:
>         Hi,
>         
>         On Mon, 2008-03-31 at 15:16 +0200, Mathieu Avila wrote:
>         > Le Mon, 31 Mar 2008 11:54:20 +0100,
>         > Steven Whitehouse <swhiteho at redhat.com> a ?crit :
>         >
>         > > Hi,
>         > >
>         >
>         > Hi,
>         >
>         > > Both GFS1 and GFS2 are safe from this problem since
>         neither of them
>         > > use barriers. Instead we do a flush at the critical points
>         to ensure
>         > > that all data is on disk before proceeding with the next
>         stage.
>         > >
>         >
>         > I don't think this solves the problem.
>         >
>         > Consider a cheap iSCSI disk (no NVRAM, no UPS) accessed by
>         all my GFS
>         > nodes; this disk has a write cache enabled, which means it
>         will reply
>         > that write requests are performed even if they are not
>         really written
>         > on the platters. The disk (like most disks nowadays) has
>         some logic
>         > that allows it to optimize writes by re-scheduling them. It
>         is possible
>         > that all writes are ACK'd before the power failure, but only
>         a fraction
>         > of them were really performed : some are before the flush,
>         some are
>         > after the flush.
>         > --Not all blocks writes before the flush were performed but
>         other
>         > blocks after the flush are written -> the FS is corrupted.--
>         > So, after the power failure all data in the disk's write
>         cache are
>         > forgotten. If the journal data was in the disk cache, the
>         journal was
>         > not written to disk, but other metadata have been written,
>         so there are
>         > metadata inconsistencies.
>         >
>         
>         I don't agree that write caching implies that I/O must be
>         acked before
>         it has hit disk. It might well be reordered (which is ok), but
>         if we
>         wait for all outstanding I/O completions, then we ought to be
>         able to be
>         sure that all I/O is actually on disk, or at the very least
>         that further
>         I/O will not be reordered with already ACKed data. If devices
>         are
>         sending ACKs in advance of the I/O hitting disk then I think
>         thats
>         broken behaviour.
> 
> You seem to assume when disk subsystem acks back, the data is surely
> on disk. That is not correct . You may consider it a brokoen behavior,
> mostly from firmware bugs, but it occurs more often than you would
> expect. The problem is extremely difficult to debug from host side. So
> I think the proposal here is how the filesystem should protect itself
> from this situation (though I'm fuzzy about what the actual proposal
> is without looking into other subsystems, particularly volume manager,
> that are involved)  You can not say "oh, then I don't have the
> responsibility. Please go to talk to disk vendors". Serious
> implementations have been trying to find good ways to solve this
> issue.
> 
> -- Wendy
> 
If the data is not physically on disk when the ACK it sent back, then
there is no way for the fs to know whether the data has (at a later
date) not been written due to some error or other. Even ignoring that
for the moment and assuming that such errors never occur, I don't think
its too unreasonable to expect at a minimum that all acknowledged I/O
will never be reordered with unacknowledged I/O. That is all that is
required for correct operation of gfs1/2 provided that no media errors
occur on write.

The message on lkml which Mathieu referred to suggested that there were
three kinds of devices, but it seems to be that type 2 (flushable)
doesn't exist so far as the fs is concerned since blkdev_issue_flush()
just issues a BIO with only a barrier in it. A device driver might
support the barrier request by either waiting for all outstanding I/O
and issuing a flush command (if required) or by passing the barrier down
to the device, assuming that it supports such a thing directly.

Further down the message (the url is http://lkml.org/lkml/2007/5/25/71
btw) there is a list of dm/md implementation status and it seems that
for a good number of the common targets there is little or no support
for barriers anyway at the moment.

Now I agree that it would be nice to support barriers in GFS2, but it
won't solve any problems relating to ordering of I/O unless all of the
underlying device supports them too. See also Alasdair's response to the
thread: http://lkml.org/lkml/2007/5/28/81

So although I'd like to see barrier support in GFS2, it won't solve any
problems for most people and really its a device/block layer issue at
the moment.

Steve.




From siddiqut at gmail.com  Wed Apr  2 15:30:31 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Wed, 2 Apr 2008 11:30:31 -0400
Subject: [Linux-cluster] writing to GFS from multiple JVM's concurrently
Message-ID: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>

Hi,

We are evaluating GFS for use as a highly concurrent distributed file
system.

What I have observed:

When 2 JVM's  (multiple Threads per Java Virtual Machine) are writing to the
same directory on GFS, on of the JVM doesn't see the files it writes on the
GFS.
The Writer Threads on JVM think they're done, but the files don't show up on
"ls" etc.
The other JVM works fine.

This problem goes away if the 2 JVM's write to different directories on GFS

OR

Only one JVM is writing at a time.

Any ideas on this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/91c502e9/attachment.htm>

From gordan at bobich.net  Wed Apr  2 15:38:50 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 2 Apr 2008 16:38:50 +0100 (BST)
Subject: [Linux-cluster] writing to GFS from multiple JVM's
 concurrently
In-Reply-To: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
Message-ID: <alpine.LRH.1.10.0804021637010.13468@skynet.shatteredsilicon.net>

On Wed, 2 Apr 2008, Tajdar Siddiqui wrote:

> When 2 JVM's? (multiple Threads per Java Virtual Machine) are writing to the same directory
> on GFS, on of the JVM doesn't see the files it writes on the GFS.
> The Writer Threads on JVM think they're done, but the files don't show up on "ls" etc.
> The other JVM works fine.
> 
> This problem goes away if the 2 JVM's write to different directories on GFS
> 
> OR
> 
> Only one JVM is writing at a time.
> 
> Any ideas on this.

This may sound like a daft question, but did you test it on ext3? Are the 
JVMs on the same node? What locking protocol are you using?

Gordan

From cluster at defuturo.co.uk  Wed Apr  2 15:48:06 2008
From: cluster at defuturo.co.uk (Robert Clark)
Date: Wed, 02 Apr 2008 16:48:06 +0100
Subject: [Linux-cluster] clvmd hang
In-Reply-To: <47F39B8A.3040107@redhat.com>
References: <1207144978.2373.34.camel@rutabaga.defuturo.co.uk>
	<47F39B8A.3040107@redhat.com>
Message-ID: <1207151286.2373.45.camel@rutabaga.defuturo.co.uk>

On Wed, 2008-04-02 at 15:43 +0100, Christine Caulfield wrote:

> > Has anyone else seen anything like this?

> Yes, we seem to have collected quite a few bugzillas on the subject! The
> fix is in CVS for LVM2. Packages are on their way I believe.

  Ah yes. I searched BZ for dlm bugs but forgot to check for
lvm2-cluster ones... I'll test again after bz#435491 is closed.

	Thanks,

		Robert



From s.wendy.cheng at gmail.com  Wed Apr  2 15:57:44 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 2 Apr 2008 11:57:44 -0400
Subject: [Linux-cluster] About GFS1 and I/O barriers.
In-Reply-To: <1207149428.3635.151.camel@quoit>
References: <20080328153458.45fc6e13@mathieu.toulouse>
	<20080331124651.3f0d2428@mathieu.toulouse>
	<1206960860.3635.126.camel@quoit>
	<20080331151622.1360a2cb@mathieu.toulouse>
	<1207130014.3310.24.camel@localhost.localdomain>
	<1a2a6dd60804020726g20d77419k47298eb000c431ec@mail.gmail.com>
	<1207149428.3635.151.camel@quoit>
Message-ID: <1a2a6dd60804020857w16ddeaebs41f916f4a01792c3@mail.gmail.com>

On Wed, Apr 2, 2008 at 11:17 AM, Steven Whitehouse <swhiteho at redhat.com>
wrote:

>
> Now I agree that it would be nice to support barriers in GFS2, but it
> won't solve any problems relating to ordering of I/O unless all of the
> underlying device supports them too. See also Alasdair's response to the
> thread: http://lkml.org/lkml/2007/5/28/81


I'm not suggesting GFS1/2 should take this patch, considering their current
states. However, you can't give people an impression, as your original reply
implying, that GFS1/2 would not have this problem.

<http://lkml.org/lkml/2007/5/28/81>
>
> So although I'd like to see barrier support in GFS2, it won't solve any
> problems for most people and really its a device/block layer issue at
> the moment.


This part I agree ... better to attack this issue from volume manager than
from filesystem.

-- Wendy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/73b3fb5e/attachment.htm>

From tiagocruz at forumgdh.net  Wed Apr  2 15:59:37 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Wed, 02 Apr 2008 12:59:37 -0300
Subject: [Linux-cluster] Why my cluster stop to work when one node down?
In-Reply-To: <alpine.LRH.1.10.0804021615080.13468@skynet.shatteredsilicon.net>
References: <1207148933.27447.6.camel@tuxkiller.ig.com.br>
	<alpine.LRH.1.10.0804021615080.13468@skynet.shatteredsilicon.net>
Message-ID: <1207151977.27447.10.camel@tuxkiller.ig.com.br>

Nice ?Gordan!!!

It works now!! :-p

"?Quorum" its the number minimum of nodes on the cluster?

[root at teste-spo-la-v1 ~]# cman_tool status
Version: 6.0.1
Config Version: 3
Cluster Name: mycluster
Cluster Id: 56756
Cluster Member: Yes
Cluster Generation: 140
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1  
Active subsystems: 8
Flags: 2node 
Ports Bound: 0 11 177  
Node name: node1.mycluster.com
Node ID: 1
Multicast addresses: 239.192.221.146 
Node addresses: 10.25.0.251 


?Many thanks!!



On Wed, 2008-04-02 at 16:16 +0100, gordan at bobich.net wrote:
> Replace:
> 
> <cman expected_votes="1">
> </cman>
> 
> with
> 
> <cman two_node="1" expected_votes="1"/>
> 
> in cluster.conf.
> 
> Gordan
> 
> On Wed, 2 Apr 2008, Tiago Cruz wrote:
> 
> > Hello guys,
> >
> > I have one cluster with two machines, running RHEL 5.1 x86_64.
> > The Storage device has imported using GNDB and formated using GFS, to
> > mount on both nodes:
> >
> > [root at teste-spo-la-v1 ~]# gnbd_import -v -l
> > Device name : cluster
> > ----------------------
> >    Minor # : 0
> > sysfs name : /block/gnbd0
> >     Server : gnbdserv
> >       Port : 14567
> >      State : Open Connected Clear
> >   Readonly : No
> >    Sectors : 20971520
> >
> > # gfs2_mkfs -p lock_dlm -t mycluster:export1 -j 2 /dev/gnbd/cluster
> > # mount /dev/gnbd/cluster /mnt/
> >
> > Everything works graceful, until one node get out (shutdown, network
> > stop, xm destroy...)
> >
> >
> > teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering GATHER state from 0.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Creating commit token because I am the rep.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Saving state aru 46 high seq received 46
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Storing new sequence id for ring 4c
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering COMMIT state.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering RECOVERY state.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] position [0] member 10.25.0.251:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] previous ring seq 72 rep 10.25.0.251
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] aru 46 high delivered 46 received flag 1
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Did not need to originate any messages in recovery.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] Sending initial ORF token
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE
> > Apr  2 12:00:07 teste-spo-la-v1 kernel: dlm: closing connection to node 3
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration:
> > Apr  2 12:00:07 teste-spo-la-v1 clurgmgrd[3557]: <emerg> #1: Quorum Dissolved
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.252)
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CMAN ] quorum lost, blocking activity
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] CLM CONFIGURATION CHANGE
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] New Configuration:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] 	r(0) ip(10.25.0.251)
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Left:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] Members Joined:
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [SYNC ] This node is within the primary component and will provide service.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [TOTEM] entering OPERATIONAL state.
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CLM  ] got nodejoin message 10.25.0.251
> > Apr  2 12:00:07 teste-spo-la-v1 openais[1545]: [CPG  ] got joinlist message from node 2
> > Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
> > Apr  2 12:00:12 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused
> > Apr  2 12:00:16 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
> > Apr  2 12:00:17 teste-spo-la-v1 ccsd[1539]: Error while processing connect: Connection refused
> > Apr  2 12:00:22 teste-spo-la-v1 ccsd[1539]: Cluster is not quorate.  Refusing connection.
> >
> >
> > So then, my GFS mount point has broken... the terminal freeze when I try
> > to access the directory "/mnt" and just come back when the second node
> > has back again to the cluster.
> >
> >
> > Follow the cluster.conf:
> >
> > <?xml version="1.0"?>
> > <cluster name="mycluster" config_version="2">
> >
> > <cman expected_votes="1">
> > </cman>
> >
> > <fence_daemon post_join_delay="60">
> > </fence_daemon>
> >
> > <clusternodes>
> > <clusternode name="node1.mycluster.com" nodeid="2">
> > 	<fence>
> > 		<method name="single">
> > 			<device name="gnbd" ipaddr="10.25.0.251"/>
> > 		</method>
> > 	</fence>
> > </clusternode>
> > <clusternode name="node2.mycluster.com" nodeid="3">
> > 	<fence>
> > 		<method name="single">
> > 			<device name="gnbd" ipaddr="10.25.0.252"/>
> > 		</method>
> > 	</fence>
> > </clusternode>
> > </clusternodes>
> >
> > <fencedevices>
> > 	<fencedevice name="gnbd" agent="fence_gnbd"/>
> > </fencedevices>
> > </cluster>
> >
> >
> > Thanks!
> >
> > -- 
> > Tiago Cruz
> > http://everlinux.com
> > Linux User #282636
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From gordan at bobich.net  Wed Apr  2 16:13:16 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 2 Apr 2008 17:13:16 +0100 (BST)
Subject: [Linux-cluster] Why my cluster stop to work when one node
 down?
In-Reply-To: <1207151977.27447.10.camel@tuxkiller.ig.com.br>
References: <1207148933.27447.6.camel@tuxkiller.ig.com.br>
	<alpine.LRH.1.10.0804021615080.13468@skynet.shatteredsilicon.net>
	<1207151977.27447.10.camel@tuxkiller.ig.com.br>
Message-ID: <alpine.LRH.1.10.0804021710220.13892@skynet.shatteredsilicon.net>


> Nice ?Gordan!!!
>
> It works now!! :-p

You're welcome. :)

> "?Quorum" its the number minimum of nodes on the cluster?

Yes, it's the minimum number of nodes required for the cluster to start. 
This is (n+1)/2, round up number of nodes defined in cluster.conf. This 
ensures that the cluster can't split-brain. In the 2-node case this needs 
to be adjusted which is what the two_node parameter does. There's higher 
risk of splitbrain, though, but you can use tie-breakers of some sort.

Gordan

From rohara at redhat.com  Wed Apr  2 16:23:43 2008
From: rohara at redhat.com (Ryan O'Hara)
Date: Wed, 02 Apr 2008 11:23:43 -0500
Subject: [Linux-cluster] SCSI reservation conflicts after update
In-Reply-To: <47ED9556.9050000@amnh.org>
References: <47ED9556.9050000@amnh.org>
Message-ID: <47F3B30F.2050502@redhat.com>


I went back and investigated why this might happen. Seems that I had 
seen it before but could not recall how this sort of thing happens.

For 4.6, the scsi_reserve script should only be run if you intend to use 
SCSI reservations as a fence mechanism, as you correctly pointed out at 
the end of your message. I believe in 4.6 scsi_reserve was incorrectly 
enabled by default.

The real problem is that the keys used for scsi reservations are based 
on node ID. For this reason, it is required that nodeid be defined in 
the cluster.conf file for all nodes. Without this, the nodeid can change 
from node to node between cluster restarts, etc. The scsi_reserve and 
fence_scsi scripts require consistent nodeid (ie. they do not change).

So I think the problem we are seeing is that running 'scsi_reserve stop' 
cannot work since that will attempt to remove that node's key from the 
devices. If that key has changed (the node ID changed), it will not find 
a matching registration key on the device and thus fail.

The best bet is to disable scsi_reserve and to clear all scsi 
reservations. As you mentioned, the sg_persist command with the -C 
option should do the trick. I am guessing that the reason that failed 
for you is that you must supply the device name AND the key being used 
for that I_T nexus. You can use sg_persist to list the keys registered 
with a particular device, but since nodeid's may have changed you might 
have to guess the key for a particular node (ie. the node you run the 
sg_persist -C command on). The good news is that when you identify the 
correct key it will clear all the keys.

Ryan

Sajesh Singh wrote:
> After updating my GFS cluster to the latest packages (as of 3/28/08) on 
> an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
> am receiving scsi reservation errors whenever the nodes are rebooted. 
> The node is then subsequently rebooted at varying intervals without any 
> intervention. I have tried to disable the scsi_reserve script from 
> startup, but it does not seem to have any effect. I have also tried to 
> use the sg_persist command to clear all reservations with the -C option 
> to no avail. I first noticed something was wrong when the 2nd node of 
> the 2 node cluster was being updated. That was the first sign of the 
> scsi reservation errors on the console.
> 
>  From my understanding persistent SCSI reservations are only needed if I 
> am using the fence_scsi module.
> 
> I would appreciate any guidance.
> 
> Regards,
> 
> Sajesh Singh
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From ssingh at amnh.org  Wed Apr  2 16:36:11 2008
From: ssingh at amnh.org (Sajesh Singh)
Date: Wed, 02 Apr 2008 12:36:11 -0400
Subject: [Linux-cluster] SCSI reservation conflicts after update
In-Reply-To: <47F3B30F.2050502@redhat.com>
References: <47ED9556.9050000@amnh.org> <47F3B30F.2050502@redhat.com>
Message-ID: <47F3B5FB.3090107@amnh.org>

Ryan and all else that have answered,
       Thank you for the info on scsi_reserve. I have disabled the 
script and all seems okay. What is a little confusing is that the 
script/service was enabled before the upgrade, but did not cause any 
scsi reservation conflicts.

-Sajesh-


Ryan O'Hara wrote:
>
> I went back and investigated why this might happen. Seems that I had 
> seen it before but could not recall how this sort of thing happens.
>
> For 4.6, the scsi_reserve script should only be run if you intend to 
> use SCSI reservations as a fence mechanism, as you correctly pointed 
> out at the end of your message. I believe in 4.6 scsi_reserve was 
> incorrectly enabled by default.
>
> The real problem is that the keys used for scsi reservations are based 
> on node ID. For this reason, it is required that nodeid be defined in 
> the cluster.conf file for all nodes. Without this, the nodeid can 
> change from node to node between cluster restarts, etc. The 
> scsi_reserve and fence_scsi scripts require consistent nodeid (ie. 
> they do not change).
>
> So I think the problem we are seeing is that running 'scsi_reserve 
> stop' cannot work since that will attempt to remove that node's key 
> from the devices. If that key has changed (the node ID changed), it 
> will not find a matching registration key on the device and thus fail.
>
> The best bet is to disable scsi_reserve and to clear all scsi 
> reservations. As you mentioned, the sg_persist command with the -C 
> option should do the trick. I am guessing that the reason that failed 
> for you is that you must supply the device name AND the key being used 
> for that I_T nexus. You can use sg_persist to list the keys registered 
> with a particular device, but since nodeid's may have changed you 
> might have to guess the key for a particular node (ie. the node you 
> run the sg_persist -C command on). The good news is that when you 
> identify the correct key it will clear all the keys.
>
> Ryan
>
> Sajesh Singh wrote:
>> After updating my GFS cluster to the latest packages (as of 3/28/08) 
>> on an Enterprise Linux 4.6 cluster (kernel version 
>> 2.6.9-67.0.7.ELsmp)  I am receiving scsi reservation errors whenever 
>> the nodes are rebooted. The node is then subsequently rebooted at 
>> varying intervals without any intervention. I have tried to disable 
>> the scsi_reserve script from startup, but it does not seem to have 
>> any effect. I have also tried to use the sg_persist command to clear 
>> all reservations with the -C option to no avail. I first noticed 
>> something was wrong when the 2nd node of the 2 node cluster was being 
>> updated. That was the first sign of the scsi reservation errors on 
>> the console.
>>
>>  From my understanding persistent SCSI reservations are only needed 
>> if I am using the fence_scsi module.
>>
>> I would appreciate any guidance.
>>
>> Regards,
>>
>> Sajesh Singh
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From RJM002 at shsu.edu  Wed Apr  2 16:37:04 2008
From: RJM002 at shsu.edu (MARTI, ROBERT JESSE)
Date: Wed, 2 Apr 2008 11:37:04 -0500
Subject: [Linux-cluster] Why my cluster stop to work when one node down?
In-Reply-To: <alpine.LRH.1.10.0804021710220.13892@skynet.shatteredsilicon.net>
References: <1207148933.27447.6.camel@tuxkiller.ig.com.br><alpine.LRH.1.10.0804021615080.13468@skynet.shatteredsilicon.net><1207151977.27447.10.camel@tuxkiller.ig.com.br>
	<alpine.LRH.1.10.0804021710220.13892@skynet.shatteredsilicon.net>
Message-ID: <9F633DE6C0E04F4691DCB713AC44C94B066E4C68@EXCHANGE.SHSU.EDU>

Speaking of...

If I already have a cluster set up that split brained itself (but the services are still running on one, and it wont un-split brain with the other box up...) how hard would it be to add a quorum disk?

I guess I could post my whole problem and let smarter people figure out what I broke.

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of gordan at bobich.net
Sent: Wednesday, April 02, 2008 11:13 AM
To: linux clustering
Subject: Re: [Linux-cluster] Why my cluster stop to work when one node down?


> Nice ?Gordan!!!
>
> It works now!! :-p

You're welcome. :)

> "?Quorum" its the number minimum of nodes on the cluster?

Yes, it's the minimum number of nodes required for the cluster to start. 
This is (n+1)/2, round up number of nodes defined in cluster.conf. This ensures that the cluster can't split-brain. In the 2-node case this needs to be adjusted which is what the two_node parameter does. There's higher risk of splitbrain, though, but you can use tie-breakers of some sort.

Gordan



From paolom at prisma-eng.it  Wed Apr  2 17:20:58 2008
From: paolom at prisma-eng.it (Paolo Marini)
Date: Wed, 02 Apr 2008 19:20:58 +0200
Subject: [Linux-cluster] Problems with SAMBA server on Centos 51 virtual xen
 guest with iSCSI SAN
Message-ID: <47F3C07A.8090709@prisma-eng.it>

I have implemented a cluster of a few xen guest with a shared GFS 
filesystem residing on a SAN build with openfiler to support iSCSI storage.

Physical servers are 3 machines implementing a physical cluster, each 
one equipped with quad xeon and 4 G RAM. The network interface is based 
on channel bonding with LACP (on the physical hosts) having an aggregate 
of 2 gigabits ethernet per physical host, the switch supports LACP and 
has been configured accordingly.

Virtual servers are based on xen nodes on top of the physical server 
with shared storage on iSCSI and GFS.

The networking is based on a cluster private network (for cluster 
heartbeat and cluster communication + iSCSI) and an ethernet alias for 
the LAN to which the users are connected.

One of the cluster xen nodes is used for implementing a samba PDC (no 
failover of the service, plain samba, single samba server on the LAN) 
plus ldap server; samba works with ldap for users authentication. 
Storage for the samba server is on the SAN.

I continue to receive complaints from my users due to the fact that 
sometimes copying file generates errors, plus problems related to office 
usage (we still use the old Office 97 on some machines). The samba 
configuration is more or less the same as that correctly working on the 
previous physical machine, on which those problems were not present.

The problems generate these log entries on /var/log/samba/smbd:

[2008/04/02 19:00:50, 0] lib/util_sock.c:get_peer_addr(1232)
  getpeername failed. Error was Transport endpoint is not connected
[2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
  getpeername failed. Error was Transport endpoint is not connected
[2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
  getpeername failed. Error was Transport endpoint is not connected

And on the client machine log also on /var/log/samba

[2008/04/02 19:04:34, 0] lib/util_sock.c:read_data(534)
  read_data: read failure for 4 bytes to client 192.168.13.240. Error = 
Connection reset by peer
[2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
  amhwq53p (192.168.13.240) closed connection to service tmp
[2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
  amhwq53p (192.168.13.240) closed connection to service stock
[2008/04/02 19:04:34, 0] lib/util_sock.c:write_data(562)
  write_data: write failure in writing to client 192.168.13.240. Error 
Broken pipe
[2008/04/02 19:04:34, 0] lib/util_sock.c:send_smb(769)
  Error writing 75 bytes to client. -1. (Broken pipe)
[2008/04/02 19:04:34, 1] smbd/service.c:make_connection_snum(1033)

They seem similar to problems related to poor connectivity or problem in 
the network; however, these problems are new and were never found before 
switching to the clustered architecture. Also no problem have been found 
so far on the other xen nodes serving the same GFS filesystem (different 
dirs !) for NFS or other services.

Also putting the option

posix locking = no

on the smb.conf file did not help.

Any idea from someone else facing the same problems ?

thanks, Paolo



From jamesc at exa.com  Wed Apr  2 18:50:13 2008
From: jamesc at exa.com (James Chamberlain)
Date: Wed, 2 Apr 2008 14:50:13 -0400
Subject: [Linux-cluster] Unformatting a GFS cluster disk
In-Reply-To: <1206482065.2741.83.camel@technetium.msp.redhat.com>
References: <OFB60FBDB0.E8EF177D-ON80257417.0066F992-80257417.00673763@amnesty.org>
	<1206482065.2741.83.camel@technetium.msp.redhat.com>
Message-ID: <0018D44C-35B7-4496-88F6-C8AB8F12FA84@exa.com>


On Mar 25, 2008, at 5:54 PM, Bob Peterson wrote:

> If it were my file system, and I didn't have a backup, and I had
> data on it that I absolutely needed to get back, I personally would
> use the gfs2_edit tool (assuming RHEL5, Centos5 or similar) which can
> mostly operate on gfs1 file systems.  The "gfs_edit" tool will also
> work, but it is much more primitive than gfs2_edit (but at least it
> exists on RHEL4, Centos4 and similar).

Any idea what RPM gfs_edit would be in for RHEL4/CentOS 4?  I've got  
CentOS 4.6, and I'm not finding it anywhere.

Thanks,

James



From siddiqut at gmail.com  Wed Apr  2 19:34:15 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Wed, 2 Apr 2008 15:34:15 -0400
Subject: [Linux-cluster] Re: writing to GFS from multiple JVM's concurrently
In-Reply-To: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
Message-ID: <3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>

Hi Gordon,

Thanx for your reply.

Yes, this test works fine on an ext3 filesystem.

The JVM's are on different nodes.

The files being written/read on the 2 JVM's are different (file-names).
Where does locking come into play here ?

A JVM is only reading the files it creates, so there is no cross.

-Tajdar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/a92bcd9b/attachment.htm>

From jruemker at redhat.com  Wed Apr  2 20:10:23 2008
From: jruemker at redhat.com (John Ruemker)
Date: Wed, 02 Apr 2008 16:10:23 -0400
Subject: [Linux-cluster] Problems with SAMBA server on Centos 51 virtual
	xen guest with iSCSI SAN
In-Reply-To: <47F3C07A.8090709@prisma-eng.it>
References: <47F3C07A.8090709@prisma-eng.it>
Message-ID: <47F3E82F.6030600@redhat.com>

Paolo Marini wrote:
> I have implemented a cluster of a few xen guest with a shared GFS 
> filesystem residing on a SAN build with openfiler to support iSCSI 
> storage.
>
> Physical servers are 3 machines implementing a physical cluster, each 
> one equipped with quad xeon and 4 G RAM. The network interface is 
> based on channel bonding with LACP (on the physical hosts) having an 
> aggregate of 2 gigabits ethernet per physical host, the switch 
> supports LACP and has been configured accordingly.
>
> Virtual servers are based on xen nodes on top of the physical server 
> with shared storage on iSCSI and GFS.
>
> The networking is based on a cluster private network (for cluster 
> heartbeat and cluster communication + iSCSI) and an ethernet alias for 
> the LAN to which the users are connected.
>
> One of the cluster xen nodes is used for implementing a samba PDC (no 
> failover of the service, plain samba, single samba server on the LAN) 
> plus ldap server; samba works with ldap for users authentication. 
> Storage for the samba server is on the SAN.
>
> I continue to receive complaints from my users due to the fact that 
> sometimes copying file generates errors, plus problems related to 
> office usage (we still use the old Office 97 on some machines). The 
> samba configuration is more or less the same as that correctly working 
> on the previous physical machine, on which those problems were not 
> present.
>
> The problems generate these log entries on /var/log/samba/smbd:
>
> [2008/04/02 19:00:50, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
>
> And on the client machine log also on /var/log/samba
>
> [2008/04/02 19:04:34, 0] lib/util_sock.c:read_data(534)
>  read_data: read failure for 4 bytes to client 192.168.13.240. Error = 
> Connection reset by peer
> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>  amhwq53p (192.168.13.240) closed connection to service tmp
> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>  amhwq53p (192.168.13.240) closed connection to service stock
> [2008/04/02 19:04:34, 0] lib/util_sock.c:write_data(562)
>  write_data: write failure in writing to client 192.168.13.240. Error 
> Broken pipe
> [2008/04/02 19:04:34, 0] lib/util_sock.c:send_smb(769)
>  Error writing 75 bytes to client. -1. (Broken pipe)
> [2008/04/02 19:04:34, 1] smbd/service.c:make_connection_snum(1033)
>
> They seem similar to problems related to poor connectivity or problem 
> in the network; however, these problems are new and were never found 
> before switching to the clustered architecture. Also no problem have 
> been found so far on the other xen nodes serving the same GFS 
> filesystem (different dirs !) for NFS or other services.
>
> Also putting the option
>
> posix locking = no
>
> on the smb.conf file did not help.
>
> Any idea from someone else facing the same problems ?
>
> thanks, Paolo
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster 
Those errors are explained in

     http://kbase.redhat.com/faq/FAQ_45_5274.shtm

John



From gordan at bobich.net  Wed Apr  2 21:19:11 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Wed, 02 Apr 2008 22:19:11 +0100
Subject: [Linux-cluster] Re: writing to GFS from multiple JVM's
	concurrently
In-Reply-To: <3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
	<3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
Message-ID: <47F3F84F.5020309@bobich.net>

Tajdar Siddiqui wrote:

> Yes, this test works fine on an ext3 filesystem.
> 
> The JVM's are on different nodes.
> 
> The files being written/read on the 2 JVM's are different (file-names). 
> Where does locking come into play here ?
>
 > A JVM is only reading the files it creates, so there is no cross.

Writing files to a directory requires a directory lock. This lock needs 
to be bounced back between the nodes. This will slow things down at the 
very least. If you can arrange your files/application so that the nodes 
are writing to separate directory trees, then that will undoubtedly give 
you better performance.

What version of GFS are you using? If GFS2, try GFS1. GFS2 isn't 
entirely stable yet.

Gordan



From jruemker at redhat.com  Wed Apr  2 20:38:30 2008
From: jruemker at redhat.com (John Ruemker)
Date: Wed, 02 Apr 2008 16:38:30 -0400
Subject: [Linux-cluster] Unformatting a GFS cluster disk
In-Reply-To: <0018D44C-35B7-4496-88F6-C8AB8F12FA84@exa.com>
References: <OFB60FBDB0.E8EF177D-ON80257417.0066F992-80257417.00673763@amnesty.org>	<1206482065.2741.83.camel@technetium.msp.redhat.com>
	<0018D44C-35B7-4496-88F6-C8AB8F12FA84@exa.com>
Message-ID: <47F3EEC6.4010105@redhat.com>

James Chamberlain wrote:
>
> On Mar 25, 2008, at 5:54 PM, Bob Peterson wrote:
>
>> If it were my file system, and I didn't have a backup, and I had
>> data on it that I absolutely needed to get back, I personally would
>> use the gfs2_edit tool (assuming RHEL5, Centos5 or similar) which can
>> mostly operate on gfs1 file systems.  The "gfs_edit" tool will also
>> work, but it is much more primitive than gfs2_edit (but at least it
>> exists on RHEL4, Centos4 and similar).
>
> Any idea what RPM gfs_edit would be in for RHEL4/CentOS 4?  I've got 
> CentOS 4.6, and I'm not finding it anywhere.
>
AFAIK its not provided in RHEL4.  In RHEL5 it would be in gfs-utils 
package. 

John

> Thanks,
>
> James
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From siddiqut at gmail.com  Wed Apr  2 20:58:45 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Wed, 2 Apr 2008 16:58:45 -0400
Subject: [Linux-cluster] Re: writing to GFS from multiple JVM's concurrently
In-Reply-To: <3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
	<3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
Message-ID: <3abaa1ce0804021358m56553dbfs9dc309d4c8bb32bd@mail.gmail.com>

Hi Gordan (apologize i misspelled your name last time),

Thanx for your help so far. A lame question probably: How do i figure out
the gfs version:

$ rpm -qa | grep GFS
--returns nothing


$ rpm -qa | grep gfs
gfs2-utils-0.1.38-1.el5
kmod-gfs-0.1.19-7.el5_1.1
kmod-gfs-0.1.16-5.2.6.18_8.el5
gfs-utils-0.1.12-1.el5

Not sure how to figure it out.

-Tajdar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/142605d2/attachment.htm>

From garromo at us.ibm.com  Wed Apr  2 21:17:55 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Wed, 2 Apr 2008 15:17:55 -0600
Subject: [Linux-cluster] SCSI reservation conflicts after update
In-Reply-To: <47F3B30F.2050502@redhat.com>
Message-ID: <OFC16731B0.3775690D-ON8725741F.0074F188-8725741F.0074DB45@us.ibm.com>

We had a similar issue and we just removed sg3utils (orsomething like 
that), if your not going to use it.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com



"Ryan O'Hara" <rohara at redhat.com> 
Sent by: linux-cluster-bounces at redhat.com
04/02/2008 10:23 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
ssingh at amnh.org, linux clustering <linux-cluster at redhat.com>
cc

Subject
Re: [Linux-cluster] SCSI reservation conflicts after update







I went back and investigated why this might happen. Seems that I had 
seen it before but could not recall how this sort of thing happens.

For 4.6, the scsi_reserve script should only be run if you intend to use 
SCSI reservations as a fence mechanism, as you correctly pointed out at 
the end of your message. I believe in 4.6 scsi_reserve was incorrectly 
enabled by default.

The real problem is that the keys used for scsi reservations are based 
on node ID. For this reason, it is required that nodeid be defined in 
the cluster.conf file for all nodes. Without this, the nodeid can change 
from node to node between cluster restarts, etc. The scsi_reserve and 
fence_scsi scripts require consistent nodeid (ie. they do not change).

So I think the problem we are seeing is that running 'scsi_reserve stop' 
cannot work since that will attempt to remove that node's key from the 
devices. If that key has changed (the node ID changed), it will not find 
a matching registration key on the device and thus fail.

The best bet is to disable scsi_reserve and to clear all scsi 
reservations. As you mentioned, the sg_persist command with the -C 
option should do the trick. I am guessing that the reason that failed 
for you is that you must supply the device name AND the key being used 
for that I_T nexus. You can use sg_persist to list the keys registered 
with a particular device, but since nodeid's may have changed you might 
have to guess the key for a particular node (ie. the node you run the 
sg_persist -C command on). The good news is that when you identify the 
correct key it will clear all the keys.

Ryan

Sajesh Singh wrote:
> After updating my GFS cluster to the latest packages (as of 3/28/08) on 
> an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
> am receiving scsi reservation errors whenever the nodes are rebooted. 
> The node is then subsequently rebooted at varying intervals without any 
> intervention. I have tried to disable the scsi_reserve script from 
> startup, but it does not seem to have any effect. I have also tried to 
> use the sg_persist command to clear all reservations with the -C option 
> to no avail. I first noticed something was wrong when the 2nd node of 
> the 2 node cluster was being updated. That was the first sign of the 
> scsi reservation errors on the console.
> 
>  From my understanding persistent SCSI reservations are only needed if I 

> am using the fence_scsi module.
> 
> I would appreciate any guidance.
> 
> Regards,
> 
> Sajesh Singh
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/34bbe732/attachment.htm>

From rpeterso at redhat.com  Wed Apr  2 21:40:19 2008
From: rpeterso at redhat.com (Bob Peterson)
Date: Wed, 02 Apr 2008 16:40:19 -0500
Subject: [Linux-cluster] Unformatting a GFS cluster disk
In-Reply-To: <47F3EEC6.4010105@redhat.com>
References: <OFB60FBDB0.E8EF177D-ON80257417.0066F992-80257417.00673763@amnesty.org>
	<1206482065.2741.83.camel@technetium.msp.redhat.com>
	<0018D44C-35B7-4496-88F6-C8AB8F12FA84@exa.com>
	<47F3EEC6.4010105@redhat.com>
Message-ID: <1207172419.24927.37.camel@technetium.msp.redhat.com>

On Wed, 2008-04-02 at 16:38 -0400, John Ruemker wrote:
> James Chamberlain wrote:
> >
> > On Mar 25, 2008, at 5:54 PM, Bob Peterson wrote:
> >
> >> If it were my file system, and I didn't have a backup, and I had
> >> data on it that I absolutely needed to get back, I personally would
> >> use the gfs2_edit tool (assuming RHEL5, Centos5 or similar) which can
> >> mostly operate on gfs1 file systems.  The "gfs_edit" tool will also
> >> work, but it is much more primitive than gfs2_edit (but at least it
> >> exists on RHEL4, Centos4 and similar).
> >
> > Any idea what RPM gfs_edit would be in for RHEL4/CentOS 4?  I've got 
> > CentOS 4.6, and I'm not finding it anywhere.
> >
> AFAIK its not provided in RHEL4.  In RHEL5 it would be in gfs-utils 
> package. 
> 
> John
> 
> > Thanks,
> >
> > James

That's true, but I very recently (last week) built a gfs2_edit
program that runs on RHEL4.6.  (Yes, I know gfs2 won't run on
4.X but the gfs2_edit tool can still be used to work on gfs1
file systems.)

I'm trying to get a people page worked out, and if that
goes through, I'll post it there.  I can send a tar/zip version of
the source tree for this if anyone wants it.  It's basically the
same cluster source tree as 5.2, but I've modified it slightly so
that it will compile on 4.6 without too much hassle.
I can also easily send a x86_64 binary for RHEL4.6 gfs2_edit if that
helps.  A 32-bit version would be hard for me to build at the moment.

The original gfs_edit is pretty primitive compared to gfs2_edit.

Regards,

Bob Peterson
Red Hat Clustering & GFS




From gordan at bobich.net  Thu Apr  3 00:42:10 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 03 Apr 2008 01:42:10 +0100
Subject: [Linux-cluster] Re: writing to GFS from multiple JVM's
	concurrently
In-Reply-To: <3abaa1ce0804021358m56553dbfs9dc309d4c8bb32bd@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>	<3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
	<3abaa1ce0804021358m56553dbfs9dc309d4c8bb32bd@mail.gmail.com>
Message-ID: <47F427E2.50706@bobich.net>

Tajdar Siddiqui wrote:
> Thanx for your help so far. A lame question probably: How do i figure 
> out the gfs version:
> 
> $ rpm -qa | grep gfs
> gfs2-utils-0.1.38-1.el5
> kmod-gfs-0.1.19-7.el5_1.1
> kmod-gfs-0.1.16-5.2.6.18_8.el5
> gfs-utils-0.1.12-1.el5
> 
> Not sure how to figure it out.

Did you make the FS with mkfs.gfs or mkfs.gfs2? What does mount say for 
the FS type?

Gordan



From siddiqut at gmail.com  Thu Apr  3 01:10:20 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Wed, 2 Apr 2008 21:10:20 -0400
Subject: [Linux-cluster] Re: writing to GFS from multiple JVM's concurrently
In-Reply-To: <3abaa1ce0804021358m56553dbfs9dc309d4c8bb32bd@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
	<3abaa1ce0804021234v6eb5b425hd591f6cf2c5f3caa@mail.gmail.com>
	<3abaa1ce0804021358m56553dbfs9dc309d4c8bb32bd@mail.gmail.com>
Message-ID: <3abaa1ce0804021810q64f6d0afg1126a9b981876bc1@mail.gmail.com>

Unfortunately, I did not create this FS so not sure what command params were
used.

Output of df -T :

$ df -T /gfs
Filesystem    Type   1K-blocks      Used Available Use% Mounted on
/dev/mapper/vggfs01-lvol00
               gfs   104551424   8120236  96431188   8% /gfs

Output of mount:

$ mount
/dev/mapper/vggfs01-lvol00 on /gfs type gfs
(rw,hostdata=jid=1:id=196610:first=0)


My guess this is GFS1?

Thanx,
Tajdar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/d0908daf/attachment.htm>

From pmshehzad at yahoo.com  Thu Apr  3 08:05:40 2008
From: pmshehzad at yahoo.com (Mshehzad Pankhawala)
Date: Thu, 3 Apr 2008 01:05:40 -0700 (PDT)
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster Suit
Message-ID: <468783.25520.qm@web45809.mail.sp1.yahoo.com>

We are using desktop PCs at my institute. 
We have tested  DRBD and Heartbeat,
Now we want to use RHCS. So how do we configure manual fencing.

Suggestions are welcome,
Thank you..

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080403/13dfa36b/attachment.htm>

From maciej.bogucki at artegence.com  Thu Apr  3 08:01:51 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Thu, 03 Apr 2008 10:01:51 +0200
Subject: [Linux-cluster] writing to GFS from multiple JVM's concurrently
In-Reply-To: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
References: <3abaa1ce0804020830k30bce4eey2127b8687f14d912@mail.gmail.com>
Message-ID: <47F48EEF.4070609@artegence.com>

Tajdar Siddiqui napisa?(a):
> Hi,
> 
> We are evaluating GFS for use as a highly concurrent distributed file
> system.
> 
> What I have observed:
> 
> When 2 JVM's  (multiple Threads per Java Virtual Machine) are writing to
> the same directory on GFS, on of the JVM doesn't see the files it writes
> on the GFS.
> The Writer Threads on JVM think they're done, but the files don't show
> up on "ls" etc.
> The other JVM works fine.
> 
> This problem goes away if the 2 JVM's write to different directories on GFS
> 
> OR
> 
> Only one JVM is writing at a time.
> 
> Any ideas on this.
> 
> 

http://sourceware.org/cluster/faq.html#gfs_samefile

Best Regards
Maciej Bogucki



From maciej.bogucki at artegence.com  Thu Apr  3 08:30:28 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Thu, 03 Apr 2008 10:30:28 +0200
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster Suit
In-Reply-To: <468783.25520.qm@web45809.mail.sp1.yahoo.com>
References: <468783.25520.qm@web45809.mail.sp1.yahoo.com>
Message-ID: <47F495A4.6030400@artegence.com>

Mshehzad Pankhawala napisa?(a):
> We are using desktop PCs at my institute.
> We have tested  DRBD and Heartbeat,
> Now we want to use RHCS. So how do we configure manual fencing.
> 
> Suggestions are welcome,

<clusternode name="node1" votes="1">
  <fence>
    <method name="1">
      <device name="manual" nodename="node1"/>
    </method>
  </fence>
</clusternode>
<clusternode name="node2" votes="1">
  <fence>
    <method name="1">
      <device name="manual" nodename="node2"/>
    </method>
  </fence>
</clusternode>


<fencedevices>
  <fencedevice agent="fence_manual" name="manual"/>
</fencedevices>

Best Regards
Maciej Bogucki



From nkhare.lists at gmail.com  Thu Apr  3 08:52:37 2008
From: nkhare.lists at gmail.com (Neependra Khare)
Date: Thu, 03 Apr 2008 14:22:37 +0530
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster Suit
In-Reply-To: <47F495A4.6030400@artegence.com>
References: <468783.25520.qm@web45809.mail.sp1.yahoo.com>
	<47F495A4.6030400@artegence.com>
Message-ID: <47F49AD5.9080702@gmail.com>

Maciej Bogucki wrote:
> Mshehzad Pankhawala napisa?(a):
>   
>> We are using desktop PCs at my institute.
>> We have tested  DRBD and Heartbeat,
>> Now we want to use RHCS. So how do we configure manual fencing.
>>
>> Suggestions are welcome,
>>     
>
> <clusternode name="node1" votes="1">
>   <fence>
>     <method name="1">
>       <device name="manual" nodename="node1"/>
>     </method>
>   </fence>
> </clusternode>
> <clusternode name="node2" votes="1">
>   <fence>
>     <method name="1">
>       <device name="manual" nodename="node2"/>
>     </method>
>   </fence>
> </clusternode>
>
>
> <fencedevices>
>   <fencedevice agent="fence_manual" name="manual"/>
>   
To complete the fencing you have to run "fence_ack_manual" command manually.
For more information have a look at man page of "fence_ack_manual" .

Neependra






From gordan at bobich.net  Thu Apr  3 08:56:29 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 3 Apr 2008 09:56:29 +0100 (BST)
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster Suit
In-Reply-To: <468783.25520.qm@web45809.mail.sp1.yahoo.com>
References: <468783.25520.qm@web45809.mail.sp1.yahoo.com>
Message-ID: <alpine.LRH.1.10.0804030955150.22826@skynet.shatteredsilicon.net>

As far as I'm aware, there is no way to use manual fencing in an automated 
way. You'll have to manually acknowledge that the machine has been fenced.

On Thu, 3 Apr 2008, Mshehzad Pankhawala wrote:

> We are using desktop PCs at my institute.
> We have tested? DRBD and Heartbeat,
> Now we want to use RHCS. So how do we configure manual fencing.

From denisb+gmane at gmail.com  Thu Apr  3 09:39:30 2008
From: denisb+gmane at gmail.com (denis)
Date: Thu, 03 Apr 2008 11:39:30 +0200
Subject: [Linux-cluster] Re: fence_manual missing /tmp/fence_manual.fifo
In-Reply-To: <47F0FD50.3030500@artegence.com>
References: <fsfnut$6i6$1@ger.gmane.org>
	<47EBCEE8.7090905@artegence.com>	<fsidga$tum$1@ger.gmane.org>
	<47F0FD50.3030500@artegence.com>
Message-ID: <ft28jp$f2j$1@ger.gmane.org>

Maciej Bogucki wrote:
> denis napisa?(a):
>> Maciej Bogucki wrote:
>>>> Are you certain you want to continue? [yN] y
>>>> can't open /tmp/fence_manual.fifo: No such file or directory
> Do You run fence_ack_manual on the node which is master in the cluster[1]?
> [1] - http://www.mail-archive.com/linux-cluster at redhat.com/msg02173.html

Thanks for the reply. Well, yes this is a two node cluster, so I am
doing it on the remaining node, which will then be the master node.

I will however double check the next time manual fencing hits in (which
isn't often anymore as the bladecenter fencing works smoothly now.

Regards
--
Denis



From ben.yarwood at juno.co.uk  Thu Apr  3 11:16:23 2008
From: ben.yarwood at juno.co.uk (Ben Yarwood)
Date: Thu, 3 Apr 2008 12:16:23 +0100
Subject: [Linux-cluster] gfs_fsck memory allocation
Message-ID: <02f401c8957c$2b9d1fa0$82d75ee0$@yarwood@juno.co.uk>

I'm trying to run gfs_fsck on a 16TB file system and I keep getting the following message

Initializing fsck
Unable to allocate bitmap of size 520093697
This system doesn't have enough memory + swap space to fsck this file system.
Additional memory needed is approximately: 5952MB
Please increase your swap space by that amount and run gfs_fsck again.

I have increased the swap size to 16GB but I still keep getting the message.  Does anyone have any suggestions?
 





From npf-mlists at eurotux.com  Thu Apr  3 14:25:25 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Thu, 3 Apr 2008 15:25:25 +0100
Subject: [Linux-cluster] Problem in clvmd and iscsi-target
Message-ID: <200804031525.25980.npf-mlists@eurotux.com>

Hi,

There is a race condition in iscsi-target and clvmd that does not allow me to 
export a volume by iscsi and use it localy in clvmd.

I have two servers "black" and "gray". "Gray" has two drives hda (for 
filesystem) and hdb (that is going to be exported through iscsi to "black"). 
Then both machines are part of a 2 node cluster to use clvmd.

root gray ~ # /etc/init.d/iscsi-target start
Starting iSCSI target service:                             [  OK  ]
root gray ~ # /etc/init.d/clvmd start
Starting clvmd:                                            [  OK  ]
Activating VGs:   No volume groups found
                                                           [  OK  ]
root gray ~ # pvcreate /dev/hdb
  Can't open /dev/hdb exclusively.  Mounted filesystem?
root gray ~ # /etc/init.d/iscsi-target stop
Stopping iSCSI target service:                             [  OK  ]
root gray ~ # pvcreate /dev/hdb
  Physical volume "/dev/hdb" successfully created
root gray ~ #

I cannot use clvmd and iscsi-target on the same machine? If i create a logical 
volume is is activated in "black"

lvdisplay -C
  LV     VG    Attr   LSize   Origin Snap%  Move Log Copy%
  teste2 teste -wi-a-   1.00G

and in "gray" is disabled:

root gray ~ # lvdisplay -C
  LV     VG    Attr   LSize Origin Snap%  Move Log Copy%
  teste2 teste -wi-d- 1.00G

Any ideas?

Thanks
Nuno Fernandes



From rpeterso at redhat.com  Thu Apr  3 15:19:20 2008
From: rpeterso at redhat.com (Bob Peterson)
Date: Thu, 03 Apr 2008 10:19:20 -0500
Subject: [Linux-cluster] gfs_fsck memory allocation
In-Reply-To: <02f401c8957c$2b9d1fa0$82d75ee0$@yarwood@juno.co.uk>
References: <02f401c8957c$2b9d1fa0$82d75ee0$@yarwood@juno.co.uk>
Message-ID: <1207235960.24927.72.camel@technetium.msp.redhat.com>

On Thu, 2008-04-03 at 12:16 +0100, Ben Yarwood wrote:
> I'm trying to run gfs_fsck on a 16TB file system and I keep getting the following message
> 
> Initializing fsck
> Unable to allocate bitmap of size 520093697
> This system doesn't have enough memory + swap space to fsck this file system.
> Additional memory needed is approximately: 5952MB
> Please increase your swap space by that amount and run gfs_fsck again.
> 
> I have increased the swap size to 16GB but I still keep getting the message.  Does anyone have any suggestions?

Hi Ben,

The gfs_fsck needs one byte per block in each bitmap.  That message
indicates that it tried to allocate a chunk of 520MB of memory and got
an error on it.  IIRC, the biggest RG size is 2GB, and would therefore
require at most a chunk of 512K.  (Assuming 4K blocks and assuming I did
the math correctly, which I won't promise!)  a 520MB chunk is big enough
to hold an entire RG; much bigger than a bitmap for one.

So this error is most likely caused by corruption in your system rindex
file.  Perhaps you should do gfs_tool rindex and look for anomalies.

I'm planning to do some fixes to gfs_fsck to handle more cases
like this, but that will take some time to resolve.  If you send in your
metadata (gfs2_tool savemeta), that might help me in this task.

Regards,

Bob Peterson
Red Hat Clustering & GFS




From theophanis_kontogiannis at yahoo.gr  Thu Apr  3 08:44:56 2008
From: theophanis_kontogiannis at yahoo.gr (Theophanis Kontogiannis)
Date: Thu, 3 Apr 2008 11:44:56 +0300
Subject: [Linux-cluster] GFS2 error?
Message-ID: <00e201c89567$0c88ef50$9601a8c0@corp.netone.gr>

Hi all,

Anybody knows what this is?


GFS2: fsid=tweety:gfs0.0: fatal: invalid metadata block
GFS2: fsid=tweety:gfs0.0:   bh = 183677 (magic number)
GFS2: fsid=tweety:gfs0.0:   function = gfs2_meta_indirect_buffer, file =
fs/gfs2/meta_io.c, line = 438
GFS2: fsid=tweety:gfs0.0: about to withdraw this file system
GFS2: fsid=tweety:gfs0.0: telling LM to withdraw
GFS2: fsid=tweety:gfs0.0: withdrawn


I have the GFS2 fs on DRBD but DRBD reported no errors.

Thank you all

Theophanis Kontogiannis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080403/dbd7a2d9/attachment.htm>

From npf-mlists at eurotux.com  Thu Apr  3 17:11:37 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Thu, 3 Apr 2008 18:11:37 +0100
Subject: [Linux-cluster] Re: [Iscsitarget-devel] Problem in clvmd and
	iscsi-target
In-Reply-To: <E2BB8074E5500C42984D980D4BD78EF9022A708F@MFG-NYC-EXCH2.mfg.prv>
References: <E2BB8074E5500C42984D980D4BD78EF9022A708F@MFG-NYC-EXCH2.mfg.prv>
Message-ID: <200804031811.38493.npf-mlists@eurotux.com>

On Thursday 03 April 2008 15:56:52 Ross S. W. Walker wrote:
> You have given lots of LVM info, but no iscsi-target info.
>
> What version? A copy of your ietd.conf. Output of /proc/net/iet/volume
Sorry :)

iscsitarget-0.4.16-1

ietd.conf
Target iqn.2007-04.com.eurotux.dc.gray:storage.disk.hdb
        Lun 0 Path=/dev/hdb,Type=blockio,ScsiId=95afc14465efeb27
        Alias Test
        InitialR2T              No
        ImmediateData           Yes
        MaxRecvDataSegmentLength 16384
        MaxXmitDataSegmentLength 16384
        #MaxBurstLength         262144
        #FirstBurstLength       65536
        MaxOutstandingR2T       8
        Wthreads                8

Output of proc data:
tid:1 name:iqn.2007-04.com.eurotux.dc.gray:storage.disk.hdb
        lun:0 state:0 iotype:blockio iomode:wt path:/dev/hdb

Meanwhile i narrow it down to a iscsi-target problem. Doing strace i saw it 
requires exclusive open. So solve it it created an sort of an hack.. :)

from hdb and using device-mapper i create a linear mapping to 2 devices:

hdb-int
hdb-ext

Next i put hdb-ext in ietd.conf and hdb-int in lvm.conf. iscsi-target still 
opens in exclusive mode but it only opens hdb-ext device.
clvmd uses hdb-int that has no exclusive lock.

Another way that we rejected is to put "gray" machine exporting the iscsi 
volume to it self also and using that device in clvmd. This option also 
worked but has less performance as all local contend has to be encoded to 
iscsi and decoded in the same machine.

The problem of iscsi-target opening the device excluse remais.

Thanks,
Nuno Fernandes


>
> If you export /dev/hdb via iscsi it will not be accessible for anything
> else.
>
> -Ross
>
>
> ----- Original Message -----
> From: iscsitarget-devel-bounces at lists.sourceforge.net
> <iscsitarget-devel-bounces at lists.sourceforge.net> To: 'linux clustering'
> <linux-cluster at redhat.com>; iscsitarget-devel at lists.sourceforge.net
> <iscsitarget-devel at lists.sourceforge.net> Sent: Thu Apr 03 10:25:25 2008
> Subject: [Iscsitarget-devel] Problem in clvmd and iscsi-target
>
> Hi,
>
> There is a race condition in iscsi-target and clvmd that does not allow me
> to export a volume by iscsi and use it localy in clvmd.
>
> I have two servers "black" and "gray". "Gray" has two drives hda (for
> filesystem) and hdb (that is going to be exported through iscsi to
> "black"). Then both machines are part of a 2 node cluster to use clvmd.
>
> root gray ~ # /etc/init.d/iscsi-target start
> Starting iSCSI target service:                             [  OK  ]
> root gray ~ # /etc/init.d/clvmd start
> Starting clvmd:                                            [  OK  ]
> Activating VGs:   No volume groups found
>                                                            [  OK  ]
> root gray ~ # pvcreate /dev/hdb
>   Can't open /dev/hdb exclusively.  Mounted filesystem?
> root gray ~ # /etc/init.d/iscsi-target stop
> Stopping iSCSI target service:                             [  OK  ]
> root gray ~ # pvcreate /dev/hdb
>   Physical volume "/dev/hdb" successfully created
> root gray ~ #
>
> I cannot use clvmd and iscsi-target on the same machine? If i create a
> logical volume is is activated in "black"
>
> lvdisplay -C
>   LV     VG    Attr   LSize   Origin Snap%  Move Log Copy%
>   teste2 teste -wi-a-   1.00G
>
> and in "gray" is disabled:
>
> root gray ~ # lvdisplay -C
>   LV     VG    Attr   LSize Origin Snap%  Move Log Copy%
>   teste2 teste -wi-d- 1.00G
>
> Any ideas?
>
> Thanks
> Nuno Fernandes
>
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplac
>e _______________________________________________
> Iscsitarget-devel mailing list
> Iscsitarget-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel
>
> ______________________________________________________________________
> This e-mail, and any attachments thereto, is intended only for use by
> the addressee(s) named herein and may contain legally privileged
> and/or confidential information. If you are not the intended recipient
> of this e-mail, you are hereby notified that any dissemination,
> distribution or copying of this e-mail, and any attachments thereto,
> is strictly prohibited. If you have received this e-mail in error,
> please immediately notify the sender and permanently delete the
> original and any copy or printout thereof.




From mcse47 at hotmail.com  Thu Apr  3 18:43:00 2008
From: mcse47 at hotmail.com (Tracey Flanders)
Date: Thu, 3 Apr 2008 14:43:00 -0400
Subject: [Linux-cluster] Is there a fencing agent I can use for iscsi ?(GFS
	and iSCSI)
Message-ID: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>


I have a 2 node cluster serverA and serverB using iSCSI for the shared disk.
Both mount to ServerC which hosts the iscsi target. They both mount to the same iscsi target using GFS as the filesystem.  
I have setup RHCS and everything is working great except I do not have a proper fencing agent. Basically both nodes have to be online in order for the cluster to come up.

I was wondering if anyone has written a iscsi fencing agent that I could use. I saw one written in perl that ssh'd into the node and added an iptables entry in order to fence the server from the iscsi target. It was from 2004 and didn't run correctly on my machine.  Does anyone have any ideas? Or should I try and salvage the one I found and fix it up? Thanks.

Tracey Flanders

_________________________________________________________________
Get in touch in an instant. Get Windows Live Messenger now.
http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_getintouch_042008



From lhh at redhat.com  Thu Apr  3 19:19:52 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 03 Apr 2008 15:19:52 -0400
Subject: [Linux-cluster] VIP's on mixed subnets
In-Reply-To: <OF36CA8B0B.F6241E8B-ON8725741E.0060B0D3-8725741E.0060BDDB@us.ibm.com>
References: <OF36CA8B0B.F6241E8B-ON8725741E.0060B0D3-8725741E.0060BDDB@us.ibm.com>
Message-ID: <1207250392.15132.46.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-01 at 11:38 -0600, Gary Romo wrote:
> 
> In my cluster all of my servers NICs are bonded. 
> Up until recently all of my VIPs (for resources/services) were in the
> same subnet. 
> Is it ok that VIPs be in mixed subnets?  Thanks. 

Yes, as long as you have an IP on the subnet already.

e.g. if you have eth1 on 192.168.1.0/24 and eth2 on 10.0.0.0/8, you can
put IPs for those two subnets in cluster.conf.

-- Lon



From lhh at redhat.com  Thu Apr  3 19:27:43 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 03 Apr 2008 15:27:43 -0400
Subject: [Linux-cluster] Using GFS and DLM without RHCS
In-Reply-To: <47F16E7F.3000903@bobich.net>
References: <47F113C5020000C800008B0C@mail-int.health-first.org>
	<47F16E7F.3000903@bobich.net>
Message-ID: <1207250863.15132.51.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-01 at 00:06 +0100, Gordan Bobic wrote:
> Danny Wall wrote:
> > I was wondering if it is possible to run GFS on several machines with a
> > shared GFS LUN, but not use full clustering like RHCS. From the FAQs:
> 
> First of all, what's the problem with having RHCS running? It doesn't 
> mean you have to use it to handle resources failing over. You can run it 
> all in active/active setup with load balancing in front.

Well, the direct answer is you don't need rgmanager (failover stuff) at
all to run GFS.  GFS is a client of the cluster infrastructure, and
really, it's a peer of rgmanager (though you can use rgmanager to mount
it if you want).

You do, however, need to run the cluster infrastructure stack
(openais/cman/dlm/fencing/etc.) to run GFS on multiple nodes if those
nodes are accessing the same software.

-- Lon





From garromo at us.ibm.com  Thu Apr  3 20:56:22 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Thu, 3 Apr 2008 14:56:22 -0600
Subject: [Linux-cluster] VIP's on mixed subnets
In-Reply-To: <1207250392.15132.46.camel@ayanami.boston.devel.redhat.com>
Message-ID: <OFC77778FE.8CC6D9BB-ON87257420.00727E16-87257420.0072E259@us.ibm.com>

Thanks Lon.  In my case I do not have another IP with another subnet.
So, the only way to do this would be by obtaining another NIC with the 
other SUBNET on it, correct?

e.g. 
My eth0 and eth1 are bonded with the same IP address, on the same subnet 
(192.168.0./24)
I would need to order a new NIC (eth2), which would be on the other subnet 
(10.0.0.0/8), in order to use VIP from the 10.0 subnet?

Correct?

-Gary




Lon Hohberger <lhh at redhat.com> 
Sent by: linux-cluster-bounces at redhat.com
04/03/2008 01:19 PM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc

Subject
Re: [Linux-cluster] VIP's on mixed subnets






On Tue, 2008-04-01 at 11:38 -0600, Gary Romo wrote:
> 
> In my cluster all of my servers NICs are bonded. 
> Up until recently all of my VIPs (for resources/services) were in the
> same subnet. 
> Is it ok that VIPs be in mixed subnets?  Thanks. 

Yes, as long as you have an IP on the subnet already.

e.g. if you have eth1 on 192.168.1.0/24 and eth2 on 10.0.0.0/8, you can
put IPs for those two subnets in cluster.conf.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080403/73172f70/attachment.htm>

From david at eciad.ca  Thu Apr  3 21:29:55 2008
From: david at eciad.ca (David Ayre)
Date: Thu, 3 Apr 2008 14:29:55 -0700
Subject: [Linux-cluster] dlm high cpu on latest stock centos 5.1 kernel
In-Reply-To: <47567.10.8.105.69.1207097461.squirrel@secure.ntsg.umt.edu>
References: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>
	<EE95BF49-125C-4A49-99D1-E7459BDE6E80@eciad.ca>
	<47567.10.8.105.69.1207097461.squirrel@secure.ntsg.umt.edu>
Message-ID: <7E877A62-DDED-46F8-AB00-10B396AA9791@eciad.ca>

Some progress...

We had another dlm_sendd lockup yesterday which prompted us to do some  
reworking of our file sharing.  Previously we had both SMB and NFS  
services competing for GFS resources on this particular node.   We  
thought perhaps it was this combination which may have provoked the  
lockups... so, we moved things around with the help of another server  
in our GFS cluster.

Previously we had:

Machine A (nfs and smb services sitting on top of gfs)
NFS  SMB
GFS

And switched things around to this:

Machine A
SMB
NFS -> Machine B

Machine B
NFS
GFS

Basically we moved all NFS mounts to machine B.... NFS is the only  
file sharing service using GFS on this machine, and changed Machine A  
to use an NFS mount to machine B.   This way we don't have any nodes  
with both SMB and NFS services running on top of GFS.

Previously we had 1-2 lockups a day, but today nothing... so far so  
good.   Not sure if this configuration will work for you... let me  
know if you need any further clarification.

d


On 1-Apr-08, at 5:51 PM, Andrew A. Neuschwander wrote:

> My symptoms are similar. dlm_send sits on all of the cpu. Top shows  
> the
> cpu spending nearly all of it's time in sys or interrupt handling.  
> Disk
> and network I/O isn't very high (as seen via iostat and iptraf). But
> SMB/NFS throughput and latency are horrible. Context switches per  
> second
> as seen by vmstat are in the 20,000+ range (I don't now if this is  
> high
> though, I haven't really paid attention to this in the past). Nothing
> crashes, and it is still able to serve data (very slowly), and  
> eventually
> the load and latency recovers.
>
> As an aside, does anyone know how to _view_ the resource group size  
> after
> file system creation on GFS?
>
> Thanks,
> -Andrew
>
>
> On Tue, April 1, 2008 6:30 pm, David Ayre wrote:
>> What do you mean by pounded exactly ?
>>
>> We have an ongoing issue, similar... when we have about a dozen users
>> using both smb/nfs, and at some seemingly random point in time our
>> dlm_senddd chews up 100% of the CPU... then dies down at on its own
>> after quite a while.  Killing SMB processes, shutting down SMB didn't
>> seem to have any affect... only a reboot cures it.  I've seen this
>> described (if this is the same issue) as a "soft lockup" as it does
>> seem to come back to life:
>>
>> http://lkml.org/lkml/2007/10/4/137
>>
>> We've been assuming its a kernel/dlm version as we are running
>> 2.6.9-55.0.6.ELsmp with dlm-kernel 2.6.9-46.16.0.8
>>
>> we were going to try a kernel update this week... but you seem to be
>> using a later version and still have this problem ?
>>
>> Could you elaborate on "getting pounded by dlm" ?  I've posted about
>> this on this list in the past but received no assistance.
>>
>>
>>
>>
>> On 1-Apr-08, at 5:19 PM, Andrew A. Neuschwander wrote:
>>
>>> I have a GFS cluster with one node serving files via smb and nfs.
>>> Under
>>> fairly light usage (5-10 users) the cpu is getting pounded by dlm. I
>>> am
>>> using CentOS5.1 with the included kernel (2.6.18-53.1.14.el5). This
>>> sounds
>>> like the dlm issue mentioned back in March of last year
>>> (https://www.redhat.com/archives/linux-cluster/2007-March/msg00068.html
>>> )
>>> that was resolved in 2.6.21.
>>>
>>> Has (or will) this fix be back ported to the current el5 kernel?
>>> Will it
>>> be in RHEL5.2? What is the easiest way for me to get this fix?
>>>
>>> Also, if I try a newer kernel on this node, will there be any harm
>>> in the
>>> other nodes using their current kernel?
>>>
>>> Thanks,
>>> -Andrew
>>> --
>>> Andrew A. Neuschwander, RHCE
>>> Linux Systems Administrator
>>> Numerical Terradynamic Simulation Group
>>> College of Forestry and Conservation
>>> The University of Montana
>>> http://www.ntsg.umt.edu
>>> andrew at ntsg.umt.edu - 406.243.6310
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> ~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
>> David Ayre
>> Programmer/Analyst - Information Technlogy Services
>> Emily Carr Institute of Art and Design
>> Vancouver, B.C.   Canada
>> 604-844-3875 /  david at eciad.ca
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
David Ayre
Programmer/Analyst - Information Technlogy Services
Emily Carr Institute of Art and Design
Vancouver, B.C.   Canada
604-844-3875 /  david at eciad.ca



From andrew at ntsg.umt.edu  Thu Apr  3 22:35:41 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Thu, 3 Apr 2008 16:35:41 -0600 (MDT)
Subject: [Linux-cluster] dlm high cpu on latest stock centos 5.1 kernel
In-Reply-To: <7E877A62-DDED-46F8-AB00-10B396AA9791@eciad.ca>
References: <33710.10.8.105.69.1207095555.squirrel@secure.ntsg.umt.edu>
	<EE95BF49-125C-4A49-99D1-E7459BDE6E80@eciad.ca>
	<47567.10.8.105.69.1207097461.squirrel@secure.ntsg.umt.edu>
	<7E877A62-DDED-46F8-AB00-10B396AA9791@eciad.ca>
Message-ID: <36955.10.8.105.69.1207262141.squirrel@secure.ntsg.umt.edu>

Dave,

Thanks for the update. I had considered that and I'm setup to be able to
do it. Now that someone else has tried with positive results, I think I'll
give it a try.

Thanks,
-Andrew

On Thu, April 3, 2008 3:29 pm, David Ayre wrote:
> Some progress...
>
> We had another dlm_sendd lockup yesterday which prompted us to do some
> reworking of our file sharing.  Previously we had both SMB and NFS
> services competing for GFS resources on this particular node.   We
> thought perhaps it was this combination which may have provoked the
> lockups... so, we moved things around with the help of another server
> in our GFS cluster.
>
> Previously we had:
>
> Machine A (nfs and smb services sitting on top of gfs)
> NFS  SMB
> GFS
>
> And switched things around to this:
>
> Machine A
> SMB
> NFS -> Machine B
>
> Machine B
> NFS
> GFS
>
> Basically we moved all NFS mounts to machine B.... NFS is the only
> file sharing service using GFS on this machine, and changed Machine A
> to use an NFS mount to machine B.   This way we don't have any nodes
> with both SMB and NFS services running on top of GFS.
>
> Previously we had 1-2 lockups a day, but today nothing... so far so
> good.   Not sure if this configuration will work for you... let me
> know if you need any further clarification.
>
> d
>
>
> On 1-Apr-08, at 5:51 PM, Andrew A. Neuschwander wrote:
>
>> My symptoms are similar. dlm_send sits on all of the cpu. Top shows
>> the
>> cpu spending nearly all of it's time in sys or interrupt handling.
>> Disk
>> and network I/O isn't very high (as seen via iostat and iptraf). But
>> SMB/NFS throughput and latency are horrible. Context switches per
>> second
>> as seen by vmstat are in the 20,000+ range (I don't now if this is
>> high
>> though, I haven't really paid attention to this in the past). Nothing
>> crashes, and it is still able to serve data (very slowly), and
>> eventually
>> the load and latency recovers.
>>
>> As an aside, does anyone know how to _view_ the resource group size
>> after
>> file system creation on GFS?
>>
>> Thanks,
>> -Andrew
>>
>>
>> On Tue, April 1, 2008 6:30 pm, David Ayre wrote:
>>> What do you mean by pounded exactly ?
>>>
>>> We have an ongoing issue, similar... when we have about a dozen users
>>> using both smb/nfs, and at some seemingly random point in time our
>>> dlm_senddd chews up 100% of the CPU... then dies down at on its own
>>> after quite a while.  Killing SMB processes, shutting down SMB didn't
>>> seem to have any affect... only a reboot cures it.  I've seen this
>>> described (if this is the same issue) as a "soft lockup" as it does
>>> seem to come back to life:
>>>
>>> http://lkml.org/lkml/2007/10/4/137
>>>
>>> We've been assuming its a kernel/dlm version as we are running
>>> 2.6.9-55.0.6.ELsmp with dlm-kernel 2.6.9-46.16.0.8
>>>
>>> we were going to try a kernel update this week... but you seem to be
>>> using a later version and still have this problem ?
>>>
>>> Could you elaborate on "getting pounded by dlm" ?  I've posted about
>>> this on this list in the past but received no assistance.
>>>
>>>
>>>
>>>
>>> On 1-Apr-08, at 5:19 PM, Andrew A. Neuschwander wrote:
>>>
>>>> I have a GFS cluster with one node serving files via smb and nfs.
>>>> Under
>>>> fairly light usage (5-10 users) the cpu is getting pounded by dlm. I
>>>> am
>>>> using CentOS5.1 with the included kernel (2.6.18-53.1.14.el5). This
>>>> sounds
>>>> like the dlm issue mentioned back in March of last year
>>>> (https://www.redhat.com/archives/linux-cluster/2007-March/msg00068.html
>>>> )
>>>> that was resolved in 2.6.21.
>>>>
>>>> Has (or will) this fix be back ported to the current el5 kernel?
>>>> Will it
>>>> be in RHEL5.2? What is the easiest way for me to get this fix?
>>>>
>>>> Also, if I try a newer kernel on this node, will there be any harm
>>>> in the
>>>> other nodes using their current kernel?
>>>>
>>>> Thanks,
>>>> -Andrew
>>>> --
>>>> Andrew A. Neuschwander, RHCE
>>>> Linux Systems Administrator
>>>> Numerical Terradynamic Simulation Group
>>>> College of Forestry and Conservation
>>>> The University of Montana
>>>> http://www.ntsg.umt.edu
>>>> andrew at ntsg.umt.edu - 406.243.6310
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> ~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
>>> David Ayre
>>> Programmer/Analyst - Information Technlogy Services
>>> Emily Carr Institute of Art and Design
>>> Vancouver, B.C.   Canada
>>> 604-844-3875 /  david at eciad.ca
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> ~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~_~
> David Ayre
> Programmer/Analyst - Information Technlogy Services
> Emily Carr Institute of Art and Design
> Vancouver, B.C.   Canada
> 604-844-3875 /  david at eciad.ca
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>


-- 
Andrew A. Neuschwander, RHCE
Linux Systems Administrator
Numerical Terradynamic Simulation Group
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310



From andrew at ntsg.umt.edu  Thu Apr  3 23:06:22 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Thu, 3 Apr 2008 17:06:22 -0600 (MDT)
Subject: [Linux-cluster] gfs mount options and tuneables
Message-ID: <42586.10.8.105.69.1207263982.squirrel@secure.ntsg.umt.edu>

How important is it that all members of a gfs cluster have the same mount
options and tunable values set? Is having different options safe in
general or is the safety dependent on the specific option in question?

Thanks,
-Andrew
-- 
Andrew A. Neuschwander, RHCE
Linux Systems Administrator
Numerical Terradynamic Simulation Group
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310



From theophanis_kontogiannis at yahoo.gr  Fri Apr  4 00:10:52 2008
From: theophanis_kontogiannis at yahoo.gr (Theophanis Kontogiannis)
Date: Fri, 4 Apr 2008 03:10:52 +0300
Subject: [Linux-cluster] fsck.gfs2 seg fault
Message-ID: <000501c895e8$5f13e5f0$9601a8c0@corp.netone.gr>

Hello all

I have RHEL 5.1 with 2.6.18 kernel and fsck.gfs2 (GFS2 fsck 0.1.38) always
seg faults at 99% with:

fsck.gfs2[8245]: segfault at 0000000000000018 rip 00000000004047db rsp
00007ffffbabecb0 error 4

Any ideas on that?

Thank you all for your time
Theophanis Kontogiannis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/63ca0c82/attachment.htm>

From theophanis_kontogiannis at yahoo.gr  Thu Apr  3 23:15:54 2008
From: theophanis_kontogiannis at yahoo.gr (Theophanis Kontogiannis)
Date: Fri, 4 Apr 2008 02:15:54 +0300
Subject: [Linux-cluster] Problem with 2 node cluster and GFS2
Message-ID: <000001c895e0$ac2aacf0$9601a8c0@corp.netone.gr>

	Hello all,

Any ideas on what the following is?

GFS2: fsid=: Trying to join cluster "lock_dlm", "tweety:gfs0"
GFS2: fsid=tweety:gfs0.0: Joined cluster. Now mounting FS...
GFS2: fsid=tweety:gfs0.0: jid=0, already locked for use
GFS2: fsid=tweety:gfs0.0: jid=0: Looking at journal...
GFS2: fsid=tweety:gfs0.0: jid=0: Acquiring the transaction lock...
GFS2: fsid=tweety:gfs0.0: jid=0: Replaying journal...
GFS2: fsid=tweety:gfs0.0: jid=0: Replayed 4 of 4 blocks
GFS2: fsid=tweety:gfs0.0: jid=0: Found 0 revoke tags
GFS2: fsid=tweety:gfs0.0: jid=0: Journal replayed in 1s
GFS2: fsid=tweety:gfs0.0: jid=0: Done
GFS2: fsid=tweety:gfs0.0: jid=1: Trying to acquire journal lock...
GFS2: fsid=tweety:gfs0.0: jid=1: Looking at journal...
GFS2: fsid=tweety:gfs0.0: jid=1: Done
GFS2: fsid=tweety:gfs0.0: jid=2: Trying to acquire journal lock...
GFS2: fsid=tweety:gfs0.0: jid=2: Looking at journal...
GFS2: fsid=tweety:gfs0.0: jid=2: Done
GFS2: fsid=tweety:gfs0.0: jid=3: Trying to acquire journal lock...
GFS2: fsid=tweety:gfs0.0: jid=3: Looking at journal...
GFS2: fsid=tweety:gfs0.0: jid=3: Done
GFS2: fsid=tweety:gfs0.0: fatal: invalid metadata block
GFS2: fsid=tweety:gfs0.0:   bh = 162602 (magic number)
GFS2: fsid=tweety:gfs0.0:   function = gfs2_meta_indirect_buffer, file =
fs/gfs2/meta_io.c, line = 438
GFS2: fsid=tweety:gfs0.0: about to withdraw this file system
GFS2: fsid=tweety:gfs0.0: telling LM to withdraw
GFS2: fsid=tweety:gfs0.0: withdrawn

Call Trace:
 [<ffffffff8865215e>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0
 [<ffffffff80014cca>] sync_buffer+0x0/0x3f
 [<ffffffff80062a3f>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff8009ba34>] wake_bit_function+0x0/0x23
 [<ffffffff8866395b>] :gfs2:gfs2_meta_check_ii+0x2c/0x38
 [<ffffffff88655dbb>] :gfs2:gfs2_meta_indirect_buffer+0x1e3/0x284
 [<ffffffff88650a88>] :gfs2:gfs2_inode_refresh+0x22/0x2b9
 [<ffffffff8864feff>] :gfs2:inode_go_lock+0x29/0x57
 [<ffffffff8864f08f>] :gfs2:glock_wait_internal+0x1e3/0x259
 [<ffffffff8864f2b3>] :gfs2:gfs2_glock_nq+0x1ae/0x1d4
 [<ffffffff88659e35>] :gfs2:gfs2_getattr+0x7d/0xc3
 [<ffffffff88659e2d>] :gfs2:gfs2_getattr+0x75/0xc3
 [<ffffffff8000de0b>] vfs_getattr+0x2d/0xa9
 [<ffffffff8003e6ff>] vfs_lstat_fd+0x2f/0x47
 [<ffffffff8002a6ab>] sys_newlstat+0x19/0x31
 [<ffffffff8005c229>] tracesys+0x71/0xe0
 [<ffffffff8005c28d>] tracesys+0xd5/0xe0



I see it on both nodes when I try to access a particular folder from either
node

Thank you all
Theophanis Kontogiannis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/ed95315c/attachment.htm>

From Christopher.Barry at qlogic.com  Fri Apr  4 03:23:27 2008
From: Christopher.Barry at qlogic.com (christopher barry)
Date: Thu, 03 Apr 2008 23:23:27 -0400
Subject: [Linux-cluster] Unformatting a GFS cluster disk
In-Reply-To: <47EFEFE9.9040903@gmail.com>
References: <OF6D2E49A0.025F62C0-ON80257418.0036C9A3-80257418.00376E11@amnesty.org>
	<47EA6EB7.1090108@gmail.com> <1206545696.5336.48.camel@localhost>
	<20080326205842.GA22083@nlxdcldnl2.cl.intel.com>
	<1206646016.29968.60.camel@localhost>
	<20080328144225.GA12231@nlxdcldnl2.cl.intel.com>
	<1206733914.5433.39.camel@localhost>  <47EFEFE9.9040903@gmail.com>
Message-ID: <1207279407.6233.6.camel@localhost>

On Sun, 2008-03-30 at 14:54 -0500, Wendy Cheng wrote:
snip...
> In general, GFS backup from Linux side during run time has been a pain, 
> mostly because of its slowness and the process has to walk thru the 
> whole filesystem to read every single file that ends up accumulating 
> non-trivial amount of cached glocks and memory. For a sizable filesystem 
> (say in TBs range like yours), past experiences have shown that after 
> backup(s), the filesystem latency can go up to an unacceptable level 
> unless its glocks are trimmed. There is a tunable specifically written 
> for this purpose (glock_purge - introduced via RHEL 4.5 ) though.

What should I be setting glock_purge to?

snip...
> The thinking here is to leverage the embedded Netapp copy-on-write 
> feature to speed up the backup process with reasonable disk space 
> requirement. The snapshot volume and the cloned lun shouldn't take much 
> disk space and we can turn on gfs readahead and glock_purge tunables 
> with minimum interruptions to the original gfs volume. The caveat here 
> is GFS-mounting the cloned lun - for one, gfs itself at this moment 
> doesn't allow mounting of multiple devices that have the same filesystem 
> identifiers (the -t value you use during mkfs time e.g. 
> "cluster-name:filesystem-name") on the same node - but it can be fixed 
> (by rewriting the filesystem ID and lock protocol - I will start to test 
> out the described backup script and a gfs kernel patch next week). Also 
> as any tape backup from linux host, you should not expect an image of 
> gfs mountable device (when retrieving from tape) - it is basically a 
> collection of all files residing on the gfs filesystem when the backup 
> events take places.
> 
> Will the above serve your need ? Maybe other folks have (other) better 
> ideas ?

This sounds exactly like what I can use - and it's got to be useful for
everyone with a NetApp and gfs. Thanks for doing this! Let me know how I
can help.



Regards,
-C



From Alain.Moulle at bull.net  Fri Apr  4 06:32:36 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Fri, 04 Apr 2008 08:32:36 +0200
Subject: [Linux-cluster] CS5/ best way to monitor other eth networks
Message-ID: <47F5CB84.3070901@bull.net>

Hi

Which is the best way to monitor other eth networks than the one
for heart-beat ?

I don't think it's possible with heuristics if we have already
set two ping on the same network to check and avoid dual-fencing.

would it be to create a service with a status target which will
ping the eth network to be monitored (quite same way as both
pings in heuristics for heart-beat network) ?

or would it be a cron to check periodically the network and
to do poweroff -f in case of failure ?

Thanks for piece of advice.
Regards
Alain Moull?



From maciej.bogucki at artegence.com  Fri Apr  4 07:50:20 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Fri, 04 Apr 2008 09:50:20 +0200
Subject: [Linux-cluster] Problem in clvmd and iscsi-target
In-Reply-To: <200804031525.25980.npf-mlists@eurotux.com>
References: <200804031525.25980.npf-mlists@eurotux.com>
Message-ID: <47F5DDBC.1040307@artegence.com>

Nuno Fernandes napisa?(a):
> Hi,
> 
> There is a race condition in iscsi-target and clvmd that does not allow me to 
> export a volume by iscsi and use it localy in clvmd.
> 
> I have two servers "black" and "gray". "Gray" has two drives hda (for 
> filesystem) and hdb (that is going to be exported through iscsi to "black"). 
> Then both machines are part of a 2 node cluster to use clvmd.
> 
> root gray ~ # /etc/init.d/iscsi-target start
> Starting iSCSI target service:                             [  OK  ]
> root gray ~ # /etc/init.d/clvmd start
> Starting clvmd:                                            [  OK  ]
> Activating VGs:   No volume groups found
>                                                            [  OK  ]
> root gray ~ # pvcreate /dev/hdb
>   Can't open /dev/hdb exclusively.  Mounted filesystem?
> root gray ~ # /etc/init.d/iscsi-target stop
> Stopping iSCSI target service:                             [  OK  ]
> root gray ~ # pvcreate /dev/hdb
>   Physical volume "/dev/hdb" successfully created
> root gray ~ #
> 
> I cannot use clvmd and iscsi-target on the same machine? If i create a logical 
> volume is is activated in "black"
> 
> lvdisplay -C
>   LV     VG    Attr   LSize   Origin Snap%  Move Log Copy%
>   teste2 teste -wi-a-   1.00G
> 
> and in "gray" is disabled:
> 
> root gray ~ # lvdisplay -C
>   LV     VG    Attr   LSize Origin Snap%  Move Log Copy%
>   teste2 teste -wi-d- 1.00G
> 
> Any ideas?

Hello,

This is strange to me. I have naver done it but I'm sure it is possible
to do! There is nothing special in /etc/init.d/iscsi-target what could
wrong. Could You send Your /etc/ietd.conf. I think that You have
exported /dev/hdb via iscsi, ant here is Your problem.

Best Regards
Maciej Bogucki




From maciej.bogucki at artegence.com  Fri Apr  4 08:02:40 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Fri, 04 Apr 2008 10:02:40 +0200
Subject: [Linux-cluster] Is there a fencing agent I can use for iscsi
	?(GFS	and iSCSI)
In-Reply-To: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
References: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
Message-ID: <47F5E0A0.9080709@artegence.com>

> Or should I try and salvage the one I found and fix it up? Thanks.

Hello,

I think that fixing this script is the best way to do it with linux
iscsi-target.

Best Regards
Maciej Bogucki



From pmshehzad at yahoo.com  Fri Apr  4 08:28:32 2008
From: pmshehzad at yahoo.com (Mshehzad Pankhawala)
Date: Fri, 4 Apr 2008 01:28:32 -0700 (PDT)
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster
	Suit
Message-ID: <553769.13176.qm@web45812.mail.sp1.yahoo.com>

Thanks for your kind reply,
I have successfully configured two node cluster using system-config-cluster with given settings.
Now trying to use drbd with RHCS.
Regards
MohammedShehzad

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/b9c99a1d/attachment.htm>

From johannes.russek at io-consulting.net  Fri Apr  4 08:41:00 2008
From: johannes.russek at io-consulting.net (jr)
Date: Fri, 04 Apr 2008 10:41:00 +0200
Subject: [Linux-cluster] Is there a fencing agent I can use for iscsi
	?(GFS and iSCSI)
In-Reply-To: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
References: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
Message-ID: <1207298460.15409.39.camel@admc.win-rar.local>


> I was wondering if anyone has written a iscsi fencing agent that I could use. I saw one written in perl that ssh'd into the node and added an iptables entry in order to fence the server from the iscsi target. It was from 2004 and didn't run correctly on my machine.  Does anyone have any ideas? Or should I try and salvage the one I found and fix it up? Thanks.

if you need to use it (as suggested in that other reply), i'd make sure
it doesn't connect to a node but to the iSCSI target and adds the
firewall rules there :) or even better if you have a managed switch in
between where you can simply disable the ethernet port (or even better,
have iSCSI on a separate vlan and remove the port from that vlan) via an
ssh script or maybe snmp or whatever.
enjoy,
johannes 

> Tracey Flanders
> 
> _________________________________________________________________
> Get in touch in an instant. Get Windows Live Messenger now.
> http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_getintouch_042008
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From npf-mlists at eurotux.com  Fri Apr  4 09:06:35 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Fri, 4 Apr 2008 10:06:35 +0100
Subject: [Linux-cluster] Re: [Iscsitarget-devel] Problem in clvmd and
	iscsi-target
In-Reply-To: <E2BB8074E5500C42984D980D4BD78EF9022A7090@MFG-NYC-EXCH2.mfg.prv>
References: <E2BB8074E5500C42984D980D4BD78EF9022A708F@MFG-NYC-EXCH2.mfg.prv>
	<200804031811.38493.npf-mlists@eurotux.com>
	<E2BB8074E5500C42984D980D4BD78EF9022A7090@MFG-NYC-EXCH2.mfg.prv>
Message-ID: <200804041006.36217.npf-mlists@eurotux.com>

> > Meanwhile i narrow it down to a iscsi-target problem. Doing strace i saw
> > it requires exclusive open. So solve it it created an sort of an
> > hack.. :)
>
> Ok, you cannot have local services and remote services accessing
> the same raw data sectors at the same time.
>
> I expressedly made blockio open the device exclusively to avoid
> just what you are trying to do.
Why? Did you have any problems in this scenario? Using blockio iscsi-target 
does not use any cache, right?


> If you really need access to the iSCSI target locally, then
> you will need to install an iSCSI initiator on the server and
> connect to the target and use the new disk volume that is
> created.
Ok.. and is the following scenario workable?

HP package cluster with MSA500 (basically 2 servers connected to a scsi shared 
storage)

both servers  export the same msa500 volume
on clients i use iscsi-initiator + multipath to support the failure of one 
server

In this scenario there are two iscsi-target server accessing the same data but 
on different machines. Basically it's the same thing as another process 
accessing the exported volume... Do you agree?

> > from hdb and using device-mapper i create a linear mapping to 2 devices:
> >
> > hdb-int
> > hdb-ext
> >
> > Next i put hdb-ext in ietd.conf and hdb-int in lvm.conf. iscsi-target
> > still opens in exclusive mode but it only opens hdb-ext device.
> > clvmd uses hdb-int that has no exclusive lock.
>
> You are heading down a path here that is fully unsupported and
> discouraged, if you loose your data it will be completely and
> totally because of this.

:(


> > Another way that we rejected is to put "gray" machine exporting the iscsi
> > volume to it self also and using that device in clvmd. This option also
> > worked but has less performance as all local contend has to be encoded to
> > iscsi and decoded in the same machine.
>
> iSCSI encoding/decoding is minimal and if you are using a loopback
> to connect should be as fast as the machine can do it. Do not try
> this with fileio unit types though or the page cache will deadlock
> between iSCSI target/initiator.

IMHO encoding and decoding it locally would be a performance penalty that 
would not be required.. I dind't used fileio exactly because it caches 
content.

> > The problem of iscsi-target opening the device excluse remais.
>
> And it will continue to remain.
>
> Why not try to explain what it is your are trying to accomplish
> and a valid solution or two can be given.
What i'm trying to do is the following:

2 servers connected to msa500 exporting a volume through iscsi-target
6 servers with iscsi-initiator + multipath accessing the volume

all 8 servers with clvmd accessing the volume so i can create LVs in any of 
the 8 servers.

The LVs are to be used by Xen virtual machines in any machine in the cluster

I didn't want to use iscsi-target + iscsi-initiator on the msa connected 
machines as it would be a performance penalty.

Thanks,
Nuno Fernandes



From maciej.bogucki at artegence.com  Fri Apr  4 10:28:53 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Fri, 04 Apr 2008 12:28:53 +0200
Subject: [Linux-cluster] Is there a fencing agent I can use for iscsi
	?(GFS and iSCSI)
In-Reply-To: <1207298460.15409.39.camel@admc.win-rar.local>
References: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
	<1207298460.15409.39.camel@admc.win-rar.local>
Message-ID: <47F602E5.9040306@artegence.com>

jr napisa?(a):
>> I was wondering if anyone has written a iscsi fencing agent that I could use. I saw one written in perl that ssh'd into the node and added an iptables entry in order to fence the server from the iscsi target. It was from 2004 and didn't run correctly on my machine.  Does anyone have any ideas? Or should I try and salvage the one I found and fix it up? Thanks.
> 
> if you need to use it (as suggested in that other reply), i'd make sure
> it doesn't connect to a node but to the iSCSI target and adds the
> firewall rules there :) or even better if you have a managed switch in
> between where you can simply disable the ethernet port (or even better,
> have iSCSI on a separate vlan and remove the port from that vlan) via an
> ssh script or maybe snmp or whatever.
> enjoy,

Another option is fencing via power device fe. fence_apc, fence_apc_snmp
but You would need tu but APC hardware. Fenceing via fence_ilo,
fence_rsa. fence_ipmilan is the option if You would have IBM, Dell or HP
servers. You could also try fence_scsi without any costs, but it doesn't
works if You had multipath configuration.

Best Regards
Maciej Bogucki



From rpeterso at redhat.com  Fri Apr  4 13:37:55 2008
From: rpeterso at redhat.com (Bob Peterson)
Date: Fri, 04 Apr 2008 08:37:55 -0500
Subject: [Linux-cluster] fsck.gfs2 seg fault
In-Reply-To: <000501c895e8$5f13e5f0$9601a8c0@corp.netone.gr>
References: <000501c895e8$5f13e5f0$9601a8c0@corp.netone.gr>
Message-ID: <1207316275.2740.10.camel@technetium.msp.redhat.com>

On Fri, 2008-04-04 at 03:10 +0300, Theophanis Kontogiannis wrote:
> Hello all
> 
> I have RHEL 5.1 with 2.6.18 kernel and fsck.gfs2 (GFS2 fsck 0.1.38)
> always seg faults at 99% with:
> 
> fsck.gfs2[8245]: segfault at 0000000000000018 rip 00000000004047db rsp
> 00007ffffbabecb0 error 4
> 
> Any ideas on that?
> 
> 
> Thank you all for your time
> 
> Theophanis Kontogiannis

Hi Theophanis,

That's not enough information to tell what's going on.
It would probably be more helpful if I could get the full call trace,
and the last several lines of output from fsck.gfs2 -vv <device>.

It would even be better if you built gfs2.fsck from the latest source
tree, changed the Makefile to use "-g" rather than "-O2", compiled it,
and then ran it from gdb.  Then if/when it segfaults, you can do "bt"
to get a better backtrace.

Also, be aware that GFS2 is not ready for production in RHEL5.1.

Regards,

Bob Peterson
Red Hat Clustering & GFS




From johannes.russek at io-consulting.net  Fri Apr  4 14:15:43 2008
From: johannes.russek at io-consulting.net (jr)
Date: Fri, 04 Apr 2008 16:15:43 +0200
Subject: [Linux-cluster] Nagios check
Message-ID: <1207318543.15409.42.camel@admc.win-rar.local>

Hi Everybody,
i wonder if i'm the first with the need to check the status of GFS /
cman with nagios.
Did anyone maybe already write a check script i did not find yet?
i found one via google, but it basically just did an ls -l on the GFS
share, and that seems to be a little bit too less for monitoring..
thanks in advance,
regards,
johannes



From rotsen at gmail.com  Fri Apr  4 14:41:43 2008
From: rotsen at gmail.com (=?ISO-8859-1?Q?N=E9stor?=)
Date: Fri, 4 Apr 2008 07:41:43 -0700
Subject: [Linux-cluster] How to use manual fencing in Redhat Cluster
	Suit
In-Reply-To: <553769.13176.qm@web45812.mail.sp1.yahoo.com>
References: <553769.13176.qm@web45812.mail.sp1.yahoo.com>
Message-ID: <a304e2d60804040741o2dde7d14lb068f4fd288aae6c@mail.gmail.com>

Mohammed,

I need to do DRBD on RHEL5.  Let me know wha tyou find out while doing your
project
I am sure I can use your experiences when I start my DRBD.

Good Luck,


N?stor :-)

2008/4/4 Mshehzad Pankhawala <pmshehzad at yahoo.com>:

> Thanks for your kind reply,
> I have successfully configured two node cluster using
> system-config-cluster with given settings.
> Now trying to use drbd with RHCS.
> Regards
> MohammedShehzad
>
> ------------------------------
> You rock. That's why Blockbuster's offering you one month of Blockbuster
> Total Access<http://us.rd.yahoo.com/evt=47523/*http://tc.deals.yahoo.com/tc/blockbuster/text5.com>,
> No Cost.
>
> ------------------------------
> You rock. That's why Blockbuster's offering you one month of Blockbuster
> Total Access<http://us.rd.yahoo.com/evt=47523/*http://tc.deals.yahoo.com/tc/blockbuster/text5.com>,
> No Cost.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/6ef25a2c/attachment.htm>

From tiagocruz at forumgdh.net  Fri Apr  4 14:46:25 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Fri, 04 Apr 2008 11:46:25 -0300
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
Message-ID: <1207320385.19045.19.camel@tuxkiller.ig.com.br>

If I'm using one LVM device, export it using GNDB, and format using GFS
(1 or 2).... I will can add or remove some GB on this filesystem?

Thanks
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From npf-mlists at eurotux.com  Fri Apr  4 15:04:11 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Fri, 4 Apr 2008 16:04:11 +0100
Subject: [Linux-cluster] Re: [Iscsitarget-devel] Problem in clvmd and
	iscsi-target
In-Reply-To: <E2BB8074E5500C42984D980D4BD78EF9022A7095@MFG-NYC-EXCH2.mfg.prv>
References: <E2BB8074E5500C42984D980D4BD78EF9022A708F@MFG-NYC-EXCH2.mfg.prv>
	<200804041006.36217.npf-mlists@eurotux.com>
	<E2BB8074E5500C42984D980D4BD78EF9022A7095@MFG-NYC-EXCH2.mfg.prv>
Message-ID: <200804041604.11396.npf-mlists@eurotux.com>

Hello,

First of all i would like to thank your patience...

> There is a lot of confusion by newcomers to iSCSI storage.
>
> A lot of the time they think of iSCSI as yet another
> file sharing method, which it isn't, it is a disk sharing
> method, and if you allow 2 hosts to access the same disk
> without putting special controls in place to make sure
> that either 1) only 1 host at a time can access a given
> disk, or 2) install a clustering file system that allows
> multiple hosts to access the same disk at the same time,
> then they will experience data corruption as there is
> nothing preventing any two hosts from writing data on
> top of each other.
I understant.. iscsi has nothing to do with files or filesystems. Iscsi (and 
scsi for that matter) only work with blocks. If you try to put several 
machines accessing the same filesystem that is not cluster-aware you'll have 
lots of corruptions..

> The performance penalty you speak of with blockio being accessed
> through a local iSCSI connection should really not be noticed
> except for extreme high-end processing, which if that is the
> case you are picking the wrong technology.
We have bladecenter with FC storage for that :)
What we are trying to do is remove "unecessary" load in the msa connected 
machines as they will be used for virtual machines also.

> When you mount an iscsi target locally the open-iscsi initiator
> does agressive caching of io, then the file system of the OS
> does agressive caching itself, so it's not as if all io becomes
> synchronous in this scenario.
You are correct but that also happens with 2 open-iscsi initiators accessing 
the same exported volume in different machines. The only difference is that 
instead of the msa500 volume being exported directly by iscsi-target  there 
is a middleware (device-mapper) between msa500 volume and the iscsi-target.
Device-mapper does not do cache. When we do an fsync in a guest machine it 
goes:

virtual machine fsync -> clvmd/lvm -> iscsi-initiator -> iscsi-target -> 
device-mapper -> msa500

when the virtual machine is running in the msa500 connected hardware we get

virtual machine fsync -> clvmd/lvm -> device-mapper (linear) -> msa500

> Now you can use clvm between the iSCSI targets to manage
> how the MSA500 storage is allocated for the creation of
> iSCSI targets, but once exported by iSCSI, these servers
> should not care about what the initiators put into it
> or how they manage it.
That would require us to be changing all the time the iscsi-target and 
initiators confs as well as iscsi discovers and multipath in all the 
iscsi-initiators machines.
When we create a volume to a virtual machine we would have to do:

1 - create volume in clvmd that manages the storage
2 - change ietd.conf to allow it to be exported
3 - discover the new device in initiators
4 - change multipath in initiators including the new volume

Drawbacks:
1 - lots of changes in conf files, restarting services :)
2 - Multipath has a patchchecker that checks if a path is alive (usually 
readblock0). That would give me lots and lots of readblock0..

total checks in msa500 = num client machines * num multipath devices * num 
iscsi-target machines

With 8 machines and 40 volumes we would have:

8 * 40 * 2 = 640 IO checks

>
>                +--------+ <-> |- initiator1
>
>                | iSCSI1 |     |
>
> +--------+ <-> +--------+ <-> |- initiator2
>
> | MSA500 | (2)     (3)    (4) |     (5)
>
> +--------+ <-> +--------+ <-> |- initiator3
>     (1)        | iSCSI2 |     |
>                +--------+ <-> |- initiator4
>
> 1) MSA500 provides volume1, volume2 to
>    fiber hosts iSCSI1/iSCSI2
> 2) iSCSI1/iSCSI2 fiber connect to MSA500
> 3) iSCSI1/iSCSI2 use clvm to divvy up
>    volume1 and volume2 into target1, target2
>    target3, target4, target5 to iSCSI network
> 4) iSCSI1/iSCSI2 provide targets to iSCSI
>    network through bonded pairs
> 5) initiators use clvm to divvy up target1,
>    target2, target3... storage for use by Xen
>    domains.

> I hope that helps.
We are doing stress tests (bonnie++, ctcs) with our "hack" and so far it never 
had any problems. We even shutdown one of the iscsi-target nodes there's a 
small hiccup (as one path failed) but it continues shortly after.

We've changed

node.session.timeo.replacement_timeout 
node.conn[0].timeo.noop_out_interval 
node.conn[0].timeo.noop_out_timeout 

to increase the speed of the failover..

Thanks again,
Nuno Fernandes



From RJM002 at shsu.edu  Fri Apr  4 15:26:46 2008
From: RJM002 at shsu.edu (MARTI, ROBERT JESSE)
Date: Fri, 4 Apr 2008 10:26:46 -0500
Subject: [Linux-cluster] Nagios check
References: <1207318543.15409.42.camel@admc.win-rar.local>
Message-ID: <9F633DE6C0E04F4691DCB713AC44C94B066B56C7@EXCHANGE.SHSU.EDU>

IIRC, theres a cluster snmp package - I would see what I can pull from there.

Rob Marti
Sam Houston State University
Systems Analyst II
936-294-3804 // rjm002 at shsu.edu



-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of jr
Sent: Fri 4/4/2008 09:15
To: linux clustering
Subject: [Linux-cluster] Nagios check
 
Hi Everybody,
i wonder if i'm the first with the need to check the status of GFS /
cman with nagios.
Did anyone maybe already write a check script i did not find yet?
i found one via google, but it basically just did an ls -l on the GFS
share, and that seems to be a little bit too less for monitoring..
thanks in advance,
regards,
johannes

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From garromo at us.ibm.com  Fri Apr  4 15:50:26 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Fri, 4 Apr 2008 09:50:26 -0600
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <1207320385.19045.19.camel@tuxkiller.ig.com.br>
Message-ID: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com>

You cannot shrink/reduce the size of a GFS file system.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com



Tiago Cruz <tiagocruz at forumgdh.net> 
Sent by: linux-cluster-bounces at redhat.com
04/04/2008 08:46 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc

Subject
[Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?






If I'm using one LVM device, export it using GNDB, and format using GFS
(1 or 2).... I will can add or remove some GB on this filesystem?

Thanks
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/d4f25b67/attachment.htm>

From johannes.russek at io-consulting.net  Fri Apr  4 16:22:52 2008
From: johannes.russek at io-consulting.net (jr)
Date: Fri, 04 Apr 2008 18:22:52 +0200
Subject: [Linux-cluster] Nagios check
In-Reply-To: <9F633DE6C0E04F4691DCB713AC44C94B066B56C7@EXCHANGE.SHSU.EDU>
References: <1207318543.15409.42.camel@admc.win-rar.local>
	<9F633DE6C0E04F4691DCB713AC44C94B066B56C7@EXCHANGE.SHSU.EDU>
Message-ID: <1207326172.15409.45.camel@admc.win-rar.local>

good idea!
if only i wouldn't run rhel5 x86_64 (centos in this case) which still
maintains a bug of snmpd that causes it to lock up and stay in an
infinite loop with 99% cpu usage :/

regards,
johannes


Am Freitag, den 04.04.2008, 10:26 -0500 schrieb MARTI, ROBERT JESSE:
> IIRC, theres a cluster snmp package - I would see what I can pull from there.
> 
> Rob Marti
> Sam Houston State University
> Systems Analyst II
> 936-294-3804 // rjm002 at shsu.edu
> 
> 
> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com on behalf of jr
> Sent: Fri 4/4/2008 09:15
> To: linux clustering
> Subject: [Linux-cluster] Nagios check
>  
> Hi Everybody,
> i wonder if i'm the first with the need to check the status of GFS /
> cman with nagios.
> Did anyone maybe already write a check script i did not find yet?
> i found one via google, but it basically just did an ls -l on the GFS
> share, and that seems to be a little bit too less for monitoring..
> thanks in advance,
> regards,
> johannes
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From alex.kompel at 23andme.com  Fri Apr  4 17:29:09 2008
From: alex.kompel at 23andme.com (Alex Kompel)
Date: Fri, 4 Apr 2008 10:29:09 -0700
Subject: [Linux-cluster] Is there a fencing agent I can use for iscsi
	?(GFS and iSCSI)
In-Reply-To: <47F602E5.9040306@artegence.com>
References: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
	<1207298460.15409.39.camel@admc.win-rar.local>
	<47F602E5.9040306@artegence.com>
Message-ID: <68a019b50804041029n37f52dc8od34c597a23b928c3@mail.gmail.com>

2008/4/4 Maciej Bogucki <maciej.bogucki at artegence.com>:

> jr napisa?(a):
> >> I was wondering if anyone has written a iscsi fencing agent that I
> could use. I saw one written in perl that ssh'd into the node and added an
> iptables entry in order to fence the server from the iscsi target. It was
> from 2004 and didn't run correctly on my machine.  Does anyone have any
> ideas? Or should I try and salvage the one I found and fix it up? Thanks.
> >
> > if you need to use it (as suggested in that other reply), i'd make sure
> > it doesn't connect to a node but to the iSCSI target and adds the
> > firewall rules there :) or even better if you have a managed switch in
> > between where you can simply disable the ethernet port (or even better,
> > have iSCSI on a separate vlan and remove the port from that vlan) via an
> > ssh script or maybe snmp or whatever.
> > enjoy,
>
> Another option is fencing via power device fe. fence_apc, fence_apc_snmp
> but You would need tu but APC hardware. Fenceing via fence_ilo,
> fence_rsa. fence_ipmilan is the option if You would have IBM, Dell or HP
> servers. You could also try fence_scsi without any costs, but it doesn't
> works if You had multipath configuration.
>
I second that: fence_scsi should work pretty well if your target supports
SCSI-3 persistent reservations. It does not make much sense to use multipath
I/O  for iSCSI since channel bonding provides the same functionality
nowadays.

Also, if you have 2-node cluster then you can configure quorum disk on iSCSI
volume as a tiebreaker .

-Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/45072d1b/attachment.htm>

From jgarrity at qualcomm.com  Fri Apr  4 19:03:28 2008
From: jgarrity at qualcomm.com (John Garrity)
Date: Fri, 04 Apr 2008 12:03:28 -0700
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <68a019b50804041029n37f52dc8od34c597a23b928c3@mail.gmail.co
 m>
References: <BAY123-W3F23BA8A504F52CDB20EAD4F70@phx.gbl>
	<1207298460.15409.39.camel@admc.win-rar.local>
	<47F602E5.9040306@artegence.com>
	<68a019b50804041029n37f52dc8od34c597a23b928c3@mail.gmail.com>
Message-ID: <200804041903.m34J3UQX017586@msgtransport04.qualcomm.com>

I'm trying to get ftp working in a LVS DR cluster. I think it's the iptables rules that might be giving me a problem. I have http services working well. Can someone who has ftp working share their ip tables rules? I'm new at this so please go easy on me. Thanks! 



From sghosh at redhat.com  Fri Apr  4 19:06:30 2008
From: sghosh at redhat.com (Subhendu Ghosh)
Date: Fri, 04 Apr 2008 15:06:30 -0400
Subject: [Linux-cluster] Nagios check
In-Reply-To: <1207326172.15409.45.camel@admc.win-rar.local>
References: <1207318543.15409.42.camel@admc.win-rar.local>	<9F633DE6C0E04F4691DCB713AC44C94B066B56C7@EXCHANGE.SHSU.EDU>
	<1207326172.15409.45.camel@admc.win-rar.local>
Message-ID: <47F67C36.20502@redhat.com>

If you would like this in the standard plugins distribution, let me know. 
There is a lot of back end work happening with the plugins.

-regards
Subhendu

jr wrote:
> good idea!
> if only i wouldn't run rhel5 x86_64 (centos in this case) which still
> maintains a bug of snmpd that causes it to lock up and stay in an
> infinite loop with 99% cpu usage :/
> 
> regards,
> johannes
> 
> 
> Am Freitag, den 04.04.2008, 10:26 -0500 schrieb MARTI, ROBERT JESSE:
>> IIRC, theres a cluster snmp package - I would see what I can pull from there.
>>
>> Rob Marti
>> Sam Houston State University
>> Systems Analyst II
>> 936-294-3804 // rjm002 at shsu.edu
>>
>>
>>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com on behalf of jr
>> Sent: Fri 4/4/2008 09:15
>> To: linux clustering
>> Subject: [Linux-cluster] Nagios check
>>  
>> Hi Everybody,
>> i wonder if i'm the first with the need to check the status of GFS /
>> cman with nagios.
>> Did anyone maybe already write a check script i did not find yet?
>> i found one via google, but it basically just did an ls -l on the GFS
>> share, and that seems to be a little bit too less for monitoring..
>> thanks in advance,
>> regards,
>> johannes
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


-- 
Red Hat Summit Boston |  June 18-20, 2008
Learn more: http://www.redhat.com/summit

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sghosh.vcf
Type: text/x-vcard
Size: 266 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/b5bd488f/attachment.vcf>

From tiagocruz at forumgdh.net  Fri Apr  4 19:04:49 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Fri, 04 Apr 2008 16:04:49 -0300
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com>
References: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com>
Message-ID: <1207335889.19045.37.camel@tuxkiller.ig.com.br>

Hum.... so, how can I increase/grow one filesystem with GFS? :-)

Its stable and secure does this?

Many thanks

On Fri, 2008-04-04 at 09:50 -0600, Gary Romo wrote:
> 
> You cannot shrink/reduce the size of a GFS file system. 
> 
> Gary Romo
> IBM Global Technology Services
> 303.458.4415
> Email: garromo at us.ibm.com
> Pager:1.877.552.9264
> Text message: gromo at skytel.com 
> 
> 
> Tiago Cruz
> <tiagocruz at forumgdh.net> 
> Sent by:
> linux-cluster-bounces at redhat.com 
> 
> 04/04/2008 08:46 AM 
>          Please respond to
>          linux clustering
>     <linux-cluster at redhat.com>
> 
> 
> 
> 
>                To
> linux clustering
> <linux-cluster at redhat.com> 
>                cc
> 
>           Subject
> [Linux-cluster]
> Can I shrink/grow
> one GFS{12}
> FileSystem?
> 
> 
> 
> 
> 
> 
> 
> 
> If I'm using one LVM device, export it using GNDB, and format using
> GFS
> (1 or 2).... I will can add or remove some GB on this filesystem?
> 
> Thanks
> -- 
> Tiago Cruz
> http://everlinux.com
> Linux User #282636
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From Derek.Anderson at compellent.com  Fri Apr  4 19:22:56 2008
From: Derek.Anderson at compellent.com (Derek Anderson)
Date: Fri, 4 Apr 2008 14:22:56 -0500
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <1207335889.19045.37.camel@tuxkiller.ig.com.br>
References: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com>
	<1207335889.19045.37.camel@tuxkiller.ig.com.br>
Message-ID: <99E0F1976E2DA2499F3E6EB18B25F036040C7974@honeywheat.Beer.Town>

Tiago,

You can grow the filesystem with gfs_grow, once the underlying device
has been expanded.  See gfs_grow(8) for more information.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Tiago Cruz
Sent: Friday, April 04, 2008 2:05 PM
To: linux clustering
Cc: linux-cluster-bounces at redhat.com
Subject: Re: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?

Hum.... so, how can I increase/grow one filesystem with GFS? :-)

Its stable and secure does this?

Many thanks

On Fri, 2008-04-04 at 09:50 -0600, Gary Romo wrote:
> 
> You cannot shrink/reduce the size of a GFS file system. 
> 
> Gary Romo
> IBM Global Technology Services
> 303.458.4415
> Email: garromo at us.ibm.com
> Pager:1.877.552.9264
> Text message: gromo at skytel.com 
> 
> 
> Tiago Cruz
> <tiagocruz at forumgdh.net> 
> Sent by:
> linux-cluster-bounces at redhat.com 
> 
> 04/04/2008 08:46 AM 
>          Please respond to
>          linux clustering
>     <linux-cluster at redhat.com>
> 
> 
> 
> 
>                To
> linux clustering
> <linux-cluster at redhat.com> 
>                cc
> 
>           Subject
> [Linux-cluster]
> Can I shrink/grow
> one GFS{12}
> FileSystem?
> 
> 
> 
> 
> 
> 
> 
> 
> If I'm using one LVM device, export it using GNDB, and format using
> GFS
> (1 or 2).... I will can add or remove some GB on this filesystem?
> 
> Thanks
> -- 
> Tiago Cruz
> http://everlinux.com
> Linux User #282636
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From chawkins at veracitynetworks.com  Fri Apr  4 19:31:45 2008
From: chawkins at veracitynetworks.com (Christopher Hawkins)
Date: Fri, 4 Apr 2008 15:31:45 -0400
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <200804041903.m34J3UQX017586@msgtransport04.qualcomm.com>
Message-ID: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>

Never had to load balance it myself, but have heard of FTP over LVS issues
due to lack of persistence (make sure it's on) and due to port 21 and 20
getting sent to different servers. The solution was to remove port 20 from
LVS. With LVS NAT there is a special FTP module you can load, but it should
not be required in LVS DR. Or are you sure the issue is iptables?

Also I would suggest the LVS mailing list if someone here can't solve this
quickly.  ;-) 

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of John Garrity
Sent: Friday, April 04, 2008 3:03 PM
To: linux clustering
Subject: [Linux-cluster] iptables rules for LVS-DR cluster

I'm trying to get ftp working in a LVS DR cluster. I think it's the iptables
rules that might be giving me a problem. I have http services working well.
Can someone who has ftp working share their ip tables rules? I'm new at this
so please go easy on me. Thanks! 

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From garromo at us.ibm.com  Fri Apr  4 19:35:42 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Fri, 4 Apr 2008 13:35:42 -0600
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <1207335889.19045.37.camel@tuxkiller.ig.com.br>
Message-ID: <OF109EFE3E.D6A78173-ON87257421.006B7BE8-87257421.006B7F37@us.ibm.com>

extend the VG:  vgextend
extend the LV: lvextend

Grow the GFS file system:  gfs_grow -v device

Consider if space is also needed for additional journals.


Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com



Tiago Cruz <tiagocruz at forumgdh.net> 
Sent by: linux-cluster-bounces at redhat.com
04/04/2008 01:04 PM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc
linux-cluster-bounces at redhat.com
Subject
Re: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?






Hum.... so, how can I increase/grow one filesystem with GFS? :-)

Its stable and secure does this?

Many thanks

On Fri, 2008-04-04 at 09:50 -0600, Gary Romo wrote:
> 
> You cannot shrink/reduce the size of a GFS file system. 
> 
> Gary Romo
> IBM Global Technology Services
> 303.458.4415
> Email: garromo at us.ibm.com
> Pager:1.877.552.9264
> Text message: gromo at skytel.com 
> 
> 
> Tiago Cruz
> <tiagocruz at forumgdh.net> 
> Sent by:
> linux-cluster-bounces at redhat.com 
> 
> 04/04/2008 08:46 AM 
>          Please respond to
>          linux clustering
>     <linux-cluster at redhat.com>
> 
> 
> 
> 
>                To
> linux clustering
> <linux-cluster at redhat.com> 
>                cc
> 
>           Subject
> [Linux-cluster]
> Can I shrink/grow
> one GFS{12}
> FileSystem?
> 
> 
> 
> 
> 
> 
> 
> 
> If I'm using one LVM device, export it using GNDB, and format using
> GFS
> (1 or 2).... I will can add or remove some GB on this filesystem?
> 
> Thanks
> -- 
> Tiago Cruz
> http://everlinux.com
> Linux User #282636
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080404/29bb11cd/attachment.htm>

From tiagocruz at forumgdh.net  Fri Apr  4 19:59:15 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Fri, 04 Apr 2008 16:59:15 -0300
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <99E0F1976E2DA2499F3E6EB18B25F036040C7974@honeywheat.Beer.Town>
References: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com>
	<1207335889.19045.37.camel@tuxkiller.ig.com.br>
	<99E0F1976E2DA2499F3E6EB18B25F036040C7974@honeywheat.Beer.Town>
Message-ID: <1207339155.19045.48.camel@tuxkiller.ig.com.br>

?Anderson and ?Gary,

Many thanks for your attention!

I've did one test here, and works perfectly!

I just have one problem, because I need to restart my gnbd_serv, and
re-export the GNDB device for nodes, because I've got this error
messages:

Apr  4 16:40:15 xen-7 gnbd_serv[23817]: ERROR size of the exported file /dev/Vol_LVM/mycluster has changed, aborting 
Apr  4 16:40:15 xen-7 gnbd_serv[23817]: server process 2956 exited because of signal 11 
Apr  4 16:40:15 xen-7 kernel: gnbd_serv[2956]: segfault at 000000000000000c rip 0000000000405ab0 rsp 00007fff36dcb450 error 4
Apr  4 16:41:10 xen-7 gnbd_serv[2970]: startup succeeded 
Apr  4 16:41:17 xen-7 gnbd_serv[2970]: got local command 0x1 
Apr  4 16:41:17 xen-7 gnbd_serv[2970]: gnbd device 'cluster' serving /dev/Vol_LVM/mycluster exported with 41943040 sectors 


But I did another test, and this time I've just "restart" the export
using:
# gnbd_export -R -O
# gnbd_export -c -d /dev/Vol_LVM/mycluster -e cluster

And sounds like fine on the nodes... but I don't know if this process
it's recommend, or if this force (-O  Force unexport) can be dangerous
for the filesystem...

Thanks


On Fri, 2008-04-04 at 14:22 -0500, Derek Anderson wrote:

> You can grow the filesystem with gfs_grow, once the underlying device
> has been expanded.  See gfs_grow(8) for more information.




From Derek.Anderson at compellent.com  Fri Apr  4 20:36:13 2008
From: Derek.Anderson at compellent.com (Derek Anderson)
Date: Fri, 4 Apr 2008 15:36:13 -0500
Subject: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?
In-Reply-To: <1207339155.19045.48.camel@tuxkiller.ig.com.br>
References: <OFC66CCC8D.582C4145-ON87257421.0056E373-87257421.0056DF74@us.ibm.com><1207335889.19045.37.camel@tuxkiller.ig.com.br><99E0F1976E2DA2499F3E6EB18B25F036040C7974@honeywheat.Beer.Town>
	<1207339155.19045.48.camel@tuxkiller.ig.com.br>
Message-ID: <99E0F1976E2DA2499F3E6EB18B25F036040C7AFD@honeywheat.Beer.Town>

Sorry, I missed the part about you using GNBD.

It's been awhile, and I haven't tested it, but I think the safest procedure might be:
1. Unmount from gnbd clients.
2. Un-import from gnbd clients.
3. Un-export gnbd devices from the server _without_ the Override option.
4. Extend the VG and LV.
5. Re-export the gnbd device from the server.
6. Re-import gnbd devices from clients.
7. Re-mount gnbd devices on clients.
8. Run gfs_grow from one client.

Hopefully, gnbd_serv won't complain about the device size changing underneath it if that device is not currently exported.  If  it does you will probably need to stop and restart it around step 4.

Overriding the unexport of gnbd devices can be hazardous.  See the warning in gnbd_export(8).

Good luck:
- Derek

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Tiago Cruz
Sent: Friday, April 04, 2008 2:59 PM
To: linux clustering
Subject: RE: [Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?

Anderson and ?Gary,

Many thanks for your attention!

I've did one test here, and works perfectly!

I just have one problem, because I need to restart my gnbd_serv, and
re-export the GNDB device for nodes, because I've got this error
messages:

Apr  4 16:40:15 xen-7 gnbd_serv[23817]: ERROR size of the exported file /dev/Vol_LVM/mycluster has changed, aborting 
Apr  4 16:40:15 xen-7 gnbd_serv[23817]: server process 2956 exited because of signal 11 
Apr  4 16:40:15 xen-7 kernel: gnbd_serv[2956]: segfault at 000000000000000c rip 0000000000405ab0 rsp 00007fff36dcb450 error 4
Apr  4 16:41:10 xen-7 gnbd_serv[2970]: startup succeeded 
Apr  4 16:41:17 xen-7 gnbd_serv[2970]: got local command 0x1 
Apr  4 16:41:17 xen-7 gnbd_serv[2970]: gnbd device 'cluster' serving /dev/Vol_LVM/mycluster exported with 41943040 sectors 


But I did another test, and this time I've just "restart" the export
using:
# gnbd_export -R -O
# gnbd_export -c -d /dev/Vol_LVM/mycluster -e cluster

And sounds like fine on the nodes... but I don't know if this process
it's recommend, or if this force (-O  Force unexport) can be dangerous
for the filesystem...

Thanks


On Fri, 2008-04-04 at 14:22 -0500, Derek Anderson wrote:

> You can grow the filesystem with gfs_grow, once the underlying device
> has been expanded.  See gfs_grow(8) for more information.


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From linux-cluster at merctech.com  Fri Apr  4 22:41:56 2008
From: linux-cluster at merctech.com (linux-cluster at merctech.com)
Date: Fri, 04 Apr 2008 18:41:56 -0400
Subject: [Linux-cluster] anyone modified fence_mcdata to use ssh instead of
	telnet?
Message-ID: <29376.1207348916@localhost>


Telnet is fundamentally insecure. We've known this for about 20 years. Finally,
network switches, fibre switches, appliances, etc., have begun to recognize this
truth. For example, the McData fibre switches give you the choice of telnet
(evil) or ssh (good). Note that this is a choice between them...you cannot have
both protocols enabled at once (at least not with the switch hardware and
firmware rev I'm using).


So, like a good sysadmin, I enable ssh on my McData Sphereon 4400. I can ssh
into the switch and configure it via the command line. Happiness. Unfortunately,
the fence_mcdata script assumes that the only way to connect to the switch is
via (evil) telnet.


Before I start hacking the fence_mcdata script...has anyone already modified 
this to make it more secure? If not, this would be a simple product 
enhancement (hint, hint).

Thanks,

Mark




From johannes.russek at io-consulting.net  Sat Apr  5 00:20:38 2008
From: johannes.russek at io-consulting.net (Johannes Russek)
Date: Sat, 05 Apr 2008 02:20:38 +0200
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>
References: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>
Message-ID: <47F6C5D6.8030502@io-consulting.net>

we use this together with firewall mark rule in lvs-DR (piranha) and 
scheduler "rr" and persistent = 20:

-A PREROUTING -d $VIP-i eth0 -p tcp -m tcp --dport 10000:20000 -j MARK 
--set-mark 0x14
-A PREROUTING -d $VIP -i eth0 -p tcp -m tcp --dport 20 -j MARK 
--set-mark 0x14
-A PREROUTING -d $VIP -i eth0 -p tcp -m tcp --dport 21 -j MARK 
--set-mark 0x14

also vsftpd.conf is configured with

pasv_min_port=10000
pasv_max_port=20000

hope this helps?
regards,
johannes

p.s.: of course the main firewall has to open the appropiate ports as well

Christopher Hawkins schrieb:
> Never had to load balance it myself, but have heard of FTP over LVS issues
> due to lack of persistence (make sure it's on) and due to port 21 and 20
> getting sent to different servers. The solution was to remove port 20 from
> LVS. With LVS NAT there is a special FTP module you can load, but it should
> not be required in LVS DR. Or are you sure the issue is iptables?
>
> Also I would suggest the LVS mailing list if someone here can't solve this
> quickly.  ;-) 
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of John Garrity
> Sent: Friday, April 04, 2008 3:03 PM
> To: linux clustering
> Subject: [Linux-cluster] iptables rules for LVS-DR cluster
>
> I'm trying to get ftp working in a LVS DR cluster. I think it's the iptables
> rules that might be giving me a problem. I have http services working well.
> Can someone who has ftp working share their ip tables rules? I'm new at this
> so please go easy on me. Thanks! 
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   



From jan.gerrit at kootstra.org.uk  Sat Apr  5 07:21:20 2008
From: jan.gerrit at kootstra.org.uk (Jan Gerrit Kootstra)
Date: Sat, 05 Apr 2008 09:21:20 +0200
Subject: [Linux-cluster] Linux-cluster] Re: Can I shrink/grow one GFS{12}
	FileSystem?
Message-ID: <47F72870.8000308@kootstra.org.uk>

You cannot shrink/reduce the size of a GFS file system.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com


Tiago Cruz <tiagocruz at forumgdh.net>
Sent by: linux-cluster-bounces at redhat.com

04/04/2008 08:46 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


	
To
	linux clustering <linux-cluster at redhat.com>
cc
	
Subject
	[Linux-cluster] Can I shrink/grow one GFS{12} FileSystem?



	





If I'm using one LVM device, export it using GNDB, and format using GFS
(1 or 2).... I will can add or remove some GB on this filesystem?

Thanks
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



Gary,


You are right about reducing/shrinking a GFS filesystem.

Tiago also asked about expanding/growing, for GFS(2) this can be done 
with gfs(2)_grow. Both commands run only on mounted file systems.


Kind regards,


Jan Gerrit Kootstra

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080405/f0c69cac/attachment.htm>

From jgarrity at qualcomm.com  Sat Apr  5 18:04:33 2008
From: jgarrity at qualcomm.com (John Garrity)
Date: Sat, 05 Apr 2008 11:04:33 -0700
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <47F6C5D6.8030502@io-consulting.net>
References: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>
	<47F6C5D6.8030502@io-consulting.net>
Message-ID: <200804051804.m35I4XDs019407@hamtaro.qualcomm.com>

Question: how did you set the scheduler to "n"? 

I don't see a choice for "none" in Piranha and I tried manually editing /etc/sysconfig/ha/lvs.cf with no luck. Even when I commented out the scheduler field it seems to default to wlc. 

Basically, I'm not sure that it's my iptables rules that are giving me a problem. Maybe it's what Christopher mentions below? How would I remove port 20 from LVS? 

I tried using a firewall mark of 20 and have Piranha configured to use 21 as the application port. I can ftp to the real servers using their real IPs but ftps to the VIP fail with the error on the ftp client "An existing connection was forcibly closed by the remote host."

Persistence is set to 20

Here are the iptables rules I'm using

# service iptables status
Table: mangle
Chain PREROUTING (policy ACCEPT)
num  target     prot opt source               destination         
1    MARK       tcp  --  0.0.0.0/0            VIP         tcp dpts:10000:20000 MARK set 0x14 
2    MARK       tcp  --  0.0.0.0/0            VIP         tcp dpt:20 MARK set 0x14 
3    MARK       tcp  --  0.0.0.0/0            VIP         tcp dpt:21 MARK set 0x14 

Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
num  target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
num  target     prot opt source               destination         

Chain POSTROUTING (policy ACCEPT)
num  target     prot opt source               destination         

Table: filter
Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination         
1    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0             tcp spts:1:65535 dpts:1:65535 

Chain FORWARD (policy ACCEPT)
num  target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
num  target     prot opt source               destination 

At 05:20 PM 4/4/2008, Johannes Russek wrote:
>we use this together with firewall mark rule in lvs-DR (piranha) and scheduler "rr" and persistent = 20:
>
>-A PREROUTING -d $VIP-i eth0 -p tcp -m tcp --dport 10000:20000 -j MARK --set-mark 0x14
>-A PREROUTING -d $VIP -i eth0 -p tcp -m tcp --dport 20 -j MARK --set-mark 0x14
>-A PREROUTING -d $VIP -i eth0 -p tcp -m tcp --dport 21 -j MARK --set-mark 0x14
>
>also vsftpd.conf is configured with
>
>pasv_min_port=10000
>pasv_max_port=20000
>
>hope this helps?
>regards,
>johannes
>
>p.s.: of course the main firewall has to open the appropiate ports as well
>
>Christopher Hawkins schrieb:
>>Never had to load balance it myself, but have heard of FTP over LVS issues
>>due to lack of persistence (make sure it's on) and due to port 21 and 20
>>getting sent to different servers. The solution was to remove port 20 from
>>LVS. With LVS NAT there is a special FTP module you can load, but it should
>>not be required in LVS DR. Or are you sure the issue is iptables?
>>
>>Also I would suggest the LVS mailing list if someone here can't solve this
>>quickly.  ;-) 
>>-----Original Message-----
>>From: linux-cluster-bounces at redhat.com
>>[mailto:linux-cluster-bounces at redhat.com] On Behalf Of John Garrity
>>Sent: Friday, April 04, 2008 3:03 PM
>>To: linux clustering
>>Subject: [Linux-cluster] iptables rules for LVS-DR cluster
>>
>>I'm trying to get ftp working in a LVS DR cluster. I think it's the iptables
>>rules that might be giving me a problem. I have http services working well.
>>Can someone who has ftp working share their ip tables rules? I'm new at this
>>so please go easy on me. Thanks! 
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>https://www.redhat.com/mailman/listinfo/linux-cluster
>>  
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster



From johannes.russek at io-consulting.net  Sun Apr  6 01:11:58 2008
From: johannes.russek at io-consulting.net (Johannes Russek)
Date: Sun, 06 Apr 2008 03:11:58 +0200
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <200804051804.m35I4XDs019407@hamtaro.qualcomm.com>
References: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>	<47F6C5D6.8030502@io-consulting.net>
	<200804051804.m35I4XDs019407@hamtaro.qualcomm.com>
Message-ID: <47F8235E.4000509@io-consulting.net>

John Garrity schrieb:
> Question: how did you set the scheduler to "n"? 
>   

i didn't.
it's "rr", double-R for round-robin.

> I don't see a choice for "none" in Piranha and I tried manually editing /etc/sysconfig/ha/lvs.cf with no luck. Even when I commented out the scheduler field it seems to default to wlc. 
>
> Basically, I'm not sure that it's my iptables rules that are giving me a problem. Maybe it's what Christopher mentions below? How would I remove port 20 from LVS? 
>   

i don't think you have to do that with persistency. as i said, it works 
pretty good here.
without much knowledge about your network, i would say it's an issue 
with the direct routing setup. i would suggest digging a little deeper 
into your network setup and checking tcpdump for the reason of the 
connection reset. (stateful filtering at the wrong point in the setup 
comes to mind).
maybe you should ask at that LVS mailing list for help!
good luck.
johannes

> I tried using a firewall mark of 20 and have Piranha configured to use 21 as the application port. I can ftp to the real servers using their real IPs but ftps to the VIP fail with the error on the ftp client "An existing connection was forcibly closed by the remote host."
>
> Persistence is set to 20
>
>   



From jgarrity at qualcomm.com  Sun Apr  6 21:08:26 2008
From: jgarrity at qualcomm.com (John Garrity)
Date: Sun, 06 Apr 2008 14:08:26 -0700
Subject: [Linux-cluster] iptables rules for LVS-DR cluster
In-Reply-To: <47F8235E.4000509@io-consulting.net>
References: <200804041931.m34JVYad009708@mail2.ontariocreditcorp.com>
	<47F6C5D6.8030502@io-consulting.net>
	<200804051804.m35I4XDs019407@hamtaro.qualcomm.com>
	<47F8235E.4000509@io-consulting.net>
Message-ID: <200804062108.m36L8RJZ010147@hamtaro.qualcomm.com>

At 06:11 PM 4/5/2008, you wrote:
it's "rr", double-R for round-robin.

d'oh, that's what i get for not wearing my glasses! 

i don't think you have to do that with persistency. as i said, it works pretty good here.
>without much knowledge about your network, i would say it's an issue with the direct routing setup. i would suggest digging a little deeper into your network setup and checking tcpdump for the reason of the connection reset. (stateful filtering at the wrong point in the setup comes to mind).

yeah, the output from ipvsadm is good for http

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  l423-lvs.qualcomm.com:http rr
  -> l423-cn1.qualcomm.com:http   Route   1      0          0         
  -> l423-cn2.qualcomm.com:http   Route   2      0          0         
FWM  20 rr persistent 20

but no good for ftp

[root at l423-lb1 ~]# ipvsadm
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
FWM  20 rr persistent 20

I signed up for the LVS mail list and will probably post there next week if I can't make any more progress on my own...



>maybe you should ask at that LVS mailing list for help!
>good luck.
>johannes
>
>>I tried using a firewall mark of 20 and have Piranha configured to use 21 as the application port. I can ftp to the real servers using their real IPs but ftps to the VIP fail with the error on the ftp client "An existing connection was forcibly closed by the remote host."
>>
>>Persistence is set to 20
>>
>>  
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster



From jakub.suchy at enlogit.cz  Sun Apr  6 21:12:39 2008
From: jakub.suchy at enlogit.cz (Jakub Suchy)
Date: Sun, 6 Apr 2008 23:12:39 +0200
Subject: [Linux-cluster] Virtual service without GFS
Message-ID: <20080406211239.GC32651@localhost>

Hi,
is it possible to run a virtual service on a cluster (XEN host) without
using GFS? I know I can create an ext3 partition, but it is not possible
to add a resource to virtual service, so I can't join ext3 to it.

Thanks,
Jakub Suchy



From maciej.bogucki at artegence.com  Mon Apr  7 08:23:06 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Mon, 07 Apr 2008 10:23:06 +0200
Subject: [Linux-cluster] Nagios check
In-Reply-To: <1207318543.15409.42.camel@admc.win-rar.local>
References: <1207318543.15409.42.camel@admc.win-rar.local>
Message-ID: <47F9D9EA.50006@artegence.com>

jr napisa?(a):
> Hi Everybody,
> i wonder if i'm the first with the need to check the status of GFS /
> cman with nagios.
> Did anyone maybe already write a check script i did not find yet?
> i found one via google, but it basically just did an ls -l on the GFS
> share, and that seems to be a little bit too less for monitoring..
> thanks in advance,

Here [1] is some tool to monitoring GFS. And below You have my own script.

---cut---
#!/bin/bash

ok() {
	echo "OK - $*"; exit 0
}
warning() {
	echo "WARNING - $*"; exit 1
}
critical() {
	echo "CRITICAL - $*"; exit 2
}
unknown() {
	echo "UNKNOWN - $*"; exit 3
}

procfsf=/proc/cluster/services

if [ ! -f $procfsf ] ; then
	critical "RHCS not running"
fi

procfss=$(cat /proc/cluster/services)
check_clvmd=$(echo "$procfss"|grep "^DLM Lock Space"|grep "clvmd"|head
-1|awk '{print $7}')
check_dlm=$(echo "$procfss"|grep "^DLM Lock Space"|grep -v "clvmd"|head
-1|awk '{print $7}')
check_fenced=$(echo "$procfss"|grep "^Fence Domain"|head -1|awk '{print
$6}')
check_gfs=$(echo "$procfss"|grep "^GFS Mount Group"|head -1|awk '{print
$7}')

if [ -z "$check_clvmd" ] ; then
	critical "CLVM not running"
fi

if [ -z "$check_dlm" ] ; then
	critical "DLM not running"
fi

if [ -z "$check_fenced" ] ; then
	critical "FENCED not running"
fi

if [ -z "$check_gfs" ] ; then
	critical "GFS not running"
fi

if [ "$check_clvmd" != "run" ] ; then
	warning "CLVM in state $check_clvmd"
fi

if [ "$check_dlm" != "run" ] ; then
	warning "DLM in state $check_dlm"
fi

if [ "$check_fenced" != "run" ] ; then
	warning "FENCED in state $check_fenced"
fi

if [ "$check_gfs" != "run" ] ; then
	warning "GFS in state $check_gfs"
fi

gfs_res=$(echo "$procfss"|grep "^GFS Mount Group"|awk '{print $4}'|xargs
echo)

if [ -z "$gfs_res" ] ; then
	critical "RHCS is running without any active resources"
fi

ok "RHCS is running ($gfs_res)"

---cut---

[1] -
http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F2442.html;d=1


Best Regards
Maciej Bogucki



From johannes.russek at io-consulting.net  Mon Apr  7 09:32:25 2008
From: johannes.russek at io-consulting.net (jr)
Date: Mon, 07 Apr 2008 11:32:25 +0200
Subject: [Linux-cluster] Nagios check
In-Reply-To: <47F9D9EA.50006@artegence.com>
References: <1207318543.15409.42.camel@admc.win-rar.local>
	<47F9D9EA.50006@artegence.com>
Message-ID: <1207560745.15409.54.camel@admc.win-rar.local>


Hi Maciej,
thanks a lot for your script. If you look at [1], i'm the guy that had
commented on that ;)
I don't seem to have /proc/cluster? is that a rhel4 specific thing
maybe? or do i need to load something first?

johannes


Am Montag, den 07.04.2008, 10:23 +0200 schrieb Maciej Bogucki:
> jr napisa?(a):
> > Hi Everybody,
> > i wonder if i'm the first with the need to check the status of GFS /
> > cman with nagios.
> > Did anyone maybe already write a check script i did not find yet?
> > i found one via google, but it basically just did an ls -l on the GFS
> > share, and that seems to be a little bit too less for monitoring..
> > thanks in advance,
> 
> Here [1] is some tool to monitoring GFS. And below You have my own script.
> 
> ---cut---
> #!/bin/bash
> 
> ok() {
> 	echo "OK - $*"; exit 0
> }
> warning() {
> 	echo "WARNING - $*"; exit 1
> }
> critical() {
> 	echo "CRITICAL - $*"; exit 2
> }
> unknown() {
> 	echo "UNKNOWN - $*"; exit 3
> }
> 
> procfsf=/proc/cluster/services
> 
> if [ ! -f $procfsf ] ; then
> 	critical "RHCS not running"
> fi
> 
> procfss=$(cat /proc/cluster/services)
> check_clvmd=$(echo "$procfss"|grep "^DLM Lock Space"|grep "clvmd"|head
> -1|awk '{print $7}')
> check_dlm=$(echo "$procfss"|grep "^DLM Lock Space"|grep -v "clvmd"|head
> -1|awk '{print $7}')
> check_fenced=$(echo "$procfss"|grep "^Fence Domain"|head -1|awk '{print
> $6}')
> check_gfs=$(echo "$procfss"|grep "^GFS Mount Group"|head -1|awk '{print
> $7}')
> 
> if [ -z "$check_clvmd" ] ; then
> 	critical "CLVM not running"
> fi
> 
> if [ -z "$check_dlm" ] ; then
> 	critical "DLM not running"
> fi
> 
> if [ -z "$check_fenced" ] ; then
> 	critical "FENCED not running"
> fi
> 
> if [ -z "$check_gfs" ] ; then
> 	critical "GFS not running"
> fi
> 
> if [ "$check_clvmd" != "run" ] ; then
> 	warning "CLVM in state $check_clvmd"
> fi
> 
> if [ "$check_dlm" != "run" ] ; then
> 	warning "DLM in state $check_dlm"
> fi
> 
> if [ "$check_fenced" != "run" ] ; then
> 	warning "FENCED in state $check_fenced"
> fi
> 
> if [ "$check_gfs" != "run" ] ; then
> 	warning "GFS in state $check_gfs"
> fi
> 
> gfs_res=$(echo "$procfss"|grep "^GFS Mount Group"|awk '{print $4}'|xargs
> echo)
> 
> if [ -z "$gfs_res" ] ; then
> 	critical "RHCS is running without any active resources"
> fi
> 
> ok "RHCS is running ($gfs_res)"
> 
> ---cut---
> 
> [1] -
> http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F2442.html;d=1
> 
> 
> Best Regards
> Maciej Bogucki
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From maciej.bogucki at artegence.com  Mon Apr  7 09:49:41 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Mon, 07 Apr 2008 11:49:41 +0200
Subject: [Linux-cluster] Nagios check
In-Reply-To: <1207560745.15409.54.camel@admc.win-rar.local>
References: <1207318543.15409.42.camel@admc.win-rar.local>	<47F9D9EA.50006@artegence.com>
	<1207560745.15409.54.camel@admc.win-rar.local>
Message-ID: <47F9EE35.5030204@artegence.com>

jr napisa?(a):
> Hi Maciej,
> thanks a lot for your script. If you look at [1], i'm the guy that had
> commented on that ;)
:)))

> I don't seem to have /proc/cluster? is that a rhel4 specific thing
> maybe? or do i need to load something first?

Yes, it is for rhel4. I don't have any rhel5 with GFS, so I can't help You.

Best Regards
Maciej Bogucki



From maciej.bogucki at artegence.com  Mon Apr  7 10:01:09 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Mon, 07 Apr 2008 12:01:09 +0200
Subject: [Linux-cluster] Nagios check
In-Reply-To: <47F9EE35.5030204@artegence.com>
References: <1207318543.15409.42.camel@admc.win-rar.local>	<47F9D9EA.50006@artegence.com>	<1207560745.15409.54.camel@admc.win-rar.local>
	<47F9EE35.5030204@artegence.com>
Message-ID: <47F9F0E5.9010308@artegence.com>

>> thanks a lot for your script. If you look at [1], i'm the guy that had
>> commented on that ;)
> :)))
> 
>> I don't seem to have /proc/cluster? is that a rhel4 specific thing
>> maybe? or do i need to load something first?
> 
> Yes, it is for rhel4. I don't have any rhel5 with GFS, so I can't help You.

Hello,

Here is the patch, now It should work with rhel5.
--- check_rhcs.original        2006-12-22 13:47:36.000000000 +0100
+++ check_rhcs    2008-04-07 11:59:23.000000000 +0200
@@ -27,13 +27,13 @@
        echo "UNKNOWN - $*"; exit 3
 }

-procfsf=/proc/cluster/services
+#procfsf="cman_tool services"

 if [ ! -f $procfsf ] ; then
        critical "RHCS not running"
 fi

-procfss=$(cat /proc/cluster/services)
+procfss=`cman_tool services`
 check_clvmd=$(echo "$procfss"|grep "^DLM Lock Space"|grep "clvmd"|head
-1|awk '{print $7}')
 check_dlm=$(echo "$procfss"|grep "^DLM Lock Space"|grep -v "clvmd"|head
-1|awk '{print $7}')
 check_fenced=$(echo "$procfss"|grep "^Fence Domain"|head -1|awk '{print
$6}')


Best Regards
Maciej Bogucki



From teemu.m2 at luukku.com  Mon Apr  7 10:29:11 2008
From: teemu.m2 at luukku.com (m.. mm..)
Date: Mon, 7 Apr 2008 13:29:11 +0300 (EEST)
Subject: [Linux-cluster] SCSI-fence configure
Message-ID: <1207564151655.teemu.m2.27811.rpZq244jT_oNCRlQjmOPCA@luukku.com>

Hi Ryan or somebody else..

I have one question about your documentation about RedHat and scsi_reservation fence what you have write.

About this Storage Requimenents:
You write like this. "all shared storage must use LVM2 cluster volumes"
If i have 2 cluster-nodes and shared /data1 mount which are this shared volume, with no scsi-resevation bit on, in active-passive mode service.
And i want SCSI-fence like: I configure 2 * 2Gb shared storage more with SCSI-3 reservation bit on and create lvm2-partition, should this fence work, or must i still convert this allready configured /data1 to lvm2, for not preventing data-corruption in some situations. Or can i leave this partition unchanged


------------cut starts:--------------
3.2 - Storage Requirements

In order to use SCSI persistent reservations as a fencing method, all
shared storage must use LVM2 cluster volumes. In addition, all devices
within these volumes must be SPC-3 compliant. If you are unsure if
your cluster and shared storage environment meets these requirements,
a script is available to determine if your shared storage devices are
capable of using SCSI persistent reservations. See section x.x.

------------ cut ends-------------------------------




...................................................................
Luukku Plus paketilla p??set eroon tila- ja turvallisuusongelmista.
Hanki Luukku Plus ja helpotat el?m??si. http://www.mtv3.fi/luukku




From maciej.bogucki at artegence.com  Mon Apr  7 12:07:59 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Mon, 07 Apr 2008 14:07:59 +0200
Subject: [Linux-cluster] SCSI-fence configure
In-Reply-To: <1207564151655.teemu.m2.27811.rpZq244jT_oNCRlQjmOPCA@luukku.com>
References: <1207564151655.teemu.m2.27811.rpZq244jT_oNCRlQjmOPCA@luukku.com>
Message-ID: <47FA0E9F.5010302@artegence.com>

m.. mm.. napisa?(a):
> Hi Ryan or somebody else..
> 
> I have one question about your documentation about RedHat and scsi_reservation fence what you have write.
> 
> About this Storage Requimenents:
> You write like this. "all shared storage must use LVM2 cluster volumes"
> If i have 2 cluster-nodes and shared /data1 mount which are this shared volume, with no scsi-resevation bit on, in active-passive mode service.
> And i want SCSI-fence like: I configure 2 * 2Gb shared storage more with SCSI-3 reservation bit on and create lvm2-partition, should this fence work, or must i still convert this allready configured /data1 to lvm2, for not preventing data-corruption in some situations. Or can i leave this partition unchanged

Hello,

As Ryan said, actual version of fence_scsi works only with LVM2. It is
only bash script, so You could change it, if You want.
I can't understand the term "active-passive mode service"? Do you use
GFS filesystem?

Best Regards
Maciej Bogucki



From Alain.Moulle at bull.net  Mon Apr  7 13:00:23 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Mon, 07 Apr 2008 15:00:23 +0200
Subject: [Linux-cluster] timers tuning (contd)
Message-ID: <47FA1AE7.2050600@bull.net>

Hi

Is there a similar rule with CS5 ? I mean if we
increase the heart-beat timeout, is there some
other parameters to adjust together ?

Thanks
Regards
Alain Moull?

Alain Moulle wrote:


>>>> Hi
>>>>
>>>> is there a rule to follow between the DLM lock_timeout
>>>> and the deadnode_timeout value ?
>>>> Meaning for example that the first one must be always lesser than
>>>> the second one ?
>>>>
>>>> And if so, could we have a deadnode_timeout=60s and the
>>>> /proc/cluster/config/dlm/lock_timeout at 70s ? or are
>>>> there some upper limits not to exceed ?



The DLM's lock_timeout should always be greater than cman's
deadnode_timeout. A sensible minimum is about 1.5 times the cman value,
but it can go as high as you like.
-- Chrissie




From rohara at redhat.com  Mon Apr  7 15:09:03 2008
From: rohara at redhat.com (Ryan O'Hara)
Date: Mon, 07 Apr 2008 10:09:03 -0500
Subject: [Linux-cluster] SCSI-fence configure
In-Reply-To: <1207564151655.teemu.m2.27811.rpZq244jT_oNCRlQjmOPCA@luukku.com>
References: <1207564151655.teemu.m2.27811.rpZq244jT_oNCRlQjmOPCA@luukku.com>
Message-ID: <47FA390F.7060707@redhat.com>

m.. mm.. wrote:
> Hi Ryan or somebody else..
> 
> I have one question about your documentation about RedHat and scsi_reservation fence what you have write.
> 
> About this Storage Requimenents:
> You write like this. "all shared storage must use LVM2 cluster volumes"
> If i have 2 cluster-nodes and shared /data1 mount which are this shared volume, with no scsi-resevation bit on, in active-passive mode service.
> And i want SCSI-fence like: I configure 2 * 2Gb shared storage more with SCSI-3 reservation bit on and create lvm2-partition, should this fence work, or must i still convert this allready configured /data1 to lvm2, for not preventing data-corruption in some situations. Or can i leave this partition unchanged

This means you must use LVM2 to setup your shared storage. In other 
words, you must be using clvmd (lvm2-cluster package).

Also note that active-passive arrays are not officially support. I have 
tested as active-passive array with RHEL5 and it seems to work due to a 
change in device-mapper-multipath. The reason is that SCSI reservations 
work by using an ioctl call to the devices and both the active and 
passive paths must get this ioctl (to create the registration and/or 
reservation). RHEL5 appears to include a fix that passes this ioctl to 
all devices. This is not present in RHEL4. I believe the documentation 
states that active-passive (multipath) arrays are not supported. I will 
update this when I do more testing.

I'm not sure what you mean by "no scsi-resevation bit on". Can you explain?

-Ryan


> ------------cut starts:--------------
> 3.2 - Storage Requirements
> 
> In order to use SCSI persistent reservations as a fencing method, all
> shared storage must use LVM2 cluster volumes. In addition, all devices
> within these volumes must be SPC-3 compliant. If you are unsure if
> your cluster and shared storage environment meets these requirements,
> a script is available to determine if your shared storage devices are
> capable of using SCSI persistent reservations. See section x.x.
> 
> ------------ cut ends-------------------------------
> 
> 
> 
> 
> ...................................................................
> Luukku Plus paketilla p??set eroon tila- ja turvallisuusongelmista.
> Hanki Luukku Plus ja helpotat el?m??si. http://www.mtv3.fi/luukku
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From lhh at redhat.com  Mon Apr  7 21:23:29 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 07 Apr 2008 17:23:29 -0400
Subject: [Linux-cluster] timers tuning (contd)
In-Reply-To: <47FA1AE7.2050600@bull.net>
References: <47FA1AE7.2050600@bull.net>
Message-ID: <1207603409.2927.34.camel@localhost.localdomain>


On Mon, 2008-04-07 at 15:00 +0200, Alain Moulle wrote:
> Hi
> 
> Is there a similar rule with CS5 ? I mean if we
> increase the heart-beat timeout, is there some
> other parameters to adjust together ?

qdisk timeout should be about a hair more than cman's timeout, if you're
using it.

-- Lon



From lhh at redhat.com  Mon Apr  7 21:25:38 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 07 Apr 2008 17:25:38 -0400
Subject: [Linux-cluster] Virtual service without GFS
In-Reply-To: <20080406211239.GC32651@localhost>
References: <20080406211239.GC32651@localhost>
Message-ID: <1207603538.2927.37.camel@localhost.localdomain>


On Sun, 2008-04-06 at 23:12 +0200, Jakub Suchy wrote:
> Hi,
> is it possible to run a virtual service on a cluster (XEN host) without
> using GFS? I know I can create an ext3 partition, but it is not possible
> to add a resource to virtual service, so I can't join ext3 to it.

Yes, but ...

 * the same rules apply as if you were using GFS
   * The storage used by the Xen virtual machine must be accessible from
all cluster members.
   * fencing is required

For example, you can store the guest images on raw SAN partitions
without using GFS if you wish.

-- Lon



From Christopher.Barry at qlogic.com  Tue Apr  8 02:36:46 2008
From: Christopher.Barry at qlogic.com (christopher barry)
Date: Mon, 07 Apr 2008 22:36:46 -0400
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get a
	coffee first ; )>
Message-ID: <1207622206.5259.66.camel@localhost>

Hi everyone,

I have a couple of questions about the tuning the dlm and gfs that
hopefully someone can help me with.

my setup:
6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
not new stuff, but corporate standards dictated the rev of rhat.

The cluster is a developer build cluster, where developers login, and
are balanced across nodes and edit and compile code. They can access via
vnc, XDMCP, ssh and telnet, and nodes external to the cluster can mount
the gfs home via nfs, balanced through the director. Their homes are on
the gfs, and accessible on all nodes.

I'm noticing huge differences in compile times - or any home file access
really - when doing stuff in the same home directory on the gfs on
different nodes. For instance, the same compile on one node is ~12
minutes - on another it's 18 minutes or more (not running concurrently).
I'm also seeing weird random pauses in writes, like saving a file in vi,
what would normally take less than a second, may take up to 10 seconds.

* From reading, I see that the first node to access a directory will be
the lock master for that directory. How long is that node the master? If
the user is no longer 'on' that node, is it still the master? If
continued accesses are remote, will the master state migrate to the node
that is primarily accessing it? I've set LVS persistence for ssh and
telnet for 5 minutes, to allow multiple xterms fired up in a script to
land on the same node, but new ones later will land on a different node
- by design really. Do I need to make this persistence way longer to
keep people only on the first node they hit? That kind of horks my load
balancing design if so. How can I see which node is master for which
directories? Is there a table I can read somehow?

* I've bumped the wake times for gfs_scand and gfs_inoded to 30 secs, I
mount noatime,noquota,nodiratime, and David Teigland recommended I set
dlm_dropcount to '0' today on irc, which I did, and I see an improvement
in speed on the node that appears to be master for say 'find' command
runs on the second and subsequent runs of the command if I restart them
immediately, but on the other nodes the speed is awful - worse than nfs
would be. On the first run of a find, or If I wait >10 seconds to start
another run after the last run completes, the time to run is
unbelievably slower than the same command on a standalone box with ext3.
e.g. <9 secs on the standalone, compared to 46 secs on the cluster - on
a different node it can take over 2 minutes! Yet an immediate re-run on
the cluster, on what I think must be the master is sub-second. How can I
speed up the first access time, and how can I keep the speed up similar
to immediate subsequent runs. I've got a ton of memory - I just do not
know which knobs to turn.

Am I expecting too much from gfs? Did I oversell it when I literally
fought to use it rather than nfs off the NetApp filer, insisting that
the performance of gfs smoked nfs? Or, more likely, do I just not
understand how to optimize it fully for my application?


Regards and Thanks,
-C



From bevan.broun at ardec.com.au  Tue Apr  8 04:31:31 2008
From: bevan.broun at ardec.com.au (Bevan Broun)
Date: Tue, 8 Apr 2008 13:31:31 +0900 (EIT)
Subject: [Linux-cluster] strange requirements:non reboot of failed node,
 both shared and non-shared storage on SAN.
In-Reply-To: <1207622206.5259.66.camel@localhost>
References: <1207622206.5259.66.camel@localhost>
Message-ID: <22207.210.9.69.226.1207629091.squirrel@webmail.ardec.com.au>

Hi All

I have a strange set of requirements:

A two node cluster:
services running on cluster nodes are not shared (ie not clustered).
cluster is only there for two GFS file systems on a SAN.
The same storage system hosts non GFS luns for individual use by the
cluster members.
The nodes run two applications, the critical app does NOT use the GFS. The
non critical ap uses the GFS.
The critical application uses storage from the SAN for ext3 file systems.

The requirement is that a failure of the cluster should not interupt the
critical application.
This means the failed node cannot be power cycled. Also the failed node must
continue to have access to it's non GFS luns on the storage.

The Storage are two HP EVAs. Each EVA has two controllers. There are two
brocade FC switches.

Fencing is required for GFS.

The only solution I can think of is:
GFS LUNs presented down one HBA only, while ext3 luns are presented down
both.
Use SAN fencing to block access by the fenced host to GFS luns by blocking
access to the controller that is handling this LUN.

repairing the cluster will be a manual operation that may involve a reboot.

does this look workable?

Thanks




From mathieu.avila at seanodes.com  Tue Apr  8 08:47:53 2008
From: mathieu.avila at seanodes.com (Mathieu Avila)
Date: Tue, 8 Apr 2008 10:47:53 +0200
Subject: [Linux-cluster] About GFS1 and I/O barriers.
In-Reply-To: <1207149428.3635.151.camel@quoit>
References: <20080328153458.45fc6e13@mathieu.toulouse>
	<20080331124651.3f0d2428@mathieu.toulouse>
	<1206960860.3635.126.camel@quoit>
	<20080331151622.1360a2cb@mathieu.toulouse>
	<1207130014.3310.24.camel@localhost.localdomain>
	<1a2a6dd60804020726g20d77419k47298eb000c431ec@mail.gmail.com>
	<1207149428.3635.151.camel@quoit>
Message-ID: <20080408104753.14db9ed3@mathieu.toulouse>

Le Wed, 02 Apr 2008 16:17:08 +0100,
Steven Whitehouse <swhiteho at redhat.com> a ?crit :

> Hi,
> 
> 
> If the data is not physically on disk when the ACK it sent back, then
> there is no way for the fs to know whether the data has (at a later
> date) not been written due to some error or other. Even ignoring that
> for the moment and assuming that such errors never occur, I don't
> think its too unreasonable to expect at a minimum that all
> acknowledged I/O will never be reordered with unacknowledged I/O.
> That is all that is required for correct operation of gfs1/2 provided
> that no media errors occur on write.

If I understand correctly your statement, I think you misinterpret what
a ACK on write means.
For the SCSI protocol, ACKing a write doesn't mean it has reached the
platters.
>From here:
http://t10.org/ftp/t10/drafts/sbc3/sbc3r14.pdf
4.11 Caches - 5th paragraph
"
During write operations, the device server uses the cache to store data
that is to be written to the medium at a later time. This is called
write-back caching. The command may complete prior to logical blocks
being written to the medium. As a result of using a write-back caching
there is a period of time when the data may be lost if power to the
SCSI target device is lost and a volatile cache is being used or a
hardware failure occurs. There is also the possibility of an error
occurring during the subsequent write operation. If an error occurred
during the write operation, it may be reported as a deferred error on a
later command. 
"

If you want some WRITEs to hit the persistent media, you must issue
special commands, like  "synchronize cache", or a write with
"FUA" (force unit acccess) bit set. All this is correctly (or at
least, it should be) handled by the kernel's barriers, if the device
supports it. In the case where no barriers are used, there is no
guarantee on reordering of WRITEs, so log corruption can occur.

>From where I understand the code, ext3 allows to activate barriers with
an option on mount, so when the device does not support them, it is
still possible to disable the option by remounting the device.
For XFS, barriers will be automatically disabled when the device
doesn't support them. 
(well, this is also what i've observed, but take those statements with
caution)

> 
> The message on lkml which Mathieu referred to suggested that there
> were three kinds of devices, but it seems to be that type 2
> (flushable) doesn't exist so far as the fs is concerned since
> blkdev_issue_flush() just issues a BIO with only a barrier in it. A
> device driver might support the barrier request by either waiting for
> all outstanding I/O and issuing a flush command (if required) or by
> passing the barrier down to the device, assuming that it supports
> such a thing directly.
> 
> Further down the message (the url is http://lkml.org/lkml/2007/5/25/71
> btw) there is a list of dm/md implementation status and it seems that
> for a good number of the common targets there is little or no support
> for barriers anyway at the moment.
> 
> Now I agree that it would be nice to support barriers in GFS2, but it
> won't solve any problems relating to ordering of I/O unless all of the
> underlying device supports them too. See also Alasdair's response to
> the thread: http://lkml.org/lkml/2007/5/28/81
> 
> So although I'd like to see barrier support in GFS2, it won't solve
> any problems for most people and really its a device/block layer
> issue at the moment.
> 
> Steve.
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From s.wendy.cheng at gmail.com  Tue Apr  8 09:13:52 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Tue, 8 Apr 2008 04:13:52 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get a
	coffee first ; )>
In-Reply-To: <1207622206.5259.66.camel@localhost>
References: <1207622206.5259.66.camel@localhost>
Message-ID: <1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>

On Mon, Apr 7, 2008 at 9:36 PM, christopher barry <
Christopher.Barry at qlogic.com> wrote:

> Hi everyone,
>
> I have a couple of questions about the tuning the dlm and gfs that
> hopefully someone can help me with.



There are lots to say about this configuration.. It is not a simple tuning
issue.


>
> my setup:
> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
> not new stuff, but corporate standards dictated the rev of rhat.



Putting a load balancer in front of cluster filesystem is tricky to get it
right (to say the least). This is particularly true between GFS and LVS,
mostly because LVS is a general purpose load balancer that is difficult to
tune to work with the existing GFS locking overhead.


The cluster is a developer build cluster, where developers login, and
> are balanced across nodes and edit and compile code. They can access via
> vnc, XDMCP, ssh and telnet, and nodes external to the cluster can mount
> the gfs home via nfs, balanced through the director. Their homes are on
> the gfs, and accessible on all nodes.



Direct login into GFS nodes (via vnc, ssh, telnet, etc) is ok but nfs client
access in this setup will have locking issues. It is *not* only a
performance issue. It is *also* a function issue - that is, before 2.6.19
Linux kernel, NLM locking (used by NFS client) doesn't get propagated into
clustered NFS servers. You'll have file corruption if different NFS clients
do file lockings and expect the lockings can be honored across different
clustered NFS servers. In general, people needs to think *very* carefully to
put a load balancer before a group of linux NFS servers using any
before-2.6.19 kernel. It is not going to work if there are multiple clients
that invoke either posix locks and/or flocks on files that are expected to
get accessed across different linux NFS servers on top  *any* cluster
filesystem (not only GFS). .


>
>
> I'm noticing huge differences in compile times - or any home file access
> really - when doing stuff in the same home directory on the gfs on
> different nodes. For instance, the same compile on one node is ~12
> minutes - on another it's 18 minutes or more (not running concurrently).
> I'm also seeing weird random pauses in writes, like saving a file in vi,
> what would normally take less than a second, may take up to 10 seconds.
>
> * From reading, I see that the first node to access a directory will be
> the lock master for that directory. How long is that node the master? If
> the user is no longer 'on' that node, is it still the master? If
> continued accesses are remote, will the master state migrate to the node
> that is primarily accessing it?



Cluster locking is expensive. As the result, GFS caches its glocks and there
is an one-to-one correspondence between GFS glock and DLM locks. Even an
user is no longer "on" that node, the lock stays on that node unless:

1. some other node requests an exclusive access of this lock (file write);
or
2. the node has memory pressure that kicks off linux virtual memory manager
to reclaim idle filesystem structures (inode, dentries, etc); or
3. abnormal events such as crash, umount, etc.

Check out: ,
http://open-sharedroot.org/Members/marc/blog/blog-on-gfs/glock-trimming-patch/?searchterm=gfs
for details.


I've set LVS persistence for ssh and
> telnet for 5 minutes, to allow multiple xterms fired up in a script to
> land on the same node, but new ones later will land on a different node
> - by design really. Do I need to make this persistence way longer to
> keep people only on the first node they hit? That kind of horks my load
> balancing design if so. How can I see which node is master for which
> directories? Is there a table I can read somehow?



You did the right thing here (by making the connection persistence). There
is a gfs glock dump command that can print out all the lock info (name,
owner, etc) but I really don't want to recommend it - since automating this
process is not trivial and there is no way to do this by hand, i.e.
manually.


>
> * I've bumped the wake times for gfs_scand and gfs_inoded to 30 secs, I
> mount noatime,noquota,nodiratime, and David Teigland recommended I set
> dlm_dropcount to '0' today on irc, which I did, and I see an improvement
> in speed on the node that appears to be master for say 'find' command
> runs on the second and subsequent runs of the command if I restart them
> immediately, but on the other nodes the speed is awful - worse than nfs
> would be. On the first run of a find, or If I wait >10 seconds to start
> another run after the last run completes, the time to run is
> unbelievably slower than the same command on a standalone box with ext3.
> e.g. <9 secs on the standalone, compared to 46 secs on the cluster - on
> a different node it can take over 2 minutes! Yet an immediate re-run on
> the cluster, on what I think must be the master is sub-second. How can I
> speed up the first access time, and how can I keep the speed up similar
> to immediate subsequent runs. I've got a ton of memory - I just do not
> know which knobs to turn.


The more memory you have, the more gfs locks (and their associated gfs file
structures) will be cached in the node. It, in turns, will make both dlm and
gfs lock queries take longer. The glock_purge (on RHEL 4.6, not on RHEL 4.5)
should be able to help but its effects will be limited if you ping-pong the
locks quickly between different GFS nodes. Try to play around with this
tunable (start with 20%) to see how it goes (but please reset gfs_scand and
gfs_inoded back to their defaults while you are experimenting glock_purge).

So assume this is a build-compile cluster, implying large amount of small
files come and go, The tricks I can think of:

1. glock_purge ~ 20%
2. glock_inode shorter than default (not longer)
3. persistent LVS session if all possible

>
>
> Am I expecting too much from gfs? Did I oversell it when I literally
> fought to use it rather than nfs off the NetApp filer, insisting that
> the performance of gfs smoked nfs? Or, more likely, do I just not
> understand how to optimize it fully for my application?



GFS1 is very good on large sequential IO (such as vedio-on-demand) but works
poorly in the environment you try to setup. However, I'm in an awkward
position to do further comments  I'll stop here.

-- Wendy

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080408/a4bb3a31/attachment.htm>

From gordan at bobich.net  Tue Apr  8 10:05:25 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Tue, 8 Apr 2008 11:05:25 +0100 (BST)
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a	coffee first ; )>
In-Reply-To: <1207622206.5259.66.camel@localhost>
References: <1207622206.5259.66.camel@localhost>
Message-ID: <alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>



> my setup:
> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
> not new stuff, but corporate standards dictated the rev of rhat.
[...]
> I'm noticing huge differences in compile times - or any home file access
> really - when doing stuff in the same home directory on the gfs on
> different nodes. For instance, the same compile on one node is ~12
> minutes - on another it's 18 minutes or more (not running concurrently).
> I'm also seeing weird random pauses in writes, like saving a file in vi,
> what would normally take less than a second, may take up to 10 seconds.
>
> * From reading, I see that the first node to access a directory will be
> the lock master for that directory. How long is that node the master? If
> the user is no longer 'on' that node, is it still the master? If
> continued accesses are remote, will the master state migrate to the node
> that is primarily accessing it? I've set LVS persistence for ssh and
> telnet for 5 minutes, to allow multiple xterms fired up in a script to
> land on the same node, but new ones later will land on a different node
> - by design really. Do I need to make this persistence way longer to
> keep people only on the first node they hit? That kind of horks my load
> balancing design if so. How can I see which node is master for which
> directories? Is there a table I can read somehow?
>
> * I've bumped the wake times for gfs_scand and gfs_inoded to 30 secs, I
> mount noatime,noquota,nodiratime, and David Teigland recommended I set
> dlm_dropcount to '0' today on irc, which I did, and I see an improvement
> in speed on the node that appears to be master for say 'find' command
> runs on the second and subsequent runs of the command if I restart them
> immediately, but on the other nodes the speed is awful - worse than nfs
> would be. On the first run of a find, or If I wait >10 seconds to start
> another run after the last run completes, the time to run is
> unbelievably slower than the same command on a standalone box with ext3.
> e.g. <9 secs on the standalone, compared to 46 secs on the cluster - on
> a different node it can take over 2 minutes! Yet an immediate re-run on
> the cluster, on what I think must be the master is sub-second. How can I
> speed up the first access time, and how can I keep the speed up similar
> to immediate subsequent runs. I've got a ton of memory - I just do not
> know which knobs to turn.

It sounds like bumping up lock trimming might help, but I don't think 
the feature accessibility through /sys has been back-ported to RHEL4, so 
if you're stuck with RHEL4, you may have to rebuild the latest versions of 
the tools and kernel modules from RHEL5, or you're out of luck.

> Am I expecting too much from gfs? Did I oversell it when I literally
> fought to use it rather than nfs off the NetApp filer, insisting that
> the performance of gfs smoked nfs? Or, more likely, do I just not
> understand how to optimize it fully for my application?

Probably a combination of all of the above. The main advantage of GFS 
isn't speed, it's the fact that it is a proper POSIX file system, unlike 
NFS or CIFS (e.g. file locking actually works on GFS). It also tends to 
stay consistent if a node fails, due to journalling.

Having said that, I've not seen speed differences as big as what you're 
describing, but I'm using RHEL5. I also have bandwidth charts for my 
DRBD/cluster interface, and the bandwidth usage on a lightly loaded system 
is not really signifficant unless lots of writes start happening. With 
mostly reads (which can all be served from the local DRBD mirror), the 
background "noise" traffic of combined DRBD and RHCS is > 200Kb/s 
(25KB/s). Since the ping times are < 0.1ms, in theory, this should make 
locks take < 1ms to resolve/migrate. Of course, if your find goes over 
50,000 files, the a 50 second delay to migrate all the locks may well be 
in a reasonable ball-park. You may find that things have moved on quite a 
bit since RHEL4...

Gordan



From paolom at prisma-eng.it  Tue Apr  8 12:51:01 2008
From: paolom at prisma-eng.it (Paolo Marini)
Date: Tue, 08 Apr 2008 14:51:01 +0200
Subject: [Linux-cluster] Problems with SAMBA server on Centos 51 virtual
	xen guest with iSCSI SAN
In-Reply-To: <47F3E82F.6030600@redhat.com>
References: <47F3C07A.8090709@prisma-eng.it> <47F3E82F.6030600@redhat.com>
Message-ID: <47FB6A35.3030407@prisma-eng.it>

After some investigation, it seems that the problem is really related to 
samba and not to the cluster infrastructure which is working quite well.

Here some posting on the issue with samba, that was exploited with the 
upgrade to 3.0.25 included in the RH 5.1 update:

http://bugs.contribs.org/show_bug.cgi?id=3762
http://www.centos.org/modules/newbb/viewtopic.php?post_id=39829&topic_id=12152
https://bugzilla.redhat.com/show_bug.cgi?id=426244

What I did to solve the problem was to get the latest samba sources (3.0.28a) and rebuild the package updating the spec file. I commented out the patches from 115 onwards as they are already included in the samba 3.0.28a tarball.

After the upgrade, none of the problems mentioned by me and in the above reported links happened again.

Hope this helps other folks solve the same problem, and also convinces RH people to upgrade the sasmba package.

Paolo


John Ruemker ha scritto:
> Paolo Marini wrote:
>> I have implemented a cluster of a few xen guest with a shared GFS 
>> filesystem residing on a SAN build with openfiler to support iSCSI 
>> storage.
>>
>> Physical servers are 3 machines implementing a physical cluster, each 
>> one equipped with quad xeon and 4 G RAM. The network interface is 
>> based on channel bonding with LACP (on the physical hosts) having an 
>> aggregate of 2 gigabits ethernet per physical host, the switch 
>> supports LACP and has been configured accordingly.
>>
>> Virtual servers are based on xen nodes on top of the physical server 
>> with shared storage on iSCSI and GFS.
>>
>> The networking is based on a cluster private network (for cluster 
>> heartbeat and cluster communication + iSCSI) and an ethernet alias 
>> for the LAN to which the users are connected.
>>
>> One of the cluster xen nodes is used for implementing a samba PDC (no 
>> failover of the service, plain samba, single samba server on the LAN) 
>> plus ldap server; samba works with ldap for users authentication. 
>> Storage for the samba server is on the SAN.
>>
>> I continue to receive complaints from my users due to the fact that 
>> sometimes copying file generates errors, plus problems related to 
>> office usage (we still use the old Office 97 on some machines). The 
>> samba configuration is more or less the same as that correctly 
>> working on the previous physical machine, on which those problems 
>> were not present.
>>
>> The problems generate these log entries on /var/log/samba/smbd:
>>
>> [2008/04/02 19:00:50, 0] lib/util_sock.c:get_peer_addr(1232)
>>  getpeername failed. Error was Transport endpoint is not connected
>> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>>  getpeername failed. Error was Transport endpoint is not connected
>> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>>  getpeername failed. Error was Transport endpoint is not connected
>>
>> And on the client machine log also on /var/log/samba
>>
>> [2008/04/02 19:04:34, 0] lib/util_sock.c:read_data(534)
>>  read_data: read failure for 4 bytes to client 192.168.13.240. Error 
>> = Connection reset by peer
>> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>>  amhwq53p (192.168.13.240) closed connection to service tmp
>> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>>  amhwq53p (192.168.13.240) closed connection to service stock
>> [2008/04/02 19:04:34, 0] lib/util_sock.c:write_data(562)
>>  write_data: write failure in writing to client 192.168.13.240. Error 
>> Broken pipe
>> [2008/04/02 19:04:34, 0] lib/util_sock.c:send_smb(769)
>>  Error writing 75 bytes to client. -1. (Broken pipe)
>> [2008/04/02 19:04:34, 1] smbd/service.c:make_connection_snum(1033)
>>
>> They seem similar to problems related to poor connectivity or problem 
>> in the network; however, these problems are new and were never found 
>> before switching to the clustered architecture. Also no problem have 
>> been found so far on the other xen nodes serving the same GFS 
>> filesystem (different dirs !) for NFS or other services.
>>
>> Also putting the option
>>
>> posix locking = no
>>
>> on the smb.conf file did not help.
>>
>> Any idea from someone else facing the same problems ?
>>
>> thanks, Paolo
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster 
> Those errors are explained in
>
>     http://kbase.redhat.com/faq/FAQ_45_5274.shtm
>
> John
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From s.wendy.cheng at gmail.com  Tue Apr  8 14:37:58 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Tue, 08 Apr 2008 09:37:58 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a	coffee first ; )>
In-Reply-To: <alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>
References: <1207622206.5259.66.camel@localhost>
	<alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>
Message-ID: <47FB8346.1070802@gmail.com>

gordan at bobich.net wrote:
>
>
>> my setup:
>> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
>> not new stuff, but corporate standards dictated the rev of rhat.
> [...]
>> I'm noticing huge differences in compile times - or any home file access
>> really - when doing stuff in the same home directory on the gfs on
>> different nodes. For instance, the same compile on one node is ~12
>> minutes - on another it's 18 minutes or more (not running concurrently).
>> I'm also seeing weird random pauses in writes, like saving a file in vi,
>> what would normally take less than a second, may take up to 10 seconds.
>>
>> * From reading, I see that the first node to access a directory will be
>> the lock master for that directory. How long is that node the master? If
>> the user is no longer 'on' that node, is it still the master? If
>> continued accesses are remote, will the master state migrate to the node
>> that is primarily accessing it? I've set LVS persistence for ssh and
>> telnet for 5 minutes, to allow multiple xterms fired up in a script to
>> land on the same node, but new ones later will land on a different node
>> - by design really. Do I need to make this persistence way longer to
>> keep people only on the first node they hit? That kind of horks my load
>> balancing design if so. How can I see which node is master for which
>> directories? Is there a table I can read somehow?
>>
>> * I've bumped the wake times for gfs_scand and gfs_inoded to 30 secs, I
>> mount noatime,noquota,nodiratime, and David Teigland recommended I set
>> dlm_dropcount to '0' today on irc, which I did, and I see an improvement
>> in speed on the node that appears to be master for say 'find' command
>> runs on the second and subsequent runs of the command if I restart them
>> immediately, but on the other nodes the speed is awful - worse than nfs
>> would be. On the first run of a find, or If I wait >10 seconds to start
>> another run after the last run completes, the time to run is
>> unbelievably slower than the same command on a standalone box with ext3.
>> e.g. <9 secs on the standalone, compared to 46 secs on the cluster - on
>> a different node it can take over 2 minutes! Yet an immediate re-run on
>> the cluster, on what I think must be the master is sub-second. How can I
>> speed up the first access time, and how can I keep the speed up similar
>> to immediate subsequent runs. I've got a ton of memory - I just do not
>> know which knobs to turn.
>
> It sounds like bumping up lock trimming might help, but I don't think 
> the feature accessibility through /sys has been back-ported to RHEL4, 
> so if you're stuck with RHEL4, you may have to rebuild the latest 
> versions of the tools and kernel modules from RHEL5, or you're out of 
> luck.

Glock trimming patch was mostly written and tuned on top of RHEL 4. It 
doesn't use /sys interface. The original patch was field tested on 
several customer production sites. Upon CVS RHEL 4.5 check-in, it was 
revised to use a less aggressive approach and turned out to be not as 
effective as the original approach. So the original patch was re-checked 
into RHEL 4.6.

I wrote the patch.

-- Wendy






From garromo at us.ibm.com  Tue Apr  8 17:28:42 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Tue, 8 Apr 2008 11:28:42 -0600
Subject: [Linux-cluster] Tunable parameters
In-Reply-To: <1207603538.2927.37.camel@localhost.localdomain>
Message-ID: <OF2B607A60.FC20DD52-ON87257425.005FDBAB-87257425.005FDD80@us.ibm.com>

Anyone know where I can get details about these files?

# pwd
/proc/cluster/config/cman

# ls
deadnode_timeout  join_timeout      max_retries transition_restarts
hello_timer       joinwait_timeout  newcluster_timeout  transition_timeout
joinconf_timeout  max_nodes         sm_debug_size

I am looking for definitions and for the ability to modify (if necessary). 
 Thank you.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080408/f075882b/attachment.htm>

From garromo at us.ibm.com  Tue Apr  8 17:31:40 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Tue, 8 Apr 2008 11:31:40 -0600
Subject: [Linux-cluster] timers tuning (contd)
In-Reply-To: <1207603409.2927.34.camel@localhost.localdomain>
Message-ID: <OF415C6F45.4603F1A4-ON87257425.00603DF2-87257425.00602361@us.ibm.com>

How do you increase the hearbeat timeout?

Gary Romo



Lon Hohberger <lhh at redhat.com> 
Sent by: linux-cluster-bounces at redhat.com
04/07/2008 03:23 PM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc

Subject
Re: [Linux-cluster] timers tuning (contd)







On Mon, 2008-04-07 at 15:00 +0200, Alain Moulle wrote:
> Hi
> 
> Is there a similar rule with CS5 ? I mean if we
> increase the heart-beat timeout, is there some
> other parameters to adjust together ?

qdisk timeout should be about a hair more than cman's timeout, if you're
using it.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080408/dff1d9a9/attachment.htm>

From mcse47 at hotmail.com  Tue Apr  8 19:33:39 2008
From: mcse47 at hotmail.com (Tracey Flanders)
Date: Tue, 8 Apr 2008 15:33:39 -0400
Subject: [Linux-cluster] Can you shrink your GFS Volume?
In-Reply-To: <20080408160009.676B26192A4@hormel.redhat.com>
References: <20080408160009.676B26192A4@hormel.redhat.com>
Message-ID: <BAY123-W44B2C965AD511007D5BC86D4F20@phx.gbl>


I need to add another journal to my GFS volume to add another cluster node. But I noticed that I get an error about free blocks.
Here's the command I'm running and the message I receive:
gfs_jadd -v -j1 /mnt/gfs1
Requested size (32768 blocks) greater than available space (3 blocks)

This makes perfect sense since I don't have any free space outside the gfs formatted volume. Is it possible at all to shrink a GFS volume? Or do I need to add more space to my lvm volume?
Simply, can GFS filesystems be shrunk?
_________________________________________________________________
Use video conversation to talk face-to-face with Windows Live Messenger.
http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_video_042008



From tiagocruz at forumgdh.net  Tue Apr  8 19:40:03 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Tue, 08 Apr 2008 16:40:03 -0300
Subject: [Linux-cluster] Can you shrink your GFS Volume?
In-Reply-To: <BAY123-W44B2C965AD511007D5BC86D4F20@phx.gbl>
References: <20080408160009.676B26192A4@hormel.redhat.com>
	<BAY123-W44B2C965AD511007D5BC86D4F20@phx.gbl>
Message-ID: <1207683603.15852.25.camel@tuxkiller.ig.com.br>

On Tue, 2008-04-08 at 15:33 -0400, Tracey Flanders wrote:

> Simply, can GFS filesystems be shrunk?

?Did you saw this?
https://www.redhat.com/archives/linux-cluster/2008-April/msg00076.html





From kadlec at sunserv.kfki.hu  Tue Apr  8 21:09:26 2008
From: kadlec at sunserv.kfki.hu (Kadlecsik Jozsef)
Date: Tue, 8 Apr 2008 23:09:26 +0200 (CEST)
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
References: <1207622206.5259.66.camel@localhost>
	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>

On Tue, 8 Apr 2008, Wendy Cheng wrote:

> The more memory you have, the more gfs locks (and their associated gfs file
> structures) will be cached in the node. It, in turns, will make both dlm and
> gfs lock queries take longer. The glock_purge (on RHEL 4.6, not on RHEL 4.5)
> should be able to help but its effects will be limited if you ping-pong the
> locks quickly between different GFS nodes. Try to play around with this
> tunable (start with 20%) to see how it goes (but please reset gfs_scand and
> gfs_inoded back to their defaults while you are experimenting glock_purge).
> 
> So assume this is a build-compile cluster, implying large amount of small
> files come and go, The tricks I can think of:
> 
> 1. glock_purge ~ 20%
> 2. glock_inode shorter than default (not longer)
> 3. persistent LVS session if all possible

What is glock_inode? Does it exist or something equivalent in 
cluster-2.01.00?
 
Isn't GFS_GL_HASH_SIZE too small for large amount of glocks? Being too 
small it results not only long linked lists but clashing at the same 
bucket will block otherwise parallel operations. Wouldn't it help 
increasing it from 8k to 65k?

Best regards,
Jozsef
--
E-mail : kadlec at mail.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary



From robertofratelli at yahoo.com  Tue Apr  8 23:41:04 2008
From: robertofratelli at yahoo.com (Roberto Fratelli)
Date: Tue, 8 Apr 2008 16:41:04 -0700 (PDT)
Subject: [Linux-cluster] Node fencing without an apparent reason
Message-ID: <347183.45719.qm@web33302.mail.mud.yahoo.com>

Hello Everyone.

I've been reading about post_fail_delay option and i
would like to hear your thoughts. I have a 2 node
cluster using GFS mounts. I want to prevent a "not so
dead" node being fenced by the other node by
increasing post_fail_delay value. Nowdays, i have it
set to 0

I'm using DRAC as a fencing device, but ofter i saw
one node fencing the other one without an apparent
reason (no network / quorum disk failures) and i'm not
happy with that...

I've read about the risks of having the active node
replaying other's node GFS Journal and then having the
2nd node write on GFS again i can get GFS Metadata
corruption, but how long (seconds) this whole
procedure occurs ?  Is it safe to increase
post_fail_delay to something like 5 seconds ?


Thanks !

Roberto Fratelli



      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.  
http://tc.deals.yahoo.com/tc/blockbuster/text5.com



From Alain.Moulle at bull.net  Wed Apr  9 06:57:21 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Wed, 09 Apr 2008 08:57:21 +0200
Subject: [Linux-cluster] CS5 / timers tuning (contd)
Message-ID: <47FC68D1.3000004@bull.net>

Hi Lon, and thanks for your answer, but :

about "qdisk timeout" , you mean "tko" parameter in cluster.conf ?
Because I can't see a qdisk "timeout" parameter in all your qdisk
parameter description ...
And if it is "tko", what is the value in seconds of one cyle, so
that we can adjust it a hair more than cman's timeout ?

Thanks for these details.
Regards
Alain Moull?

On Mon, 2008-04-07 at 15:00 +0200, Alain Moulle wrote:

>> Hi
>>
>> Is there a similar rule with CS5 ? I mean if we
>> increase the heart-beat timeout, is there some
>> other parameters to adjust together ?


qdisk timeout should be about a hair more than cman's timeout, if you're
using it.

-- Lon




From j.buzzard at dundee.ac.uk  Wed Apr  9 09:06:12 2008
From: j.buzzard at dundee.ac.uk (Jonathan Buzzard)
Date: Wed, 09 Apr 2008 10:06:12 +0100
Subject: [Linux-cluster] Writing new fencing agent
Message-ID: <1207731972.9236.51.camel@localhost.lifesci.dundee.ac.uk>


I am re-purposing an old cluster that used to run RHEL4 and IBM's GPFS.
The nodes are all HP NetServer LP1000r with 2GB RAM, and dual 1.4GHz
PIII's with an additional 1Gbps Intel NIC, and a local 73GB 10k RPM SCSI
disk. I have 48 of these nodes (and a couple spare).

As the GPFS and RedHat licenses have been transferred to new machines,
it is my intention to rebuild the nodes using CentOS 5 and use GFS. I
have a couple TB of iSCSI storage to go with it.

This is a low budget project and I need a fencing device. The nodes all
support something called "Alert on Lan v2", which seems to have been a
fore runner of IPMI. I have a separate "management" network, and have
turned AOL on in the BIOS on each node.

Googling turned up no documentation on how Alert on Lan works so some
time later with Wireshark and the windows client I have some C code that
sends magic packets of death to either power off, reset, or power cycle
(off wait 15 seconds then on) the nodes.

Testing shows that it is robust in that it works on a node that has
kernel panicked and is otherwise totally hung. It is also fast, once
magic packet of death received the node is off instantly. All that seems
to be required on the client side is for the management NIC to be up and
configured with an IP address. This is contrary to the suggestion that
client software is need according to the rather sketchy HP
documentation.

All good so far. However I am not sure what the requirements of a
fencing agent are. Can I rename my program fence_aol2 fiddle with
cluster.conf and it will work? Does the fencing agent have to return
specific exit codes? Should the fencing agent do something to test the
magic packet of death worked or is simply sending it enough? Does the
fencing agent need to be able to turn nodes on (I could use Wake On Lan
for this) as well as off?

Finally once I have a working and debugged AOL2 fencing agent, how does
one go about submitting for inclusion in cluster suite. Alternatively if
this is not wanted (Alert on Lan is a historical protocol and superseded
by IPMI) what is the best way of pointing other users to it's existance?


JAB.

-- 
Jonathan A. Buzzard                      Tel: +441382-386998
Storage Administrator, College of Life Sciences
University of Dundee, DD1 5EH



From pmshehzad at yahoo.com  Wed Apr  9 09:57:58 2008
From: pmshehzad at yahoo.com (Mshehzad Pankhawala)
Date: Wed, 9 Apr 2008 02:57:58 -0700 (PDT)
Subject: [Linux-cluster] How to configure Squid Server Failover in RHCS
Message-ID: <571371.33505.qm@web45802.mail.sp1.yahoo.com>

Thanks to everyone ,

I am testing redhat cluster suite,

I have successfully configured Apache Failover service using RHCS (in RHEL5).

Now i am testing a Squid server Failover.

One IP address (ex. 192.168.0.111) is allowed to connect to the internet directly.

I have two RHEL5 Server having Squid installed on it. and i want to configure Squid Fail over using RHCS on those two servers. But I can't find in Resource List Resource Like Squid Server (As like Apache Server).

After that I wrote a script to start the Squid Server And added it as Script resource but it failed. The proble is that I am getting requests from clients at 3128 port of Squid but they won't connect to internet at all.

I have desabled SELinux and Firewall from the begning of this testing.

Please Reply my with some alternatives

Thanks in Advance,
Regards.
MShehzad


 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080409/1919f13c/attachment.htm>

From mgrac at redhat.com  Wed Apr  9 14:32:21 2008
From: mgrac at redhat.com (Marek 'marx' Grac)
Date: Wed, 09 Apr 2008 16:32:21 +0200
Subject: [Linux-cluster] Writing new fencing agent
In-Reply-To: <1207731972.9236.51.camel@localhost.lifesci.dundee.ac.uk>
References: <1207731972.9236.51.camel@localhost.lifesci.dundee.ac.uk>
Message-ID: <47FCD375.5020807@redhat.com>

Hi,

Jonathan Buzzard wrote:
> All good so far. However I am not sure what the requirements of a
> fencing agent are. 

> Can I rename my program fence_aol2 fiddle with cluster.conf and it will work? 

You have to set 'agent' option:
<fencedevice name="apc" agent="fence_apc" ipaddr="1.1.1.1" login="apc" 
passwd="apc"/>

These options will be set to fencing agent on STDIN. Also there is a set 
of getopt arguments (look at existing code).

> Does the fencing agent have to return specific exit codes? 
You should return 0 when operation was finished succesfully.

> Should the fencing agent do something to test the magic packet of death worked or is simply sending it enough? 
All 'standard' fencing agents when rebooting are doing these actions:
1) power off
2) test if the plug/machine is powered off [sometimes it take few seconds]
3) power on

> Does the
> fencing agent need to be able to turn nodes on (I could use Wake On Lan
> for this) as well as off?
>
>   
yes, it should.

> Finally once I have a working and debugged AOL2 fencing agent, how does
> one go about submitting for inclusion in cluster suite. Alternatively if
> this is not wanted (Alert on Lan is a historical protocol and superseded
> by IPMI) what is the best way of pointing other users to it's existance?
>
>   
You can take a look at new fencing agents (available in git / master 
branch). They use a python module for common fencing tasks and it should 
not be a problem to write a new fencing agent (agent for APC devices has 
3kB).  If you will find any problem with new agents don't hesitate and 
contact me.

Marek Grac

-- 
Marek Grac
Red Hat Czech s.r.o.



From lhh at redhat.com  Wed Apr  9 14:58:48 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 09 Apr 2008 10:58:48 -0400
Subject: [Linux-cluster] CS5 / timers tuning (contd)
In-Reply-To: <47FC68D1.3000004@bull.net>
References: <47FC68D1.3000004@bull.net>
Message-ID: <1207753128.15132.89.camel@ayanami.boston.devel.redhat.com>

On Wed, 2008-04-09 at 08:57 +0200, Alain Moulle wrote:
> Hi Lon, and thanks for your answer, but :
> 
> about "qdisk timeout" , you mean "tko" parameter in cluster.conf ?
> Because I can't see a qdisk "timeout" parameter in all your qdisk
> parameter description ...

Right, interval + tko

interval * tko = qdisk timeout

See the qdisk man page for more details.

-- Lon



From lhh at redhat.com  Wed Apr  9 15:06:36 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 09 Apr 2008 11:06:36 -0400
Subject: [Linux-cluster] timers tuning (contd)
In-Reply-To: <OF415C6F45.4603F1A4-ON87257425.00603DF2-87257425.00602361@us.ibm.com>
References: <OF415C6F45.4603F1A4-ON87257425.00603DF2-87257425.00602361@us.ibm.com>
Message-ID: <1207753596.15132.93.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-08 at 11:31 -0600, Gary Romo wrote:
> 
> How do you increase the hearbeat timeout? 

On cluster2 (rhel5-ish):
  <totem token="x"/>

... where x is the number of _milliseconds_.  Default is 5000 (5
seconds).

On cluster1 (rhel4-ish), I don't recall if there's a way to do it from
cluster.conf; but in your cman initscript:

  echo x > /proc/cluster/config/cman/deadnode_timeout

... where x is the number of _seconds_.  Default is 21.

-- Lon




From s.wendy.cheng at gmail.com  Wed Apr  9 15:06:08 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 09 Apr 2008 10:06:08 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
References: <1207622206.5259.66.camel@localhost>	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
Message-ID: <47FCDB60.8030807@gmail.com>

Kadlecsik Jozsef wrote:
>
> What is glock_inode? Does it exist or something equivalent in 
> cluster-2.01.00?
>   

Sorry, typo. What I mean is "inoded_secs" (gfs inode daemon wake-up 
time). This is the daemon that reclaims deleted inodes. Don't set it too 
small though.

>  
> Isn't GFS_GL_HASH_SIZE too small for large amount of glocks? Being too 
> small it results not only long linked lists but clashing at the same 
> bucket will block otherwise parallel operations. Wouldn't it help 
> increasing it from 8k to 65k?
>   

Worth a try.

However, the issues involved here are more than lock searching time. It 
also has to do with cache flushing. GFS currently accumulates too much 
dirty caches. When it starts to flush, it will pause the system for too 
long.  Glock trimming helps - since cache flush is part of glock 
releasing operation.

-- Wendy





From isplist at logicore.net  Wed Apr  9 17:18:00 2008
From: isplist at logicore.net (isplist at logicore.net)
Date: Wed, 9 Apr 2008 12:18:00 -0500
Subject: [Linux-cluster] SSC Console?
Message-ID: <20084912180.894658@leena>

Anyone know what an SSC console is? I'm trying to migrate some data between storage devices but the device requires that I enter the command from an SSC Console. 

EXAMPLE;
mylinuxsystem:~/tmp$ssc 192.168.43.70

Just like ssh, telnet, etc. Can't find this anywhere.

Since everyone here is into storage, thought I'd ask. 

Mike





From s.wendy.cheng at gmail.com  Wed Apr  9 17:54:27 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 09 Apr 2008 12:54:27 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <47FCDB60.8030807@gmail.com>
References: <1207622206.5259.66.camel@localhost>	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
	<47FCDB60.8030807@gmail.com>
Message-ID: <47FD02D3.7010802@gmail.com>

Wendy Cheng wrote:
> Kadlecsik Jozsef wrote:
>>
>> What is glock_inode? Does it exist or something equivalent in 
>> cluster-2.01.00?
>>   
>
> Sorry, typo. What I mean is "inoded_secs" (gfs inode daemon wake-up 
> time). This is the daemon that reclaims deleted inodes. Don't set it 
> too small though.

Have been responding to this email from top of the head, based on folks' 
descriptions. Please be aware that they are just rough thoughts and the 
responses may not fit in general cases. The above is mostly for the 
original problem description where:

1. The system is designated for build-compile - my take is that there 
are many temporary and deleted files.
2. The gfs_inode tunable was changed (to 30, instead of default, 15).

>
>>  
>> Isn't GFS_GL_HASH_SIZE too small for large amount of glocks? Being 
>> too small it results not only long linked lists but clashing at the 
>> same bucket will block otherwise parallel operations. Wouldn't it 
>> help increasing it from 8k to 65k?
>>   
>
> Worth a try.

Now I remember .... we did experiment with different hash sizes when 
this latency issue was first reported two years ago. It didn't make much 
difference. The cache flushing, on the other hand, was more significant.

-- Wendy

>
> However, the issues involved here are more than lock searching time. 
> It also has to do with cache flushing. GFS currently accumulates too 
> much dirty caches. When it starts to flush, it will pause the system 
> for too long.  Glock trimming helps - since cache flush is part of 
> glock releasing operation.
>
>
>
>




From kadlec at sunserv.kfki.hu  Wed Apr  9 19:42:33 2008
From: kadlec at sunserv.kfki.hu (Kadlecsik Jozsef)
Date: Wed, 9 Apr 2008 21:42:33 +0200 (CEST)
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <47FD02D3.7010802@gmail.com>
References: <1207622206.5259.66.camel@localhost>
	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
	<47FCDB60.8030807@gmail.com> <47FD02D3.7010802@gmail.com>
Message-ID: <Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>

On Wed, 9 Apr 2008, Wendy Cheng wrote:

> Have been responding to this email from top of the head, based on folks'
> descriptions. Please be aware that they are just rough thoughts and the
> responses may not fit in general cases. The above is mostly for the original
> problem description where:
> 
> 1. The system is designated for build-compile - my take is that there are many
> temporary and deleted files.
> 2. The gfs_inode tunable was changed (to 30, instead of default, 15).

I'll take it into account when experimenting with the different settings.

> > > Isn't GFS_GL_HASH_SIZE too small for large amount of glocks? Being too
> > > small it results not only long linked lists but clashing at the same
> > > bucket will block otherwise parallel operations. Wouldn't it help
> > > increasing it from 8k to 65k?
> >
> > Worth a try.
> 
> Now I remember .... we did experiment with different hash sizes when this
> latency issue was first reported two years ago. It didn't make much
> difference. The cache flushing, on the other hand, was more significant.

What led me to suspect clashing in the hash (or some other lock-creating 
issue) was the simple test I made on our five node cluster: on one node I 
ran

find /gfs -type f -exec cat {} > /dev/null \;

and on another one just started an editor, naming a non-existent file.
It took multiple seconds while the editor "opened" the file. What else 
than creating the lock could delay the process so long?

> > However, the issues involved here are more than lock searching time. It also
> > has to do with cache flushing. GFS currently accumulates too much dirty
> > caches. When it starts to flush, it will pause the system for too long.
> > Glock trimming helps - since cache flush is part of glock releasing
> > operation.

But 'flushing when releasing glock' looks as a side effect. I mean, isn't 
there a more direct way to control the flushing?

I can easily be totally wrong, but on the one hand, it's good to keep as 
many locks cached as possible, because lock creation is expensive. But on 
the other hand, trimming locks triggers flushing, which helps to keep the 
systems running more smoothly. So a tunable to control flushing directly 
would be better than just trimming the locks, isn't it. But not knowing 
the deep internals of GFS, my reasoning can of course be bogus.

Best regards,
Jozsef
--
E-mail : kadlec at mail.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary



From gordan at bobich.net  Wed Apr  9 20:08:39 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 9 Apr 2008 21:08:39 +0100 (BST)
Subject: [Linux-cluster] How to configure Squid Server Failover in RHCS
In-Reply-To: <571371.33505.qm@web45802.mail.sp1.yahoo.com>
References: <571371.33505.qm@web45802.mail.sp1.yahoo.com>
Message-ID: <alpine.LRH.1.10.0804092106270.32419@skynet.shatteredsilicon.net>

On Wed, 9 Apr 2008, Mshehzad Pankhawala wrote:

> Now i am testing a Squid server Failover.
> 
> One IP address (ex. 192.168.0.111) is allowed to connect to the internet directly.
> 
> I have two RHEL5 Server having Squid installed on it. and i want to configure Squid Fail over
> using RHCS on those two servers. But I can't find in Resource List Resource Like Squid Server
> (As like Apache Server).
> 
> After that I wrote a script to start the Squid Server And added it as Script resource but it
> failed. The proble is that I am getting requests from clients at 3128 port of Squid but they
> won't connect to internet at all.
> 
> I have desabled SELinux and Firewall from the begning of this testing.

Squid can be rather funny with what IP addresses/interfaces it binds to. 
I've found it works in a hot-failover setup (always runs on both servers) 
if you tell it to bind to a port without specifying the IPs (so it binds 
to all interfaces/IPs), and just fail over the IP - no need to bother 
specifying the fail-over service. Mind you, same holds true for Apache - 
you might as well have your servers as a load-balanced pair, rather than 
warm spare fail-over.

Gordan



From s.wendy.cheng at gmail.com  Wed Apr  9 20:41:37 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 09 Apr 2008 15:41:37 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>
References: <1207622206.5259.66.camel@localhost>	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>	<47FCDB60.8030807@gmail.com>
	<47FD02D3.7010802@gmail.com>
	<Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>
Message-ID: <47FD2A01.1030708@gmail.com>

Kadlecsik Jozsef wrote:
> On Wed, 9 Apr 2008, Wendy Cheng wrote:
>
>   
>> Have been responding to this email from top of the head, based on folks'
>> descriptions. Please be aware that they are just rough thoughts and the
>> responses may not fit in general cases. The above is mostly for the original
>> problem description where:
>>
>> 1. The system is designated for build-compile - my take is that there are many
>> temporary and deleted files.
>> 2. The gfs_inode tunable was changed (to 30, instead of default, 15).
>>     
>
> I'll take it into account when experimenting with the different settings.
>
>   
>>>> Isn't GFS_GL_HASH_SIZE too small for large amount of glocks? Being too
>>>> small it results not only long linked lists but clashing at the same
>>>> bucket will block otherwise parallel operations. Wouldn't it help
>>>> increasing it from 8k to 65k?
>>>>         
>>> Worth a try.
>>>       
>> Now I remember .... we did experiment with different hash sizes when this
>> latency issue was first reported two years ago. It didn't make much
>> difference. The cache flushing, on the other hand, was more significant.
>>     
>
> What led me to suspect clashing in the hash (or some other lock-creating 
> issue) was the simple test I made on our five node cluster: on one node I 
> ran
>
> find /gfs -type f -exec cat {} > /dev/null \;
>
> and on another one just started an editor, naming a non-existent file.
> It took multiple seconds while the editor "opened" the file. What else 
> than creating the lock could delay the process so long?
>   

Not knowing how "find" is implemented, I would guess this is caused by 
directory locks. Creating a file needs a directory lock. Your exclusive 
write lock (file create) can't be granted until the "find" releases the 
directory lock. It doesn't look like a lock query performance issue to me.

>   
>>> However, the issues involved here are more than lock searching time. It also
>>> has to do with cache flushing. GFS currently accumulates too much dirty
>>> caches. When it starts to flush, it will pause the system for too long.
>>> Glock trimming helps - since cache flush is part of glock releasing
>>> operation.
>>>       
>
> But 'flushing when releasing glock' looks as a side effect. I mean, isn't 
> there a more direct way to control the flushing?
>   
> I can easily be totally wrong, but on the one hand, it's good to keep as 
> many locks cached as possible, because lock creation is expensive. But on 
> the other hand, trimming locks triggers flushing, which helps to keep the 
> systems running more smoothly. So a tunable to control flushing directly 
> would be better than just trimming the locks, isn't it. 

To make long story short, I did submit a direct cache flush patch first, 
instead of this final version of lock trimming patch. Unfortunately, it 
was *rejected*.

-- Wendy




From federico.simoncelli at gmail.com  Thu Apr 10 08:57:44 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Thu, 10 Apr 2008 10:57:44 +0200
Subject: [Linux-cluster] Migration of VMs instead of relocation
Message-ID: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>

Hi everybody. Shutting down a cluster node results in relocate the
services to another node (accomplished with stop and start). Is there
any way to change this behavior to "migrate" for the virtual machines?
It looks like this post should be related to my problem:
http://article.gmane.org/gmane.linux.redhat.cluster/10848
The rgmanager version I'm using is 2.0.31-1.
Thanks.

-- 
Federico.



From npf-mlists at eurotux.com  Thu Apr 10 09:35:34 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Thu, 10 Apr 2008 10:35:34 +0100
Subject: [Linux-cluster] RHEL cluster upgrade from 5.0 to 5.1
Message-ID: <200804101035.35425.npf-mlists@eurotux.com>

Hi,

With a cluster of only clvmd is it possible to do a rolling upgrade from 5.0 
to 5.1?

By rolling upgrade i mean,

1 - select 1 node
2 - leave the node from cluster
3 - upgrade to 5.1
4 - join to cluster
5 - go to step 1

Any info?
Thanks,
Nuno Fernandes



From Alain.Moulle at bull.net  Thu Apr 10 12:51:59 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Thu, 10 Apr 2008 14:51:59 +0200
Subject: [Linux-cluster] CS5 / timers tuning (contd)
Message-ID: <47FE0D6F.9060700@bull.net>

Hi Lon

Thans again, but that's strange because in the man , the recommended
values are :
intervall="1" tko="10" and so we have a result < 21s which is the
default value of heart-beat timer, so not a hair above like you
recommened in previous email ...
extract of man qddisk :

         interval="1"
            This is the frequency of read/write cycles, in seconds.

         tko="10"
            This  is  the  number  of  cycles  a node must miss in order to be
            declared dead.

?

PS : " don't recall if there's a way to do it from
cluster.conf"
 yes we can change the deadnode_timeout in cluster.conf :
<cman deadnode_timer="DEADNODETIMER_VALUE"/>

Thanks
Regards
Alain Moull?


>Hi Lon, and thanks for your answer, but :
>>
>> about "qdisk timeout" , you mean "tko" parameter in cluster.conf ?
>> Because I can't see a qdisk "timeout" parameter in all your qdisk
>> parameter description ...


Right, interval + tko

interval * tko = qdisk timeout

See the qdisk man page for more details.

-- Lon



From kadlec at sunserv.kfki.hu  Thu Apr 10 13:00:40 2008
From: kadlec at sunserv.kfki.hu (Kadlecsik Jozsef)
Date: Thu, 10 Apr 2008 15:00:40 +0200 (CEST)
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <47FD2A01.1030708@gmail.com>
References: <1207622206.5259.66.camel@localhost>
	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
	<47FCDB60.8030807@gmail.com> <47FD02D3.7010802@gmail.com>
	<Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>
	<47FD2A01.1030708@gmail.com>
Message-ID: <Pine.LNX.4.64.0804101349450.14375@lxserv1.kfki.hu>

On Wed, 9 Apr 2008, Wendy Cheng wrote:

> > What led me to suspect clashing in the hash (or some other lock-creating
> > issue) was the simple test I made on our five node cluster: on one node I
> > ran
> >
> > find /gfs -type f -exec cat {} > /dev/null \;
> >
> > and on another one just started an editor, naming a non-existent file.
> > It took multiple seconds while the editor "opened" the file. What else than
> > creating the lock could delay the process so long?
> >   
> 
> Not knowing how "find" is implemented, I would guess this is caused by
> directory locks. Creating a file needs a directory lock. Your exclusive write
> lock (file create) can't be granted until the "find" releases the directory
> lock. It doesn't look like a lock query performance issue to me.

As /gfs is a large directory structure with hundreds of user home 
directories, somehow I don't think I could pick the same directory which 
was just processed by "find".

But this is a good clue to what might bite us most! Our GFS cluster is an 
almost mail-only cluster for users with Maildir. When the users experience 
temporary hangups for several seconds (even when writing a new mail), it 
might be due to the concurrent scanning for a new mail on one node by the 
MUA and the delivery to the Maildir in another node by the MTA.

What is really strange (and distrurbing) that such "hangups" can take 
10-20 seconds which is just too much for the users.

In order to look at the possible tuning options and the side effects, I 
list what I have learned so far:

- Increasing glock_purge (percent, default 0) helps to trim back the 
  unused glocks by gfs_scand itself. Otherwise glocks can accumulate and 
  gfs_scand eats more and more time at scanning the larger and 
  larger table of glocks.
- gfs_scand wakes up every scand_secs (default 5s) to scan the glocks,  
  looking for work to do. By increasing scand_secs one can lessen the load 
  produced by gfs_scand, but it'll hurt because flushing data can be 
  delayed.
- Decreasing demote_secs (seconds, default 300) helps to flush cached data
  more often by moving write locks into less restricted states. Flushing 
  often helps to avoid burstiness *and* to prolong another nodes' 
  lock access. Question is, what are the side effects of small
  demote_secs values? (Probably there is no much point to choose
  smaller demote_secs value than scand_secs.)

Currently we are running with 'glock_purge = 20' and 'demote_secs = 30'.

> > But 'flushing when releasing glock' looks as a side effect. I mean, isn't
> > there a more direct way to control the flushing?
> 
> To make long story short, I did submit a direct cache flush patch first,
> instead of this final version of lock trimming patch. Unfortunately, it was
> *rejected*.

I see. Another question, just out of curiosity: why don't you use kernel 
timers for every glock instead of gfs_scand? The hash bucket id of the 
glock should be added to struct gfs_glock, but the timer function could be 
almost identical with scan_glock. As far as I see the only drawback were 
that it'd be equivalent with 'glock_purge = 100' and it'd be tricky to 
emulate glock_purge != 100 settings.

Best regards,
Jozsef
--
E-mail : kadlec at mail.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary



From Alain.Moulle at bull.net  Thu Apr 10 14:01:45 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Thu, 10 Apr 2008 16:01:45 +0200
Subject: [Linux-cluster] CS5/ Little thing with clustat and the quorum disk
Message-ID: <47FE1DC9.8050103@bull.net>

Hi

Something strange with clustat :

if CS5 is launched with a valid Quorum Disk (that we can see
with mkqdisk -L) and if we break the quorum disk (i.e mkfs on
the device just to simulate a pb to reach the quorum disk), the
clustat command always displays the Quorum Disk "Online" :

#clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  xena140                               1 Online, Local, rgmanager
  xena141                               2 Offline
  /dev/sdb                              0 Online, Quorum Disk

Note that in this case, the cluster is not at all disturbed, there
are ony some messages in syslog like :
qdiskd[30709]: <warning> Error reading node ID block ...

but it just need to execute again a mkqdisk and no needs to
stop/start again the CS, all is then working fine.

Alain Moull?



From jruemker at redhat.com  Thu Apr 10 17:00:15 2008
From: jruemker at redhat.com (John Ruemker)
Date: Thu, 10 Apr 2008 13:00:15 -0400
Subject: [Linux-cluster] Migration of VMs instead of relocation
In-Reply-To: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>
References: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>
Message-ID: <47FE479F.9010900@redhat.com>

Federico Simoncelli wrote:
> Hi everybody. Shutting down a cluster node results in relocate the
> services to another node (accomplished with stop and start). Is there
> any way to change this behavior to "migrate" for the virtual machines?
> It looks like this post should be related to my problem:
> http://article.gmane.org/gmane.linux.redhat.cluster/10848
> The rgmanager version I'm using is 2.0.31-1.
> Thanks.
>
>   
I believe migration will be the default in RHEL5.2, but for now you can 
follow the instructions at http://kbase.redhat.com/faq/FAQ_51_11879

John



From federico.simoncelli at gmail.com  Thu Apr 10 17:52:58 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Thu, 10 Apr 2008 19:52:58 +0200
Subject: [Linux-cluster] Migration of VMs instead of relocation
In-Reply-To: <47FE479F.9010900@redhat.com>
References: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>
	<47FE479F.9010900@redhat.com>
Message-ID: <a01fe36d0804101052m4ccc03e7l404d69ae145020e1@mail.gmail.com>

On Thu, Apr 10, 2008 at 7:00 PM, John Ruemker <jruemker at redhat.com> wrote:
>  I believe migration will be the default in RHEL5.2, but for now you can
> follow the instructions at http://kbase.redhat.com/faq/FAQ_51_11879
>
>  John

I already fixed that. (It is just to improve performances right? live
migration vs. migration)
My problem is quite different.
Imagine you have to shutdown a node for maintenance... you have to
manually migrate all the vms to other nodes before actually shut it
down.
If you don't, rgmanager takes care of relocating the vms using
"relocate" which will result in stopping the service (vm unclean
destroy) and starting it somewhere else.
I am trying to avoid this kind of behavior.

-- 
Federico.



From lpleiman at redhat.com  Thu Apr 10 17:56:53 2008
From: lpleiman at redhat.com (Leo Pleiman)
Date: Thu, 10 Apr 2008 13:56:53 -0400
Subject: [Linux-cluster] Migration of VMs instead of relocation
In-Reply-To: <a01fe36d0804101052m4ccc03e7l404d69ae145020e1@mail.gmail.com>
References: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>	<47FE479F.9010900@redhat.com>
	<a01fe36d0804101052m4ccc03e7l404d69ae145020e1@mail.gmail.com>
Message-ID: <47FE54E5.8000005@redhat.com>

Isn't this fixed by adding --live to the migration line in the vm.sh script?

Federico Simoncelli wrote:
> On Thu, Apr 10, 2008 at 7:00 PM, John Ruemker <jruemker at redhat.com> wrote:
>   
>>  I believe migration will be the default in RHEL5.2, but for now you can
>> follow the instructions at http://kbase.redhat.com/faq/FAQ_51_11879
>>
>>  John
>>     
>
> I already fixed that. (It is just to improve performances right? live
> migration vs. migration)
> My problem is quite different.
> Imagine you have to shutdown a node for maintenance... you have to
> manually migrate all the vms to other nodes before actually shut it
> down.
> If you don't, rgmanager takes care of relocating the vms using
> "relocate" which will result in stopping the service (vm unclean
> destroy) and starting it somewhere else.
> I am trying to avoid this kind of behavior.
>
>   

-- 
Leo J Pleiman
Senior Consultant, GPS Federal
410-688-3873

-------------- next part --------------
A non-text attachment was scrubbed...
Name: lpleiman.vcf
Type: text/x-vcard
Size: 194 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080410/839ef816/attachment.vcf>

From jruemker at redhat.com  Thu Apr 10 18:45:07 2008
From: jruemker at redhat.com (John Ruemker)
Date: Thu, 10 Apr 2008 14:45:07 -0400
Subject: [Linux-cluster] Migration of VMs instead of relocation
In-Reply-To: <47FE54E5.8000005@redhat.com>
References: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>	<47FE479F.9010900@redhat.com>	<a01fe36d0804101052m4ccc03e7l404d69ae145020e1@mail.gmail.com>
	<47FE54E5.8000005@redhat.com>
Message-ID: <47FE6033.8030903@redhat.com>

Yes, which is the fix included in that kbase I posted. 

Leo Pleiman wrote:
> Isn't this fixed by adding --live to the migration line in the vm.sh 
> script?
>
> Federico Simoncelli wrote:
>> On Thu, Apr 10, 2008 at 7:00 PM, John Ruemker <jruemker at redhat.com> 
>> wrote:
>>  
>>>  I believe migration will be the default in RHEL5.2, but for now you 
>>> can
>>> follow the instructions at http://kbase.redhat.com/faq/FAQ_51_11879
>>>
>>>  John
>>>     
>>
>> I already fixed that. (It is just to improve performances right? live
>> migration vs. migration)
>> My problem is quite different.
>> Imagine you have to shutdown a node for maintenance... you have to
>> manually migrate all the vms to other nodes before actually shut it
>> down.
>> If you don't, rgmanager takes care of relocating the vms using
>> "relocate" which will result in stopping the service (vm unclean
>> destroy) and starting it somewhere else.
>> I am trying to avoid this kind of behavior.
>>
>>   
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From federico.simoncelli at gmail.com  Thu Apr 10 19:12:32 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Thu, 10 Apr 2008 21:12:32 +0200
Subject: [Linux-cluster] Migration of VMs instead of relocation
In-Reply-To: <47FE6033.8030903@redhat.com>
References: <a01fe36d0804100157m56e13f5dg2af62873a566f492@mail.gmail.com>
	<47FE479F.9010900@redhat.com>
	<a01fe36d0804101052m4ccc03e7l404d69ae145020e1@mail.gmail.com>
	<47FE54E5.8000005@redhat.com> <47FE6033.8030903@redhat.com>
Message-ID: <a01fe36d0804101212i64ba9bc4l7da8930eab0a12a5@mail.gmail.com>

On Thu, Apr 10, 2008 at 8:45 PM, John Ruemker <jruemker at redhat.com> wrote:
> Yes, which is the fix included in that kbase I posted.
>  Leo Pleiman wrote:
> >
> > Isn't this fixed by adding --live to the migration line in the vm.sh
> script?
> >

I agree with you that adding --live (or -l) would enable live
migration but only when you issue a migrate request with the command:

  # clusvcadm -M vm:clustered_guest -n cluster2-2
(taken from the kbase posted)

It works. Agreed.
The problem is that when you shutdown a node the rgmanager is stopped
(migration is not involved for as much as I know).

  # service rgmanager stop

This would stop rgmanager (as much as a node shoutdown would do) and
the services are stopped and started on available nodes.
I would like to automatically migrate the vms instead.
Can you try this and confirm that the fix in the kbase is not related
to this behavior?
Thank you.

-- 
Federico.



From isplist at logicore.net  Thu Apr 10 19:52:31 2008
From: isplist at logicore.net (isplist at logicore.net)
Date: Thu, 10 Apr 2008 14:52:31 -0500
Subject: [Linux-cluster] Hardware/Sofware for VM use
Message-ID: <2008410145231.870495@leena>

I've had to take a break from hardware for a while as software was calling my name. Now I need to get my act together on the hardware side so thought I would ask for thoughts. I've been wanting to consolidate machines for a long time as power and cooling and waste of resources are of concern.

I know I've touched on this before but the response was somewhat overwhelming to me since I have not yet touched VM environments so was not/am not aware of terminology yet.

Questions;

Hardware?

I have powerful 8/16-way servers which I could use as VM servers.
But, I also have dozens of small 1Ghz/512MB servers in the form of very reliable blade servers.

I had asked about the possibility of creating an SSI style cluster then creating VM's out of that. Are there any such methods being used? Where a cluster of parallel processing computers is an options today? Or, are folks just basically using the most powerful computers they have, running two or more with shared storage for redundancy?

Software;

I'd like to take advantage of VM but it's not clear to me what is already included in the Linux packages, what is not, etc. I see a lot of projects out there and I'd much prefer to use RPM as well. If I have to, I can go the compile route. I prefer RPM's because they are so easy to manage for me.

What would be a good starting path, one which would allow me to try VM, in a way that I can start migrating machines over to the new method.

Thanks very much for your input/help on this.

Mike





From Christopher.Barry at qlogic.com  Thu Apr 10 20:18:55 2008
From: Christopher.Barry at qlogic.com (christopher barry)
Date: Thu, 10 Apr 2008 16:18:55 -0400
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a	coffee first ; )>
In-Reply-To: <47FB8346.1070802@gmail.com>
References: <1207622206.5259.66.camel@localhost>
	<alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>
	<47FB8346.1070802@gmail.com>
Message-ID: <1207858736.5188.77.camel@localhost>

On Tue, 2008-04-08 at 09:37 -0500, Wendy Cheng wrote:
> gordan at bobich.net wrote:
> >
> >
> >> my setup:
> >> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
> >> not new stuff, but corporate standards dictated the rev of rhat.
> > [...]
> >> I'm noticing huge differences in compile times - or any home file access
> >> really - when doing stuff in the same home directory on the gfs on
> >> different nodes. For instance, the same compile on one node is ~12
> >> minutes - on another it's 18 minutes or more (not running concurrently).
> >> I'm also seeing weird random pauses in writes, like saving a file in vi,
> >> what would normally take less than a second, may take up to 10 seconds.

Anyway, thought I would re-connect to you all and let you know how this
worked out. We ended up scrapping gfs. Not because it's not a great fs,
but because I was using it in a way that was playing to it's weak
points. I had a lot of time and energy invested in it, and it was hard
to let it go. Turns out that connecting to the NetApp filer via nfs is
faster for this workload. I couldn't believe it either, as my bonnie and
dd type tests showed gfs to be faster. But for the use case of large
sets of very small files, and lots of stats going on, gfs simply cannot
compete with NetApp's nfs implementation. GFS is an excellent fs, and it
has it's place in the landscape - but for a development build system,
the NetApp is simply phenomenal.


Thanks all for your assistance in the many months I have sought and
received advice and help here.

Regards,
Christopher Barry 



From agspoon at gmail.com  Thu Apr 10 21:27:05 2008
From: agspoon at gmail.com (Craig Johnston)
Date: Thu, 10 Apr 2008 14:27:05 -0700
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21 kernel
Message-ID: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>

We would like to achieve a stable GFS/GFS2 cluster configuration using
a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
attempt was to obtain the Fedora Core 7 source rpms for the various
components (cman, rgmanager, openais, etc.).  We were successful in
incorporating these packages into our distribution, and creating what
should be a working cluster configuration with multiple nodes sharing
a set of GFS2 file systems from an iSCSI SAN.

The problem is that it is all very unstable, takes forever to
start-up, and locks up under even small load.  We would like to move
to a more recent version of the cluster suite and update the kernel
gfs2 and dlm modules for a 2.6.21 kernel.  We need to stick with
2.6.21 for other reasons (vendor support mostly), and we figure if it
all can be back ported for RHEL5.1 (2.6.18) it should be doable for
2.6.21.  We just don't know where to start.

Any advice on how we might proceed on this process would be greatly appreciated.

Thanks,
Craig



From swhiteho at redhat.com  Fri Apr 11 08:08:49 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 11 Apr 2008 09:08:49 +0100
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21 kernel
In-Reply-To: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
References: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
Message-ID: <1207901329.3635.342.camel@quoit>

Hi,

On Thu, 2008-04-10 at 14:27 -0700, Craig Johnston wrote:
> We would like to achieve a stable GFS/GFS2 cluster configuration using
> a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
> attempt was to obtain the Fedora Core 7 source rpms for the various
> components (cman, rgmanager, openais, etc.).  We were successful in
> incorporating these packages into our distribution, and creating what
> should be a working cluster configuration with multiple nodes sharing
> a set of GFS2 file systems from an iSCSI SAN.
> 
> The problem is that it is all very unstable, takes forever to
> start-up, and locks up under even small load.  We would like to move
> to a more recent version of the cluster suite and update the kernel
> gfs2 and dlm modules for a 2.6.21 kernel.  We need to stick with
> 2.6.21 for other reasons (vendor support mostly), and we figure if it
> all can be back ported for RHEL5.1 (2.6.18) it should be doable for
> 2.6.21.  We just don't know where to start.
> 
> Any advice on how we might proceed on this process would be greatly appreciated.
> 
> Thanks,
> Craig
> 
If you want to use GFS2, then try F-8, or rawhide with the most uptodate
set of packages. I would not recommend using a kernel that old for GFS2,

Steve.

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Fri Apr 11 09:38:17 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Fri, 11 Apr 2008 10:38:17 +0100 (BST)
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21
 kernel
In-Reply-To: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
References: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
Message-ID: <alpine.LRH.1.10.0804111033100.29827@skynet.shatteredsilicon.net>

On Thu, 10 Apr 2008, Craig Johnston wrote:

> We would like to achieve a stable GFS/GFS2 cluster configuration using
> a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
> attempt was to obtain the Fedora Core 7 source rpms for the various
> components (cman, rgmanager, openais, etc.).  We were successful in
> incorporating these packages into our distribution, and creating what
> should be a working cluster configuration with multiple nodes sharing
> a set of GFS2 file systems from an iSCSI SAN.
>
> The problem is that it is all very unstable, takes forever to
> start-up, and locks up under even small load.

Use GFS1. GFS2 still does that. FC6+ no longer ships with GFS1 support 
built in as standard. If you're going to stick to the tried path, use 
RHEL5 (based) distributions. If you don't, you may well be better of just 
building the lot from source. It comes down to what your time is worth to 
you.

Gordan



From kadlec at sunserv.kfki.hu  Fri Apr 11 11:05:08 2008
From: kadlec at sunserv.kfki.hu (Kadlecsik Jozsef)
Date: Fri, 11 Apr 2008 13:05:08 +0200 (CEST)
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <Pine.LNX.4.64.0804101349450.14375@lxserv1.kfki.hu>
References: <1207622206.5259.66.camel@localhost>
	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>
	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>
	<47FCDB60.8030807@gmail.com> <47FD02D3.7010802@gmail.com>
	<Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>
	<47FD2A01.1030708@gmail.com>
	<Pine.LNX.4.64.0804101349450.14375@lxserv1.kfki.hu>
Message-ID: <Pine.LNX.4.64.0804111124230.26805@lxserv1.kfki.hu>

On Thu, 10 Apr 2008, Kadlecsik Jozsef wrote:

> But this is a good clue to what might bite us most! Our GFS cluster is an 
> almost mail-only cluster for users with Maildir. When the users experience 
> temporary hangups for several seconds (even when writing a new mail), it 
> might be due to the concurrent scanning for a new mail on one node by the 
> MUA and the delivery to the Maildir in another node by the MTA.
> 
> What is really strange (and distrurbing) that such "hangups" can take 
> 10-20 seconds which is just too much for the users.

Yesterday we started to monitor the number of locks/held locks on two of 
the machines. The results from the first day can be found at 
http://www.kfki.hu/~kadlec/gfs/.

It looks as Maildir is definitely a wrong choice for GFS and we should 
consider to convert to mailbox format: at least I cannot explain the 
spikes in another way.
 
> In order to look at the possible tuning options and the side effects, I 
> list what I have learned so far:
> 
> - Increasing glock_purge (percent, default 0) helps to trim back the 
>   unused glocks by gfs_scand itself. Otherwise glocks can accumulate and 
>   gfs_scand eats more and more time at scanning the larger and 
>   larger table of glocks.
> - gfs_scand wakes up every scand_secs (default 5s) to scan the glocks,  
>   looking for work to do. By increasing scand_secs one can lessen the load 
>   produced by gfs_scand, but it'll hurt because flushing data can be 
>   delayed.
> - Decreasing demote_secs (seconds, default 300) helps to flush cached data
>   more often by moving write locks into less restricted states. Flushing 
>   often helps to avoid burstiness *and* to prolong another nodes' 
>   lock access. Question is, what are the side effects of small
>   demote_secs values? (Probably there is no much point to choose
>   smaller demote_secs value than scand_secs.)
> 
> Currently we are running with 'glock_purge = 20' and 'demote_secs = 30'.

Best regards,
Jozsef
--
E-mail : kadlec at mail.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary



From jmacfarland at nexatech.com  Fri Apr 11 14:07:41 2008
From: jmacfarland at nexatech.com (Jeff Macfarland)
Date: Fri, 11 Apr 2008 09:07:41 -0500
Subject: [Linux-cluster] SSC Console?
In-Reply-To: <20084912180.894658@leena>
References: <20084912180.894658@leena>
Message-ID: <47FF70AD.2020102@nexatech.com>

Sounds like sun storage console to me. google "storedge ssconsole"

isplist at logicore.net wrote:
> Anyone know what an SSC console is? I'm trying to migrate some data between storage devices but the device requires that I enter the command from an SSC Console. 
> 
> EXAMPLE;
> mylinuxsystem:~/tmp$ssc 192.168.43.70
> 
> Just like ssh, telnet, etc. Can't find this anywhere.
> 
> Since everyone here is into storage, thought I'd ask. 
> 
> Mike
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> ___________________________________________________________________________
> 
> Inbound Email has been scanned by Nexa Technologies Email Security Systems.
> ___________________________________________________________________________

-- 
Jeff Macfarland (jmacfarland at nexatech.com)
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net



From isplist at logicore.net  Fri Apr 11 14:32:08 2008
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 11 Apr 2008 09:32:08 -0500
Subject: [Linux-cluster] SSC Console?
In-Reply-To: <47FF70AD.2020102@nexatech.com>
Message-ID: <20084119328.411953@leena>

It might be if it's something that can run on linux. I've come across symantec and sun but not much else giving this protocol away so far. It seems to run on default port 206.

EXAMPLE;
mylinuxsystem:~/tmp$ssc 192.168.43.70



On Fri, 11 Apr 2008 09:07:41 -0500, Jeff Macfarland wrote:
>?Sounds like sun storage console to me. google "storedge ssconsole"




From isplist at logicore.net  Fri Apr 11 14:40:08 2008
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 11 Apr 2008 09:40:08 -0500
Subject: [Linux-cluster] Hardware/Sofware for VM use
In-Reply-To: <20080411120315.10707b89.pegasus@nerv.eu.org>
Message-ID: <20084119408.728182@leena>

I do use a plesk server and thanks for the lead on OpenVZ, looking at it right now. 

I am still wondering about the possible use of a number of slower servers being put to some good use. I wondered if perhaps redhat already has something along the lines of what I'm thinking about since they always have so many cool projects in the works.

Mike




From npf-mlists at eurotux.com  Fri Apr 11 14:58:46 2008
From: npf-mlists at eurotux.com (Nuno Fernandes)
Date: Fri, 11 Apr 2008 15:58:46 +0100
Subject: [Linux-cluster] RHEL cluster upgrade from 5.0 to 5.1
In-Reply-To: <200804101035.35425.npf-mlists@eurotux.com>
References: <200804101035.35425.npf-mlists@eurotux.com>
Message-ID: <200804111558.46449.npf-mlists@eurotux.com>

On Thursday 10 April 2008 10:35:34 Nuno Fernandes wrote:
> Hi,
>
> With a cluster of only clvmd is it possible to do a rolling upgrade from
> 5.0 to 5.1?
>
> By rolling upgrade i mean,
>
> 1 - select 1 node
> 2 - leave the node from cluster
> 3 - upgrade to 5.1
> 4 - join to cluster
> 5 - go to step 1
>
> Any info?
> Thanks,
> Nuno Fernandes
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

Does anyone know anything about this?

./npf



From teigland at redhat.com  Fri Apr 11 15:01:34 2008
From: teigland at redhat.com (David Teigland)
Date: Fri, 11 Apr 2008 10:01:34 -0500
Subject: [Linux-cluster] cluster-2.03.00
Message-ID: <20080411150134.GB7435@redhat.com>

A new source tarball of cluster code has been released: cluster-2.03.00
This has been taken from the STABLE2 branch in the cluster git tree.  It
is compatible with the current stable release of openais (0.80.3), and the
current stable release of the kernel (2.6.24).

  ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.00.tar.gz

To use gfs, a kernel patch is required to export three symbols from gfs2:
  ftp://sources.redhat.com/pub/cluster/releases/lockproto-exports.patch


Abhijith Das (3):
      gfs2_tool: remove 'gfs2_tool counters' as they aren't implemented anymore
      gfs-kernel: fix for bz 429343 gfs_glock_is_locked_by_me assertion
      gfs2_tool manpage: gfs2_tool counters doesn't exist anymore.

Andrew Price (1):
      [[BUILD] Warn and continue if CONFIG_KERNELVERSION is not found

Bob Peterson (9):
      Resolves: bz 435917: GFS2: mkfs.gfs2 default lock protocol
      Resolves: bz 421761: 'gfs_tool lockdump' wrongly says 'unknown
      Resolves: bz 431945: GFS: gfs-kernel should use device major:minor
      Update to prior commit for bz431945: I forgot that STABLE2
      Resolves: bz 436383: GFS filesystem size inconsistent
      Fix savemeta so it saves gfs-1 rg information properly
      Fix gfs2_edit print options (-p) to work properly for gfs-1
      gfs2_edit was not recalculating the max block size after it figured
      Fix some compiler warnings in gfs2_edit

Chris Feist (1):
      Added back in change to description line to make chkconfig work properly.

Christine Caulfield (5):
      [DLM] Don't segfault if lvbptr is NULL
      [CMAN] Free up any queued messages when someone disconnects
      [CMAN] Limit outstanding replies
      [CMAN] valid port number & don't use it before validation
      Remove references to broadcast.

David Teigland (4):
      doc: update usage.txt
      groupd: purge messages from dead nodes
      dlm_tool: print correct rq mode in lockdump
      libdlm: fix lvb copying 
Fabio M. Di Nitto (8):
      [BUILD] Fix configure script to handle releases
      [BUILD] Fix build system with openais whitetank
      [BUILD] Allow release version to contain padding 0's
      Add toplevel .gitignore
      [BUILD] Fix handling of version and libraries soname
      [BUILD] Fix man page install permission
      Revert "Fix help message to refer to script as 'fence_scsi_test'."
      Revert "fix bz277781 by accepting "nodename" as a synonym for "node""

Joel Becker (1):
      libdlm: Don't pass LKF_WAIT to the kernel

Jonathan Brassow (4):
      rgmanager/lvm.sh: Fix bug 438816
      rgmanager/lvm.sh:  Fix bug bz242798
      rgmanager/lvm.sh: change argument order of shell command
      rgmanager/lvm.sh:  Minor comment updates

Lon Hohberger (10):
      Add Sybase failover agent
      Update changelog
      Add / fix Oracle 10g failover agent
      [rgmanager] Make ip.sh check link states of non-ethernet devices
      [rgmanager] Set cloexec bit in msg_socket.c
      [rgmanager] Don't call quotaoff if quotas are not used
      [CMAN] Fix "Node X is undead" loop bug
      [rgmanager] Fix #432998
      [cman] Apply missing fix for #315711
      [CMAN] Make cman init script start qdiskd intelligently

Ryan McCabe (1):
      fix bz277781 by accepting "nodename" as a synonym for "node"

Ryan O'Hara (15):
      Variable should be quoted in conditional statement.
      Fix unregister code to report failure correctly.
      Remove "self" parameter. This was used to specify the name of the node
      Fix code to use get_key subroutine.
      Fix split calls to be consistent. Remove the optional LIMIT parameter.
      Replace /var/lock/subsys/${0##*/} with /var/lock/subsys/scsi_reserve.
      Fix success/failure reporting when registering devices at startup.
      Rewrite of get_scsi_devices function.
      Record devices that are successfully registered to /var/run/scsi_reserve.
      Allow 'stop' to release the reservation if and only if there are no other
      Attempt to register the node in the case where it must perform fence_scsi
      Fix help message to refer to script as 'fence_scsi_test'.
      BZ 248715
      BZ: 373491, 373511, 373531, 373541, 373571, 429033
      BZ 441323 : Redirect stderr to /dev/null when getting list of devices.


 .gitignore                            |    1 +
 cman/daemon/Makefile                  |    3 +-
 cman/daemon/cmanccs.c                 |   11 +-
 cman/daemon/cnxman-private.h          |    2 +-
 cman/daemon/commands.c                |    2 +-
 cman/daemon/daemon.c                  |   40 ++-
 cman/daemon/daemon.h                  |    3 +-
 cman/init.d/cman.in                   |   32 ++
 cman/init.d/qdiskd                    |   21 +-
 cman/lib/Makefile                     |   14 +-
 cman/man/cman_tool.8                  |   20 +-
 cman/qdisk/main.c                     |   34 +-
 configure                             |   87 +++-
 dlm/lib/Makefile                      |   26 +-
 dlm/lib/libdlm.c                      |   15 +-
 dlm/tool/main.c                       |    8 +-
 doc/usage.txt                         |   87 ++---
 fence/agents/scsi/fence_scsi.pl       |  248 ++++++++--
 fence/agents/scsi/fence_scsi_test.pl  |  171 ++++---
 fence/agents/scsi/scsi_reserve        |  300 ++++++++----
 gfs-kernel/src/gfs/glock.h            |   15 +-
 gfs-kernel/src/gfs/ops_address.c      |   29 +-
 gfs-kernel/src/gfs/proc.c             |    9 +-
 gfs/gfs_grow/main.c                   |    4 +-
 gfs/gfs_tool/util.c                   |   64 +--
 gfs2/edit/gfs2hex.c                   |   12 +-
 gfs2/edit/hexedit.c                   |  178 ++++++--
 gfs2/edit/hexedit.h                   |   32 ++
 gfs2/edit/savemeta.c                  |   38 +-
 gfs2/man/gfs2_tool.8                  |    4 -
 gfs2/man/mkfs.gfs2.8                  |    6 +-
 gfs2/tool/Makefile                    |    3 +-
 gfs2/tool/counters.c                  |  203 --------
 gfs2/tool/main.c                      |    5 -
 group/daemon/app.c                    |   25 +
 group/daemon/cpg.c                    |    1 +
 group/daemon/gd_internal.h            |    1 +
 group/dlm_controld/member_cman.c      |    8 +
 make/defines.mk.input                 |    1 +
 make/man.mk                           |    2 +-
 rgmanager/ChangeLog                   |    4 +
 rgmanager/src/clulib/msg_socket.c     |   12 +
 rgmanager/src/daemons/restree.c       |    2 +-
 rgmanager/src/resources/ASEHAagent.sh |  893 +++++++++++++++++++++++++++++++++
 rgmanager/src/resources/Makefile      |    3 +-
 rgmanager/src/resources/fs.sh         |   51 ++-
 rgmanager/src/resources/ip.sh         |   18 +-
 rgmanager/src/resources/lvm.metadata  |   13 +-
 rgmanager/src/resources/lvm.sh        |   14 +-
 rgmanager/src/resources/lvm_by_lv.sh  |   15 +-
 rgmanager/src/resources/lvm_by_vg.sh  |   22 +-
 rgmanager/src/resources/oracleas      |  792 -----------------------------
 rgmanager/src/resources/oracledb.sh   |  869 ++++++++++++++++++++++++++++++++
 53 files changed, 2954 insertions(+), 1519 deletions(-)



From s.wendy.cheng at gmail.com  Fri Apr 11 15:28:37 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Fri, 11 Apr 2008 10:28:37 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a	coffee first ; )>
In-Reply-To: <1207858736.5188.77.camel@localhost>
References: <1207622206.5259.66.camel@localhost>	<alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>	<47FB8346.1070802@gmail.com>
	<1207858736.5188.77.camel@localhost>
Message-ID: <47FF83A5.6010806@gmail.com>

christopher barry wrote:
> On Tue, 2008-04-08 at 09:37 -0500, Wendy Cheng wrote:
>   
>> gordan at bobich.net wrote:
>>     
>>>       
>>>> my setup:
>>>> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
>>>> not new stuff, but corporate standards dictated the rev of rhat.
>>>>         
>>> [...]
>>>       
>>>> I'm noticing huge differences in compile times - or any home file access
>>>> really - when doing stuff in the same home directory on the gfs on
>>>> different nodes. For instance, the same compile on one node is ~12
>>>> minutes - on another it's 18 minutes or more (not running concurrently).
>>>> I'm also seeing weird random pauses in writes, like saving a file in vi,
>>>> what would normally take less than a second, may take up to 10 seconds.
>>>>         
>
> Anyway, thought I would re-connect to you all and let you know how this
> worked out. We ended up scrapping gfs. Not because it's not a great fs,
> but because I was using it in a way that was playing to it's weak
> points. I had a lot of time and energy invested in it, and it was hard
> to let it go. Turns out that connecting to the NetApp filer via nfs is
> faster for this workload. I couldn't believe it either, as my bonnie and
> dd type tests showed gfs to be faster. But for the use case of large
> sets of very small files, and lots of stats going on, gfs simply cannot
> compete with NetApp's nfs implementation. GFS is an excellent fs, and it
> has it's place in the landscape - but for a development build system,
> the NetApp is simply phenomenal.
>   

Assuming you run both configurations (nfs-wafl vs. gfs-san) on the very 
same netapp box (?) ...

Both configurations have their pros and cons. The wafl-nfs runs on 
native mode that certainly has its advantages - you've made a good 
choice but the latter (gfs-on-netapp san) can work well in other 
situations. The biggest problem with your original configuration is the 
load-balancer. The round-robin (and its variants) scheduling will not 
work well if you have a write intensive workload that needs to fight for 
locks between multiple GFS nodes. IIRC, there are gfs customers running 
on build-compile development environment. They normally assign groups of 
users on different GFS nodes, say user id starting with a-e on node 1, 
f-j on node2, etc.

One encouraging news from this email is gfs-netapp-san runs well on 
bonnie. GFS1 has been struggling with bonnie (large amount of smaller 
files within one single node) for a very long time. One of the reasons 
is its block allocation tends to get spread across the disk whenever 
there are resource group contentions. It is very difficult for linux IO 
scheduler to merge these blocks within one single server. When the 
workload becomes IO-bound, the locks are subsequently stalled and 
everything start to snow-ball after that. Netapp SAN has one more layer 
of block allocation indirection within its firmware and its write speed 
is "phenomenal" (I'm borrowing your words ;) ), mostly to do with the 
NVRAM where it can aggressively cache write data - this helps GFS to 
relieve its small file issue quite well.

-- Wendy



From Christopher.Barry at qlogic.com  Fri Apr 11 15:47:16 2008
From: Christopher.Barry at qlogic.com (christopher barry)
Date: Fri, 11 Apr 2008 11:47:16 -0400
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a	coffee first ; )>
In-Reply-To: <47FF83A5.6010806@gmail.com>
References: <1207622206.5259.66.camel@localhost>
	<alpine.LRH.1.10.0804081047550.23969@skynet.shatteredsilicon.net>
	<47FB8346.1070802@gmail.com> <1207858736.5188.77.camel@localhost>
	<47FF83A5.6010806@gmail.com>
Message-ID: <1207928836.5229.33.camel@localhost>

On Fri, 2008-04-11 at 10:28 -0500, Wendy Cheng wrote:
> christopher barry wrote:
> > On Tue, 2008-04-08 at 09:37 -0500, Wendy Cheng wrote:
> >   
> >> gordan at bobich.net wrote:
> >>     
> >>>       
> >>>> my setup:
> >>>> 6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
> >>>> not new stuff, but corporate standards dictated the rev of rhat.
> >>>>         
> >>> [...]
> >>>       
> >>>> I'm noticing huge differences in compile times - or any home file access
> >>>> really - when doing stuff in the same home directory on the gfs on
> >>>> different nodes. For instance, the same compile on one node is ~12
> >>>> minutes - on another it's 18 minutes or more (not running concurrently).
> >>>> I'm also seeing weird random pauses in writes, like saving a file in vi,
> >>>> what would normally take less than a second, may take up to 10 seconds.
> >>>>         
> >
> > Anyway, thought I would re-connect to you all and let you know how this
> > worked out. We ended up scrapping gfs. Not because it's not a great fs,
> > but because I was using it in a way that was playing to it's weak
> > points. I had a lot of time and energy invested in it, and it was hard
> > to let it go. Turns out that connecting to the NetApp filer via nfs is
> > faster for this workload. I couldn't believe it either, as my bonnie and
> > dd type tests showed gfs to be faster. But for the use case of large
> > sets of very small files, and lots of stats going on, gfs simply cannot
> > compete with NetApp's nfs implementation. GFS is an excellent fs, and it
> > has it's place in the landscape - but for a development build system,
> > the NetApp is simply phenomenal.
> >   
> 
> Assuming you run both configurations (nfs-wafl vs. gfs-san) on the very 
> same netapp box (?) ...

yes.

> 
> Both configurations have their pros and cons. The wafl-nfs runs on 
> native mode that certainly has its advantages - you've made a good 
> choice but the latter (gfs-on-netapp san) can work well in other 
> situations. The biggest problem with your original configuration is the 
> load-balancer. The round-robin (and its variants) scheduling will not 
> work well if you have a write intensive workload that needs to fight for 
> locks between multiple GFS nodes. IIRC, there are gfs customers running 
> on build-compile development environment. They normally assign groups of 
> users on different GFS nodes, say user id starting with a-e on node 1, 
> f-j on node2, etc.

exactly. I was about to implement the sh (source hash) scheduler in LVS,
which I believe would have accomplished the same thing, only
automatically, and in a statistically balanced way. Actually still
might. I've had some developers test out the nfs solution and for some
gfs is still better. I know that if users are pinned to a node - but can
still failover in the event of node failure - this would yield the best
possible performance.

The main reason the IT group wants to use nfs, is for all of the other
benefits, such as file-level snapshots, better backup performance, etc.
Now that they see a chink in the gfs performance armor, mainly because I
implemented the wrong load balancing algorithm, they're circling for the
kill. I'm interested how well the nfs will scale with users vs. the
gfs-san approach.

> 
> One encouraging news from this email is gfs-netapp-san runs well on 
> bonnie. GFS1 has been struggling with bonnie (large amount of smaller 
> files within one single node) for a very long time. One of the reasons 
> is its block allocation tends to get spread across the disk whenever 
> there are resource group contentions. It is very difficult for linux IO 
> scheduler to merge these blocks within one single server. When the 
> workload becomes IO-bound, the locks are subsequently stalled and 
> everything start to snow-ball after that. Netapp SAN has one more layer 
> of block allocation indirection within its firmware and its write speed 
> is "phenomenal" (I'm borrowing your words ;) ), mostly to do with the 
> NVRAM where it can aggressively cache write data - this helps GFS to 
> relieve its small file issue quite well.

Thanks for all of your input Wendy.

-C

> 
> -- Wendy
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From agspoon at gmail.com  Fri Apr 11 16:05:35 2008
From: agspoon at gmail.com (Craig Johnston)
Date: Fri, 11 Apr 2008 09:05:35 -0700
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21 kernel
In-Reply-To: <1207901329.3635.342.camel@quoit>
References: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
	<1207901329.3635.342.camel@quoit>
Message-ID: <f4f856dc0804110905x99bd236hb9c43083e8e9b819@mail.gmail.com>

On Fri, Apr 11, 2008 at 1:08 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:
> Hi,
>
>
>
>  On Thu, 2008-04-10 at 14:27 -0700, Craig Johnston wrote:
>  > We would like to achieve a stable GFS/GFS2 cluster configuration using
>  > a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
>  > attempt was to obtain the Fedora Core 7 source rpms for the various
>  > components (cman, rgmanager, openais, etc.).  We were successful in
>  > incorporating these packages into our distribution, and creating what
>  > should be a working cluster configuration with multiple nodes sharing
>  > a set of GFS2 file systems from an iSCSI SAN.
>  >
>  > The problem is that it is all very unstable, takes forever to
>  > start-up, and locks up under even small load.  We would like to move
>  > to a more recent version of the cluster suite and update the kernel
>  > gfs2 and dlm modules for a 2.6.21 kernel.  We need to stick with
>  > 2.6.21 for other reasons (vendor support mostly), and we figure if it
>  > all can be back ported for RHEL5.1 (2.6.18) it should be doable for
>  > 2.6.21.  We just don't know where to start.
>  >
>  > Any advice on how we might proceed on this process would be greatly appreciated.
>  >
>  > Thanks,
>  > Craig
>  >
>  If you want to use GFS2, then try F-8, or rawhide with the most uptodate
>  set of packages. I would not recommend using a kernel that old for GFS2,
>
>  Steve.

Do you think we could be successful in patching up the GFS2/DLM
modules in our 2.6.21 kernel to bring it up to a more recent version?
 How coupled is the GFS2/DLM code to the rest of the kernel?  We have
a number of machines running CentOS 5.1.  Does it seem feasible to
select the applicable patches from that distribution and apply them to
a 2.6.21 kernel (with some tweaks no doubt)?

Craig



From agspoon at gmail.com  Fri Apr 11 16:09:14 2008
From: agspoon at gmail.com (Craig Johnston)
Date: Fri, 11 Apr 2008 09:09:14 -0700
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21 kernel
In-Reply-To: <alpine.LRH.1.10.0804111033100.29827@skynet.shatteredsilicon.net>
References: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
	<alpine.LRH.1.10.0804111033100.29827@skynet.shatteredsilicon.net>
Message-ID: <f4f856dc0804110909t65ea317as9d6e44706e01b33a@mail.gmail.com>

On Fri, Apr 11, 2008 at 2:38 AM,  <gordan at bobich.net> wrote:
> On Thu, 10 Apr 2008, Craig Johnston wrote:
>
>
> > We would like to achieve a stable GFS/GFS2 cluster configuration using
> > a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
> > attempt was to obtain the Fedora Core 7 source rpms for the various
> > components (cman, rgmanager, openais, etc.).  We were successful in
> > incorporating these packages into our distribution, and creating what
> > should be a working cluster configuration with multiple nodes sharing
> > a set of GFS2 file systems from an iSCSI SAN.
> >
> > The problem is that it is all very unstable, takes forever to
> > start-up, and locks up under even small load.
> >
>
>  Use GFS1. GFS2 still does that. FC6+ no longer ships with GFS1 support
> built in as standard. If you're going to stick to the tried path, use RHEL5
> (based) distributions. If you don't, you may well be better of just building
> the lot from source. It comes down to what your time is worth to you.
>
>  Gordan

Yes, we thought that we might have better luck with GFS, but like you
say it is not really available in more recent distributions.  What
would we need to do to get GFS working?  It is only the gfs-tools that
are missing, or do we need kernel changes as well?  Do the recent
versions of the cluster tools work with GFS?

Craig



From swhiteho at redhat.com  Fri Apr 11 16:13:59 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 11 Apr 2008 17:13:59 +0100
Subject: [Linux-cluster] Achieving a stable cluster with a 2.6.21 kernel
In-Reply-To: <f4f856dc0804110905x99bd236hb9c43083e8e9b819@mail.gmail.com>
References: <f4f856dc0804101427r36e59d2o8c63fc70880c4637@mail.gmail.com>
	<1207901329.3635.342.camel@quoit>
	<f4f856dc0804110905x99bd236hb9c43083e8e9b819@mail.gmail.com>
Message-ID: <1207930439.3635.352.camel@quoit>

Hi,

On Fri, 2008-04-11 at 09:05 -0700, Craig Johnston wrote:
> On Fri, Apr 11, 2008 at 1:08 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:
> > Hi,
> >
> >
> >
> >  On Thu, 2008-04-10 at 14:27 -0700, Craig Johnston wrote:
> >  > We would like to achieve a stable GFS/GFS2 cluster configuration using
> >  > a non-Redhat distribution that is based on a 2.6.21 kernel.  Our first
> >  > attempt was to obtain the Fedora Core 7 source rpms for the various
> >  > components (cman, rgmanager, openais, etc.).  We were successful in
> >  > incorporating these packages into our distribution, and creating what
> >  > should be a working cluster configuration with multiple nodes sharing
> >  > a set of GFS2 file systems from an iSCSI SAN.
> >  >
> >  > The problem is that it is all very unstable, takes forever to
> >  > start-up, and locks up under even small load.  We would like to move
> >  > to a more recent version of the cluster suite and update the kernel
> >  > gfs2 and dlm modules for a 2.6.21 kernel.  We need to stick with
> >  > 2.6.21 for other reasons (vendor support mostly), and we figure if it
> >  > all can be back ported for RHEL5.1 (2.6.18) it should be doable for
> >  > 2.6.21.  We just don't know where to start.
> >  >
> >  > Any advice on how we might proceed on this process would be greatly appreciated.
> >  >
> >  > Thanks,
> >  > Craig
> >  >
> >  If you want to use GFS2, then try F-8, or rawhide with the most uptodate
> >  set of packages. I would not recommend using a kernel that old for GFS2,
> >
> >  Steve.
> 
> Do you think we could be successful in patching up the GFS2/DLM
> modules in our 2.6.21 kernel to bring it up to a more recent version?
>  How coupled is the GFS2/DLM code to the rest of the kernel?  We have
> a number of machines running CentOS 5.1.  Does it seem feasible to
> select the applicable patches from that distribution and apply them to
> a 2.6.21 kernel (with some tweaks no doubt)?
> 
> Craig

Not easily. One of the bugs since then was solved by a change in the VFS
so that its not just a question of applying patches to gfs2 on its own.
The version of GFS2 in RHEL has a different fix for this problem though,
so you might be able to borrow that. Either way its not going to be an
easy task and using a more recent kernel would be a much quicker way of
getting a more stable GFS2,

Steve.

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From lexi.herrera at gmail.com  Fri Apr 11 17:22:29 2008
From: lexi.herrera at gmail.com (Lexi Herrera)
Date: Fri, 11 Apr 2008 13:22:29 -0400
Subject: [Linux-cluster] rhel hpcc
Message-ID: <6c3ea40804111022w18e53040r75cde52ef3ba605a@mail.gmail.com>

hello to all, I need aid to install cluster hpcc with red hat el as r 4.5,
it has 4 node with 4 cpu each on, the computers are ibm blades.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080411/6d06bf0d/attachment.htm>

From s.wendy.cheng at gmail.com  Sat Apr 12 04:16:52 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Fri, 11 Apr 2008 23:16:52 -0500
Subject: [Linux-cluster] dlm and IO speed problem <er, might wanna get
	a coffee first ; )>
In-Reply-To: <Pine.LNX.4.64.0804111124230.26805@lxserv1.kfki.hu>
References: <1207622206.5259.66.camel@localhost>	<1a2a6dd60804080213x17dc2578s75fcf7a92ea35790@mail.gmail.com>	<Pine.LNX.4.64.0804082245130.6860@lxserv0.kfki.hu>	<47FCDB60.8030807@gmail.com>
	<47FD02D3.7010802@gmail.com>	<Pine.LNX.4.64.0804092122001.23034@lxserv1.kfki.hu>	<47FD2A01.1030708@gmail.com>	<Pine.LNX.4.64.0804101349450.14375@lxserv1.kfki.hu>
	<Pine.LNX.4.64.0804111124230.26805@lxserv1.kfki.hu>
Message-ID: <480037B4.8000406@gmail.com>

Kadlecsik Jozsef wrote:
> On Thu, 10 Apr 2008, Kadlecsik Jozsef wrote:
>
>   
>> But this is a good clue to what might bite us most! Our GFS cluster is an 
>> almost mail-only cluster for users with Maildir. When the users experience 
>> temporary hangups for several seconds (even when writing a new mail), it 
>> might be due to the concurrent scanning for a new mail on one node by the 
>> MUA and the delivery to the Maildir in another node by the MTA.
>>     

I personally don't know much about mail server. But if anyone can 
explain more about what these two processes (?) do, say, how does that 
"MTA" deliver its mail (by "rename" system call ?) and/or how mails are 
moved from which node to where, we may have a better chance to figure 
this puzzle out.

Note that "rename" system call is normally very expensive. Minimum 4 
exclusive locks are required (two directory locks, one file lock for 
unlink, one file lock for link), plus resource group lock if block 
allocation is required. There are numerous chances for deadlocks if not 
handled carefully. The issue is further worsen by the way GFS1 does its 
lock ordering - it obtains multiple locks based on lock name order. Most 
of the locknames are taken from inode number so their sequence always 
quite random. As soon as lock contention occurs, lock requests will be 
serialized to avoid deadlocks. So this may be a cause for these spikes 
where "rename"(s) are struggling to get lock order straight. But I don't 
know for sure unless someone explains how email server does its things. 
BTW, GFS2 has relaxed this lock order issue so it should work better.

I'm having a trip (away from internet) but I'm interested to know this 
story... Maybe by the time I get back on my laptop, someone has figured 
this out. But please do share the story :) ...

-- Wendy

>> What is really strange (and distrurbing) that such "hangups" can take 
>> 10-20 seconds which is just too much for the users.
>>     
>
> Yesterday we started to monitor the number of locks/held locks on two of 
> the machines. The results from the first day can be found at 
> http://www.kfki.hu/~kadlec/gfs/.
>
> It looks as Maildir is definitely a wrong choice for GFS and we should 
> consider to convert to mailbox format: at least I cannot explain the 
> spikes in another way.
>  
>   
>> In order to look at the possible tuning options and the side effects, I 
>> list what I have learned so far:
>>
>> - Increasing glock_purge (percent, default 0) helps to trim back the 
>>   unused glocks by gfs_scand itself. Otherwise glocks can accumulate and 
>>   gfs_scand eats more and more time at scanning the larger and 
>>   larger table of glocks.
>> - gfs_scand wakes up every scand_secs (default 5s) to scan the glocks,  
>>   looking for work to do. By increasing scand_secs one can lessen the load 
>>   produced by gfs_scand, but it'll hurt because flushing data can be 
>>   delayed.
>> - Decreasing demote_secs (seconds, default 300) helps to flush cached data
>>   more often by moving write locks into less restricted states. Flushing 
>>   often helps to avoid burstiness *and* to prolong another nodes' 
>>   lock access. Question is, what are the side effects of small
>>   demote_secs values? (Probably there is no much point to choose
>>   smaller demote_secs value than scand_secs.)
>>
>> Currently we are running with 'glock_purge = 20' and 'demote_secs = 30'.
>>     
>
> Best regards,
> Jozsef
> --
> E-mail : kadlec at mail.kfki.hu, kadlec at blackhole.kfki.hu
> PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address: KFKI Research Institute for Particle and Nuclear Physics
>          H-1525 Budapest 114, POB. 49, Hungary
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   




From lists at tangent.co.za  Sun Apr 13 16:43:05 2008
From: lists at tangent.co.za (Chris Picton)
Date: Sun, 13 Apr 2008 16:43:05 +0000 (UTC)
Subject: [Linux-cluster] DRBD and redhat cluster
Message-ID: <fttd6p$a7k$1@ger.gmane.org>

Hi all

I am planning a pair of machines using DRBD to export redundant block 
devices via gnbd

I will have about 10 other machines using gfs2 to access this data.

I have thought of a few different ways of accomplishing this, and would 
appreciate any feedback you may have.

I can run the drbd exports in one of three ways:

1. Single drbd device in single master mode.  The drbd primary resource, 
shared ip address and gndb export will fail over between the two nodes.

Advantages.  Simple setup
Disadvantages.  One machine always fairly idle

2. Two drbd devices on each server.  Server1 will be primary on the one 
device, server 2 will be primary on the other device.  As above, each 
drbd resource will have an associated IP address and gnbd export 
associated with it.  These resources will fail over between the two 
nodes.  The remaining cluster nodes will assemble these using a striped 
clvm.

Advantages.  The two nodes will be equally used.  Will get double the 
throughput from the pair of machines (assuming high speed bonded 
crossover between the two).  Also, will be able to add another pair of 
these machines later to increase bandwidth and storage space for the LVM.
Disadvantages: More complex

3. A drbd device in master/master mode, exported from both via gnbd.  
Cluster members will access this via dm-multipath.  

Advantages: not sure
Disadvantages: Will bandwidth be shared between the two machines?



In addition to the options above, I am also wondering about the cluster 
software I should use.  I will have to use RHCS to use gfs2 and 
fence_gnbd correctly, but is heartbeat a better choice than failover 
domains for the drbd exports?

Any insight will be helpful

Chris



From gordan at bobich.net  Mon Apr 14 12:19:22 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Mon, 14 Apr 2008 13:19:22 +0100 (BST)
Subject: [Linux-cluster] Fencing Driver API Requirements
Message-ID: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>

Hi,

I remember that this was mentioned several times in the last few months, 
but has any documentation been put together on the API that the fencing 
drivers are supposed to cover?

I'm looking into writing a fencing driver based on disabling switch ports 
on a managed 3com switch via the telnet interface, and I'd like to make 
sure that it conforms to any speciffic requirements that might exist. If 
someone could point me at the relevant URL, that would be most 
appreciated.

Gordan



From arjuna.christensen at maxnet.co.nz  Mon Apr 14 13:31:25 2008
From: arjuna.christensen at maxnet.co.nz (Arjuna Christensen)
Date: Tue, 15 Apr 2008 01:31:25 +1200
Subject: [Linux-cluster] DRBD and redhat cluster
In-Reply-To: <fttd6p$a7k$1@ger.gmane.org>
References: <fttd6p$a7k$1@ger.gmane.org>
Message-ID: <6DD7CC182D1E154E9F5FF6301B077EFE72EB16@exchange01.office.maxnet.co.nz>

I've been using RHCS to control DRBD quite happily, but only in a active/passive scenario.

All it requires is a little script, and an rgmanager '<script../>' object:

#!/bin/bash
exec /etc/ha.d/resource.d/drbddisk <resourcename> $@

(/etc/ha.d/resource.d/drbddisk is installed by the DRBD package)

Regards,

Arjuna Christensen?|?Systems Engineer?
Maximum Internet Ltd
DDI: + 64 9?913 9683 | Ph: +64 9 915 1825 | Fax:: +64 9 300 7227
arjuna.christensen at maxnet.co.nz| www.maxnet.co.nz

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Picton
Sent: Monday, 14 April 2008 4:43 a.m.
To: linux-cluster at redhat.com
Subject: [Linux-cluster] DRBD and redhat cluster

Hi all

I am planning a pair of machines using DRBD to export redundant block 
devices via gnbd

I will have about 10 other machines using gfs2 to access this data.

I have thought of a few different ways of accomplishing this, and would 
appreciate any feedback you may have.

I can run the drbd exports in one of three ways:

1. Single drbd device in single master mode.  The drbd primary resource, 
shared ip address and gndb export will fail over between the two nodes.

Advantages.  Simple setup
Disadvantages.  One machine always fairly idle

2. Two drbd devices on each server.  Server1 will be primary on the one 
device, server 2 will be primary on the other device.  As above, each 
drbd resource will have an associated IP address and gnbd export 
associated with it.  These resources will fail over between the two 
nodes.  The remaining cluster nodes will assemble these using a striped 
clvm.

Advantages.  The two nodes will be equally used.  Will get double the 
throughput from the pair of machines (assuming high speed bonded 
crossover between the two).  Also, will be able to add another pair of 
these machines later to increase bandwidth and storage space for the LVM.
Disadvantages: More complex

3. A drbd device in master/master mode, exported from both via gnbd.  
Cluster members will access this via dm-multipath.  

Advantages: not sure
Disadvantages: Will bandwidth be shared between the two machines?



In addition to the options above, I am also wondering about the cluster 
software I should use.  I will have to use RHCS to use gfs2 and 
fence_gnbd correctly, but is heartbeat a better choice than failover 
domains for the drbd exports?

Any insight will be helpful

Chris

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From mgrac at redhat.com  Mon Apr 14 18:47:32 2008
From: mgrac at redhat.com (Marek 'marx' Grac)
Date: Mon, 14 Apr 2008 20:47:32 +0200
Subject: [Linux-cluster] Fencing Driver API Requirements
In-Reply-To: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>
Message-ID: <4803A6C4.20706@redhat.com>

Hi,

gordan at bobich.net wrote:
>
> I remember that this was mentioned several times in the last few 
> months, but has any documentation been put together on the API that 
> the fencing drivers are supposed to cover?
>
> I'm looking into writing a fencing driver based on disabling switch 
> ports on a managed 3com switch via the telnet interface, and I'd like 
> to make sure 
> that it conforms to any speciffic requirements that might exist. If 
> someone could point me at the relevant URL, that would be most 
> appreciated.

There is a new python module in the git (master branch / 
cluster/gence/agents/lib/fencing.py) that should contain everything you 
should need to write a fence agent. This module was used to built 
several agents (they are just in the git tree) eg. apc/apc.py, 
drac/drac5.py, wti/wti.py. If you will find any problem with fencing.py, 
let me know and I will try to fix it.


marx,
-- 

Marek Grac
Red Hat Czech s.r.o.



From underscore_dot at yahoo.com  Mon Apr 14 21:40:10 2008
From: underscore_dot at yahoo.com (nch)
Date: Mon, 14 Apr 2008 14:40:10 -0700 (PDT)
Subject: [Linux-cluster] how can I share  a logical volume?
Message-ID: <831443.94855.qm@web32406.mail.mud.yahoo.com>

Hello, everybody.

I'm trying to run a cluster with 3 nodes. One of them would share storage with the other two using GFS and DLM (kernel 2.4.18-6).
I was able to start ccsd, cman, fenced and clvmd in all nodes. I've defined a logical volume in the storage node and was able to gfs_mkfs, activate it locally and mount it, but I don't know how to make it available/visible to the other two nodes.
Do you know how to do this?
I've followed instruction given in http://sources.redhat.com/cluster/doc/usage.txt (except for setting locking_type=2).

Many thanks.




      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080414/e8255015/attachment.htm>

From shawnlhood at gmail.com  Tue Apr 15 00:06:07 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 14 Apr 2008 20:06:07 -0400
Subject: [Linux-cluster] how can I share a logical volume?
In-Reply-To: <831443.94855.qm@web32406.mail.mud.yahoo.com>
References: <831443.94855.qm@web32406.mail.mud.yahoo.com>
Message-ID: <cfe2fc960804141706s52b6fde3x4f6c494f62b5137d@mail.gmail.com>

As far as I know, you should be able to at least SEE the logical
volume as long as there is a path to the physical volumes on the other
nodes.  Are you able to see the same block devices (eg /dev/sd?) on
the other nodes?

Shawn Hood



2008/4/14 nch <underscore_dot at yahoo.com>:
>
> Hello, everybody.
>
> I'm trying to run a cluster with 3 nodes. One of them would share storage
> with the other two using GFS and DLM (kernel 2.4.18-6).
> I was able to start ccsd, cman, fenced and clvmd in all nodes. I've defined
> a logical volume in the storage node and was able to gfs_mkfs, activate it
> locally and mount it, but I don't know how to make it available/visible to
> the other two nodes.
> Do you know how to do this?
> I've followed instruction given in
> http://sources.redhat.com/cluster/doc/usage.txt (except for setting
> locking_type=2).
>
> Many thanks.
>
>
> --
>  Linux-cluster mailing list
>  Linux-cluster at redhat.com
>  https://www.redhat.com/mailman/listinfo/linux-cluster
>



From underscore_dot at yahoo.com  Tue Apr 15 06:22:11 2008
From: underscore_dot at yahoo.com (nch)
Date: Mon, 14 Apr 2008 23:22:11 -0700 (PDT)
Subject: [Linux-cluster] how can I share a logical volume?
Message-ID: <694493.83478.qm@web32404.mail.mud.yahoo.com>

No, I can't see the logical volumes on the other nodes. vgscan doesn't show any, nor I can find any new devices in /dev.
As I couldn't find docs/examples on this particular point, I really don't know what to expect.
I'm trying with different types of logical volumes (stripped, mirrored), but didn't make a difference.
For the moment, I'm running all the stuff on virtual machines, could this be an issue?
For the moment I'm using a minimal cluster.conf, in which I just declare the nodes. Should I add specific configurations to it?

Lots of thanks.

----- Original Message ----
From: Shawn Hood <shawnlhood at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Tuesday, April 15, 2008 2:06:07 AM
Subject: Re: [Linux-cluster] how can I share a logical volume?

As far as I know, you should be able to at least SEE the logical
volume as long as there is a path to the physical volumes on the other
nodes.  Are you able to see the same block devices (eg /dev/sd?) on
the other nodes?

Shawn Hood



2008/4/14 nch <underscore_dot at yahoo.com>:
>
> Hello, everybody.
>
> I'm trying to run a cluster with 3 nodes. One of them would share storage
> with the other two using GFS and DLM (kernel 2.4.18-6).
> I was able to start ccsd, cman, fenced and clvmd in all nodes. I've defined
> a logical volume in the storage node and was able to gfs_mkfs, activate it
> locally and mount it, but I don't know how to make it available/visible to
> the other two nodes.
> Do you know how to do this?
> I've followed instruction given in
> http://sources.redhat.com/cluster/doc/usage.txt (except for setting
> locking_type=2).
>
> Many thanks.
>
>
> --
>  Linux-cluster mailing list
>  Linux-cluster at redhat.com
>  https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080414/d07a6f0e/attachment.htm>

From Alain.Moulle at bull.net  Tue Apr 15 14:22:24 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Tue, 15 Apr 2008 16:22:24 +0200
Subject: [Linux-cluster] CS5 / timers tuning (contd)
Message-ID: <4804BA20.30709@bull.net>

Hi Lon

Thans again, but that's strange because in the man , the recommended
values are :
intervall="1" tko="10" and so we have a result < 21s which is the
default value of heart-beat timer, so not a hair above like you
recommened in previous email ...
extract of man qddisk :

         interval="1"
            This is the frequency of read/write cycles, in seconds.

         tko="10"
            This  is  the  number  of  cycles  a node must miss in order to be
            declared dead.

?

PS : " don't recall if there's a way to do it from
cluster.conf"
 yes we can change the deadnode_timeout in cluster.conf :
<cman deadnode_timer="DEADNODETIMER_VALUE"/>

Thanks
Regards
Alain Moull?



From andrew at ntsg.umt.edu  Tue Apr 15 15:06:30 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Tue, 15 Apr 2008 09:06:30 -0600 (MDT)
Subject: [Linux-cluster] how can I share a logical volume?
In-Reply-To: <694493.83478.qm@web32404.mail.mud.yahoo.com>
References: <694493.83478.qm@web32404.mail.mud.yahoo.com>
Message-ID: <41086.10.8.105.69.1208271990.squirrel@secure.ntsg.umt.edu>

Did you setup the node with the storage as a gnbd server and the other
nodes as gnbd clients? I think this is what you want in order for the
nodes to all have block level access to the storage for clvmd and dlm to
run on top of.

-A
-- 
Andrew A. Neuschwander, RHCE
Linux Systems/Software Engineer
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310

On Tue, April 15, 2008 12:22 am, nch wrote:
> No, I can't see the logical volumes on the other nodes. vgscan doesn't
> show any, nor I can find any new devices in /dev.
> As I couldn't find docs/examples on this particular point, I really don't
> know what to expect.
> I'm trying with different types of logical volumes (stripped, mirrored),
> but didn't make a difference.
> For the moment, I'm running all the stuff on virtual machines, could this
> be an issue?
> For the moment I'm using a minimal cluster.conf, in which I just declare
> the nodes. Should I add specific configurations to it?
>
> Lots of thanks.
>
> ----- Original Message ----
> From: Shawn Hood <shawnlhood at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Tuesday, April 15, 2008 2:06:07 AM
> Subject: Re: [Linux-cluster] how can I share a logical volume?
>
> As far as I know, you should be able to at least SEE the logical
> volume as long as there is a path to the physical volumes on the other
> nodes.  Are you able to see the same block devices (eg /dev/sd?) on
> the other nodes?
>
> Shawn Hood
>
>
>
> 2008/4/14 nch <underscore_dot at yahoo.com>:
>>
>> Hello, everybody.
>>
>> I'm trying to run a cluster with 3 nodes. One of them would share
>> storage
>> with the other two using GFS and DLM (kernel 2.4.18-6).
>> I was able to start ccsd, cman, fenced and clvmd in all nodes. I've
>> defined
>> a logical volume in the storage node and was able to gfs_mkfs, activate
>> it
>> locally and mount it, but I don't know how to make it available/visible
>> to
>> the other two nodes.
>> Do you know how to do this?
>> I've followed instruction given in
>> http://sources.redhat.com/cluster/doc/usage.txt (except for setting
>> locking_type=2).
>>
>> Many thanks.
>>
>>



From lexi.herrera at gmail.com  Tue Apr 15 16:49:08 2008
From: lexi.herrera at gmail.com (Lexi Herrera)
Date: Tue, 15 Apr 2008 12:49:08 -0400
Subject: [Linux-cluster] red hat enterprise
Message-ID: <6c3ea40804150949i45f3b50cs3569e96221a2ea90@mail.gmail.com>

hi everybody, i am new in my job and new in linux and not have enough
experience with this installations, i know solaris, but don't worry. I need
install a high performance cluster with red hat enterprise and the
information that i knows is this:



-         red hat enterprise as 4.5

-         torche

-         pvm

-         mpi

-         promax

-         focus

-         gpfs or better

-         ganglia or better



please i need a lot aid to make this installation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080415/7997130f/attachment.htm>

From jruemker at redhat.com  Tue Apr 15 18:04:46 2008
From: jruemker at redhat.com (John Ruemker)
Date: Tue, 15 Apr 2008 14:04:46 -0400
Subject: [Linux-cluster] how can I share a logical volume?
In-Reply-To: <41086.10.8.105.69.1208271990.squirrel@secure.ntsg.umt.edu>
References: <694493.83478.qm@web32404.mail.mud.yahoo.com>
	<41086.10.8.105.69.1208271990.squirrel@secure.ntsg.umt.edu>
Message-ID: <4804EE3E.4000005@redhat.com>

When you say you want to share storage with the other 2 nodes, do you 
mean only one node is physically connected to the storage and exports it 
to the other 2?  Or do you mean that all three nodes are connected to 
the same storage and they share the device? 


If the former, gnbd is probably what you want.  If the latter, you 
should see the same devices (/dev/sdX) on each node.  If you do not, you 
have misconfigured your HBA or LUNs.  Once they all see the same 
devices, you should be able to start clvmd and all 3 will see the 
clustered volume group.


John

Andrew A. Neuschwander wrote:
> Did you setup the node with the storage as a gnbd server and the other
> nodes as gnbd clients? I think this is what you want in order for the
> nodes to all have block level access to the storage for clvmd and dlm to
> run on top of.
>
> -A
>   



From shawnlhood at gmail.com  Tue Apr 15 18:26:47 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Tue, 15 Apr 2008 14:26:47 -0400
Subject: [Linux-cluster] red hat enterprise
In-Reply-To: <6c3ea40804150949i45f3b50cs3569e96221a2ea90@mail.gmail.com>
References: <6c3ea40804150949i45f3b50cs3569e96221a2ea90@mail.gmail.com>
Message-ID: <cfe2fc960804151126m9d746ffs9b3f63cac11a2298@mail.gmail.com>

I hate to break it to you, but this kind of message isn't going to get
you anywhere.  I can assure you that many who read this message are
thinking RTFM (see http://en.wikipedia.org/wiki/RTFM).  You're going
to have to hit the books like the rest of us.

Shawn Hood

2008/4/15 Lexi Herrera <lexi.herrera at gmail.com>:
>
>
> hi everybody, i am new in my job and new in linux and not have enough
> experience with this installations, i know solaris, but don't worry. I need
> install a high performance cluster with red hat enterprise and the
> information that i knows is this:
>
>
>
> -         red hat enterprise as 4.5
>
> -         torche
>
> -         pvm
>
> -         mpi
>
> -         promax
>
> -         focus
>
> -         gpfs or better
>
> -         ganglia or better
>
>
>
> please i need a lot aid to make this installation.
> --
>  Linux-cluster mailing list
>  Linux-cluster at redhat.com
>  https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
Shawn Hood
(910) 670-1819 Mobile



From underscore_dot at yahoo.com  Tue Apr 15 20:21:16 2008
From: underscore_dot at yahoo.com (nch)
Date: Tue, 15 Apr 2008 13:21:16 -0700 (PDT)
Subject: [Linux-cluster] how can I share a logical volume?
Message-ID: <757725.56266.qm@web32403.mail.mud.yahoo.com>


  The gnbd approach seems to fit, although don't quite undestand why this approach is better than the one I started with.
  However, I upgraded to kernel 2.6.24, cluster-2.03.00, ais, ... and followed instructions described in doc/min-gfs.txt in  ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.00.tar.gz.
  I reached the point in which I could gfs_mkfs, but I could not mount the new fs cause it complains about insufficient number of journals (I tried 2 and 4 journals) while having only one cluster node and the gnbd server.

  Kind regards

----- Original Message ----
From: Andrew A. Neuschwander <andrew at ntsg.umt.edu>
To: linux clustering <linux-cluster at redhat.com>
Sent: Tuesday, April 15, 2008 5:06:30 PM
Subject: Re: [Linux-cluster] how can I share a logical volume?

Did you setup the node with the storage as a gnbd server and the other
nodes as gnbd clients? I think this is what you want in order for the
nodes to all have block level access to the storage for clvmd and dlm to
run on top of.

-A
-- 
Andrew A. Neuschwander, RHCE
Linux Systems/Software Engineer
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310

On Tue, April 15, 2008 12:22 am, nch wrote:
> No, I can't see the logical volumes on the other nodes. vgscan doesn't
> show any, nor I can find any new devices in /dev.
> As I couldn't find docs/examples on this particular point, I really don't
> know what to expect.
> I'm trying with different types of logical volumes (stripped, mirrored),
> but didn't make a difference.
> For the moment, I'm running all the stuff on virtual machines, could this
> be an issue?
> For the moment I'm using a minimal cluster.conf, in which I just declare
> the nodes. Should I add specific configurations to it?
>
> Lots of thanks.
>
> ----- Original Message ----
> From: Shawn Hood <shawnlhood at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Tuesday, April 15, 2008 2:06:07 AM
> Subject: Re: [Linux-cluster] how can I share a logical volume?
>
> As far as I know, you should be able to at least SEE the logical
> volume as long as there is a path to the physical volumes on the other
> nodes.  Are you able to see the same block devices (eg /dev/sd?) on
> the other nodes?
>
> Shawn Hood
>
>
>
> 2008/4/14 nch <underscore_dot at yahoo.com>:
>>
>> Hello, everybody.
>>
>> I'm trying to run a cluster with 3 nodes. One of them would share
>> storage
>> with the other two using GFS and DLM (kernel 2.4.18-6).
>> I was able to start ccsd, cman, fenced and clvmd in all nodes. I've
>> defined
>> a logical volume in the storage node and was able to gfs_mkfs, activate
>> it
>> locally and mount it, but I don't know how to make it available/visible
>> to
>> the other two nodes.
>> Do you know how to do this?
>> I've followed instruction given in
>> http://sources.redhat.com/cluster/doc/usage.txt (except for setting
>> locking_type=2).
>>
>> Many thanks.
>>
>>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080415/1a831655/attachment.htm>

From underscore_dot at yahoo.com  Tue Apr 15 20:38:49 2008
From: underscore_dot at yahoo.com (nch)
Date: Tue, 15 Apr 2008 13:38:49 -0700 (PDT)
Subject: [Linux-cluster] how can I share a logical volume?
Message-ID: <997501.91333.qm@web32401.mail.mud.yahoo.com>


Only one node is phisically connected, at this moment (this might change in the future).
I was able to create a logical volume and mount it locally, so I assume everything was correctly connected. Am I wrong?

Thank you.

----- Original Message ----
From: John Ruemker <jruemker at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Tuesday, April 15, 2008 8:04:46 PM
Subject: Re: [Linux-cluster] how can I share a logical volume?

When you say you want to share storage with the other 2 nodes, do you 
mean only one node is physically connected to the storage and exports it 
to the other 2?  Or do you mean that all three nodes are connected to 
the same storage and they share the device? 


If the former, gnbd is probably what you want.  If the latter, you 
should see the same devices (/dev/sdX) on each node.  If you do not, you 
have misconfigured your HBA or LUNs.  Once they all see the same 
devices, you should be able to start clvmd and all 3 will see the 
clustered volume group.


John

Andrew A. Neuschwander wrote:
> Did you setup the node with the storage as a gnbd server and the other
> nodes as gnbd clients? I think this is what you want in order for the
> nodes to all have block level access to the storage for clvmd and dlm to
> run on top of.
>
> -A
>   

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080415/c42a9288/attachment.htm>

From lexi.herrera at gmail.com  Tue Apr 15 20:47:05 2008
From: lexi.herrera at gmail.com (Lexi Herrera)
Date: Tue, 15 Apr 2008 16:47:05 -0400
Subject: [Linux-cluster] red hat enterprise
In-Reply-To: <cfe2fc960804151126m9d746ffs9b3f63cac11a2298@mail.gmail.com>
References: <6c3ea40804150949i45f3b50cs3569e96221a2ea90@mail.gmail.com>
	<cfe2fc960804151126m9d746ffs9b3f63cac11a2298@mail.gmail.com>
Message-ID: <6c3ea40804151347n3edb4624o410af2701996d2a3@mail.gmail.com>

thank you very much by its sincere and fast answer.

On Tue, Apr 15, 2008 at 2:26 PM, Shawn Hood <shawnlhood at gmail.com> wrote:

> I hate to break it to you, but this kind of message isn't going to get
> you anywhere.  I can assure you that many who read this message are
> thinking RTFM (see http://en.wikipedia.org/wiki/RTFM).  You're going
> to have to hit the books like the rest of us.
>
> Shawn Hood
>
> 2008/4/15 Lexi Herrera <lexi.herrera at gmail.com>:
> >
> >
> > hi everybody, i am new in my job and new in linux and not have enough
> > experience with this installations, i know solaris, but don't worry. I
> need
> > install a high performance cluster with red hat enterprise and the
> > information that i knows is this:
> >
> >
> >
> > -         red hat enterprise as 4.5
> >
> > -         torche
> >
> > -         pvm
> >
> > -         mpi
> >
> > -         promax
> >
> > -         focus
> >
> > -         gpfs or better
> >
> > -         ganglia or better
> >
> >
> >
> > please i need a lot aid to make this installation.
> > --
> >  Linux-cluster mailing list
> >  Linux-cluster at redhat.com
> >  https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
>
>
> --
> Shawn Hood
> (910) 670-1819 Mobile
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080415/6d334046/attachment.htm>

From underscore_dot at yahoo.com  Wed Apr 16 10:00:42 2008
From: underscore_dot at yahoo.com (nch)
Date: Wed, 16 Apr 2008 03:00:42 -0700 (PDT)
Subject: [Linux-cluster] how can I share a logical volume?
Message-ID: <21508.16024.qm@web32401.mail.mud.yahoo.com>

This is the exact error message:

client1:# mount -t gfs /dev/gnbd/sharedvol /mnt
Trying to join cluster "lock_dlm", "testcluster:testfs"
dlm: Using TCP for communications
Joined cluster. Now mounting FS...
GFS: fsid=testcluster:testfs.4294867295: can't mount journal #4294867295
GFS: fsid=testcluster:testfs.4294867295: there are only 6 journals (0 - 5)

I can't find anyone else having issue. Can you figure out why is this happening?

Cheers

----- Original Message ----
From: John Ruemker <jruemker at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Tuesday, April 15, 2008 8:04:46 PM
Subject: Re: [Linux-cluster] how can I share a logical volume?

When you say you want to share storage with the other 2 nodes, do you 
mean only one node is physically connected to the storage and exports it 
to the other 2?  Or do you mean that all three nodes are connected to 
the same storage and they share the device? 


If the former, gnbd is probably what you want.  If the latter, you 
should see the same devices (/dev/sdX) on each node.  If you do not, you 
have misconfigured your HBA or LUNs.  Once they all see the same 
devices, you should be able to start clvmd and all 3 will see the 
clustered volume group.


John

Andrew A. Neuschwander wrote:
> Did you setup the node with the storage as a gnbd server and the other
> nodes as gnbd clients? I think this is what you want in order for the
> nodes to all have block level access to the storage for clvmd and dlm to
> run on top of.
>
> -A
>   

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080416/396f85eb/attachment.htm>

From punit_j at rediffmail.com  Wed Apr 16 12:05:14 2008
From: punit_j at rediffmail.com (punit_j)
Date: 16 Apr 2008 12:05:14 -0000
Subject: [Linux-cluster] errors for members joining cluster
Message-ID: <20080416120514.7225.qmail@f5mail-237-211.rediffmail.com>

 ?
Hi All,

I have created a 2 node active passive cluster. When i am starting the ccsd deamon and cman daemon i get following in the /var/log/messages :- 

Apr 16 11:46:38 wesnet-store ccsd[2395]: Cluster is not quorate.  Refusing connection.
Apr 16 11:46:38 wesnet-store ccsd[2395]: Error while processing connect: Connection refused
Apr 16 11:46:38 wesnet-store clurgmgrd[2788]: <crit> #5: Couldn't connect to ccsd!
Apr 16 11:46:38 wesnet-store clurgmgrd[2788]: <crit> #8: Couldn't initialize services

Apr 16 13:16:13 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5370 seconds.
Apr 16 13:16:43 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5400 seconds.
Apr 16 13:17:13 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5430 seconds.
Apr 16 13:17:44 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5460 seconds.
Apr 16 13:18:14 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5490 seconds.
Apr 16 13:18:44 wesnet-store ccsd[2395]: Unable to connect to cluster infrastructure after 5520 seconds.
 
Moreover the strange thing is it is trying to connect to another cluster in same network. My cluster has a different name as compared to another cluster running in the network. Can anyone point out what can be the issue?

Regards,
-Punit


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080416/e57fa134/attachment.htm>

From denisb+gmane at gmail.com  Wed Apr 16 12:19:57 2008
From: denisb+gmane at gmail.com (denis)
Date: Wed, 16 Apr 2008 14:19:57 +0200
Subject: [Linux-cluster] GFS in RHCS on RHEL5.1 
Message-ID: <fu4qnq$gg2$1@ger.gmane.org>

Hi,

I was under the impression that installing Red Hat Cluster Suite with
GFS in RHEL5.1 was a "supported" solution, but a colleague informed me
that the GFS version in RHEL5.x is currently a technology preview?!

Is this correct, and if so, what is my best option for running a shared
filesystem between my clusternodes (i have a SAN available)?

Regards
--
Denis



From fog at t.is  Wed Apr 16 12:20:43 2008
From: fog at t.is (=?iso-8859-1?Q?Finnur_=D6rn_Gu=F0mundsson_-_TM_Software?=)
Date: Wed, 16 Apr 2008 12:20:43 -0000
Subject: [Linux-cluster] GFS in RHCS on RHEL5.1 
In-Reply-To: <fu4qnq$gg2$1@ger.gmane.org>
References: <fu4qnq$gg2$1@ger.gmane.org>
Message-ID: <3DDA6E3E456E144DA3BB0A62A7F7F77901F9E61F@SKYHQAMX08.klasi.is>

Hi,

GFS version 1 is supported, however version 2 is currently in a technical preview.

Bgrds,
Finnur

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of denis
Sent: 16. apr?l 2008 12:20
To: linux-cluster at redhat.com
Subject: [Linux-cluster] GFS in RHCS on RHEL5.1 

Hi,

I was under the impression that installing Red Hat Cluster Suite with
GFS in RHEL5.1 was a "supported" solution, but a colleague informed me
that the GFS version in RHEL5.x is currently a technology preview?!

Is this correct, and if so, what is my best option for running a shared
filesystem between my clusternodes (i have a SAN available)?

Regards
--
Denis

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Wed Apr 16 12:27:48 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 16 Apr 2008 13:27:48 +0100 (BST)
Subject: [Linux-cluster] GFS in RHCS on RHEL5.1
In-Reply-To: <fu4qnq$gg2$1@ger.gmane.org>
References: <fu4qnq$gg2$1@ger.gmane.org>
Message-ID: <alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>



On Wed, 16 Apr 2008, denis wrote:

> Hi,
>
> I was under the impression that installing Red Hat Cluster Suite with
> GFS in RHEL5.1 was a "supported" solution, but a colleague informed me
> that the GFS version in RHEL5.x is currently a technology preview?!
>
> Is this correct, and if so, what is my best option for running a shared
> filesystem between my clusternodes (i have a SAN available)?

Your colleague is semi-misinformed.

GFS1 is stable (and has been for years) and available in RHEL5. GFS2 is 
tech preview. So if you are deploying a cluster right now, do it with 
GFS1. GFS2 is not yet recommended for production use.

Gordan



From denisb+gmane at gmail.com  Wed Apr 16 13:38:17 2008
From: denisb+gmane at gmail.com (denis)
Date: Wed, 16 Apr 2008 15:38:17 +0200
Subject: [Linux-cluster] Re: GFS in RHCS on RHEL5.1
In-Reply-To: <alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
Message-ID: <fu4vam$23n$1@ger.gmane.org>

gordan at bobich.net wrote:
>> I was under the impression that installing Red Hat Cluster Suite with
>> GFS in RHEL5.1 was a "supported" solution, but a colleague informed me
>> that the GFS version in RHEL5.x is currently a technology preview?!
> Your colleague is semi-misinformed.
> GFS1 is stable (and has been for years) and available in RHEL5. GFS2 is
> tech preview. So if you are deploying a cluster right now, do it with
> GFS1. GFS2 is not yet recommended for production use.

Thanks for the information. My follow-up question is whether GFS (1)
tolerates high performance situations (with lots of concurrent writes /
read access / high number of files)?

Regards
--
Denis



From gordan at bobich.net  Wed Apr 16 13:42:29 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 16 Apr 2008 14:42:29 +0100 (BST)
Subject: [Linux-cluster] Re: GFS in RHCS on RHEL5.1
In-Reply-To: <fu4vam$23n$1@ger.gmane.org>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
Message-ID: <alpine.LRH.1.10.0804161436390.21120@skynet.shatteredsilicon.net>

On Wed, 16 Apr 2008, denis wrote:

>>> I was under the impression that installing Red Hat Cluster Suite with
>>> GFS in RHEL5.1 was a "supported" solution, but a colleague informed me
>>> that the GFS version in RHEL5.x is currently a technology preview?!
>> Your colleague is semi-misinformed.
>> GFS1 is stable (and has been for years) and available in RHEL5. GFS2 is
>> tech preview. So if you are deploying a cluster right now, do it with
>> GFS1. GFS2 is not yet recommended for production use.
>
> Thanks for the information. My follow-up question is whether GFS (1)
> tolerates high performance situations (with lots of concurrent writes /
> read access / high number of files)?

As much as any other similar system does. If your heavy writes with lots 
of files are all in the same directory, then you will get contention and 
performance degradation, as writing to a directory (e.g. file creation) 
requires a directory lock. Something like Maildir with few user accounts 
won't perform brilliantly. Maildir with lots of user accounts, isn't too 
bad. There are also tuning parameters you can apply (e.g. lock pruning) 
that help in such cases.

The only way you will know for sure is to try it and see for your 
particular application. If GFS can't handle it despite optimizations, you 
could always try OCFS2 or GlusterFS (please do post back with your 
findings if you get that far, I've not seen a decent real-world 
comparison recently), but if you are after clusterable scalability, then 
your application will likely need to be made/modified in such a way that 
it doesn't trip over issues inherent in clustering (and there will be FS 
lock contention issues with _ANY_ scaleable cluster FS).

Gordan



From Harri.Paivaniemi at tietoenator.com  Thu Apr 17 05:40:24 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Thu, 17 Apr 2008 08:40:24 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
References: <fu4qnq$gg2$1@ger.gmane.org><alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>

Hi all,

Short introduction: My name is Harry, I'm working in Helsinki, Finland and have used RHCS from the beginning, we have currently 7 clusters mainly running MySQL/Oracle databases.

I tought I have some kind of knowledge about this clustering software and everything seemed to be ok until version 5. I don't have problems or severe bugging in any of RH4- clusters.

But....

Tryed to move --> 5.1 with 64-bit HP Blades. Cluster just won't work or it works but I don't have any kind of trust to it anymore. I have made about 20 different scenarios and there is totally too much problems, couple of those will prevent me to use this anymore. I have created 3 tickets to RH support and it seems to me that they don't know that little what I know. I have had to tell them 2 times to read the f...g manual, because they have spoken directly agains qdisk man-page. They just don't know how it should work... hard to believe but tru.

First, I asked how to change cman deadnode_timeout in 5, because /proc doesn't anymore have it and that parameter didn't work on my tests. Support said "you can't tune the timeout at all". I asked, how can I  use qdisk if man page says cman's timeout must be > than qdisk eviction timeout.... and told them to read the man-page... finally I found myself the correct parameter "totem token"

Second time, they said in my 2-node cluster I made a mistake when I gave 1 vote for the quorum disk... but man-page again tell's to do that and of course it is correct in 2-node cluster....

So, this is my sad history with ver 5. Do you use 64-bit ver 5 and what's your feeling?

My problems this time are:

1. 2-node cluster. Can't start only one node to get cluster services up - it hangs in fencing and waits until I start te second node and immediately after that, when both nodes are starting cman, the cluster comes up. So if I have lost one node, I can't get the cluster up, if I have to restart for seome reason the working node. It should work like before (both nodes are down, I start one, it fences another and comes up). Now it just waits... log says:

ccsd[25272]: Error while processing connect: Connection refused

This is so common error message, that it just tell's nothing to me....

2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the same time) to get it up. Works ok, qdisk works, heuristic works. Everything works. If I stop cluster daemons on one node, that node can't join to cluster anymore without a complete reboot. It joins, another node says ok, the node itself says ok, quorum is registred and heuristic is up, but the node's quorum-disk stays offline and another node says this node is offline. If I reboot this machine, it joins to cluster ok.

3. Funny thing: heuristic ping didn't work at all in the beginning and support gave me a "ping-script" which make it to work... so this describes quite well how experimental this cluster is nowadays...

I have to tell you it is a FACT that basics are ok: fencing works ok in a normal situation, I don't have typos, configs are in sync,  everything is ok, but these problems still exists.

I have 2 times sent sosreports etc. so RH support. They hava spent 3 weeks and still can't say whats wrong...


Just if somebody has something in mind to help...

Thanks,

-hjp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 4384 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/64dd0e20/attachment.bin>

From p.elmers at gmx.de  Thu Apr 17 07:08:25 2008
From: p.elmers at gmx.de (Peter)
Date: Thu, 17 Apr 2008 09:08:25 +0200
Subject: [Linux-cluster] Meaning of Cluster Cycle and timeout problems
Message-ID: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>

Hi!

In our Cluster we have the following entry in the "messages" logfile:

"qdiskd[4314]: <warning> qdisk cycle took more than 3 seconds to  
complete (3.890000)"

Theese messages are very frequent. I can not find anything except the  
source code via google and i am sorry to say that i am not so familar  
with c to get the point.


We also have sometimes a quorum timeout:

"kernel: CMAN: Quorum device /dev/sdh timed out"


Are theese two messages independent and what is the meaning of the  
first message?

Thanks for reading and answering :)


Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2209 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/e62b7878/attachment.p7s>

From denisb+gmane at gmail.com  Thu Apr 17 07:43:42 2008
From: denisb+gmane at gmail.com (denis)
Date: Thu, 17 Apr 2008 09:43:42 +0200
Subject: [Linux-cluster] Re: Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
References: <fu4qnq$gg2$1@ger.gmane.org><alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
Message-ID: <fu6uti$9su$1@ger.gmane.org>

Hi Harri,

Please consider using a smart mailclient which can wrap your mails.. It
makes things a lot easier.

I have a 2 node cluster RHEL5.1 on 64bit (IBM Blades) configured which
works well for me (so far).

Harri.Paivaniemi at tietoenator.com wrote:
> 1. 2-node cluster. Can't start only one node to get cluster services 
>up - it hangs in fencing and waits until I start te second node and
>immediately after that, when both nodes are starting cman, the cluster
>comes up.

For me fencing would hang for 3-4 minutes before I had properly
configured fencing (with manual fallback), after adding manual fallback
I no longer have this issue.

When a node is fenced / rebooted, every now and then it will fence the
other node on restart. And if I shutdown and restart both nodes
simultaneously they will fence like this (has happened 2 times on tests)

A boots quicker and establishes the cluster
B fences A when it starts
A fences B when it starts
B boots and joins the cluster..

This was before I configured qdisk, and I have not yet tested this
behaviour after qdisk was setup.

> 2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the 

qdisk works well for me in a very similar setup. I have done manual
fencing tests / hardboot tests, which didn't produce anything like you
describe.

> 3. Funny thing: heuristic ping didn't work at all in the beginning 
>and support gave me a "ping-script" which make it to work... so this
>describes quite well how experimental this cluster is nowadays...

heuristic ping works fine for me out of the box..

Regards
--
Denis




From harri.paivaniemi at tietoenator.com  Thu Apr 17 07:58:22 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 10:58:22 +0300
Subject: [Linux-cluster] Re: Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <fu6uti$9su$1@ger.gmane.org>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<fu6uti$9su$1@ger.gmane.org>
Message-ID: <1208419102.21043.15.camel@hjpsuse.tebit>

Sorry,

if my email structure was someway messed up - I had to use our webmail
(Outlook web access), normally I use Evolution in SuSE...

Anyway, thanks for your comments, Denis.



-hjp

On Thu, 2008-04-17 at 09:43 +0200, denis wrote:
> Hi Harri,
> 
> Please consider using a smart mailclient which can wrap your mails.. It
> makes things a lot easier.
> 
> I have a 2 node cluster RHEL5.1 on 64bit (IBM Blades) configured which
> works well for me (so far).
> 
> Harri.Paivaniemi at tietoenator.com wrote:
> > 1. 2-node cluster. Can't start only one node to get cluster services 
> >up - it hangs in fencing and waits until I start te second node and
> >immediately after that, when both nodes are starting cman, the cluster
> >comes up.
> 
> For me fencing would hang for 3-4 minutes before I had properly
> configured fencing (with manual fallback), after adding manual fallback
> I no longer have this issue.
> 
> When a node is fenced / rebooted, every now and then it will fence the
> other node on restart. And if I shutdown and restart both nodes
> simultaneously they will fence like this (has happened 2 times on tests)
> 
> A boots quicker and establishes the cluster
> B fences A when it starts
> A fences B when it starts
> B boots and joins the cluster..
> 
> This was before I configured qdisk, and I have not yet tested this
> behaviour after qdisk was setup.
> 
> > 2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the 
> 
> qdisk works well for me in a very similar setup. I have done manual
> fencing tests / hardboot tests, which didn't produce anything like you
> describe.
> 
> > 3. Funny thing: heuristic ping didn't work at all in the beginning 
> >and support gave me a "ping-script" which make it to work... so this
> >describes quite well how experimental this cluster is nowadays...
> 
> heuristic ping works fine for me out of the box..
> 
> Regards
> --
> Denis
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Thu Apr 17 08:17:30 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 17 Apr 2008 09:17:30 +0100
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
References: <fu4qnq$gg2$1@ger.gmane.org><alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
Message-ID: <4807079A.5060909@bobich.net>

Harri.Paivaniemi at tietoenator.com wrote:

> So, this is my sad history with ver 5. Do you use 64-bit ver 5 and what's your feeling?

I only started using it with v5, and I have to say that I haven't had 
any real problems. Some of my clusters have been 64-bit, some 32-bit, 
and I haven't seen any differences yet.

> My problems this time are:
> 
> 1. 2-node cluster. Can't start only one node to get cluster services up - it hangs in fencing and waits until I start te second node and immediately after that, when both nodes are starting cman, the cluster comes up. So if I have lost one node, I can't get the cluster up, if I have to restart for seome reason the working node. It should work like before (both nodes are down, I start one, it fences another and comes up). Now it just waits... log says:
> 
> ccsd[25272]: Error while processing connect: Connection refused
> 
> This is so common error message, that it just tell's nothing to me....

I have seen similar error messages before, and it has usually been 
caused by either the node names/interfaces/IPs not being listed 
correctly in /etc/hosts file, or iptables firewalling rules blocking 
communication between the nodes.

> 2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the same time) to get it up. Works ok, qdisk works, heuristic works. Everything works. If I stop cluster daemons on one node, that node can't join to cluster anymore without a complete reboot. It joins, another node says ok, the node itself says ok, quorum is registred and heuristic is up, but the node's quorum-disk stays offline and another node says this node is offline. If I reboot this machine, it joins to cluster ok.

I believe it's supposed to work that way. When a node fails it needs to 
be fully restarted before it is allowed back into the cluster. I'm sure 
this has been mentioned on the list recently.

> 3. Funny thing: heuristic ping didn't work at all in the beginning and support gave me a "ping-script" which make it to work... so this describes quite well how experimental this cluster is nowadays...
> 
> I have to tell you it is a FACT that basics are ok: fencing works ok in a normal situation, I don't have typos, configs are in sync,  everything is ok, but these problems still exists.

I've been in similar situations before, but in the end it always turned 
out to be me doing something silly (see above re: host files and 
iptables as examples).

> I have 2 times sent sosreports etc. so RH support. They hava spent 3 weeks and still can't say whats wrong...

Sadly, that seems to be the quality of commercial support from any 
vendor. Support nowdays seems to have only one purpose - managerial 
back-covering exercise so they can pass the buck. I have always found 
that community support is several orders of magnitude better than 
commercial support in terms of both response speed and quality.

Gordan



From harri.paivaniemi at tietoenator.com  Thu Apr 17 08:30:30 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 11:30:30 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <4807079A.5060909@bobich.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
Message-ID: <1208421030.21043.24.camel@hjpsuse.tebit>

No,

I strongly believe it should not work that way.

To my mind, it should work like this:

- 2 nodes up'n running, everything ok
- shutdown cluster daemons on node b
- node b tells node a "I'm going administrative down", node a is
decreasing cluster votes from 3 --> 2
- node a is happy, no fencing

- start node b's cluster daemons
- joins to cluster normally
- gains quorum device normally, cluster votes back --> 3

Of course it's different if node b fails, but this is not failing, it's
administrative shutdown and node a is informed.

If I halt node b, it's fenced ok by node a, as it should be, it reboots
and joins to cluster normally.



-hjp



On Thu, 2008-04-17 at 09:17 +0100, Gordan Bobic wrote:
> > 2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the
> same time) to get it up. Works ok, qdisk works, heuristic works.
> Everything works. If I stop cluster daemons on one node, that node
> can't join to cluster anymore without a complete reboot. It joins,
> another node says ok, the node itself says ok, quorum is registred and
> heuristic is up, but the node's quorum-disk stays offline and another
> node says this node is offline. If I reboot this machine, it joins to
> cluster ok.
> 
> I believe it's supposed to work that way. When a node fails it needs
> to 
> be fully restarted before it is allowed back into the cluster. I'm
> sure 
> this has been mentioned on the list recently.



From johannes.russek at io-consulting.net  Thu Apr 17 09:18:51 2008
From: johannes.russek at io-consulting.net (jr)
Date: Thu, 17 Apr 2008 11:18:51 +0200
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <4807079A.5060909@bobich.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
Message-ID: <1208423931.27774.2.camel@admc.win-rar.local>

Am Donnerstag, den 17.04.2008, 09:17 +0100 schrieb Gordan Bobic:

> > 1. 2-node cluster. Can't start only one node to get cluster services up - it hangs in fencing and waits until I start te second node and immediately after that, when both nodes are starting cman, the cluster comes up. So if I have lost one node, I can't get the cluster up, if I have to restart for seome reason the working node. It should work like before (both nodes are down, I start one, it fences another and comes up). Now it just waits... log says:
> > 
> > ccsd[25272]: Error while processing connect: Connection refused
> > 
> > This is so common error message, that it just tell's nothing to me....
> 
> I have seen similar error messages before, and it has usually been 
> caused by either the node names/interfaces/IPs not being listed 
> correctly in /etc/hosts file, or iptables firewalling rules blocking 
> communication between the nodes.

or if the cluster isn't quorate, i believe cman refuses to accept any
connections.

regards,
johannes



From harri.paivaniemi at tietoenator.com  Thu Apr 17 09:28:47 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 12:28:47 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208423931.27774.2.camel@admc.win-rar.local>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
Message-ID: <1208424527.21043.33.camel@hjpsuse.tebit>

Well,

I don't have any mistakes with firewalls, hosts, names, ip's etc. This
is a fact. Communication itself works. Maby it sounds strange when I say
I don't have mistakes, but this time it's true ;)

In this case cluster should gain quorum and start running services on
node a (it has 2 votes (node-vote + qdisk-vote).

It should fence node b first, because it doesn't know where it is.

So this behaviour is wrong.

-hjp


On Thu, 2008-04-17 at 11:18 +0200, jr wrote:
> Am Donnerstag, den 17.04.2008, 09:17 +0100 schrieb Gordan Bobic:
> 
> > > 1. 2-node cluster. Can't start only one node to get cluster services up - it hangs in fencing and waits until I start te second node and immediately after that, when both nodes are starting cman, the cluster comes up. So if I have lost one node, I can't get the cluster up, if I have to restart for seome reason the working node. It should work like before (both nodes are down, I start one, it fences another and comes up). Now it just waits... log says:
> > > 
> > > ccsd[25272]: Error while processing connect: Connection refused
> > > 
> > > This is so common error message, that it just tell's nothing to me....
> > 
> > I have seen similar error messages before, and it has usually been 
> > caused by either the node names/interfaces/IPs not being listed 
> > correctly in /etc/hosts file, or iptables firewalling rules blocking 
> > communication between the nodes.
> 
> or if the cluster isn't quorate, i believe cman refuses to accept any
> connections.
> 
> regards,
> johannes
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From johannes.russek at io-consulting.net  Thu Apr 17 09:32:43 2008
From: johannes.russek at io-consulting.net (jr)
Date: Thu, 17 Apr 2008 11:32:43 +0200
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208424527.21043.33.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
Message-ID: <1208424763.27774.6.camel@admc.win-rar.local>

Am Donnerstag, den 17.04.2008, 12:28 +0300 schrieb Harri P?iv?niemi:
> Well,
> 
> I don't have any mistakes with firewalls, hosts, names, ip's etc. This
> is a fact. Communication itself works. Maby it sounds strange when I say
> I don't have mistakes, but this time it's true ;)
> 
> In this case cluster should gain quorum and start running services on
> node a (it has 2 votes (node-vote + qdisk-vote).
> 
> It should fence node b first, because it doesn't know where it is.
> 
> So this behaviour is wrong.
> 
> -hjp

i think something is wrong here, like the expected votes or similiar. if
the one node had 2 votes and those were the expected votes, it would
maintain quorum and thus fence the other node. that connection refused
error seems to say that that node doesn't have the quorum nonetheless.
can you confirm that? (clustat should show you if that node is quorate
or not)
regards,
johannes




From Laurent.WEISLO at ext.ec.europa.eu  Thu Apr 17 09:38:52 2008
From: Laurent.WEISLO at ext.ec.europa.eu (Laurent.WEISLO at ext.ec.europa.eu)
Date: Thu, 17 Apr 2008 11:38:52 +0200
Subject: [Linux-cluster] How to separate inter-cluster/public network traffic
Message-ID: <867B2D3FDCEAE947AE9DEEF2D5BD4F0601E35ED4@S-DC-EXM22.net1.cec.eu.int>

Hi,

I'm running RedHat 5.2 Beta (Tikanga) and I'm trying to achieve this behavior:

- 2 NICs (eth0, eth1) for one bond0 device intended for public LAN XXX.XXX.XXX.XXX traffic

- 2 NICs (eth2, eth3) for one bond1 device intended for inter-cluster LAN YYY.YYY.YYY.YYY traffic

- NodeA and NodeB IP addresses are in LAN XXX.XXX.XXX.XXX

In cluster.conf:

<cman>

<multicast addr="239.192.NN.NN"/>

</cman>

<clusternode name="nodeA">

<multicast addr="239.192.NN.NN" interface="bond1"/>

<clusternode/>

<clusternode name="nodeB">

<multicast addr="239.192.NN.NN" interface="bond1"/>

<clusternode/>

Unfortunately all the cluster traffic is bound to bond0:

[root at NodeA]# ip maddr show bond0

...

inet 239.192.NN.NN

inet 224.0.0.1

...

[root at NodeA]# ip maddr show bond1

...

inet 224.0.0.1

...

Is it possible to do it like that (all the clusters have 2 VLAN with bonding each) ?

If not, should I put NodeA and NodeB into the LAN YYY.YYY.YYY.YYY in cluster.conf ?

Thx for your help !

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/c8b0988e/attachment.htm>

From harri.paivaniemi at tietoenator.com  Thu Apr 17 09:58:10 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 12:58:10 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208424763.27774.6.camel@admc.win-rar.local>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
Message-ID: <1208426290.21043.48.camel@hjpsuse.tebit>

Yes, something is wrong.

I made little bit more research:

- stop cluster daemons on both nodes (node a & node b)

- start cluster on node a

It hangs 5 minutes on cman's fencing part like this:


Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... done
   Starting daemons... done
   Starting fencing...

... and in process list there is this:

/sbin/fence_tool -w -t 300 join


... so thats the 5 minutes.

Question is: why it waits there 54 minutes?

- after 5 minutes waiting, node a says:

Starting fencing... failed

                                                           [FAILED]
Starting the Quorum Disk Daemon:                           [  OK  ]
Starting Cluster Service Manager:                          [  OK  ]

... and then it loads qdiskd and after a while it has 2 votes and it
starts services normally and voila, I have a running cluster with one
node:

Node  Sts   Inc   Joined               Name
   0   M      0   2008-04-17 12:51:01  /dev/sda
   1   M   1356   2008-04-17 12:45:44  areenasql1
   2   X      0                        areenasql2



[root at areenasql1 ~]# cman_tool status
Version: 6.0.1
Config Version: 4
Cluster Name: areena_sql
Cluster Id: 39330
Cluster Member: Yes
Cluster Generation: 1356
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 2
Quorum: 2
Active subsystems: 8
Flags:
Ports Bound: 0 177
Node name: areenasql1
Node ID: 1
Multicast addresses: 239.192.153.60
Node addresses: 10.1.1.178



But log says nothing about that failed fencing. Fencing is configured
correctly, I use HP ILO and everything is ok. Fencing works in running
cluster ok, both nodes can fence eachother.

Node a should fence node b in this situation and maby it's trying to do
it somehow, but it logs nothing. It should log at least "fence failed
etc." if it's unable to fence node b...

And what's more important, if we think node a can't fence node b in this
startup situation, it should NOT start services but it starts....

-hjp














On Thu, 2008-04-17 at 11:32 +0200, jr wrote:
> Am Donnerstag, den 17.04.2008, 12:28 +0300 schrieb Harri P?iv?niemi:
> > Well,
> > 
> > I don't have any mistakes with firewalls, hosts, names, ip's etc. This
> > is a fact. Communication itself works. Maby it sounds strange when I say
> > I don't have mistakes, but this time it's true ;)
> > 
> > In this case cluster should gain quorum and start running services on
> > node a (it has 2 votes (node-vote + qdisk-vote).
> > 
> > It should fence node b first, because it doesn't know where it is.
> > 
> > So this behaviour is wrong.
> > 
> > -hjp
> 
> i think something is wrong here, like the expected votes or similiar. if
> the one node had 2 votes and those were the expected votes, it would
> maintain quorum and thus fence the other node. that connection refused
> error seems to say that that node doesn't have the quorum nonetheless.
> can you confirm that? (clustat should show you if that node is quorate
> or not)
> regards,
> johannes
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From johannes.russek at io-consulting.net  Thu Apr 17 10:12:57 2008
From: johannes.russek at io-consulting.net (jr)
Date: Thu, 17 Apr 2008 12:12:57 +0200
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208426290.21043.48.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
Message-ID: <1208427177.27774.8.camel@admc.win-rar.local>

do you mind sending your cluster.conf?

johannes



From harri.paivaniemi at tietoenator.com  Thu Apr 17 10:27:45 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 13:27:45 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208427177.27774.8.camel@admc.win-rar.local>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
Message-ID: <1208428065.21043.60.camel@hjpsuse.tebit>

Yes,

Cluster.conf attached.


I just resolved 1 thing:

When node a & b are down (cluster daemons) and I start node a, it hangs
5 minutes in fencing becouse becouse...


man fence_tool says:

""Before  joining or leaving the fence domain, fence_tool waits for the
cluster be in a quorate state""

And in qdisk man- page it's said:

""CMAN  must  be running before the qdisk program can operate in full
capacity.  If CMAN is not running, qdisk will wait for it."

I started in this order: cman-qdiskd-rgmanager". In this case it hangs
because fence is waiting cluster to be quorate and it's not gonna be
because qdisk is not yet running ;)

Jihaa - so one problem solved. No I can start cluster node at a time.


The 2nd problem that still exists is:

When node a and b are running and everything is ok. I stop node b's
cluster daemons. when I start node b again, this situation stays
forever:

----------------
node a - clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  areenasql1                            1 Online, Local, rgmanager
  areenasql2                            2 Offline
  /dev/sda                              0 Online, Quorum Disk

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:areena       areenasql1                     started

-------------------

node b - clustat

Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  areenasql1                            1 Online, rgmanager
  areenasql2                            2 Online, Local, rgmanager
  /dev/sda                              0 Offline, Quorum Disk

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:areena       areenasql1                     started


So node b's quorum disk is offline, log says it's registred ok and
heuristic is UP... node a sees node b as offline. If I reboot node b, it
works ok and joins ok...

Both nodes sees:

Nodes: 2
Expected votes: 3
Total votes: 2
Quorum: 2



-hjp







On Thu, 2008-04-17 at 12:12 +0200, jr wrote:
> do you mind sending your cluster.conf?
> 
> johannes
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: application/xml
Size: 2638 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/ef17b50b/attachment.wsdl>

From johannes.russek at io-consulting.net  Thu Apr 17 10:57:34 2008
From: johannes.russek at io-consulting.net (jr)
Date: Thu, 17 Apr 2008 12:57:34 +0200
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208428065.21043.60.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
Message-ID: <1208429854.27774.19.camel@admc.win-rar.local>

indeed, that seems like a good idea to start qdiskd before cman if qdiskd simply waits for cman to come up.
RHEL ships with the chkconfig configuration of S21 for cman and S22 for qdiskd.

question to the list: shouldn't this be changed?

johannes

Am Donnerstag, den 17.04.2008, 13:27 +0300 schrieb Harri P?iv?niemi:
> Yes,
> 
> Cluster.conf attached.
> 
> 
> I just resolved 1 thing:
> 
> When node a & b are down (cluster daemons) and I start node a, it hangs
> 5 minutes in fencing becouse becouse...
> 
> 
> man fence_tool says:
> 
> ""Before  joining or leaving the fence domain, fence_tool waits for the
> cluster be in a quorate state""
> 
> And in qdisk man- page it's said:
> 
> ""CMAN  must  be running before the qdisk program can operate in full
> capacity.  If CMAN is not running, qdisk will wait for it."
> 
> I started in this order: cman-qdiskd-rgmanager". In this case it hangs
> because fence is waiting cluster to be quorate and it's not gonna be
> because qdisk is not yet running ;)
> 
> Jihaa - so one problem solved. No I can start cluster node at a time.
> 
> 
> The 2nd problem that still exists is:
> 
> When node a and b are running and everything is ok. I stop node b's
> cluster daemons. when I start node b again, this situation stays
> forever:
> 
> ----------------
> node a - clustat
> Member Status: Quorate
> 
>   Member Name                        ID   Status
>   ------ ----                        ---- ------
>   areenasql1                            1 Online, Local, rgmanager
>   areenasql2                            2 Offline
>   /dev/sda                              0 Online, Quorum Disk
> 
>   Service Name         Owner (Last)                   State
>   ------- ----         ----- ------                   -----
>   service:areena       areenasql1                     started
> 
> -------------------
> 
> node b - clustat
> 
> Member Status: Quorate
> 
>   Member Name                        ID   Status
>   ------ ----                        ---- ------
>   areenasql1                            1 Online, rgmanager
>   areenasql2                            2 Online, Local, rgmanager
>   /dev/sda                              0 Offline, Quorum Disk
> 
>   Service Name         Owner (Last)                   State
>   ------- ----         ----- ------                   -----
>   service:areena       areenasql1                     started
> 
> 
> So node b's quorum disk is offline, log says it's registred ok and
> heuristic is UP... node a sees node b as offline. If I reboot node b, it
> works ok and joins ok...
> 
> Both nodes sees:
> 
> Nodes: 2
> Expected votes: 3
> Total votes: 2
> Quorum: 2
> 
> 
> 
> -hjp
> 
> 
> 
> 
> 
> 
> 
> On Thu, 2008-04-17 at 12:12 +0200, jr wrote:
> > do you mind sending your cluster.conf?
> > 
> > johannes
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Thu Apr 17 11:21:02 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 12:21:02 +0100 (BST)
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208428065.21043.60.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
Message-ID: <alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Harri P?iv?niemi wrote:

> I just resolved 1 thing:
>
> When node a & b are down (cluster daemons) and I start node a, it hangs
> 5 minutes in fencing becouse becouse...
>
> man fence_tool says:
>
> ""Before  joining or leaving the fence domain, fence_tool waits for the
> cluster be in a quorate state""
>
> And in qdisk man- page it's said:
>
> ""CMAN  must  be running before the qdisk program can operate in full
> capacity.  If CMAN is not running, qdisk will wait for it."
>
> I started in this order: cman-qdiskd-rgmanager". In this case it hangs
> because fence is waiting cluster to be quorate and it's not gonna be
> because qdisk is not yet running ;)
>
> Jihaa - so one problem solved. No I can start cluster node at a time.

So - your config was not correct after all? ;)

Gordan

From harri.paivaniemi at tietoenator.com  Thu Apr 17 11:41:21 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Thu, 17 Apr 2008 14:41:21 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
	<alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
Message-ID: <1208432481.21043.75.camel@hjpsuse.tebit>

Weeeel,

In this case my config was right after all - those 2 man- pages are not
in sync from the logical point of view... and maby my own logical
competence was also part of this confusion ;)

But still, I just can't solve this:

- 2 nodes up'n running, works totally ok
- stop another nodes cman, qdiskd and rgmanager
- start those again (qdiskd-cman-rgmanager OR cman-qdiskd-rgmanager OR
cman-rgmanager-qdiskd)

I'll end up to situation where

- restarted node sees quorum device offline and another node online. It
has 2 votes so it is quorate

- another node sees restarted node offline. It also has 2 votes so it is
quorate

Node reboot solves the problem.


-hjp



On Thu, 2008-04-17 at 12:21 +0100, gordan at bobich.net wrote:
> On Thu, 17 Apr 2008, Harri P?iv?niemi wrote:
> 
> > I just resolved 1 thing:
> >
> > When node a & b are down (cluster daemons) and I start node a, it hangs
> > 5 minutes in fencing becouse becouse...
> >
> > man fence_tool says:
> >
> > ""Before  joining or leaving the fence domain, fence_tool waits for the
> > cluster be in a quorate state""
> >
> > And in qdisk man- page it's said:
> >
> > ""CMAN  must  be running before the qdisk program can operate in full
> > capacity.  If CMAN is not running, qdisk will wait for it."
> >
> > I started in this order: cman-qdiskd-rgmanager". In this case it hangs
> > because fence is waiting cluster to be quorate and it's not gonna be
> > because qdisk is not yet running ;)
> >
> > Jihaa - so one problem solved. No I can start cluster node at a time.
> 
> So - your config was not correct after all? ;)
> 
> Gordan
> -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Thu Apr 17 11:50:10 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 12:50:10 +0100 (BST)
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208432481.21043.75.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
	<alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
	<1208432481.21043.75.camel@hjpsuse.tebit>
Message-ID: <alpine.LRH.1.10.0804171248370.28441@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Harri P?iv?niemi wrote:

> But still, I just can't solve this:
>
> - 2 nodes up'n running, works totally ok
> - stop another nodes cman, qdiskd and rgmanager
> - start those again (qdiskd-cman-rgmanager OR cman-qdiskd-rgmanager OR
> cman-rgmanager-qdiskd)
>
> I'll end up to situation where
>
> - restarted node sees quorum device offline and another node online. It
> has 2 votes so it is quorate
>
> - another node sees restarted node offline. It also has 2 votes so it is
> quorate
>
> Node reboot solves the problem.

I'm sure I have seen a post on this list recently explaining why this (or 
a similar condition) is normal and expected. But for some reason I cannot 
seem to find it...

Gordan

From alacey at brynmawr.edu  Thu Apr 17 15:47:05 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 11:47:05 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
Message-ID: <1032.165.106.200.207.1208447225.squirrel@webmail>

I am doing some testing on a 2-node, active/standby RHEL 4 cluster with
non-GFS shared storage. I am using HP iLO for fencing. I don't have a
quorum disk set up. Both cluster nodes are connected to the same switch,
and that network path is used for cluster communication as well as general
network communication (including access to iLO). I've found that when the
switch goes down and comes back up, the result is not desirable. As soon
as the switch loses power, each node starts trying to fence the other.
Since the iLO is not reachable, this is unsuccessful, but the nodes keep
retrying the fence. When the switch comes back online, the "OK Corral"
scenario takes place -- both nodes fence each other simultaneously and
bring down the cluster.

I have seen some references to the concept of IP-based tie-breakers on a
Red Hat cluster, but I'm not sure how to set this up. What I would like is
a configuration whereby a node that cannot ping the switch will just sit
there in its current state and not attempt to fence the other node.
Fencing would only occur when a node can reach the switch but cannot reach
the other node. Is this something that can be done? Can someone direct me
to documentation? I have a ticket in with Red Hat on this same question,
so we'll see who answers first :-) Thanks,

-Andrew L



From gordan at bobich.net  Thu Apr 17 15:55:43 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 16:55:43 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1032.165.106.200.207.1208447225.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
Message-ID: <alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Andrew Lacey wrote:

> I am doing some testing on a 2-node, active/standby RHEL 4 cluster with
> non-GFS shared storage. I am using HP iLO for fencing. I don't have a
> quorum disk set up. Both cluster nodes are connected to the same switch,
> and that network path is used for cluster communication as well as general
> network communication (including access to iLO). I've found that when the
> switch goes down and comes back up, the result is not desirable. As soon
> as the switch loses power, each node starts trying to fence the other.
> Since the iLO is not reachable, this is unsuccessful, but the nodes keep
> retrying the fence. When the switch comes back online, the "OK Corral"
> scenario takes place -- both nodes fence each other simultaneously and
> bring down the cluster.

I had a similar issue, but the solution I went for is doctoring the 
fencing agent to put in a delay based on node's priority in to the fencing 
daemon. That way the nodes wouldn't try to fence simultaneously, but in a 
staggered fashion.

If you have a spare NIC, and the nodes are next to each other, you could 
make them use a cross-over cable for their cluster communication, so they 
would notice that they are both still up even when the switch dies. That's 
what I do.

Gordan



From Harri.Paivaniemi at tietoenator.com  Thu Apr 17 16:01:46 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Thu, 17 Apr 2008 19:01:46 +0300
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>

What Gordan said is true,

but you could also just tune deadnode_timeout to be different on both nodes: this results the behaviour Gordan told - the node that has smaller deadnode_timeout would fence first.

-hjp



-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Andrew Lacey
Sent: Thu 4/17/2008 18:47
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
 
I am doing some testing on a 2-node, active/standby RHEL 4 cluster with
non-GFS shared storage. I am using HP iLO for fencing. I don't have a
quorum disk set up. Both cluster nodes are connected to the same switch,
and that network path is used for cluster communication as well as general
network communication (including access to iLO). I've found that when the
switch goes down and comes back up, the result is not desirable. As soon
as the switch loses power, each node starts trying to fence the other.
Since the iLO is not reachable, this is unsuccessful, but the nodes keep
retrying the fence. When the switch comes back online, the "OK Corral"
scenario takes place -- both nodes fence each other simultaneously and
bring down the cluster.

I have seen some references to the concept of IP-based tie-breakers on a
Red Hat cluster, but I'm not sure how to set this up. What I would like is
a configuration whereby a node that cannot ping the switch will just sit
there in its current state and not attempt to fence the other node.
Fencing would only occur when a node can reach the switch but cannot reach
the other node. Is this something that can be done? Can someone direct me
to documentation? I have a ticket in with Red Hat on this same question,
so we'll see who answers first :-) Thanks,

-Andrew L

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3678 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/b0a1e6b0/attachment.bin>

From j.buzzard at dundee.ac.uk  Thu Apr 17 16:17:05 2008
From: j.buzzard at dundee.ac.uk (Jonathan Buzzard)
Date: Thu, 17 Apr 2008 17:17:05 +0100
Subject: [Linux-cluster] Fencing Driver API Requirements
In-Reply-To: <4803A6C4.20706@redhat.com>
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>
	<4803A6C4.20706@redhat.com>
Message-ID: <1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>

On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote:
> Hi,
> 
> gordan at bobich.net wrote:
> >
> > I remember that this was mentioned several times in the last few 
> > months, but has any documentation been put together on the API that 
> > the fencing drivers are supposed to cover?
> >
> > I'm looking into writing a fencing driver based on disabling switch 
> > ports on a managed 3com switch via the telnet interface, and I'd like 
> > to make sure 
> > that it conforms to any speciffic requirements that might exist. If 
> > someone could point me at the relevant URL, that would be most 
> > appreciated.
> 
> There is a new python module in the git (master branch / 
> cluster/gence/agents/lib/fencing.py) that should contain everything you 
> should need to write a fence agent. This module was used to built 
> several agents (they are just in the git tree) eg. apc/apc.py, 
> drac/drac5.py, wti/wti.py. If you will find any problem with fencing.py, 
> let me know and I will try to fix it.
> 

The issue is that with such a critical component of a cluster (if the
fencing is not right bad things will happen) that in order to write a
new fencing agent one has to start reverse engineering from source to
work out what you need to do.

This is incredibly bad practice, and is bound to lead to improperly
implemented fencing agents that then lead to bad things happening on
clusters with these fencing agents.

There a loads of potential fencing devices out there that could be
supported, that are currently not. From my perspective trying to
implement a fencing agent for Alert On Lan 2, it was easier to reverse
engineer the magic packets of death using tcpdump and IDA pro as well as
implementing a C based Linux command tool to generate them, than it has
been to write a functioning fencing agent.

It would take a couple of hours tops for someone to write a spec for
what a fencing agent needs to do.


JAB.

-- 
Jonathan A. Buzzard                      Tel: +441382-386998
Storage Administrator, College of Life Sciences
University of Dundee, DD1 5EH



From gordan at bobich.net  Thu Apr 17 16:42:35 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 17:42:35 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
Message-ID: <alpine.LRH.1.10.0804171731390.29947@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Harri.Paivaniemi at tietoenator.com wrote:

> What Gordan said is true,
>
> but you could also just tune deadnode_timeout to be different on both 
> nodes: this results the behaviour Gordan told - the node that has 
> smaller deadnode_timeout would fence first.

Now, why didn't I know about this before? Much neater than my hack of 
putting a different sleep in the two fencing agents. :-)

Thanks.

Gordan



From alacey at brynmawr.edu  Thu Apr 17 16:44:08 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 12:44:08 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
Message-ID: <1122.165.106.200.207.1208450648.squirrel@webmail>

> but you could also just tune deadnode_timeout to be different on both
> nodes: this results the behaviour Gordan told - the node that has smaller
> deadnode_timeout would fence first.

Would this work in a situation where the switch was down for a few
minutes? Suppose the deadnode_timeout is 30 seconds on one node and 60
seconds on the other. So, after 60 seconds of switch downtime, both nodes
would be trying to fence. If the switch comes up after being down for 5
minutes, they would still immediately fence each other. Or am I not
thinking about this correctly?

-Andrew L



From gordan at bobich.net  Thu Apr 17 16:46:41 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 17:46:41 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1122.165.106.200.207.1208450648.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
	<1122.165.106.200.207.1208450648.squirrel@webmail>
Message-ID: <alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Andrew Lacey wrote:

>> but you could also just tune deadnode_timeout to be different on both
>> nodes: this results the behaviour Gordan told - the node that has smaller
>> deadnode_timeout would fence first.
>
> Would this work in a situation where the switch was down for a few
> minutes? Suppose the deadnode_timeout is 30 seconds on one node and 60
> seconds on the other. So, after 60 seconds of switch downtime, both nodes
> would be trying to fence. If the switch comes up after being down for 5
> minutes, they would still immediately fence each other. Or am I not
> thinking about this correctly?

There's an argument that if your switch is down for 30 minutes, you 
have bigger problems. If you have a 30 minute switch outage, the chances 
are that you can live with the node power-up time on top of that.

Gordan



From alacey at brynmawr.edu  Thu Apr 17 16:49:28 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 12:49:28 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
Message-ID: <1123.165.106.200.207.1208450968.squirrel@webmail>

> If you have a spare NIC, and the nodes are next to each other, you could
> make them use a cross-over cable for their cluster communication, so they
> would notice that they are both still up even when the switch dies. That's
> what I do.

I had considered this option but I haven't tried it. One thing I was
wondering is how the cluster knows which network interface should get the
cluster service IP address in that situation. Right now, I don't have
anything in my cluster.conf that specifies this, but it just seems to
work. I figured that if I tried to use a crossover cable, what I would
need to do is use /etc/hosts to create hostnames on this little private
network (consisting of just the 2 nodes connected by a cable) and use
those hostnames as the node hostnames in cluster.conf. If I did that,
would the cluster services try to assign the cluster service IP to the
interface with the crossover cable (when obviously what I want is to
assign it to the outward-facing interface)?

-Andrew L



From alacey at brynmawr.edu  Thu Apr 17 16:55:06 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 12:55:06 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
	<1122.165.106.200.207.1208450648.squirrel@webmail>
	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
Message-ID: <1127.165.106.200.207.1208451306.squirrel@webmail>

> There's an argument that if your switch is down for 30 minutes, you
> have bigger problems. If you have a 30 minute switch outage, the chances
> are that you can live with the node power-up time on top of that.

Point taken, but the problem is that if there is a switch outage and the
nodes kill each other, then somebody has to come in, power the nodes back
on and make sure everything comes up OK. It would be much easier if the
nodes would just detect that the switch is down and wait patiently without
doing anything (since there is really nothing wrong with the nodes at all,
and if they just wait for the switch to come back, everything will be
fine.)

We do have a history of flaky network here because we're a college...we
have a lot of machines on campus that we don't control (student-owned) and
we get weird traffic, rogue machines, etc. more frequently than a
locked-down corporate environment. I want to make sure that one of those
network events doesn't needlessly bring down our mail service, which is
what will be running on this cluster.

-Andrew L



From Harri.Paivaniemi at tietoenator.com  Thu Apr 17 17:03:52 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Thu, 17 Apr 2008 20:03:52 +0300
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
References: <1032.165.106.200.207.1208447225.squirrel@webmail><36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com><1122.165.106.200.207.1208450648.squirrel@webmail><alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
	<1127.165.106.200.207.1208451306.squirrel@webmail>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E20@dino.eu.tieto.com>

If you just want to have a cluster where client network can be down infinitely without cluster to take actions,

you have to run cluster heartbeat via cross-cable and deny cluster's link monitoring in client interface.

Or then start using qdisk and build heuristics.

Note, that in RHCS 5 deadnode_timeout doesn't exist anymore in /proc. It's totem token there, but havn't checked where it lives in /proc or maby it's in /sys nowadays.


-hjp



-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Andrew Lacey
Sent: Thu 4/17/2008 19:55
To: linux clustering
Subject: RE: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
 
> There's an argument that if your switch is down for 30 minutes, you
> have bigger problems. If you have a 30 minute switch outage, the chances
> are that you can live with the node power-up time on top of that.

Point taken, but the problem is that if there is a switch outage and the
nodes kill each other, then somebody has to come in, power the nodes back
on and make sure everything comes up OK. It would be much easier if the
nodes would just detect that the switch is down and wait patiently without
doing anything (since there is really nothing wrong with the nodes at all,
and if they just wait for the switch to come back, everything will be
fine.)

We do have a history of flaky network here because we're a college...we
have a lot of machines on campus that we don't control (student-owned) and
we get weird traffic, rogue machines, etc. more frequently than a
locked-down corporate environment. I want to make sure that one of those
network events doesn't needlessly bring down our mail service, which is
what will be running on this cluster.

-Andrew L

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3922 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/6d0d54e9/attachment.bin>

From Bennie_R_Thomas at raytheon.com  Thu Apr 17 17:03:43 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Thu, 17 Apr 2008 12:03:43 -0500
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1127.165.106.200.207.1208451306.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>	<1122.165.106.200.207.1208450648.squirrel@webmail>	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
	<1127.165.106.200.207.1208451306.squirrel@webmail>
Message-ID: <480782EF.6070803@raytheon.com>

Turn port security on to rid rogue machines. As Gordan suggested use a 
private interface for the cluster communications
and that will resolve the issue with the switch going down. If you use 
the point-to-point nic then you will have to reconfigure
your cluster to use the new nodenames assigned to the private lan.



Andrew Lacey wrote:
>> There's an argument that if your switch is down for 30 minutes, you
>> have bigger problems. If you have a 30 minute switch outage, the chances
>> are that you can live with the node power-up time on top of that.
>>     
>
> Point taken, but the problem is that if there is a switch outage and the
> nodes kill each other, then somebody has to come in, power the nodes back
> on and make sure everything comes up OK. It would be much easier if the
> nodes would just detect that the switch is down and wait patiently without
> doing anything (since there is really nothing wrong with the nodes at all,
> and if they just wait for the switch to come back, everything will be
> fine.)
>
> We do have a history of flaky network here because we're a college...we
> have a lot of machines on campus that we don't control (student-owned) and
> we get weird traffic, rogue machines, etc. more frequently than a
> locked-down corporate environment. I want to make sure that one of those
> network events doesn't needlessly bring down our mail service, which is
> what will be running on this cluster.
>
> -Andrew L
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   




From gordan at bobich.net  Thu Apr 17 17:06:13 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 18:06:13 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1123.165.106.200.207.1208450968.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
	<1123.165.106.200.207.1208450968.squirrel@webmail>
Message-ID: <alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>



On Thu, 17 Apr 2008, Andrew Lacey wrote:

>> If you have a spare NIC, and the nodes are next to each other, you could
>> make them use a cross-over cable for their cluster communication, so they
>> would notice that they are both still up even when the switch dies. That's
>> what I do.
>
> I had considered this option but I haven't tried it. One thing I was
> wondering is how the cluster knows which network interface should get the
> cluster service IP address in that situation.

Whichever interface has the IPs on the right subnet. Your public interface 
has the public/fail-over IPs. The private cluster interface has a pair of 
private IPs on a network of their own. No resource groups should be 
assigned to that interface. It's there just for intra-cluster 
communication (e.g. dlm, san/drbd, etc.).

> Right now, I don't have
> anything in my cluster.conf that specifies this, but it just seems to
> work. I figured that if I tried to use a crossover cable, what I would
> need to do is use /etc/hosts to create hostnames on this little private
> network (consisting of just the 2 nodes connected by a cable) and use
> those hostnames as the node hostnames in cluster.conf.

That works.

> If I did that,
> would the cluster services try to assign the cluster service IP to the
> interface with the crossover cable (when obviously what I want is to
> assign it to the outward-facing interface)?

It will assign the IPs to whatever interface already has an IP on that 
subnet. i.e. if your private cluster interface (crossover one) is 
192.168.0.0/16 and your public interface is 10.0.0.0/8, you will have a 
resource group with IPs on the 10.0.0.0/8 subnet, not on the 
192.168.0.0/16 subnet.

You will probably want to add additional monitoring against switch port 
failures here, as otherwise if the switch port of the master node fails 
(it does happen, I've seen many a switch with just 1-2 dead ports), 
the backup will not notice as it can verify that the primary is up and 
responding, and it will not fence it and fail over to itself. You'd end up 
with a working cluster but unavailable service. IIRC there is a 
monitor_link option in the resource spec for this kind of thing.

Gordan



From gordan at bobich.net  Thu Apr 17 17:09:10 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 18:09:10 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1127.165.106.200.207.1208451306.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
	<1122.165.106.200.207.1208450648.squirrel@webmail>
	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
	<1127.165.106.200.207.1208451306.squirrel@webmail>
Message-ID: <alpine.LRH.1.10.0804171806350.29947@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Andrew Lacey wrote:

>> There's an argument that if your switch is down for 30 minutes, you
>> have bigger problems. If you have a 30 minute switch outage, the chances
>> are that you can live with the node power-up time on top of that.
>
> Point taken, but the problem is that if there is a switch outage and the
> nodes kill each other, then somebody has to come in, power the nodes back
> on and make sure everything comes up OK. It would be much easier if the
> nodes would just detect that the switch is down and wait patiently without
> doing anything (since there is really nothing wrong with the nodes at all,
> and if they just wait for the switch to come back, everything will be
> fine.)

How do you propose to differentiate between a network outage that should 
instigate fencing and one that shouldn't?

> We do have a history of flaky network here because we're a college...we
> have a lot of machines on campus that we don't control (student-owned) and
> we get weird traffic, rogue machines, etc. more frequently than a
> locked-down corporate environment. I want to make sure that one of those
> network events doesn't needlessly bring down our mail service, which is
> what will be running on this cluster.

The cross-over cluster interface without a switch would probably be 
the best solution. That coupled with a varying fencing timeout should do 
most of what you seem to want to achieve.

Gordan



From gordan at bobich.net  Thu Apr 17 17:10:51 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Thu, 17 Apr 2008 18:10:51 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E20@dino.eu.tieto.com>
References: <1032.165.106.200.207.1208447225.squirrel@webmail><36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com><1122.165.106.200.207.1208450648.squirrel@webmail><alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
	<1127.165.106.200.207.1208451306.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E20@dino.eu.tieto.com>
Message-ID: <alpine.LRH.1.10.0804171809270.29947@skynet.shatteredsilicon.net>

On Thu, 17 Apr 2008, Harri.Paivaniemi at tietoenator.com wrote:

> If you just want to have a cluster where client network can be down 
> infinitely without cluster to take actions,
> you have to run cluster heartbeat via cross-cable and deny cluster's 
> link monitoring in client interface.
>
> Or then start using qdisk and build heuristics.

At that rate you might as well just not bother specifying a fencing device 
- the whole cluster will just lock up until the network comes back and it 
can re-connect and re-establish quorum.

> Note, that in RHCS 5 deadnode_timeout doesn't exist anymore in /proc. 
> It's totem token there, but havn't checked where it lives in /proc or 
> maby it's in /sys nowadays.

Thanks for that. :-)

Gordan



From Bennie_R_Thomas at raytheon.com  Thu Apr 17 17:10:29 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Thu, 17 Apr 2008 12:10:29 -0500
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1123.165.106.200.207.1208450968.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
	<1123.165.106.200.207.1208450968.squirrel@webmail>
Message-ID: <48078485.40503@raytheon.com>

Let's say you have 2-nodes. "nodeA" and "nodeB" (this is how the public 
sees them). create "private" nodenames
like "nodeAe" and "nodeBe". Add the nodenames to both hosts files,  make 
sure the private interfaces are set up
with private addresses, reconfigure your cluster to use the "private" 
nodename. Then if you want a Cluster Alias IP address
that is known to the public you assign another public address as a 
resource then add it to a service.


Andrew Lacey wrote:
>> If you have a spare NIC, and the nodes are next to each other, you could
>> make them use a cross-over cable for their cluster communication, so they
>> would notice that they are both still up even when the switch dies. That's
>> what I do.
>>     
>
> I had considered this option but I haven't tried it. One thing I was
> wondering is how the cluster knows which network interface should get the
> cluster service IP address in that situation. Right now, I don't have
> anything in my cluster.conf that specifies this, but it just seems to
> work. I figured that if I tried to use a crossover cable, what I would
> need to do is use /etc/hosts to create hostnames on this little private
> network (consisting of just the 2 nodes connected by a cable) and use
> those hostnames as the node hostnames in cluster.conf. If I did that,
> would the cluster services try to assign the cluster service IP to the
> interface with the crossover cable (when obviously what I want is to
> assign it to the outward-facing interface)?
>
> -Andrew L
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   




From alacey at brynmawr.edu  Thu Apr 17 17:18:55 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 13:18:55 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
	<1123.165.106.200.207.1208450968.squirrel@webmail>
	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
Message-ID: <1470.165.106.200.207.1208452735.squirrel@webmail>

> It will assign the IPs to whatever interface already has an IP on that
> subnet. i.e. if your private cluster interface (crossover one) is
> 192.168.0.0/16 and your public interface is 10.0.0.0/8, you will have a
> resource group with IPs on the 10.0.0.0/8 subnet, not on the
> 192.168.0.0/16 subnet.

> You will probably want to add additional monitoring against switch port
> failures here, as otherwise if the switch port of the master node fails
> (it does happen, I've seen many a switch with just 1-2 dead ports),
> the backup will not notice as it can verify that the primary is up and
> responding, and it will not fence it and fail over to itself. You'd end up
> with a working cluster but unavailable service. IIRC there is a
> monitor_link option in the resource spec for this kind of thing.

Very informative post...thanks! The scenario you mentioned with a dead
switch port (or a single unplugged network cable, or whatever) is
something I had thought about, and I considered it to be a strike against
using a crossover cable. But, this "monitor_link" sounds like it might be
exactly what I've been looking for. I'll research that and see what I can
find.

You asked in your other post how I can tell the difference between a
network outage that should cause a fence and one that shouldn't. What I
wanted to do was set it up so that a node that can't reach the switch will
never try to fence the other node. That way, if the switch is down and
nobody can reach it, then nobody will fence. If there is a single port
failure and one node can still reach the switch, then it will fence the
other node and take over the services.

Thanks,

-Andrew L



From gordan at bobich.net  Thu Apr 17 18:25:26 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 17 Apr 2008 19:25:26 +0100
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1470.165.106.200.207.1208452735.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>	<1123.165.106.200.207.1208450968.squirrel@webmail>	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
	<1470.165.106.200.207.1208452735.squirrel@webmail>
Message-ID: <48079616.3050108@bobich.net>

Andrew Lacey wrote:

> Very informative post...thanks! The scenario you mentioned with a dead
> switch port (or a single unplugged network cable, or whatever) is
> something I had thought about, and I considered it to be a strike against
> using a crossover cable.

How does that follow? With a switch in the middle your points of failure 
are:
cable, switch, cable

With just a crossover cable (actually, it doesn't have to be crossover - 
99% of NICs made in the past few years auto-detect and auto-negotiate 
whether they need to cross-over or not, so you can just use a 
straight-through cable - but that's getting off topic), you only have a 
single cable as a point of failure. That is certainly better than the 
alternative.

> But, this "monitor_link" sounds like it might be
> exactly what I've been looking for. I'll research that and see what I can
> find.

You don't need that on your cluster interface though. If the NIC or 
cable die, cluster will lose the connection to the other node and fence 
it. If you have something like iLO on multiple interfaces, you can 
specify multiple fencing devices, to ensure that you manage to fence the 
other node, regardless of which interface fails. But the crossover 
interface connecting the nodes is arguably the most reliable part of 
your 2-node cluster because it has the fewest components.

> You asked in your other post how I can tell the difference between a
> network outage that should cause a fence and one that shouldn't. What I
> wanted to do was set it up so that a node that can't reach the switch will
> never try to fence the other node. That way, if the switch is down and
> nobody can reach it, then nobody will fence. If there is a single port
> failure and one node can still reach the switch, then it will fence the
> other node and take over the services.

Is your switch managed? If so, you can use this as a fencing device 
simply have a node disable the other node's port. That way any 
subsequent attempts by the other node, to fence or do anything else, 
will not get anywhere. You may need to write your own fencing agent for 
that, though. I asked for fencing agent API in a post earlier, and there 
appears to be no conclusive documentation for this. I've been meaning to 
implement a fencing agent for exactly this sort of thing (fencing by 
disabling the switch port) on a 3Com switch.

Gordan



From lhh at redhat.com  Thu Apr 17 18:42:33 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 17 Apr 2008 14:42:33 -0400
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <4807079A.5060909@bobich.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
Message-ID: <1208457753.6053.138.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-17 at 09:17 +0100, Gordan Bobic wrote:
> Harri.Paivaniemi at tietoenator.com wrote:

> > 1. 2-node cluster. Can't start only one node to get cluster services up - it hangs in fencing and waits until I start te second node and immediately after that, when both nodes are starting cman, the cluster comes up. So if I have lost one node, I can't get the cluster up, if I have to restart for seome reason the working node. It should work like before (both nodes are down, I start one, it fences another and comes up). Now it just waits... log says:
> > 
> > ccsd[25272]: Error while processing connect: Connection refused
> > 
> > This is so common error message, that it just tell's nothing to me....
> 
> I have seen similar error messages before, and it has usually been 
> caused by either the node names/interfaces/IPs not being listed 
> correctly in /etc/hosts file, or iptables firewalling rules blocking 
> communication between the nodes.

It's probably also partly the cluster not being quorate.  ccsd is very
verbose, and it logs errors perhaps when it shouldn't...

> > 2. qdisk doesn't work. 2- node cluster. Start it (both nodes at the same time) to get it up. Works ok, qdisk works, heuristic works. Everything works. If I stop cluster daemons on one node, that node can't join to cluster anymore without a complete reboot. It joins, another node says ok, the node itself says ok, quorum is registred and heuristic is up, but the node's quorum-disk stays offline and another node says this node is offline. If I reboot this machine, it joins to cluster ok.

> I believe it's supposed to work that way. When a node fails it needs to 
> be fully restarted before it is allowed back into the cluster. I'm sure 
> this has been mentioned on the list recently.

If you cleanly stop the cluster daemons, fencing shouldn't be needed
here.  If the node's not getting allowed in to the cluster, there's some
reason for it.  A way to tell if a node's being rejected is:

   cman_tool status

If you see 'DisallowedNodes' (I think?), the current "quorate" partition
thinks that the other node needs to be fenced.  I don't remember the
cases that lead to this situation, though.

Anyway, clean stop of the cluster should never require fencing.


> > 3. Funny thing: heuristic ping didn't work at all in the beginning and support gave me a "ping-script" which make it to work... so this describes quite well how experimental this cluster is nowadays...
> > 
> > I have to tell you it is a FACT that basics are ok: fencing works ok in a normal situation, I don't have typos, configs are in sync,  everything is ok, but these problems still exists.
> 
> I've been in similar situations before, but in the end it always turned 
> out to be me doing something silly (see above re: host files and 
> iptables as examples).

Need for the ping-script is definitely a bug.  It's because ping uses
signals to wake itself up, and qdiskd blocked those signals before fork
(and of course, ping doesn't unblock signals itself).  It's fixed in
current 4.6.z/5.1.z errata (IIRC) and definitely in 5.2 beta.


> > I have 2 times sent sosreports etc. so RH support. They hava spent 3 weeks and still can't say whats wrong...

> Sadly, that seems to be the quality of commercial support from any 
> vendor. Support nowdays seems to have only one purpose - managerial 
> back-covering exercise so they can pass the buck.

It's unfortunate that this is the conception.

-- Lon



From alacey at brynmawr.edu  Thu Apr 17 18:51:55 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Thu, 17 Apr 2008 14:51:55 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <48079616.3050108@bobich.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>	<1123.165.106.200.207.1208450968.squirrel@webmail>	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
	<1470.165.106.200.207.1208452735.squirrel@webmail>
	<48079616.3050108@bobich.net>
Message-ID: <1752.165.106.200.207.1208458315.squirrel@webmail>

>> Very informative post...thanks! The scenario you mentioned with a dead
>> switch port (or a single unplugged network cable, or whatever) is
>> something I had thought about, and I considered it to be a strike
>> against
>> using a crossover cable.
>
> How does that follow? With a switch in the middle your points of failure
> are:
> cable, switch, cable

The potential problem with the crossover cable design is: Although the
cluster communication goes over the crossover cable, the path to the
switch is used for user connections to the cluster service. Suppose node 1
is active and node 2 is standby. Node 1 loses its connection to the switch
for whatever reason, but node 2 doesn't. Since the heartbeat goes across
the crossover cable, the nodes think nothing is wrong, so no failover
occurs and the service is not reachable to users. If the service had
failed over to node 2 (which can still talk to the switch), it would be
reachable to users.

Eliminating the crossover cable and sending the cluster traffic through
the switch eliminates this problem nicely -- both nodes try to fence, but
node 1 can't reach anything, so node 2 kills node 1 and the service is up
on node 2. But then, of course, you have the pathological case when
neither node can talk to the switch until the downed switch comes back up,
and boom, they both fence each other.

Maybe the monitor_link option in conjunction with the crossover-cable
heartbeat will fix this. I'm in the process of setting that up right now,
so I'll post back when I have a result.

-Andrew L



From lhh at redhat.com  Thu Apr 17 18:54:56 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 17 Apr 2008 14:54:56 -0400
Subject: [Linux-cluster] Meaning of Cluster Cycle and timeout problems
In-Reply-To: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>
References: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>
Message-ID: <1208458496.6053.150.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-17 at 09:08 +0200, Peter wrote:
> Hi!
> 
> In our Cluster we have the following entry in the "messages" logfile:
> 
> "qdiskd[4314]: <warning> qdisk cycle took more than 3 seconds to  
> complete (3.890000)"

It means it took more than 3 seconds for one qdiskd cycle to complete.
This is a whole lot:

   8192 bytes in 16 block reads
   some internal calculations
   512  bytes in 1 block write

(that's it...)


> Theese messages are very frequent. I can not find anything except the  
> source code via google and i am sorry to say that i am not so familar  
> with c to get the point.
> 
> 
> We also have sometimes a quorum timeout:
> 
> "kernel: CMAN: Quorum device /dev/sdh timed out"
> 
> 
> Are theese two messages independent and what is the meaning of the  
> first message?


No, they're 100% related.  It sounds like qdiskd is getting starved for
I/O to /dev/sdh, or possibly it's getting CPU-starved for some reason.
Being that it's more or less a real-time program which helps keep the
cluster running, that's bad!  In your case, it's getting hung up for
longer than the cluster failover time, so CMAN thinks qdiskd has died.
Not good.


(1) Turn *off* status_file if you have it enabled!  It's for debugging,
and under certain load patterns, it can really slow down qdiskd.


(2) If you think it's I/O, what you should try is (assuming you're using
cluster2/rhel5/centos5/etc. here):

  echo deadline > /sys/block/sdh/queue

If you had a default of 10 seconds (1 interval 10 tko), you should also
do:

  echo 2500 > /sys/block/sdh/queue/iosched/write_expire

... you've got at least 3 for interval, so I'm not sure this would apply
to you.

[NOTE: On rhel4/centos4/stable, I think you have to set the I/O
scheduler globally in the kernel command line at system boot.]


(3) If you think qdiskd is getting CPU starved, you can adjust the
'scheduler' and 'priority' values in cluster.conf to something
different.  I think the man page might be wrong; I think the highest
'priority' value for the 'rr' scheduler is 99, not 100.  See the
qdisk(5) man page for more information on those.

-- Lon



From lhh at redhat.com  Thu Apr 17 18:56:29 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 17 Apr 2008 14:56:29 -0400
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
	<1122.165.106.200.207.1208450648.squirrel@webmail>
	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
Message-ID: <1208458589.6053.152.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-17 at 17:46 +0100, gordan at bobich.net wrote:
> On Thu, 17 Apr 2008, Andrew Lacey wrote:
> 
> >> but you could also just tune deadnode_timeout to be different on both
> >> nodes: this results the behaviour Gordan told - the node that has smaller
> >> deadnode_timeout would fence first.
> >
> > Would this work in a situation where the switch was down for a few
> > minutes? Suppose the deadnode_timeout is 30 seconds on one node and 60
> > seconds on the other. So, after 60 seconds of switch downtime, both nodes
> > would be trying to fence. If the switch comes up after being down for 5
> > minutes, they would still immediately fence each other. Or am I not
> > thinking about this correctly?
> 
> There's an argument that if your switch is down for 30 minutes, you 
> have bigger problems. If you have a 30 minute switch outage, the chances 
> are that you can live with the node power-up time on top of that.

... or an argument that maybe the 'sleep' delay in a fencing agent on a
given node isn't necessarily a bad thing after all :D

-- Lon



From lhh at redhat.com  Thu Apr 17 19:03:08 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 17 Apr 2008 15:03:08 -0400
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804171809270.29947@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>
	<1122.165.106.200.207.1208450648.squirrel@webmail>
	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>
	<1127.165.106.200.207.1208451306.squirrel@webmail>
	<36F4E74FA8263744A6B016E6A461EFF603317E20@dino.eu.tieto.com>
	<alpine.LRH.1.10.0804171809270.29947@skynet.shatteredsilicon.net>
Message-ID: <1208458988.6053.157.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-17 at 18:10 +0100, gordan at bobich.net wrote:
> On Thu, 17 Apr 2008, Harri.Paivaniemi at tietoenator.com wrote:
> 
> > If you just want to have a cluster where client network can be down 
> > infinitely without cluster to take actions,
> > you have to run cluster heartbeat via cross-cable and deny cluster's 
> > link monitoring in client interface.
> >
> > Or then start using qdisk and build heuristics.
> 
> At that rate you might as well just not bother specifying a fencing device 
> - the whole cluster will just lock up until the network comes back and it 
> can re-connect and re-establish quorum.
> 
> > Note, that in RHCS 5 deadnode_timeout doesn't exist anymore in /proc. 
> > It's totem token there, but havn't checked where it lives in /proc or 
> > maby it's in /sys nowadays.
> 
> Thanks for that. :-)

It's just cluster.conf at this point.  <totem token="X"/>

It's not possible to specify different timeouts on different nodes as
you can with deadnode_timeout.  Of course, I think doing different
deadnode_timeouts is kind of nuts :D

-- Lon



From gordan at bobich.net  Thu Apr 17 21:28:54 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 17 Apr 2008 22:28:54 +0100
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <1208458988.6053.157.camel@ayanami.boston.devel.redhat.com>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<36F4E74FA8263744A6B016E6A461EFF603317E1F@dino.eu.tieto.com>	<1122.165.106.200.207.1208450648.squirrel@webmail>	<alpine.LRH.1.10.0804171745550.29947@skynet.shatteredsilicon.net>	<1127.165.106.200.207.1208451306.squirrel@webmail>	<36F4E74FA8263744A6B016E6A461EFF603317E20@dino.eu.tieto.com>	<alpine.LRH.1.10.0804171809270.29947@skynet.shatteredsilicon.net>
	<1208458988.6053.157.camel@ayanami.boston.devel.redhat.com>
Message-ID: <4807C116.1000609@bobich.net>

Lon Hohberger wrote:

>>> Note, that in RHCS 5 deadnode_timeout doesn't exist anymore in /proc. 
>>> It's totem token there, but havn't checked where it lives in /proc or 
>>> maby it's in /sys nowadays.
>> Thanks for that. :-)
> 
> It's just cluster.conf at this point.  <totem token="X"/>
> 
> It's not possible to specify different timeouts on different nodes as
> you can with deadnode_timeout.  Of course, I think doing different
> deadnode_timeouts is kind of nuts :D

Not at all. It is _vital_ when you only have 2 nodes.

Gordan



From barbos at gmail.com  Fri Apr 18 01:10:47 2008
From: barbos at gmail.com (Alex Kompel)
Date: Thu, 17 Apr 2008 18:10:47 -0700
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
In-Reply-To: <1208428065.21043.60.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org> <fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
Message-ID: <3ae027040804171810p6dffb05dp13f1a7fb9a0ddbae@mail.gmail.com>

2008/4/17 Harri P?iv?niemi <harri.paivaniemi at tietoenator.com>:
>
> The 2nd problem that still exists is:
>
> When node a and b are running and everything is ok. I stop node b's
> cluster daemons. when I start node b again, this situation stays
> forever:
>
> ----------------
> node a - clustat
> Member Status: Quorate
>
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  areenasql1                            1 Online, Local, rgmanager
>  areenasql2                            2 Offline
>  /dev/sda                              0 Online, Quorum Disk
>
>  Service Name         Owner (Last)                   State
>  ------- ----         ----- ------                   -----
>  service:areena       areenasql1                     started
>
> -------------------
>
> node b - clustat
>
> Member Status: Quorate
>
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  areenasql1                            1 Online, rgmanager
>  areenasql2                            2 Online, Local, rgmanager
>  /dev/sda                              0 Offline, Quorum Disk
>
>  Service Name         Owner (Last)                   State
>  ------- ----         ----- ------                   -----
>  service:areena       areenasql1                     started
>
>
> So node b's quorum disk is offline, log says it's registred ok and
> heuristic is UP... node a sees node b as offline. If I reboot node b, it
> works ok and joins ok...

Now that you have mentioned it - I remember stumbling upon the similar
problem. It happens if you restart the cluster services before the
cluster realizes the node is dead. I guess it is a bug since the node
is in some sort of limbo state at that moment reporting itsefl being
part of the cluster while the cluster does not recognize it as a
member. If you wait 70 seconds ( cluster.conf: <totem token="70000"/>
) before starting the cluster services then it will come up fine. The
reboot works for you because it take longer than 70 sec (correct me if
I am wrong). So try stopping node b cluster services, wait 70 secs and
then start them back up.

-Alex



From Harri.Paivaniemi at tietoenator.com  Fri Apr 18 04:23:29 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Fri, 18 Apr 2008 07:23:29 +0300
Subject: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
References: <fu4qnq$gg2$1@ger.gmane.org> <fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
	<3ae027040804171810p6dffb05dp13f1a7fb9a0ddbae@mail.gmail.com>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E21@dino.eu.tieto.com>

Oh my dear Alex,

It really goes that way! - I just can't believe - you are one hell of a genious.

I havn't had a clue about it could be something this simple. It really works. I feel stupid.

So, I was really driving grazy with this cluster ver 5 yesterday, but now it seems that both of my problems are solved:

1. unable to bring just one node up in 2-node cluster - hanging in fencing / fence failed

Reason: cman was told (by RH) to be started before qdisk and this is wrong way.
Qdisk have to be started first in this situation, so fence_tool is not wondering why cluster is not quorate ;)

2. restart of cluster daemons not succesfull

Reason: You have to wait "token timeout" before starting again ;)

Great.

Thanks for all you. RH support has been thinking these problems 3 weeks now without success.

-hjp


-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Alex Kompel
Sent: Fri 4/18/2008 4:10
To: linux clustering
Subject: Re: [Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1
 
2008/4/17 Harri P?iv?niemi <harri.paivaniemi at tietoenator.com>:
>
> The 2nd problem that still exists is:
>
> When node a and b are running and everything is ok. I stop node b's
> cluster daemons. when I start node b again, this situation stays
> forever:
>
> ----------------
> node a - clustat
> Member Status: Quorate
>
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  areenasql1                            1 Online, Local, rgmanager
>  areenasql2                            2 Offline
>  /dev/sda                              0 Online, Quorum Disk
>
>  Service Name         Owner (Last)                   State
>  ------- ----         ----- ------                   -----
>  service:areena       areenasql1                     started
>
> -------------------
>
> node b - clustat
>
> Member Status: Quorate
>
>  Member Name                        ID   Status
>  ------ ----                        ---- ------
>  areenasql1                            1 Online, rgmanager
>  areenasql2                            2 Online, Local, rgmanager
>  /dev/sda                              0 Offline, Quorum Disk
>
>  Service Name         Owner (Last)                   State
>  ------- ----         ----- ------                   -----
>  service:areena       areenasql1                     started
>
>
> So node b's quorum disk is offline, log says it's registred ok and
> heuristic is UP... node a sees node b as offline. If I reboot node b, it
> works ok and joins ok...

Now that you have mentioned it - I remember stumbling upon the similar
problem. It happens if you restart the cluster services before the
cluster realizes the node is dead. I guess it is a bug since the node
is in some sort of limbo state at that moment reporting itsefl being
part of the cluster while the cluster does not recognize it as a
member. If you wait 70 seconds ( cluster.conf: <totem token="70000"/>
) before starting the cluster services then it will come up fine. The
reboot works for you because it take longer than 70 sec (correct me if
I am wrong). So try stopping node b cluster services, wait 70 secs and
then start them back up.

-Alex

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 4652 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080418/b027096d/attachment.bin>

From harri.paivaniemi at tietoenator.com  Fri Apr 18 06:26:34 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Fri, 18 Apr 2008 09:26:34 +0300
Subject: [Linux-cluster] Is heuristic really required?
In-Reply-To: <alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
	<alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
Message-ID: <1208499994.21043.81.camel@hjpsuse.tebit>

Hi,

Qdisk man says at least 1 heuristic is reguired.

Is it?

I have (accidentally) tested and to my mind it worked fine without
heuristics. I gave 1 vote to quorumd and no <heuristic>- tag at all and
it seemed to work normally like 3-vote cluster should...



-hjp







From harri.paivaniemi at tietoenator.com  Fri Apr 18 07:48:55 2008
From: harri.paivaniemi at tietoenator.com (Harri =?ISO-8859-1?Q?P=E4iv=E4niemi?=)
Date: Fri, 18 Apr 2008 10:48:55 +0300
Subject: [Linux-cluster] Is heuristic really required?
In-Reply-To: <1208499994.21043.81.camel@hjpsuse.tebit>
References: <fu4qnq$gg2$1@ger.gmane.org>
	<alpine.LRH.1.10.0804161323550.20621@skynet.shatteredsilicon.net>
	<fu4vam$23n$1@ger.gmane.org>
	<36F4E74FA8263744A6B016E6A461EFF603317E1B@dino.eu.tieto.com>
	<4807079A.5060909@bobich.net>
	<1208423931.27774.2.camel@admc.win-rar.local>
	<1208424527.21043.33.camel@hjpsuse.tebit>
	<1208424763.27774.6.camel@admc.win-rar.local>
	<1208426290.21043.48.camel@hjpsuse.tebit>
	<1208427177.27774.8.camel@admc.win-rar.local>
	<1208428065.21043.60.camel@hjpsuse.tebit>
	<alpine.LRH.1.10.0804171219160.28160@skynet.shatteredsilicon.net>
	<1208499994.21043.81.camel@hjpsuse.tebit>
Message-ID: <1208504935.21043.83.camel@hjpsuse.tebit>

I'll answer myself,

Of course to be sure I can put "exit 0" to heuristic so it's officially
disabled...

-hjp


On Fri, 2008-04-18 at 09:26 +0300, Harri P?iv?niemi wrote:
> Hi,
> 
> Qdisk man says at least 1 heuristic is reguired.
> 
> Is it?
> 
> I have (accidentally) tested and to my mind it worked fine without
> heuristics. I gave 1 vote to quorumd and no <heuristic>- tag at all and
> it seemed to work normally like 3-vote cluster should...
> 
> 
> 
> -hjp
> 
> 
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From admin.cluster at gmail.com  Fri Apr 18 13:21:28 2008
From: admin.cluster at gmail.com (Anthony)
Date: Fri, 18 Apr 2008 15:21:28 +0200
Subject: [Linux-cluster] Replacing a RAID 5 disk, then...
Message-ID: <4808A058.50704@gmail.com>

Hello,

i had a problem with one of my 5 disks of my sun v40z server,
one of the disks had gone out of service, and i had that Beep sound , 
notifying me of a raid 5 problem,
so i ordered a new one, and replaced it , the beep sound is still on, i 
think that the raid 5 is beeing re-constructed!?!?!
now i want to know, what are the RedHat commands to issue to see what is 
happennig, and what is the state of my raid-5.

i am under RedHat Enterprise Linux AS 4.2.

Regards,
Anthony.



From Derek.Anderson at compellent.com  Fri Apr 18 13:52:14 2008
From: Derek.Anderson at compellent.com (Derek Anderson)
Date: Fri, 18 Apr 2008 08:52:14 -0500
Subject: [Linux-cluster] Replacing a RAID 5 disk, then...
In-Reply-To: <4808A058.50704@gmail.com>
References: <4808A058.50704@gmail.com>
Message-ID: <99E0F1976E2DA2499F3E6EB18B25F036042E7301@honeywheat.Beer.Town>

Anthony,

When the failed disk was replaced a RAID rebuild should have been
initiated; that is a good thing.  The RAID device will continue to
operate in degraded mode until the rebuild has completed.  I assume
that's why it is still beeping at you.

And the fact that something is beeping at you makes it sound like your
RAID is managed in hardware, in which case operating system commands
aren't going to give you an indication of what is happening.  Is this
true?  Are your disks contained in an external disk enclosure?  If so,
what kind.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Anthony
Sent: Friday, April 18, 2008 8:21 AM
To: linux clustering
Subject: [Linux-cluster] Replacing a RAID 5 disk, then...

Hello,

i had a problem with one of my 5 disks of my sun v40z server,
one of the disks had gone out of service, and i had that Beep sound , 
notifying me of a raid 5 problem,
so i ordered a new one, and replaced it , the beep sound is still on, i 
think that the raid 5 is beeing re-constructed!?!?!
now i want to know, what are the RedHat commands to issue to see what is

happennig, and what is the state of my raid-5.

i am under RedHat Enterprise Linux AS 4.2.

Regards,
Anthony.

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From matthias.schlarb at sap.com  Fri Apr 18 13:55:44 2008
From: matthias.schlarb at sap.com (Matthias Schlarb)
Date: Fri, 18 Apr 2008 15:55:44 +0200
Subject: [Linux-cluster] operation configuration: start timeout
Message-ID: <4808A860.4070300@sap.com>

Hello,

how can I add a start timeout for a cluster resource in cluster.conf? 
And what is the default there respectively how much time does a resource 
agent have to start his resource before the cluster assumes that the 
start has failed? The PDF and the system-config-cluster doesn't provide 
anything about that and I don't have a reference for the config xml.

Many thanks in advance and best regards,
-- 
Matthias Schlarb



From admin.cluster at gmail.com  Fri Apr 18 14:05:48 2008
From: admin.cluster at gmail.com (Anthony)
Date: Fri, 18 Apr 2008 16:05:48 +0200
Subject: [Linux-cluster] Replacing a RAID 5 disk, then...
In-Reply-To: <99E0F1976E2DA2499F3E6EB18B25F036042E7301@honeywheat.Beer.Town>
References: <4808A058.50704@gmail.com>
	<99E0F1976E2DA2499F3E6EB18B25F036042E7301@honeywheat.Beer.Town>
Message-ID: <4808AABC.5030201@gmail.com>

Dear Derek,

Thanks a lot for your answer, indeed, i have a Hardware Raid 5, it is an 
internal MegaRaid product.
i hope that the reconstruction will be ok.
Any ideas how much time the reconstruction will take (it is 5*146GB SCSI 
HD)?


Regards,
Anthony.


Derek Anderson wrote:
> Anthony,
>
> When the failed disk was replaced a RAID rebuild should have been
> initiated; that is a good thing.  The RAID device will continue to
> operate in degraded mode until the rebuild has completed.  I assume
> that's why it is still beeping at you.
>
> And the fact that something is beeping at you makes it sound like your
> RAID is managed in hardware, in which case operating system commands
> aren't going to give you an indication of what is happening.  Is this
> true?  Are your disks contained in an external disk enclosure?  If so,
> what kind.
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Anthony
> Sent: Friday, April 18, 2008 8:21 AM
> To: linux clustering
> Subject: [Linux-cluster] Replacing a RAID 5 disk, then...
>
> Hello,
>
> i had a problem with one of my 5 disks of my sun v40z server,
> one of the disks had gone out of service, and i had that Beep sound , 
> notifying me of a raid 5 problem,
> so i ordered a new one, and replaced it , the beep sound is still on, i 
> think that the raid 5 is beeing re-constructed!?!?!
> now i want to know, what are the RedHat commands to issue to see what is
>
> happennig, and what is the state of my raid-5.
>
> i am under RedHat Enterprise Linux AS 4.2.
>
> Regards,
> Anthony.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>   



From isplist at logicore.net  Fri Apr 18 16:35:38 2008
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 18 Apr 2008 11:35:38 -0500
Subject: [Linux-cluster] XEN and IBM x440
Message-ID: <2008418113538.293228@leena>

Anyone using the IBM x440 and has installed rhel50 xen on it? 
After install, it can't seem to find the root and other files so just keeps rebooting.

Mike




From alacey at brynmawr.edu  Fri Apr 18 16:55:22 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Fri, 18 Apr 2008 12:55:22 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <48079616.3050108@bobich.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>	<1123.165.106.200.207.1208450968.squirrel@webmail>	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
	<1470.165.106.200.207.1208452735.squirrel@webmail>
	<48079616.3050108@bobich.net>
Message-ID: <4947.165.106.200.207.1208537722.squirrel@webmail>

Just an update...I set up the cluster to communicate over a crossover
cable, and to monitor_link for the public IP address. This works great for
the scenario where one node loses its public network link (services fail
over to the other node, without fencing), and reasonably well for the
scenario where both lose their public links (cluster services stop cleanly
after both nodes realize they lost their links, but nothing is fenced).

The only remaining thing I want to do is get the deadnode_timeout set up
so that if the crossover cable link is lost, one node will always fence
the other first (rather than both at the same time). I tried just changing
the value stored in /proc/cluster/config/cman/deadnode_timeout, but this
does not hold after a reboot (it changes back to default 21). Does anyone
know the right way to change this value on one node only?

-Andrew L



From gordan at bobich.net  Fri Apr 18 17:10:10 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Fri, 18 Apr 2008 18:10:10 +0100 (BST)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <4947.165.106.200.207.1208537722.squirrel@webmail>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
	<1123.165.106.200.207.1208450968.squirrel@webmail>
	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
	<1470.165.106.200.207.1208452735.squirrel@webmail>
	<48079616.3050108@bobich.net>
	<4947.165.106.200.207.1208537722.squirrel@webmail>
Message-ID: <alpine.LRH.1.10.0804181808560.6349@skynet.shatteredsilicon.net>

On Fri, 18 Apr 2008, Andrew Lacey wrote:

> The only remaining thing I want to do is get the deadnode_timeout set up
> so that if the crossover cable link is lost, one node will always fence
> the other first (rather than both at the same time). I tried just changing
> the value stored in /proc/cluster/config/cman/deadnode_timeout, but this
> does not hold after a reboot (it changes back to default 21). Does anyone
> know the right way to change this value on one node only?

echo "100" > /proc/cluster/config/cman/deadnode_timeout

in /etc/rc.local?

Gordan



From jerlyon at gmail.com  Fri Apr 18 19:49:50 2008
From: jerlyon at gmail.com (Jeremy Lyon)
Date: Fri, 18 Apr 2008 13:49:50 -0600
Subject: [Linux-cluster] script resource start, stop and status timeouts
Message-ID: <779919740804181249k624a6a95le2ffe7c81319bf4@mail.gmail.com>

Hi,

I'm currently using cluster on RHEL 4.6 and will be soon using moving to
cluster on RHEL 5.1.  We are using some script resources and I'm trying to
find if there are timeouts on the start, stop and status functions.  If so,
what are the defaults and can they be tuned?

TIA
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080418/2dd472d1/attachment.htm>

From alacey at brynmawr.edu  Fri Apr 18 20:00:54 2008
From: alacey at brynmawr.edu (Andrew Lacey)
Date: Fri, 18 Apr 2008 16:00:54 -0400 (EDT)
Subject: [Linux-cluster] IP-based tie-breaker on a 2-node cluster?
In-Reply-To: <alpine.LRH.1.10.0804181808560.6349@skynet.shatteredsilicon.net>
References: <1032.165.106.200.207.1208447225.squirrel@webmail>
	<alpine.LRH.1.10.0804171653160.29413@skynet.shatteredsilicon.net>
	<1123.165.106.200.207.1208450968.squirrel@webmail>
	<alpine.LRH.1.10.0804171759410.29947@skynet.shatteredsilicon.net>
	<1470.165.106.200.207.1208452735.squirrel@webmail>
	<48079616.3050108@bobich.net>
	<4947.165.106.200.207.1208537722.squirrel@webmail>
	<alpine.LRH.1.10.0804181808560.6349@skynet.shatteredsilicon.net>
Message-ID: <2519.165.106.200.207.1208548854.squirrel@webmail>

> echo "100" > /proc/cluster/config/cman/deadnode_timeout
>
> in /etc/rc.local?
>
> Gordan

That did it. Thanks!

-Andrew L



From Harri.Paivaniemi at tietoenator.com  Sat Apr 19 03:39:19 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Sat, 19 Apr 2008 06:39:19 +0300
Subject: [Linux-cluster] script resource start, stop and status timeouts
References: <779919740804181249k624a6a95le2ffe7c81319bf4@mail.gmail.com>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E23@dino.eu.tieto.com>

Those can be tuned,

/usr/share/cluster/script.sh

... is for normal init- scripts. There is also plenty of other scripts there for different purposes.

In that script:

<actions>
        <action name="start" timeout="0"/>
        <action name="stop" timeout="0"/>


.... tels it's infinite timeout by default.

-hjp



-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Jeremy Lyon
Sent: Fri 4/18/2008 22:49
To: linux-cluster at redhat.com
Subject: [Linux-cluster] script resource start, stop and status timeouts
 
Hi,

I'm currently using cluster on RHEL 4.6 and will be soon using moving to
cluster on RHEL 5.1.  We are using some script resources and I'm trying to
find if there are timeouts on the start, stop and status functions.  If so,
what are the defaults and can they be tuned?

TIA
Jeremy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 2996 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080419/85645273/attachment.bin>

From Harri.Paivaniemi at tietoenator.com  Sat Apr 19 04:44:59 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Sat, 19 Apr 2008 07:44:59 +0300
Subject: [Linux-cluster] Howto multicast?
References: <4808A058.50704@gmail.com><99E0F1976E2DA2499F3E6EB18B25F036042E7301@honeywheat.Beer.Town>
	<4808AABC.5030201@gmail.com>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E24@dino.eu.tieto.com>

Please explain me,

I havn't use multicasting very much so now I have problems to understand this RHCS 5- communication.

I have tought it goes this way:

- cman uses either broadcast or multicast nowadays (multicast by default)
- openais uses always multicast, with it's default address

Is this right? What is the difference there - if we have openais communicating, why there is another thing also communicating? I have been told that openais is cman's communication channel so what the heck?

So, If I have 2 clusters in the same subnet, how to tell these things to be different?


D. Teigland says:

""
When openais is started by cman, the openais.conf file is not used.  Many of
the configuration parameters listed in openais.conf can be set in cluster.conf
instead.

See the openais.conf man page for the specific parameters that can be set in
these sections.  Note that settings in the <clusternodes> section will
override any comparable settings in the openais sections above (in particular,
bindnetaddr, mcastaddr, mcastport and nodeid will always be replaced by values
in <clusternodes>).
""

So, if I put in cluster.conf:

<clusternode....>
<multicast addr="239.192.204.100" interface="bond0"/>

... for each node, what it configures? Cman or openais? 

You see, if I do that, log says still:

--------
openais[15321]: [MAIN ] Using default multicast address of 239.192.204.175
----------

And then, I could also tell in <cman...>, that

<multicast addr="239.192.204.100" interface="bond0"/>

... but what it configures? Still netstat -g shows only that default address there...


So if somebody knows, how this goes, please tell me. Please...


-hjp





-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3289 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080419/a92e0443/attachment.bin>

From ccaulfie at redhat.com  Mon Apr 21 07:17:36 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Mon, 21 Apr 2008 08:17:36 +0100
Subject: [Linux-cluster] Howto multicast?
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E24@dino.eu.tieto.com>
References: <4808A058.50704@gmail.com><99E0F1976E2DA2499F3E6EB18B25F036042E7301@honeywheat.Beer.Town>	<4808AABC.5030201@gmail.com>
	<36F4E74FA8263744A6B016E6A461EFF603317E24@dino.eu.tieto.com>
Message-ID: <480C3F90.4080203@redhat.com>

Harri.Paivaniemi at tietoenator.com wrote:
> Please explain me,
> 
> I havn't use multicasting very much so now I have problems to understand this RHCS 5- communication.
> 
> I have tought it goes this way:
> 
> - cman uses either broadcast or multicast nowadays (multicast by default)
> - openais uses always multicast, with it's default address
> 
> Is this right? What is the difference there - if we have openais communicating, why there is another thing also communicating? I have been told that openais is cman's communication channel so what the heck?

There are several "other thing"s communicating, as well as openais. In
particular tou might see ccsd or DLM traffic. Neither of which uses the
openais transports.


> So, If I have 2 clusters in the same subnet, how to tell these things to be different?

cman separates the cluster by cluster name. In fact it hashes the name
into a 16 bit cluster number and uses that to generate a multicast
address. Though this can be overriden in cluster.conf. If you have two
cluster son one subnet then they will probably use different multicast
addresses if the hash is different. If you're unlucky enough to have a
clash of hashes and they two clusters decide to use the same multicast
address, then you can override either the cluster ID or the multicast
address.

<cman cluster_id="xxx" />

or
<cman> <multicast addr="239.192.204.100/>
</cman>

> 
> D. Teigland says:
> 
> ""
> When openais is started by cman, the openais.conf file is not used.  Many of
> the configuration parameters listed in openais.conf can be set in cluster.conf
> instead.
> 
> See the openais.conf man page for the specific parameters that can be set in
> these sections.  Note that settings in the <clusternodes> section will
> override any comparable settings in the openais sections above (in particular,
> bindnetaddr, mcastaddr, mcastport and nodeid will always be replaced by values
> in <clusternodes>).
> ""
> 
> So, if I put in cluster.conf:
> 
> <clusternode....>
> <multicast addr="239.192.204.100" interface="bond0"/>
> 
> ... for each node, what it configures? Cman or openais? 

cman runs as a plugin to openais. so cman uses openais as its messaggn
and membership system


-- 

Chrissie



From p.elmers at gmx.de  Mon Apr 21 08:53:04 2008
From: p.elmers at gmx.de (Peter)
Date: Mon, 21 Apr 2008 10:53:04 +0200
Subject: [Linux-cluster] Meaning of Cluster Cycle and timeout problems -
	GFS 100% cpu utilization
In-Reply-To: <1208458496.6053.150.camel@ayanami.boston.devel.redhat.com>
References: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>
	<1208458496.6053.150.camel@ayanami.boston.devel.redhat.com>
Message-ID: <3BB695F1-D2A9-4FDA-8373-9229C15838C2@gmx.de>

Hi,

Thanks for the fast response!

It looks like GFS causes 100% cpu utilization and therefore the qdiskd  
process has no processor time.

Is this a known problem and has anyone seen such behavior before?

We are using rhel 4.5 with the following packages:

ccs-1.0.11-1.x86_64.rpm
cman-1.0.17-0.x86_64.rpm
cman-kernel-2.6.9-53.5.x86_64.rpm
dlm-1.0.7-1.x86_64.rpm
dlm-kernel-2.6.9-52.2.x86_64.rpm
fence-1.32.50-2.x86_64.rpm
GFS-6.1.15-1.x86_64.rpm
GFS-kernel-2.6.9-75.9.x86_64.rpm
gulm-1.0.10-0.x86_64.rpm
iddev-2.0.0-4.x86_64.rpm
lvm2-cluster-2.02.27-2.el4.x86_64.rpm
magma-1.0.8-1.x86_64.rpm
magma-plugins-1.0.12-0.x86_64.rpm
perl-Net-Telnet-3.03-3.noarch.rpm
rgmanager-1.9.72-1.x86_64.rpm
system-config-cluster-1.0.51-2.0.noarch.rpm

The Kernel is 2.6.9-55.

Thanks for reading and answering,


Peter

Am 17.04.2008 um 20:54 schrieb Lon Hohberger:

> On Thu, 2008-04-17 at 09:08 +0200, Peter wrote:
>> Hi!
>>
>> In our Cluster we have the following entry in the "messages" logfile:
>>
>> "qdiskd[4314]: <warning> qdisk cycle took more than 3 seconds to
>> complete (3.890000)"
>
> It means it took more than 3 seconds for one qdiskd cycle to complete.
> This is a whole lot:
>
>   8192 bytes in 16 block reads
>   some internal calculations
>   512  bytes in 1 block write
>
> (that's it...)
>
>
>> Theese messages are very frequent. I can not find anything except the
>> source code via google and i am sorry to say that i am not so familar
>> with c to get the point.
>>
>>
>> We also have sometimes a quorum timeout:
>>
>> "kernel: CMAN: Quorum device /dev/sdh timed out"
>>
>>
>> Are theese two messages independent and what is the meaning of the
>> first message?
>
>
> No, they're 100% related.  It sounds like qdiskd is getting starved  
> for
> I/O to /dev/sdh, or possibly it's getting CPU-starved for some reason.
> Being that it's more or less a real-time program which helps keep the
> cluster running, that's bad!  In your case, it's getting hung up for
> longer than the cluster failover time, so CMAN thinks qdiskd has died.
> Not good.
>
>
> (1) Turn *off* status_file if you have it enabled!  It's for  
> debugging,
> and under certain load patterns, it can really slow down qdiskd.
>
>
> (2) If you think it's I/O, what you should try is (assuming you're  
> using
> cluster2/rhel5/centos5/etc. here):
>
>  echo deadline > /sys/block/sdh/queue
>
> If you had a default of 10 seconds (1 interval 10 tko), you should  
> also
> do:
>
>  echo 2500 > /sys/block/sdh/queue/iosched/write_expire
>
> ... you've got at least 3 for interval, so I'm not sure this would  
> apply
> to you.
>
> [NOTE: On rhel4/centos4/stable, I think you have to set the I/O
> scheduler globally in the kernel command line at system boot.]
>
>
> (3) If you think qdiskd is getting CPU starved, you can adjust the
> 'scheduler' and 'priority' values in cluster.conf to something
> different.  I think the man page might be wrong; I think the highest
> 'priority' value for the 'rr' scheduler is 99, not 100.  See the
> qdisk(5) man page for more information on those.
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2209 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080421/fe336524/attachment.p7s>

From rpeterso at redhat.com  Mon Apr 21 12:08:07 2008
From: rpeterso at redhat.com (Bob Peterson)
Date: Mon, 21 Apr 2008 07:08:07 -0500
Subject: [Linux-cluster] Meaning of Cluster Cycle and timeout problems
	- GFS 100% cpu utilization
In-Reply-To: <3BB695F1-D2A9-4FDA-8373-9229C15838C2@gmx.de>
References: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>
	<1208458496.6053.150.camel@ayanami.boston.devel.redhat.com>
	<3BB695F1-D2A9-4FDA-8373-9229C15838C2@gmx.de>
Message-ID: <1208779687.31105.55.camel@technetium.msp.redhat.com>

On Mon, 2008-04-21 at 10:53 +0200, Peter wrote:
> Hi,
> 
> Thanks for the fast response!
> 
> It looks like GFS causes 100% cpu utilization and therefore the qdiskd  
> process has no processor time.
> 
> Is this a known problem and has anyone seen such behavior before?

Hi Peter,

I'm not aware of any problems in GFS that cause this symptom.
Can you get a call trace with the magic sysrq key? (i.e. sysrq-t)

Regards,

Bob Peterson
Red Hat Clustering & GFS




From martin.fuerstenau at oce.com  Mon Apr 21 13:39:42 2008
From: martin.fuerstenau at oce.com (Martin Fuerstenau)
Date: Mon, 21 Apr 2008 15:39:42 +0200
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16
Message-ID: <1208785182.15835.27.camel@lx002140.ops.de>

Hi there,

I am running a 2 node cluster with Centos 5. After a clusterswitch last
weekend I am not able to mount one of the filesystems on both nodes.
Read a lot last hours in the internet but unfortunately I found neither
a solution nor a hint where the error is.

When I try 

mount -t gfs /dev/VolGroup01/ClusterVol01 /mnt/tmp


I get 

/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
/sbin/mount.gfs: error mounting lockproto lock_dlm

Has anybody an idea where to look for or what to do?

Thanx, Martin


Visit Oce at drupa! Register online now: <http://drupa.oce.com>

This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law.

If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message.

Thank you for your co-operation.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080421/b8c6ec83/attachment.htm>

From leo.pleiman at yahoo.com  Mon Apr 21 13:48:53 2008
From: leo.pleiman at yahoo.com (Leo Pleiman)
Date: Mon, 21 Apr 2008 06:48:53 -0700 (PDT)
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16
Message-ID: <419691.64123.qm@web56910.mail.re3.yahoo.com>

I believe you will find that the nodes haven't properly joined the fence domain. Try the following commands:

ccs_tool lsfence
ccs_tool lsnode
cman_tool status
cman_tool services

...also check /var/log/messages
 
--Leo

----- Original Message ----
From: Martin Fuerstenau <martin.fuerstenau at oce.com>
To: linux-cluster at redhat.com
Sent: Monday, April 21, 2008 9:39:42 AM
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16

    Hi there,

I am running a 2 node cluster with Centos 5. After a clusterswitch last weekend I am not able to mount one of the filesystems on both nodes. Read a lot last hours in the internet but unfortunately I found neither a solution nor a hint where the error is.

When I try 

mount -t gfs /dev/VolGroup01/ClusterVol01 /mnt/tmp


I get 

/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
/sbin/mount.gfs: error mounting lockproto lock_dlm

Has anybody an idea where to look for or what to do?

Thanx, Martin
 
This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law.

If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message.

Thank you for your co-operation.




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080421/d4390436/attachment.htm>

From lists at tangent.co.za  Mon Apr 21 14:00:04 2008
From: lists at tangent.co.za (Chris Picton)
Date: Mon, 21 Apr 2008 14:00:04 +0000 (UTC)
Subject: [Linux-cluster] GNBD speed
Message-ID: <fui6l2$h0i$1@ger.gmane.org>

Hi All

I am testing the following scenario:

A DRBD mirror between two servers, which heartbeat failover the drbd 
primary, gnbd export and ip address.

I am trying to find potential bottlenecks, and have done the following 
tests.

Network speed between the DRBD servers (A and B)
---------------------------------------------------------
(A) dd if=/dev/zero bs=1G count=1 | nc 10.100.1.2 5001 
(B) nc k -l 5001 | dd of=/dev/null
(A) reports:
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 7.6384 seconds, 141 MB/s

DRBD sync speed:
----------------------------------------------------------
dd if=/dev/zero bs=1G count=1 of=/dev/drbd0 oflag=sync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.7832 seconds, 99.6 MB/s

Network speed between GNBD export (A) and import (C)
-----------------------------------------------------------
(C) dd if=/dev/zero bs=1G count=1 | nc nfs1 5001
(A) nc -k -l 5001 | dd of=/dev/null
(C) reports:
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.4001 seconds, 103 MB/s

Network speed between GNBD import (C) and export (A)
-----------------------------------------------------------
(A) dd if=/dev/zero bs=1G count=1 | nc 10.200.3.10 5001
(C) nc -k -l 5001 | dd of=/dev/null
(A) reports:
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.4001 seconds, 93 MB/s


So I have established that writing to drbd directly is fast, and network 
speed is fast

However, using gnbd as follows:
on the drbd server:
gnbd_serv -n
/sbin/gnbd_export -c -e r0 -d /dev/drbd0

On the client:
gnbd_import -i 10.200.3.3

I try the speed tests over the gnbd devices:

Reading from GNBD:
------------------------------------------------------------
dd if=/dev/gnbd0 of=/dev/null bs=1G count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 17.0842 seconds, 62.8 MB/s

Writing to GNBD (no sync flag)
------------------------------------------------------------
dd if=/dev/zero of=/dev/gnbd0 bs=1G count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 54.4142 seconds, 19.7 MB/s

Writing to GNBD (sync flag)
------------------------------------------------------------
dd if=/dev/zero of=/dev/gnbd0 bs=1G count=1 oflag=sync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 53.3085 seconds, 20.1 MB/s



I am almost happy with the 62 Mb/s read speed, but the 20 MB/sec write 
speed seems a bit low, compared to the write rate to drbd, and the 
network speed.

Can anyone give any hints for optimising the gnbd write speed (and read 
speed)


Chris






From martin.fuerstenau at oce.com  Mon Apr 21 14:14:01 2008
From: martin.fuerstenau at oce.com (Martin Fuerstenau)
Date: Mon, 21 Apr 2008 16:14:01 +0200
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16
In-Reply-To: <419691.64123.qm@web56910.mail.re3.yahoo.com>
References: <419691.64123.qm@web56910.mail.re3.yahoo.com>
Message-ID: <1208787241.19274.6.camel@lx002140.ops.de>

Hi there,

there is a main difference when I use "cman_tool services". On the
working node:

[root at node2 ~]# cman_tool services
type             level name          id       state
fence            0     default       00010001 none
[1 2]
dlm              1     clvmd         00020001 none
[1 2]
dlm              1     CfusterGFS01  00040001 none
[1 2]
dlm              1     ClusterGFS02  00060001 none
[2]
dlm              1     rgmanager     00070001 none
[1 2]
gfs              2     CfusterGFS01  00030001 none
[1 2]
gfs              2     ClusterGFS02  00050001 none
[2]


On the non-working node:
[root at node1 ~]# cman_tool services
type             level name          id       state
fence            0     default       00010001 none
[1 2]
dlm              1     clvmd         00020001 none
[1 2]
dlm              1     CfusterGFS01  00040001 none
[1 2]
dlm              1     rgmanager     00070001 none
[1 2]
gfs              2     CfusterGFS01  00030001 none
[1 2]

Well - but  CfusterGFS01 is not mounted on node1. Seems to me like dlm
(or gfs) is thinking that the filesystem is mounted. But it isn't. Has
anybody an idea how to kick it out?

Thx - Martin

On Mon, 2008-04-21 at 06:48 -0700, Leo Pleiman wrote:
> I believe you will find that the nodes haven't properly joined the
> fence domain. Try the following commands:
> 
> ccs_tool lsfence
> ccs_tool lsnode
> cman_tool status
> cman_tool services
> 
> ...also check /var/log/messages
>  
> --Leo
> 
> 
> ----- Original Message ----
> From: Martin Fuerstenau <martin.fuerstenau at oce.com>
> To: linux-cluster at redhat.com
> Sent: Monday, April 21, 2008 9:39:42 AM
> Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error
> -16
> 
> Hi there,
> 
> I am running a 2 node cluster with Centos 5. After a clusterswitch
> last weekend I am not able to mount one of the filesystems on both
> nodes. Read a lot last hours in the internet but unfortunately I found
> neither a solution nor a hint where the error is.
> 
> When I try 
> 
> mount -t gfs /dev/VolGroup01/ClusterVol01 /mnt/tmp
> 
> 
> I get 
> 
> /sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
> /sbin/mount.gfs: error mounting lockproto lock_dlm
> 
> Has anybody an idea where to look for or what to do?
> 
> Thanx, Martin 

> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

Visit Oce at drupa! Register online now: <http://drupa.oce.com>

This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law.

If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message.

Thank you for your co-operation.




From connerf at ncifcrf.gov  Mon Apr 21 15:03:10 2008
From: connerf at ncifcrf.gov (fred conner)
Date: Mon, 21 Apr 2008 11:03:10 -0400
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16
In-Reply-To: <419691.64123.qm@web56910.mail.re3.yahoo.com>
References: <419691.64123.qm@web56910.mail.re3.yahoo.com>
Message-ID: <1208790190.3205.12.camel@norbert>

try running the command 
fence_tool join
on the node you are getting the error
then mount the filesystem

On Mon, 2008-04-21 at 06:48 -0700, Leo Pleiman wrote:
> I believe you will find that the nodes haven't properly joined the
> fence domain. Try the following commands:
> 
> ccs_tool lsfence
> ccs_tool lsnode
> cman_tool status
> cman_tool services
> 
> ...also check /var/log/messages
>  
> --Leo
> 
> 
> ----- Original Message ----
> From: Martin Fuerstenau <martin.fuerstenau at oce.com>
> To: linux-cluster at redhat.com
> Sent: Monday, April 21, 2008 9:39:42 AM
> Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error
> -16
> 
> Hi there,
> 
> I am running a 2 node cluster with Centos 5. After a clusterswitch
> last weekend I am not able to mount one of the filesystems on both
> nodes. Read a lot last hours in the internet but unfortunately I found
> neither a solution nor a hint where the error is.
> 
> When I try 
> 
> mount -t gfs /dev/VolGroup01/ClusterVol01 /mnt/tmp
> 
> 
> I get 
> 
> /sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
> /sbin/mount.gfs: error mounting lockproto lock_dlm
> 
> Has anybody an idea where to look for or what to do?
> 
> Thanx, Martin 
> Visit Oce at drupa! Register online now:
> 
> This message and attachment(s) are intended solely for use by the
> addressee and may contain information that is privileged, confidential
> or otherwise exempt from disclosure under applicable law. If you are
> not the intended recipient or agent thereof responsible for delivering
> this message to the intended recipient, you are hereby notified that
> any dissemination, distribution or copying of this communication is
> strictly prohibited. If you have received this communication in error,
> please notify the sender immediately by telephone and with a 'reply'
> message. Thank you for your co-operation.
> 
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Fred Conner [Contractor]




From lhh at redhat.com  Mon Apr 21 16:58:14 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 21 Apr 2008 12:58:14 -0400
Subject: [Linux-cluster] script resource start, stop and status timeouts
In-Reply-To: <779919740804181249k624a6a95le2ffe7c81319bf4@mail.gmail.com>
References: <779919740804181249k624a6a95le2ffe7c81319bf4@mail.gmail.com>
Message-ID: <1208797094.23820.1.camel@ayanami.boston.devel.redhat.com>

On Fri, 2008-04-18 at 13:49 -0600, Jeremy Lyon wrote:
> Hi,
> 
> I'm currently using cluster on RHEL 4.6 and will be soon using moving
> to cluster on RHEL 5.1.  We are using some script resources and I'm
> trying to find if there are timeouts on the start, stop and status
> functions.  If so, what are the defaults and can they be tuned?

Nope, they behave the same as on rhel4.6 right now.

-- Lon




From lhh at redhat.com  Mon Apr 21 16:58:37 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 21 Apr 2008 12:58:37 -0400
Subject: [Linux-cluster] script resource start, stop and status timeouts
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E23@dino.eu.tieto.com>
References: <779919740804181249k624a6a95le2ffe7c81319bf4@mail.gmail.com>
	<36F4E74FA8263744A6B016E6A461EFF603317E23@dino.eu.tieto.com>
Message-ID: <1208797117.23820.3.camel@ayanami.boston.devel.redhat.com>

On Sat, 2008-04-19 at 06:39 +0300, Harri.Paivaniemi at tietoenator.com
wrote:
> Those can be tuned,
> 
> /usr/share/cluster/script.sh
> 
> ... is for normal init- scripts. There is also plenty of other scripts there for different purposes.
> 
> In that script:
> 
> <actions>
>         <action name="start" timeout="0"/>
>         <action name="stop" timeout="0"/>
> 
> 
> .... tels it's infinite timeout by default.

True, but the timeouts aren't currently enforced.

-- Lon




From Samuel.Kielek at marriott.com  Mon Apr 21 17:22:21 2008
From: Samuel.Kielek at marriott.com (Kielek, Samuel)
Date: Mon, 21 Apr 2008 13:22:21 -0400
Subject: [Linux-cluster] Event in one failover domain affecting another
	separate failover domain
Message-ID: <140D865F4BA13C4B9D3AFEFEAD1EA532057BB5EE@HDQNCEXCL1V2.mihdq.marrcorp.marriott.com>

I have a 3 node RHEL 4.6 cluster with two failover domains. The idea is
that two of the nodes are primary for their respective services and the
remaining node is a shared failover node for both of the services. Here
is an example of how the two ordered domains are configured:

DOMAIN_ONE (service_one)
  server_a (priority=1)
  server_b (priority=2)

DOMAIN_TWO (service_two)
  server_c (priority=1)
  server_b (priority=2)

The issue I have observed is that when server_c (DOMAIN_TWO) had an
issue that led to it being fenced, the service running on server_a
(service_one) immediately stopped and relocated to server_b (the
recovery action is set to "relocate" for both services). What I don't
understand is how a failure in DOMAIN_TWO with service_two on server_c
would affect service_one running on server_a in DOMAIN_ONE. The logs do
not provide any obvious hints. Here is a snippet from the messages log
on server_a for this time period:

11:10:56 server_a fenced[11638]: fencing node "server_c" 
11:12:03 server_a fenced[11638]: fence "server_c" success 
11:12:04 server_a clurgmgrd[11776]: <info> Magma Event: Membership
Change 
11:12:04 server_a clurgmgrd[11776]: <info> State change: server_c DOWN 
11:12:04 server_a clurgmgrd[11776]: <notice> Stopping service
service_one 
11:12:04 server_a clurgmgrd: [11776]: <info> Executing
/etc/init.d/service_one stop

As you can see, there is no indication as to why service_one is being
stopped. The last two events in the above log should not have occurred.
Has anyone else ever had this sort of issue? I'm not sure if this is a
bug or a config problem.

Thanks,
Sam



From lhh at redhat.com  Mon Apr 21 17:29:50 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 21 Apr 2008 13:29:50 -0400
Subject: [Linux-cluster] operation configuration: start timeout
In-Reply-To: <4808A860.4070300@sap.com>
References: <4808A860.4070300@sap.com>
Message-ID: <1208798990.23820.10.camel@ayanami.boston.devel.redhat.com>

On Fri, 2008-04-18 at 15:55 +0200, Matthias Schlarb wrote:
> Hello,
> 
> how can I add a start timeout for a cluster resource in cluster.conf? 

Currently, these are present but not enforced.

A simple timeout to start is not a "clean" failure case; the stop phase
of the script would need to be called, and the stop phase would need to
complete successfully in order to allow starting the resource on a
different node in the cluster.

A timeout exceeded on "stop" would make the resource mostly
unrecoverable (requiring manual intervention).

A timeout exceeded on "status" would probably mean a failure (right?)

-- Lon



From lhh at redhat.com  Mon Apr 21 18:02:22 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 21 Apr 2008 14:02:22 -0400
Subject: [Linux-cluster] Event in one failover domain affecting another
	separate failover domain
In-Reply-To: <140D865F4BA13C4B9D3AFEFEAD1EA532057BB5EE@HDQNCEXCL1V2.mihdq.marrcorp.marriott.com>
References: <140D865F4BA13C4B9D3AFEFEAD1EA532057BB5EE@HDQNCEXCL1V2.mihdq.marrcorp.marriott.com>
Message-ID: <1208800942.23820.30.camel@ayanami.boston.devel.redhat.com>

On Mon, 2008-04-21 at 13:22 -0400, Kielek, Samuel wrote:

> The issue I have observed is that when server_c (DOMAIN_TWO) had an
> issue that led to it being fenced, the service running on server_a
> (service_one) immediately stopped and relocated to server_b (the
> recovery action is set to "relocate" for both services).

Your cluster.conf would be helpful.

Also, you can increase the log level to 'debug' which would tell you
more; see "Logging Configuration":

  http://sources.redhat.com/cluster/wiki/RGManager

...for more information.

-- Lon





From lhh at redhat.com  Mon Apr 21 18:04:21 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 21 Apr 2008 14:04:21 -0400
Subject: [Linux-cluster] Meaning of Cluster Cycle and timeout problems
	- GFS 100% cpu utilization
In-Reply-To: <3BB695F1-D2A9-4FDA-8373-9229C15838C2@gmx.de>
References: <BA222AAE-0F73-46CC-9F02-7BB78CE2AD37@gmx.de>
	<1208458496.6053.150.camel@ayanami.boston.devel.redhat.com>
	<3BB695F1-D2A9-4FDA-8373-9229C15838C2@gmx.de>
Message-ID: <1208801061.23820.31.camel@ayanami.boston.devel.redhat.com>

On Mon, 2008-04-21 at 10:53 +0200, Peter wrote:
> Hi,
> 
> Thanks for the fast response!
> 
> It looks like GFS causes 100% cpu utilization and therefore the qdiskd  
> process has no processor time.

Ouch!  That would do it.

It sounds like a bug.

> > (3) If you think qdiskd is getting CPU starved, you can adjust the
> > 'scheduler' and 'priority' values in cluster.conf to something
> > different.  I think the man page might be wrong; I think the highest
> > 'priority' value for the 'rr' scheduler is 99, not 100.  See the
> > qdisk(5) man page for more information on those.

^^ You can still set qdiskd's priority to 99 if you want ;)

-- Lon



From Samuel.Kielek at marriott.com  Mon Apr 21 18:45:21 2008
From: Samuel.Kielek at marriott.com (Kielek, Samuel)
Date: Mon, 21 Apr 2008 14:45:21 -0400
Subject: [Linux-cluster] Event in one failover domain affecting
	anotherseparate failover domain
In-Reply-To: <1208800942.23820.30.camel@ayanami.boston.devel.redhat.com>
References: <140D865F4BA13C4B9D3AFEFEAD1EA532057BB5EE@HDQNCEXCL1V2.mihdq.marrcorp.marriott.com>
	<1208800942.23820.30.camel@ayanami.boston.devel.redhat.com>
Message-ID: <140D865F4BA13C4B9D3AFEFEAD1EA532057BB685@HDQNCEXCL1V2.mihdq.marrcorp.marriott.com>

Ok, I've set the log level to debug so hopefully next time this happens
I can get more info. Of course this is a production cluster so there is
only so much I can do in terms of testing.. Here is the cluster.conf
(sanitized but otherwise accurate):

<?xml version="1.0"?>
<cluster alias="cluster_a" config_version="2" name="cluster_a">
        <quorumd device="/dev/mapper/mpath5p1" interval="3" tko="23"
votes="3"/>
        <cman deadnode_timeout="135" expected_votes="6">
                <multicast addr="239.0.0.10"/>
        </cman>
        <fence_daemon post_fail_delay="0" post_join_delay="30"/>
        <clusternodes>
                <clusternode name="server_a" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="server_a-ilo"/>
                                </method>
                        </fence>
                        <multicast addr="239.0.0.10" interface="bond0"/>
                </clusternode>
                <clusternode name="server_b" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="server_b-ilo"/>
                                </method>
                        </fence>
                        <multicast addr="239.0.0.10" interface="bond0"/>
                </clusternode>
                <clusternode name="server_c" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="server_c-ilo"/>
                                </method>
                        </fence>
                        <multicast addr="239.0.0.10" interface="bond0"/>
                </clusternode>
        </clusternodes>
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="server_a-ilo"
login="clu_user" name="server_a-ilo" passwd="..removed.."/>
                <fencedevice agent="fence_ilo" hostname="server_b-ilo"
login="clu_user" name="server_b-ilo" passwd="..removed.."/>
                <fencedevice agent="fence_ilo" hostname="server_c-ilo"
login="clu_user" name="server_c-ilo" passwd="..removed.."/>
        </fencedevices>
        <rm log_level="7">
                <failoverdomains>
                        <failoverdomain name="DOMAIN_ONE" ordered="1"
restricted="1">
                                <failoverdomainnode name="server_a"
priority="1"/>
                                <failoverdomainnode name="server_b"
priority="2"/>
                        </failoverdomain>
                        <failoverdomain name="DOMAIN_TWO" ordered="1"
restricted="1">
                                <failoverdomainnode name="server_c"
priority="1"/>
                                <failoverdomainnode name="server_b"
priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                </resources>
                <service autostart="1" domain="DOMAIN_ONE"
name="service_one" recovery="relocate">
                        <script file="/etc/init.d/service_one"
name="script-service_one"/>
                        <lvm lv_name="lvapp1" name="app1-lvm"
vg_name="cluvg-app1"/>
                        <ip address="xxx.xxx.xxx.100" monitor_link="1"/>
                        <fs device="/dev/cluvg-app1/lvapp1"
force_fsck="1" force_unmount="1" fsid="64050" fstype="ext3"
mountpoint="/app1" name="app1-fs" options="" self_fence="0"/>
                </service>
                <service autostart="1" domain="DOMAIN_TWO"
name="service_two" recovery="relocate">
                        <script file="/etc/init.d/service_two"
name="script-service_two"/>
                        <lvm lv_name="lvapp2" name="app2-lvm"
vg_name="cluvg-app2"/>
                        <lvm lv_name="lvapp2_data" name="app2-data-lvm"
vg_name="cluvg-app2-data"/>
                        <ip address="xxx.xxx.xxx.200" monitor_link="1"/>
                        <fs device="/dev/cluvg-app2/lvapp2"
force_fsck="1" force_unmount="1" fsid="45751" fstype="ext3"
mountpoint="/app2" name="app2-fs" options="" self_fence="0"/>
                        <fs device="/dev/cluvg-app2-data/lvapp2_data"
force_fsck="1" force_unmount="1" fsid="985" fstype="ext3"
mountpoint="/app2/data" name="app2-data-fs" options="" self_fence="0"/>
                </service>
        </rm>
</cluster>

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
Sent: Monday, April 21, 2008 2:02 PM
To: linux clustering
Subject: Re: [Linux-cluster] Event in one failover domain affecting
anotherseparate failover domain

On Mon, 2008-04-21 at 13:22 -0400, Kielek, Samuel wrote:

> The issue I have observed is that when server_c (DOMAIN_TWO) had an
> issue that led to it being fenced, the service running on server_a
> (service_one) immediately stopped and relocated to server_b (the
> recovery action is set to "relocate" for both services).

Your cluster.conf would be helpful.

Also, you can increase the log level to 'debug' which would tell you
more; see "Logging Configuration":

  http://sources.redhat.com/cluster/wiki/RGManager

...for more information.

-- Lon



--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From fdinitto at redhat.com  Mon Apr 21 20:06:56 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 21 Apr 2008 22:06:56 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.03.01 Released
Message-ID: <Pine.LNX.4.64.0804212144420.28030@trider-g7>


The cluster team and its vibrant community are proud to announce the 3rd 
release from the STABLE2 branch: 2.03.01.

The STABLE2 branch collects daily all bug fixes and the bare minimal 
changes required to run the cluster on top of the most recent Linux kernel 
(2.6.25) and rock solid openais (0.80.3 or higher).

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.01.tar.gz

In order to use GFS1, the Linux kernel requires a minimal patch:

   ftp://sources.redhat.com/pub/cluster/releases/lockproto-exports.patch

To report bugs or issues:

   https://bugzilla.redhat.com/

Happy clustering,
Fabio

Under the hood (from 2.03.00):

Abhijith Das (3):
       gfs2_tool: Fix build warnings in misc.c bz 441636
       gfs2_tool manpage: Updates to the manpage for bz441636
       gfs2_tool: Fix build warnings in misc.c bz 441636

Andrew Price (2):
       [GFS2] gfs2_fsck: Fix operation on 'ptr' may be undefined warnings
       [GFS2] gfs2_edit: Remove duplicate linux_endian.h

Bob Peterson (4):
       bz440896/440897 GFS: gfs_fsck should repair gfs_grow corruption
       bz425421: gfs mount attempt hangs if no more journals available
       bz438762: gfs_tool: Cannot allocate memory
       bz295301: Need man page for gfs_edit

Christine Caulfield (4):
       [CMAN] Save the new expected_votes when a node is removed
       [MISC] Make it build with gcc 4.3
       [FENCE] Make it build with gcc 4.3
       [CMAN] Disallow a new dirty node from joining the cman cluster

David Teigland (2):
       gfs: don't cancel glocks when writing to hidden file
       gfs_controld: retry recovery for withdrawn journal

Fabio M. Di Nitto (6):
       [GFS2] Fix build warning
       [BUILD] Fix typo
       Revert "gfs2_tool: Fix build warnings in misc.c bz 441636"
       [RGMANAGER] Fix build with gcc4.3
       [GFS2] Fix build warning
       [KERNEL] Update modules to build with 2.6.25

Lon Hohberger (2):
       [fence] Fix #441737 - fence_node broken due to internal API change
       Revert "[fence] Fix #441737 - fence_node broken due to internal API change"

jparsons (1):
       Bump MAX_DEVICES in fenced from 4 to 8

  ccs/daemon/cnx_mgr.c                 |    1 +
  cman/daemon/cmanccs.c                |    3 +-
  cman/daemon/commands.c               |   12 +++--
  cman/lib/libcman.c                   |    1 +
  configure                            |    4 +-
  dlm/tool/main.c                      |    1 +
  fence/fence_tool/fence_tool.c        |    1 +
  fence/fenced/agent.c                 |    2 +-
  gfs-kernel/src/gfs/glock.c           |    3 +-
  gfs-kernel/src/gfs/glock.h           |    1 +
  gfs-kernel/src/gfs/ioctl.c           |    3 +-
  gfs-kernel/src/gfs/ops_fstype.c      |    1 +
  gfs-kernel/src/gfs/sys.c             |   25 +++++------
  gfs/gfs_fsck/rgrp.c                  |   10 ++++
  gfs/gfs_fsck/super.c                 |   40 +++++++++++++++++
  gfs/gfs_tool/df.c                    |    2 +-
  gfs/gfs_tool/tune.c                  |    2 +-
  gfs/man/Makefile                     |    1 +
  gfs2/edit/linux_endian.h             |   81 ----------------------------------
  gfs2/fsck/fs_recovery.c              |    9 +++-
  gfs2/man/gfs2_tool.8                 |   10 +---
  gfs2/tool/iflags.h                   |   53 ++++++++++++++++++++++
  gfs2/tool/main.c                     |    2 +-
  gfs2/tool/misc.c                     |   18 +++-----
  gnbd-kernel/src/gnbd.c               |   18 +++-----
  gnbd/tools/gnbd_export/gnbd_export.c |    1 +
  gnbd/tools/gnbd_import/gnbd_import.c |    1 +
  group/gfs_controld/lock_dlm.h        |    1 +
  group/gfs_controld/recover.c         |   19 ++++++++
  make/clean.mk                        |    2 +-
  rgmanager/ChangeLog                  |    3 +
  rgmanager/src/utils/clustat.c        |    2 +-
  32 files changed, 190 insertions(+), 143 deletions(-)



From rotsen at gmail.com  Mon Apr 21 20:37:57 2008
From: rotsen at gmail.com (=?ISO-8859-1?Q?N=E9stor?=)
Date: Mon, 21 Apr 2008 13:37:57 -0700
Subject: [Linux-cluster] Cluster 2.03.01 Released
In-Reply-To: <Pine.LNX.4.64.0804212144420.28030@trider-g7>
References: <Pine.LNX.4.64.0804212144420.28030@trider-g7>
Message-ID: <a304e2d60804211337u160d8d99y33820f96ebffc5d7@mail.gmail.com>

What is the difference between this cluster and the red hat cluster suite
that you
have to pay for?

Thanks,

Nestor :-)

On Mon, Apr 21, 2008 at 1:06 PM, Fabio M. Di Nitto <fdinitto at redhat.com>
wrote:

>
> The cluster team and its vibrant community are proud to announce the 3rd
> release from the STABLE2 branch: 2.03.01.
>
> The STABLE2 branch collects daily all bug fixes and the bare minimal
> changes required to run the cluster on top of the most recent Linux kernel
> (2.6.25) and rock solid openais (0.80.3 or higher).
>
> The new source tarball can be downloaded here:
>
>  ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.01.tar.gz
>
> In order to use GFS1, the Linux kernel requires a minimal patch:
>
>  ftp://sources.redhat.com/pub/cluster/releases/lockproto-exports.patch
>
> To report bugs or issues:
>
>  https://bugzilla.redhat.com/
>
> Happy clustering,
> Fabio
>
> Under the hood (from 2.03.00):
>
> Abhijith Das (3):
>      gfs2_tool: Fix build warnings in misc.c bz 441636
>      gfs2_tool manpage: Updates to the manpage for bz441636
>      gfs2_tool: Fix build warnings in misc.c bz 441636
>
> Andrew Price (2):
>      [GFS2] gfs2_fsck: Fix operation on 'ptr' may be undefined warnings
>      [GFS2] gfs2_edit: Remove duplicate linux_endian.h
>
> Bob Peterson (4):
>      bz440896/440897 GFS: gfs_fsck should repair gfs_grow corruption
>      bz425421: gfs mount attempt hangs if no more journals available
>      bz438762: gfs_tool: Cannot allocate memory
>      bz295301: Need man page for gfs_edit
>
> Christine Caulfield (4):
>      [CMAN] Save the new expected_votes when a node is removed
>      [MISC] Make it build with gcc 4.3
>      [FENCE] Make it build with gcc 4.3
>      [CMAN] Disallow a new dirty node from joining the cman cluster
>
> David Teigland (2):
>      gfs: don't cancel glocks when writing to hidden file
>      gfs_controld: retry recovery for withdrawn journal
>
> Fabio M. Di Nitto (6):
>      [GFS2] Fix build warning
>      [BUILD] Fix typo
>      Revert "gfs2_tool: Fix build warnings in misc.c bz 441636"
>      [RGMANAGER] Fix build with gcc4.3
>      [GFS2] Fix build warning
>      [KERNEL] Update modules to build with 2.6.25
>
> Lon Hohberger (2):
>      [fence] Fix #441737 - fence_node broken due to internal API change
>      Revert "[fence] Fix #441737 - fence_node broken due to internal API
> change"
>
> jparsons (1):
>      Bump MAX_DEVICES in fenced from 4 to 8
>
>  ccs/daemon/cnx_mgr.c                 |    1 +
>  cman/daemon/cmanccs.c                |    3 +-
>  cman/daemon/commands.c               |   12 +++--
>  cman/lib/libcman.c                   |    1 +
>  configure                            |    4 +-
>  dlm/tool/main.c                      |    1 +
>  fence/fence_tool/fence_tool.c        |    1 +
>  fence/fenced/agent.c                 |    2 +-
>  gfs-kernel/src/gfs/glock.c           |    3 +-
>  gfs-kernel/src/gfs/glock.h           |    1 +
>  gfs-kernel/src/gfs/ioctl.c           |    3 +-
>  gfs-kernel/src/gfs/ops_fstype.c      |    1 +
>  gfs-kernel/src/gfs/sys.c             |   25 +++++------
>  gfs/gfs_fsck/rgrp.c                  |   10 ++++
>  gfs/gfs_fsck/super.c                 |   40 +++++++++++++++++
>  gfs/gfs_tool/df.c                    |    2 +-
>  gfs/gfs_tool/tune.c                  |    2 +-
>  gfs/man/Makefile                     |    1 +
>  gfs2/edit/linux_endian.h             |   81
> ----------------------------------
>  gfs2/fsck/fs_recovery.c              |    9 +++-
>  gfs2/man/gfs2_tool.8                 |   10 +---
>  gfs2/tool/iflags.h                   |   53 ++++++++++++++++++++++
>  gfs2/tool/main.c                     |    2 +-
>  gfs2/tool/misc.c                     |   18 +++-----
>  gnbd-kernel/src/gnbd.c               |   18 +++-----
>  gnbd/tools/gnbd_export/gnbd_export.c |    1 +
>  gnbd/tools/gnbd_import/gnbd_import.c |    1 +
>  group/gfs_controld/lock_dlm.h        |    1 +
>  group/gfs_controld/recover.c         |   19 ++++++++
>  make/clean.mk                        |    2 +-
>  rgmanager/ChangeLog                  |    3 +
>  rgmanager/src/utils/clustat.c        |    2 +-
>  32 files changed, 190 insertions(+), 143 deletions(-)
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080421/22d0e06a/attachment.htm>

From christopher.barry at qlogic.com  Mon Apr 21 21:07:44 2008
From: christopher.barry at qlogic.com (Christopher Barry)
Date: Mon, 21 Apr 2008 16:07:44 -0500
Subject: [Linux-cluster] qdiskd restart question
Message-ID: <D158540CCC0AB54C8FD4818F823CCB2476DECA@EPEXCH1.qlogic.org>

Greets,

I've modified some of my heuristics scripts, and want to start using
them. Can I restart qdiskd on a running cluster without causing any
issues?

Thanks,
-C



From Harri.Paivaniemi at tietoenator.com  Tue Apr 22 03:00:57 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Tue, 22 Apr 2008 06:00:57 +0300
Subject: [Linux-cluster] qdiskd restart question
References: <D158540CCC0AB54C8FD4818F823CCB2476DECA@EPEXCH1.qlogic.org>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E2F@dino.eu.tieto.com>

Yes you can,

but in that case the node will loose your qdisk votes. Make just sure you have still enough votes to be quorate.

Qdiskd tells other nodes it goes down nicely, so nothing bad happens.

-hjp



-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Christopher Barry
Sent: Tue 4/22/2008 0:07
To: linux-cluster at redhat.com
Subject: [Linux-cluster] qdiskd restart question
 
Greets,

I've modified some of my heuristics scripts, and want to start using
them. Can I restart qdiskd on a running cluster without causing any
issues?

Thanks,
-C

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 2860 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080422/e5cd9bd8/attachment.bin>

From maciej.bogucki at artegence.com  Tue Apr 22 05:39:33 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Tue, 22 Apr 2008 07:39:33 +0200
Subject: [Linux-cluster] GNBD speed
In-Reply-To: <fui6l2$h0i$1@ger.gmane.org>
References: <fui6l2$h0i$1@ger.gmane.org>
Message-ID: <480D7A15.3080401@artegence.com>

Chris Picton napisa?(a):
> Hi All
> 
> I am testing the following scenario:
> 
> A DRBD mirror between two servers, which heartbeat failover the drbd 
> primary, gnbd export and ip address.
> 
> I am trying to find potential bottlenecks, and have done the following 
> tests.
> 
> Network speed between the DRBD servers (A and B)
> ---------------------------------------------------------
> (A) dd if=/dev/zero bs=1G count=1 | nc 10.100.1.2 5001 
> (B) nc k -l 5001 | dd of=/dev/null
> (A) reports:
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 7.6384 seconds, 141 MB/s
> 
> DRBD sync speed:
> ----------------------------------------------------------
> dd if=/dev/zero bs=1G count=1 of=/dev/drbd0 oflag=sync
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.7832 seconds, 99.6 MB/s
> 
> Network speed between GNBD export (A) and import (C)
> -----------------------------------------------------------
> (C) dd if=/dev/zero bs=1G count=1 | nc nfs1 5001
> (A) nc -k -l 5001 | dd of=/dev/null
> (C) reports:
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.4001 seconds, 103 MB/s
> 
> Network speed between GNBD import (C) and export (A)
> -----------------------------------------------------------
> (A) dd if=/dev/zero bs=1G count=1 | nc 10.200.3.10 5001
> (C) nc -k -l 5001 | dd of=/dev/null
> (A) reports:
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.4001 seconds, 93 MB/s
> 
> 
> So I have established that writing to drbd directly is fast, and network 
> speed is fast
> 
> However, using gnbd as follows:
> on the drbd server:
> gnbd_serv -n
> /sbin/gnbd_export -c -e r0 -d /dev/drbd0
> 
> On the client:
> gnbd_import -i 10.200.3.3
> 
> I try the speed tests over the gnbd devices:
> 
> Reading from GNBD:
> ------------------------------------------------------------
> dd if=/dev/gnbd0 of=/dev/null bs=1G count=1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 17.0842 seconds, 62.8 MB/s
> 
> Writing to GNBD (no sync flag)
> ------------------------------------------------------------
> dd if=/dev/zero of=/dev/gnbd0 bs=1G count=1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 54.4142 seconds, 19.7 MB/s
> 
> Writing to GNBD (sync flag)
> ------------------------------------------------------------
> dd if=/dev/zero of=/dev/gnbd0 bs=1G count=1 oflag=sync
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 53.3085 seconds, 20.1 MB/s
> 
> 
> 
> I am almost happy with the 62 Mb/s read speed, but the 20 MB/sec write 
> speed seems a bit low, compared to the write rate to drbd, and the 
> network speed.
> 
> Can anyone give any hints for optimising the gnbd write speed (and read 
> speed)
> 
> 

Hello,

I have done some tests with GNBD and 1Gbit network in the past, and
results are similar. You can try to use iSCSI instead of GNBD. From my
tests I can tell that for iSCSI write speed is faster than GNBD, but
read is slower than GNBD.

[root at blade02-1 ~]# bonnie++ -d /mnt -s 4g -u root
Version  1.03       ------Sequential Output------ --Sequential
Input---Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP
blade02-1        4G 19724  50 20362  11 13391   7 31688  70 79717  16
360.0 0


Best Regards
Maciej Bogucki



From martin.fuerstenau at oce.com  Tue Apr 22 08:00:13 2008
From: martin.fuerstenau at oce.com (Martin Fuerstenau)
Date: Tue, 22 Apr 2008 10:00:13 +0200
Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error -16
In-Reply-To: <1208790190.3205.12.camel@norbert>
References: <419691.64123.qm@web56910.mail.re3.yahoo.com>
	<1208790190.3205.12.camel@norbert>
Message-ID: <1208851213.19274.8.camel@lx002140.ops.de>

Unfortunately that doesn't work either.

Martin

On Mon, 2008-04-21 at 11:03 -0400, fred conner wrote:

> try running the command 
> fence_tool join
> on the node you are getting the error
> then mount the filesystem
> 
> On Mon, 2008-04-21 at 06:48 -0700, Leo Pleiman wrote:
> > I believe you will find that the nodes haven't properly joined the
> > fence domain. Try the following commands:
> > 
> > ccs_tool lsfence
> > ccs_tool lsnode
> > cman_tool status
> > cman_tool services
> > 
> > ...also check /var/log/messages
> >  
> > --Leo
> > 
> > 
> > ----- Original Message ----
> > From: Martin Fuerstenau <martin.fuerstenau at oce.com>
> > To: linux-cluster at redhat.com
> > Sent: Monday, April 21, 2008 9:39:42 AM
> > Subject: [Linux-cluster] Problem lock_dlm_join gfs_controld join error
> > -16
> > 
> > Hi there,
> > 
> > I am running a 2 node cluster with Centos 5. After a clusterswitch
> > last weekend I am not able to mount one of the filesystems on both
> > nodes. Read a lot last hours in the internet but unfortunately I found
> > neither a solution nor a hint where the error is.
> > 
> > When I try 
> > 
> > mount -t gfs /dev/VolGroup01/ClusterVol01 /mnt/tmp
> > 
> > 
> > I get 
> > 
> > /sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
> > /sbin/mount.gfs: error mounting lockproto lock_dlm
> > 
> > Has anybody an idea where to look for or what to do?
> > 
> > Thanx, Martin 
> > Visit Oce at drupa! Register online now:
> > 
> > This message and attachment(s) are intended solely for use by the
> > addressee and may contain information that is privileged, confidential
> > or otherwise exempt from disclosure under applicable law. If you are
> > not the intended recipient or agent thereof responsible for delivering
> > this message to the intended recipient, you are hereby notified that
> > any dissemination, distribution or copying of this communication is
> > strictly prohibited. If you have received this communication in error,
> > please notify the sender immediately by telephone and with a 'reply'
> > message. Thank you for your co-operation.
> > 
> > 
> > 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster


Visit Oce at drupa! Register online now: <http://drupa.oce.com>

This message and attachment(s) are intended solely for use by the addressee and may contain information that is privileged, confidential or otherwise exempt from disclosure under applicable law.

If you are not the intended recipient or agent thereof responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited.

If you have received this communication in error, please notify the sender immediately by telephone and with a 'reply' message.

Thank you for your co-operation.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080422/e6a1f7d7/attachment.htm>

From federico.simoncelli at gmail.com  Tue Apr 22 11:25:24 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Tue, 22 Apr 2008 13:25:24 +0200
Subject: [Linux-cluster] Monitoring the fencing device
Message-ID: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>

Hi all. My two-nodes cluster is up and running with a wti ips-800
fencing device.
Is there any way I can monitor the network connection to that device?
What if the ips gets disconnected?
I might hardly notice it... except when fencing will be needed and it
won't work.

Ref:
  http://sources.redhat.com/cluster/wiki/FAQ/Fencing#fence_levels

I know I can use multiple fencing devices but I'm trying to keep the
solution as cheap as possible.
I am thinking to write a cron job (hourly?) to check the ips status
(fence_wti -T ...) and notify by mail if unreachable.
Any other idea?

Thanks.
-- 
Federico.



From mgrac at redhat.com  Tue Apr 22 15:09:24 2008
From: mgrac at redhat.com (Marek 'marx' Grac)
Date: Tue, 22 Apr 2008 17:09:24 +0200
Subject: [Linux-cluster] Fencing Driver API Requirements
In-Reply-To: <1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>	<4803A6C4.20706@redhat.com>
	<1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>
Message-ID: <480DFFA4.5020300@redhat.com>

Hi,

Jonathan Buzzard wrote:
> On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote:
>   
> The issue is that with such a critical component of a cluster (if the
> fencing is not right bad things will happen) that in order to write a
> new fencing agent one has to start reverse engineering from source to
> work out what you need to do.
>   
Those new agents with python module are available only in developer 
branch are not a part of any distribution yet. There will be a 
documentation soon. Supported fencing agents has their man pages are 
there is description of how they work as they can use both getopt and 
stdin arguments. These options does not have to have anything common, as 
they are taken from the cluster.conf. Unfortunately some of the existing 
fencing agents use different options, so there are no standard options 
[there is an attempt to have them in new fencing agents].

> This is incredibly bad practice, and is bound to lead to improperly
> implemented fencing agents that then lead to bad things happening on
> clusters with these fencing agents.
>
>   
I agree.

> There a loads of potential fencing devices out there that could be
> supported, that are currently not. From my perspective trying to
> implement a fencing agent for Alert On Lan 2, it was easier to reverse
> engineer the magic packets of death using tcpdump and IDA pro as well as
> implementing a C based Linux command tool to generate them, than it has
> been to write a functioning fencing agent.
>
> It would take a couple of hours tops for someone to write a spec for
> what a fencing agent needs to do.
>   



-- 
Marek Grac
Red Hat Czech s.r.o.



From Harri.Paivaniemi at tietoenator.com  Tue Apr 22 15:45:00 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Tue, 22 Apr 2008 18:45:00 +0300
Subject: [Linux-cluster] Fencing Driver API Requirements
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>	<4803A6C4.20706@redhat.com>
	<1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>
	<480DFFA4.5020300@redhat.com>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E37@dino.eu.tieto.com>

I agree also,

but my problem is much more basic: to my mind this whole cluster is so badly documented, that it's
really hard to believe we have talked for years about how linux can be business-critical platform...

>From a normal human being like myself it has taken incredible reverse-engineering just to find all pieces
of information, one piece here and one there and nothing from RH, to just understand how cluster works. 

Versions go on, things change and information just gets old just when I understand it.

Just an example: When I first used qdisk I leared that I have to tune deadnode_timeout. When moved to ver5
 /proc/cluster got lost... so had to figure out.... ahaa its totem token now... RH support didn' know
this. This kind of frustrating things happen to me all the time.

Information is splitted to man- pages, wiki, faq's, poor RH- manuals, different txt- files from the
deepnes of internet. I have had to use all my poor genetic power to trie to create theorys about this
cluster as an administrator.

-hjp


-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Marek 'marx' Grac
Sent: Tue 4/22/2008 18:09
To: linux clustering
Subject: Re: [Linux-cluster] Fencing Driver API Requirements
 
Hi,

Jonathan Buzzard wrote:
> On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote:
>   
> The issue is that with such a critical component of a cluster (if the
> fencing is not right bad things will happen) that in order to write a
> new fencing agent one has to start reverse engineering from source to
> work out what you need to do.
>   
Those new agents with python module are available only in developer 
branch are not a part of any distribution yet. There will be a 
documentation soon. Supported fencing agents has their man pages are 
there is description of how they work as they can use both getopt and 
stdin arguments. These options does not have to have anything common, as 
they are taken from the cluster.conf. Unfortunately some of the existing 
fencing agents use different options, so there are no standard options 
[there is an attempt to have them in new fencing agents].

> This is incredibly bad practice, and is bound to lead to improperly
> implemented fencing agents that then lead to bad things happening on
> clusters with these fencing agents.
>
>   
I agree.

> There a loads of potential fencing devices out there that could be
> supported, that are currently not. From my perspective trying to
> implement a fencing agent for Alert On Lan 2, it was easier to reverse
> engineer the magic packets of death using tcpdump and IDA pro as well as
> implementing a C based Linux command tool to generate them, than it has
> been to write a functioning fencing agent.
>
> It would take a couple of hours tops for someone to write a spec for
> what a fencing agent needs to do.
>   



-- 
Marek Grac
Red Hat Czech s.r.o.

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 4448 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080422/9595b981/attachment.bin>

From sdake at redhat.com  Tue Apr 22 16:22:28 2008
From: sdake at redhat.com (Steven Dake)
Date: Tue, 22 Apr 2008 09:22:28 -0700
Subject: [Linux-cluster] Fencing Driver API Requirements
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E37@dino.eu.tieto.com>
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>
	<4803A6C4.20706@redhat.com>
	<1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>
	<480DFFA4.5020300@redhat.com>
	<36F4E74FA8263744A6B016E6A461EFF603317E37@dino.eu.tieto.com>
Message-ID: <1208881348.3594.23.camel@balance>

On Tue, 2008-04-22 at 18:45 +0300, Harri.Paivaniemi at tietoenator.com
wrote:
> I agree also,
> 
> but my problem is much more basic: to my mind this whole cluster is so badly documented, that it's
> really hard to believe we have talked for years about how linux can be business-critical platform...
> 
> >From a normal human being like myself it has taken incredible reverse-engineering just to find all pieces
> of information, one piece here and one there and nothing from RH, to just understand how cluster works. 
> 
> Versions go on, things change and information just gets old just when I understand it.
> 
> Just an example: When I first used qdisk I leared that I have to tune deadnode_timeout. When moved to ver5
>  /proc/cluster got lost... so had to figure out.... ahaa its totem token now... RH support didn' know
> this. This kind of frustrating things happen to me all the time.
> 
> Information is splitted to man- pages, wiki, faq's, poor RH- manuals, different txt- files from the
> deepnes of internet. I have had to use all my poor genetic power to trie to create theorys about this
> cluster as an administrator.
> 
> -hjp
> 
> 

Harri,

Your complaints are valid and we are aware of them within the various
projects that make up the community cluster stack.  We are working
towards improving the documentation we produce as open source projects
and our feeding of that documentation to commercial distribution vendor
products like RHEL5.

On a positive note, the various open source communities don't plan to
make any significant user-interface-specific changes to any of the
cluster stack anytime soon or for a very very long time.  We have
learned through experience this is very painful on our open source
users, distribution vendors, various third party support, etc (the folks
that add value to the software the various open source communities
produce).  We have made changes to our infrastructure from previous
versions of the cluster stack to the latest versions for various reasons
1) reliability 2) remove all bits from kernel that are unnecessary 3)
downstream adoption by third parties.  I know as a user these things may
not be critical to "getting the thing to just work" but over time there
is significant value in having _more_ people working, supporting,
distributing the code base then less.

I'd ask that folks be patient with the communities.  We are coordinating
and working together for the first time since clusters were started on
Linux, have widespread distribution, good adoption, and in general our
development pace is accelerating, our user view is maturing, and our
third party support from various distributions is improving.  All of
these things lead to downstream distributions with a better product,
containing better documentation and support, then was ever available in
the past.

regards
-steve

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com on behalf of Marek 'marx' Grac
> Sent: Tue 4/22/2008 18:09
> To: linux clustering
> Subject: Re: [Linux-cluster] Fencing Driver API Requirements
>  
> Hi,
> 
> Jonathan Buzzard wrote:
> > On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote:
> >   
> > The issue is that with such a critical component of a cluster (if the
> > fencing is not right bad things will happen) that in order to write a
> > new fencing agent one has to start reverse engineering from source to
> > work out what you need to do.
> >   
> Those new agents with python module are available only in developer 
> branch are not a part of any distribution yet. There will be a 
> documentation soon. Supported fencing agents has their man pages are 
> there is description of how they work as they can use both getopt and 
> stdin arguments. These options does not have to have anything common, as 
> they are taken from the cluster.conf. Unfortunately some of the existing 
> fencing agents use different options, so there are no standard options 
> [there is an attempt to have them in new fencing agents].
> 
> > This is incredibly bad practice, and is bound to lead to improperly
> > implemented fencing agents that then lead to bad things happening on
> > clusters with these fencing agents.
> >
> >   
> I agree.
> 
> > There a loads of potential fencing devices out there that could be
> > supported, that are currently not. From my perspective trying to
> > implement a fencing agent for Alert On Lan 2, it was easier to reverse
> > engineer the magic packets of death using tcpdump and IDA pro as well as
> > implementing a C based Linux command tool to generate them, than it has
> > been to write a functioning fencing agent.
> >
> > It would take a couple of hours tops for someone to write a spec for
> > what a fencing agent needs to do.
> >   
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From lhh at redhat.com  Tue Apr 22 16:27:42 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 22 Apr 2008 12:27:42 -0400
Subject: [Linux-cluster] qdiskd restart question
In-Reply-To: <D158540CCC0AB54C8FD4818F823CCB2476DECA@EPEXCH1.qlogic.org>
References: <D158540CCC0AB54C8FD4818F823CCB2476DECA@EPEXCH1.qlogic.org>
Message-ID: <1208881662.9657.0.camel@ayanami.boston.devel.redhat.com>

On Mon, 2008-04-21 at 16:07 -0500, Christopher Barry wrote:
> Greets,
> 
> I've modified some of my heuristics scripts, and want to start using
> them. Can I restart qdiskd on a running cluster without causing any
> issues?

Yes.

-- Lon




From lhh at redhat.com  Tue Apr 22 16:28:29 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 22 Apr 2008 12:28:29 -0400
Subject: [Linux-cluster] Monitoring the fencing device
In-Reply-To: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
References: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
Message-ID: <1208881709.9657.2.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-22 at 13:25 +0200, Federico Simoncelli wrote:
> Hi all. My two-nodes cluster is up and running with a wti ips-800
> fencing device.
> Is there any way I can monitor the network connection to that device?
> What if the ips gets disconnected?
> I might hardly notice it... except when fencing will be needed and it
> won't work.

Currently, this isn't done.

-- Lon




From lhh at redhat.com  Tue Apr 22 20:08:33 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 22 Apr 2008 16:08:33 -0400
Subject: [Linux-cluster] Fencing Driver API Requirements
In-Reply-To: <480DFFA4.5020300@redhat.com>
References: <alpine.LRH.1.10.0804141308350.4486@skynet.shatteredsilicon.net>
	<4803A6C4.20706@redhat.com>
	<1208449025.31915.55.camel@localhost.lifesci.dundee.ac.uk>
	<480DFFA4.5020300@redhat.com>
Message-ID: <1208894913.9657.23.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-22 at 17:09 +0200, Marek 'marx' Grac wrote:
> Hi,
> 
> Jonathan Buzzard wrote:
> > On Mon, 2008-04-14 at 20:47 +0200, Marek 'marx' Grac wrote:
> >   
> > The issue is that with such a critical component of a cluster (if the
> > fencing is not right bad things will happen) that in order to write a
> > new fencing agent one has to start reverse engineering from source to
> > work out what you need to do.
> >   
> Those new agents with python module are available only in developer 
> branch are not a part of any distribution yet. There will be a 
> documentation soon. Supported fencing agents has their man pages are 
> there is description of how they work as they can use both getopt and 
> stdin arguments. These options does not have to have anything common, as 
> they are taken from the cluster.conf. Unfortunately some of the existing 
> fencing agents use different options, so there are no standard options 
> [there is an attempt to have them in new fencing agents].

Here's a start:

http://sources.redhat.com/cluster/wiki/FenceAgentAPI

-- Lon




From bjlist at westnet.com.au  Wed Apr 23 03:33:19 2008
From: bjlist at westnet.com.au (Ben J)
Date: Wed, 23 Apr 2008 11:33:19 +0800
Subject: [Linux-cluster] Cluster failing after rebooting a standby node
Message-ID: <480EADFF.6030306@westnet.com.au>

Hello all,

We've been encountering an issue with RHCS4 U6 (using the U5 version of 
system-config-cluster as U6 version is broken) that results in the 
cluster failing after rebooting one of the standby nodes with CMAN 
dieing after too many transition restarts.

We have a 7 node cluster, with 5 active nodes and 2 standby nodes.  We 
are running the cluster with broadcast mode for cluster communication 
(the default for CS4), changing to multicast isn't an option at the 
moment due to us using Cisco switching infrastructure.  The hardware 
we're running the cluster on are IBM HS21 blades within 2  IBM H series 
Bladechassis (3 within one chassis, 4 in another).  Each Bladechassis 
network switch module has dual gig uplinks to a Cisco switch.

We have done a lot of analysis of our network to ensure that the problem 
is not being caused by the underlying network preventing the cluster 
nodes from talking to one another, so we have ruled this out as a cause 
of the problem.

The cluster is currently a pre-production system that we are testing 
before putting into production so the nodes are basically sitting idle 
at the moment whilst we test things (i.e. the cluster).

What we have seen happening, is that we have the cluster operational for 
several days and when initiating a reboot of one of the standby nodes 
(that isn't running any clustered services at the time), the other 
cluster nodes start filling the logs with:

Apr 14 15:44:57 server01 kernel: CMAN: Initiating transition, generation 64
Apr 14 15:45:12 server01 kernel: CMAN: Initiating transition, generation 65

With the generation number increasing until CMAN dies with:

Apr 14 15:48:24 server01 kernel: CMAN: too many transition restarts - 
will die
Apr 14 15:48:24 server01 kernel: CMAN: we are leaving the cluster. 
Inconsistent cluster view
Apr 14 15:48:24 server01 kernel: SM: 01000004 sm_stop: SG still joined
Apr 14 15:48:24 server01 kernel: SM: 03000003 sm_stop: SG still joined
Apr 14 15:48:24 server01 clurgmgrd[22461]: <warning> #67: Shutting down 
uncleanly
Apr 14 15:48:24 server01 ccsd[7135]: Cluster manager shutdown. 
 Attemping to reconnect...
Apr 14 15:48:25 server01 ccsd[7135]: Cluster is not quorate.  Refusing 
connection.
Apr 14 15:48:25 server01 ccsd[7135]: Error while processing connect: 
Connection refused
Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-111).
Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something 
evil.
Apr 14 15:48:25 server01 ccsd[7135]: Error while processing get: Invalid 
request descriptor
Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-111).
Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something 
evil.
Apr 14 15:48:25 server01 ccsd[7135]: Error while processing get: Invalid 
request descriptor
Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-21).
Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something 
evil.
Apr 14 15:48:25 server01 ccsd[7135]: Error while processing disconnect: 
Invalid request descriptor
Apr 14 15:48:25 server01 ccsd[7135]: Cluster is not quorate.  Refusing 
connection.

The interesting thing is that immediately after rebooting all of the 
nodes within the cluster and restarting the cluster services, the 
problem cannot be replicated.  Typically the cluster system has to have 
been running for 3-4 days untouched before we can then replicate the 
problem again (i.e. I reboot one of the standby nodes and it fails again).

I made a change yesterday to cluster.conf to increase the logging 
facility and logging level (set it to debug level - 7) and after using 
ccs_tool to apply the changes to the cluster online, once again I can't 
replicate the problem (even though immediately before this I could 
replicate the problem).

Has anyone experienced anything even remotely similar to this (I 
couldn't see anything similar reported in the list archives) and/or have 
any suggestions as to what might be causing the issue?

Cheers,

Ben



From dhongqian at 163.com  Wed Apr 23 06:08:53 2008
From: dhongqian at 163.com (hongqian)
Date: Wed, 23 Apr 2008 14:08:53 +0800 (CST)
Subject: [Linux-cluster] GFS1 caused kernel panic , could anyone help ?
Message-ID: <26924759.307571208930933270.JavaMail.coremail@bj163app46.163.com>

Hi,everyone:
I use cluster-2.03.00 in CentOS-4.6-Final.  Kernel comiled by myself is 2.6.24.4  .
 
I test it with two nodes.  
 
GFS2 runs fine. But GFS1 would cause kernel panic (40 dd write). the panic log is:
 
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040
printing eip: c0120cf1 *pdpt = 0000000036015001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 
Modules linked in: gfs lock_dlm dlm configfs gfs2 binfmt_misc dm_mirror dm_round_robin dm_multipath thermal processor fan containerd
Pid: 4031, comm: bash Not tainted (2.6.24.4.OWLsmp #1)
EIP: 0060:[<c0120cf1>] EFLAGS: 00010046 CPU: 3
EIP is at pick_next_task_fair+0x12/0x1f
EAX: 00000000 EBX: f783e030 ECX: c039c740 EDX: 00000000
ESI: 00000003 EDI: f6da2ac0 EBP: f6da2ac0 ESP: f7119e98
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process bash (pid: 4031, ti=f7118000 task=f5f898f0 task.ti=f7118000)
Stack: c0391bb2 c202bc80 f5f89a54 7fffffff f5c53000 f7119f1c bff5b72f c03921ae 
       f60c5c00 00000011 f554f000 c0239c5b 00000000 f73f7c00 f5c53000 00000246 
       c013917e f5c5300c f5c5300c f5c53000 c0238a7b f554f011 7fffffff 00000000 
Call Trace:
 [<c0391bb2>] __sched_text_start+0x11a/0x350
 [<c03921ae>] schedule_timeout+0x13/0x8d
 [<c0239c5b>] pty_write+0x2f/0x35
 [<c013917e>] add_wait_queue+0x12/0x30
 [<c0238a7b>] read_chan+0x28e/0x496
 [<c012341a>] default_wake_function+0x0/0x8
 [<c012341a>] default_wake_function+0x0/0x8
 [<c0234ab4>] tty_read+0x75/0xac
 [<c0175a12>] vfs_read+0x89/0x12e
 [<c0175d24>] sys_read+0x41/0x67
 [<c0104e7a>] sysenter_past_esp+0x6b/0xa1
 [<c0390000>] p9_create_tauth+0x1f/0xb2
 =======================
Code: da 77 11 72 04 39 c8 73 0b 5b 89 f8 5e 5f 5d e9 74 f7 ff ff 5b 5e 5f 5d c3 83 c0 34 31 d2 83 78 08 00 74 11 e8 4a fe ff ff 89 
EIP: [<c0120cf1>] pick_next_task_fair+0x12/0x1f SS:ESP 0068:f7119e98
---[ end trace 07f7603309b1b483 ]---
 
Could anyone help ?
 
 Thanks !
 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080423/a35d98d1/attachment.htm>

From dhongqian at 163.com  Wed Apr 23 05:57:42 2008
From: dhongqian at 163.com (dhongqian)
Date: Wed, 23 Apr 2008 13:57:42 +0800
Subject: [Linux-cluster] GFS1 caused kernel panic , could anyone help ?
Message-ID: <200804231357413127413@163.com>

Hi,everyone:
I use cluster-2.03.00 in CentOS-4.6-Final.  Kernel comiled by myself is 2.6.24.4  .

I test it with two nodes.  

GFS2 runs fine. But GFS1 would cause kernel panic (40 dd write). the panic log is:

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040
printing eip: c0120cf1 *pdpt = 0000000036015001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 
Modules linked in: gfs lock_dlm dlm configfs gfs2 binfmt_misc dm_mirror dm_round_robin dm_multipath thermal processor fan containerd
Pid: 4031, comm: bash Not tainted (2.6.24.4.OWLsmp #1)
EIP: 0060:[<c0120cf1>] EFLAGS: 00010046 CPU: 3
EIP is at pick_next_task_fair+0x12/0x1f
EAX: 00000000 EBX: f783e030 ECX: c039c740 EDX: 00000000
ESI: 00000003 EDI: f6da2ac0 EBP: f6da2ac0 ESP: f7119e98
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process bash (pid: 4031, ti=f7118000 task=f5f898f0 task.ti=f7118000)
Stack: c0391bb2 c202bc80 f5f89a54 7fffffff f5c53000 f7119f1c bff5b72f c03921ae 
       f60c5c00 00000011 f554f000 c0239c5b 00000000 f73f7c00 f5c53000 00000246 
       c013917e f5c5300c f5c5300c f5c53000 c0238a7b f554f011 7fffffff 00000000 
Call Trace:
 [<c0391bb2>] __sched_text_start+0x11a/0x350
 [<c03921ae>] schedule_timeout+0x13/0x8d
 [<c0239c5b>] pty_write+0x2f/0x35
 [<c013917e>] add_wait_queue+0x12/0x30
 [<c0238a7b>] read_chan+0x28e/0x496
 [<c012341a>] default_wake_function+0x0/0x8
 [<c012341a>] default_wake_function+0x0/0x8
 [<c0234ab4>] tty_read+0x75/0xac
 [<c0175a12>] vfs_read+0x89/0x12e
 [<c0175d24>] sys_read+0x41/0x67
 [<c0104e7a>] sysenter_past_esp+0x6b/0xa1
 [<c0390000>] p9_create_tauth+0x1f/0xb2
 =======================
Code: da 77 11 72 04 39 c8 73 0b 5b 89 f8 5e 5f 5d e9 74 f7 ff ff 5b 5e 5f 5d c3 83 c0 34 31 d2 83 78 08 00 74 11 e8 4a fe ff ff 89 
EIP: [<c0120cf1>] pick_next_task_fair+0x12/0x1f SS:ESP 0068:f7119e98
---[ end trace 07f7603309b1b483 ]---

Could anyone help ?

 Thanks !

2008-04-23 



dhongqian 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080423/2f166f9b/attachment.htm>

From maciej.bogucki at artegence.com  Wed Apr 23 08:09:41 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Wed, 23 Apr 2008 10:09:41 +0200
Subject: [Linux-cluster] Monitoring the fencing device
In-Reply-To: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
References: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
Message-ID: <480EEEC5.20009@artegence.com>

Federico Simoncelli napisa?(a):
> Hi all. My two-nodes cluster is up and running with a wti ips-800
> fencing device.
> Is there any way I can monitor the network connection to that device?
> What if the ips gets disconnected?
> I might hardly notice it... except when fencing will be needed and it
> won't work.
> 
> Ref:
>   http://sources.redhat.com/cluster/wiki/FAQ/Fencing#fence_levels
> 
> I know I can use multiple fencing devices but I'm trying to keep the
> solution as cheap as possible.
> I am thinking to write a cron job (hourly?) to check the ips status
> (fence_wti -T ...) and notify by mail if unreachable.
> Any other idea?

You could use nagios [1] to monitor connection between Your cluster and
the fance device.

[1] - http://www.nagios.org/

Best Regards
Maciej Bogucki



From lhh at redhat.com  Wed Apr 23 13:43:07 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 23 Apr 2008 09:43:07 -0400
Subject: [Linux-cluster] Cluster failing after rebooting a standby node
In-Reply-To: <480EADFF.6030306@westnet.com.au>
References: <480EADFF.6030306@westnet.com.au>
Message-ID: <1208958187.9657.43.camel@ayanami.boston.devel.redhat.com>

On Wed, 2008-04-23 at 11:33 +0800, Ben J wrote:

> What we have seen happening, is that we have the cluster operational for 
> several days and when initiating a reboot of one of the standby nodes 
> (that isn't running any clustered services at the time), the other 
> cluster nodes start filling the logs with:
> 
> Apr 14 15:44:57 server01 kernel: CMAN: Initiating transition, generation 64
> Apr 14 15:45:12 server01 kernel: CMAN: Initiating transition, generation 65
> 
> With the generation number increasing until CMAN dies with:
> 
> Apr 14 15:48:24 server01 kernel: CMAN: too many transition restarts - 
> will die
> Apr 14 15:48:24 server01 kernel: CMAN: we are leaving the cluster. 
> Inconsistent cluster view

^^^^ This is the problem.

vvvv These are all caused by that problem, and will 
     go away when the above is resolved.

> Apr 14 15:48:24 server01 kernel: SM: 01000004 sm_stop: SG still joined
> Apr 14 15:48:24 server01 kernel: SM: 03000003 sm_stop: SG still joined
> Apr 14 15:48:24 server01 clurgmgrd[22461]: <warning> #67: Shutting down 
> uncleanly
> Apr 14 15:48:24 server01 ccsd[7135]: Cluster manager shutdown. 
>  Attemping to reconnect...
> <snip>...
> Apr 14 15:48:25 server01 ccsd[7135]: Error while processing disconnect: 
> Invalid request descriptor
> Apr 14 15:48:25 server01 ccsd[7135]: Cluster is not quorate.  Refusing 
> connection.


> The interesting thing is that immediately after rebooting all of the 
> nodes within the cluster and restarting the cluster services, the 
> problem cannot be replicated.  Typically the cluster system has to have 
> been running for 3-4 days untouched before we can then replicate the 
> problem again (i.e. I reboot one of the standby nodes and it fails again).
> 
> I made a change yesterday to cluster.conf to increase the logging 
> facility and logging level (set it to debug level - 7) and after using 
> ccs_tool to apply the changes to the cluster online, once again I can't 
> replicate the problem (even though immediately before this I could 
> replicate the problem).

On RHEL4, there's some ugly arcane thing you need to do after this:

  cman_tool version -r <new_config_version>

I'm not sure this is the cause of the 'too many transitions' problem you
hit.  (Unfortunately, I'm not one of the people who fully understands
what causes 'too many transitions'...)

-- Lon



From lhh at redhat.com  Wed Apr 23 13:46:05 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 23 Apr 2008 09:46:05 -0400
Subject: [Linux-cluster] Monitoring the fencing device
In-Reply-To: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
References: <a01fe36d0804220425j6fe81e05t1942af80d6199b9b@mail.gmail.com>
Message-ID: <1208958365.9657.47.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-22 at 13:25 +0200, Federico Simoncelli wrote:

> I know I can use multiple fencing devices but I'm trying to keep the
> solution as cheap as possible.

> I am thinking to write a cron job (hourly?) to check the ips status
> (fence_wti -T ...) and notify by mail if unreachable.
> Any other idea?

You could do that - and it would be quite clean.  The other advantage is
that all nodes could do it at roughly the same time.

(Note the IPS doesn't allow >1 login at a time - but you can 'ping' it.)

-- Lon



From ccaulfie at redhat.com  Wed Apr 23 13:53:52 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Wed, 23 Apr 2008 14:53:52 +0100
Subject: [Linux-cluster] Cluster failing after rebooting a standby node
In-Reply-To: <480EADFF.6030306@westnet.com.au>
References: <480EADFF.6030306@westnet.com.au>
Message-ID: <480F3F70.3010004@redhat.com>

Ben J wrote:
> Hello all,
> 
> We've been encountering an issue with RHCS4 U6 (using the U5 version of
> system-config-cluster as U6 version is broken) that results in the
> cluster failing after rebooting one of the standby nodes with CMAN
> dieing after too many transition restarts.
> 
> We have a 7 node cluster, with 5 active nodes and 2 standby nodes.  We
> are running the cluster with broadcast mode for cluster communication
> (the default for CS4), changing to multicast isn't an option at the
> moment due to us using Cisco switching infrastructure.  The hardware
> we're running the cluster on are IBM HS21 blades within 2  IBM H series
> Bladechassis (3 within one chassis, 4 in another).  Each Bladechassis
> network switch module has dual gig uplinks to a Cisco switch.
> 
> We have done a lot of analysis of our network to ensure that the problem
> is not being caused by the underlying network preventing the cluster
> nodes from talking to one another, so we have ruled this out as a cause
> of the problem.
> 
> The cluster is currently a pre-production system that we are testing
> before putting into production so the nodes are basically sitting idle
> at the moment whilst we test things (i.e. the cluster).
> 
> What we have seen happening, is that we have the cluster operational for
> several days and when initiating a reboot of one of the standby nodes
> (that isn't running any clustered services at the time), the other
> cluster nodes start filling the logs with:
> 
> Apr 14 15:44:57 server01 kernel: CMAN: Initiating transition, generation 64
> Apr 14 15:45:12 server01 kernel: CMAN: Initiating transition, generation 65
> 
> With the generation number increasing until CMAN dies with:
> 
> Apr 14 15:48:24 server01 kernel: CMAN: too many transition restarts -
> will die
> Apr 14 15:48:24 server01 kernel: CMAN: we are leaving the cluster.
> Inconsistent cluster view
> Apr 14 15:48:24 server01 kernel: SM: 01000004 sm_stop: SG still joined
> Apr 14 15:48:24 server01 kernel: SM: 03000003 sm_stop: SG still joined
> Apr 14 15:48:24 server01 clurgmgrd[22461]: <warning> #67: Shutting down
> uncleanly
> Apr 14 15:48:24 server01 ccsd[7135]: Cluster manager shutdown. Attemping
> to reconnect...
> Apr 14 15:48:25 server01 ccsd[7135]: Cluster is not quorate.  Refusing
> connection.
> Apr 14 15:48:25 server01 ccsd[7135]: Error while processing connect:
> Connection refused
> Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-111).
> Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something
> evil.
> Apr 14 15:48:25 server01 ccsd[7135]: Error while processing get: Invalid
> request descriptor
> Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-111).
> Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something
> evil.
> Apr 14 15:48:25 server01 ccsd[7135]: Error while processing get: Invalid
> request descriptor
> Apr 14 15:48:25 server01 ccsd[7135]: Invalid descriptor specified (-21).
> Apr 14 15:48:25 server01 ccsd[7135]: Someone may be attempting something
> evil.
> Apr 14 15:48:25 server01 ccsd[7135]: Error while processing disconnect:
> Invalid request descriptor
> Apr 14 15:48:25 server01 ccsd[7135]: Cluster is not quorate.  Refusing
> connection.
> 
> The interesting thing is that immediately after rebooting all of the
> nodes within the cluster and restarting the cluster services, the
> problem cannot be replicated.  Typically the cluster system has to have
> been running for 3-4 days untouched before we can then replicate the
> problem again (i.e. I reboot one of the standby nodes and it fails again).
> 
> I made a change yesterday to cluster.conf to increase the logging
> facility and logging level (set it to debug level - 7) and after using
> ccs_tool to apply the changes to the cluster online, once again I can't
> replicate the problem (even though immediately before this I could
> replicate the problem).
> 
> Has anyone experienced anything even remotely similar to this (I
> couldn't see anything similar reported in the list archives) and/or have
> any suggestions as to what might be causing the issue?

I have heard of similar incidents but we have never managed to pin down
just what is happening. if you can reproduce it could you send me a
tcpdump of the conversations on port 6809 when it happens please ?

You might like to set up tcpdump to do a rolling capture so it doesn't
fill up a disk while you're waiting for it to happen!

The command is:
tcpdump -C 10 -W10 -w /tmp/port6809.dmp -xs0 port 6809

Every time we have seen this, there has been a potential for networking
troubles at the site. If you are confident that your network is fully
stable then it would be really helpful to get some debugging for this
problem.

Thanks,

Chrissie



From siddiqut at gmail.com  Wed Apr 23 15:31:47 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Wed, 23 Apr 2008 11:31:47 -0400
Subject: [Linux-cluster] gfs max file size
Message-ID: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>

What is the max file size for gfs ? (2 Gigs ?).

Can someone please point me to an appropriate link on this.

Googling, I found this:

http://www.cyberciti.biz/howto/question/static/maximum-partition-size.php

Thanx for your help,
Tajdar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080423/3bb6329f/attachment.htm>

From Greg.Caetano at hp.com  Wed Apr 23 15:43:16 2008
From: Greg.Caetano at hp.com (Caetano, Greg)
Date: Wed, 23 Apr 2008 15:43:16 +0000
Subject: [Linux-cluster] gfs max file size
In-Reply-To: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>
References: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>
Message-ID: <A9A5F016DF81824B85B00B29529E479E311593EA26@G1W0485.americas.hpqcorp.net>

you can find the limits listed at


http://www.redhat.com/rhel/compare/

GFS is listed as 16TB/8EB for RHEL5

Greg Caetano
HP ISS Linux Virtualization Solutions Engineering
greg.caetano at hp.com
Red Hat Certified Engineer
RHCE#805007310328754

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Tajdar Siddiqui
Sent: Wednesday, April 23, 2008 10:32 AM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] gfs max file size

What is the max file size for gfs ? (2 Gigs ?).

Can someone please point me to an appropriate link on this.

Googling, I found this:

http://www.cyberciti.biz/howto/question/static/maximum-partition-size.php

Thanx for your help,
Tajdar

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080423/32a43fa0/attachment.htm>

From gordan at bobich.net  Wed Apr 23 15:49:05 2008
From: gordan at bobich.net (gordan at bobich.net)
Date: Wed, 23 Apr 2008 16:49:05 +0100 (BST)
Subject: [Linux-cluster] gfs max file size
In-Reply-To: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>
References: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>
Message-ID: <alpine.LRH.1.10.0804231648300.2734@skynet.shatteredsilicon.net>

Check here, in the "Limits" table:
http://en.wikipedia.org/wiki/Comparison_of_file_systems

On Wed, 23 Apr 2008, Tajdar Siddiqui wrote:

> What is the max file size for gfs ? (2 Gigs ?).
> 
> Can someone please point me to an appropriate link on this.
> 
> Googling, I found this:
> 
> http://www.cyberciti.biz/howto/question/static/maximum-partition-size.php
> 
> Thanx for your help,
> Tajdar
> 
> 
>



From Jon.Swift at pwr.utc.com  Wed Apr 23 16:09:08 2008
From: Jon.Swift at pwr.utc.com (Swift, Jon S              PWR)
Date: Wed, 23 Apr 2008 09:09:08 -0700
Subject: [Linux-cluster] gfs max file size
In-Reply-To: <alpine.LRH.1.10.0804231648300.2734@skynet.shatteredsilicon.net>
References: <3abaa1ce0804230831g4089448cm48adfdcd9fbf31bd@mail.gmail.com>
	<alpine.LRH.1.10.0804231648300.2734@skynet.shatteredsilicon.net>
Message-ID: <5A3CA8FF800F1E418CC86A0FCC71A36B06D1039D@PWR-XCH-03.pwrutc.com>

Try,
	http://sourceware.org/cluster/faq.html#gfs_maxsize

Jon 

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of gordan at bobich.net
Sent: Wednesday, April 23, 2008 8:49 AM
To: linux clustering
Subject: Re: [Linux-cluster] gfs max file size

Check here, in the "Limits" table:
http://en.wikipedia.org/wiki/Comparison_of_file_systems

On Wed, 23 Apr 2008, Tajdar Siddiqui wrote:

> What is the max file size for gfs ? (2 Gigs ?).
> 
> Can someone please point me to an appropriate link on this.
> 
> Googling, I found this:
> 
> http://www.cyberciti.biz/howto/question/static/maximum-partition-size.
> php
> 
> Thanx for your help,
> Tajdar
> 
> 
>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From charlieb-linux-cluster at e-smith.com  Wed Apr 23 20:11:09 2008
From: charlieb-linux-cluster at e-smith.com (Charlie Brady)
Date: Wed, 23 Apr 2008 16:11:09 -0400 (EDT)
Subject: [Linux-cluster] Is heuristic really required?
In-Reply-To: <1208499994.21043.81.camel@hjpsuse.tebit>
Message-ID: <Pine.LNX.4.44.0804231607530.18321-100000@allspice.nssg.mitel.com>


On Fri, 18 Apr 2008, HarriP?iv?niemi wrote:

> Qdisk man says at least 1 heuristic is reguired.
> 
> Is it?
> 
> I have (accidentally) tested and to my mind it worked fine without
> heuristics. I gave 1 vote to quorumd and no <heuristic>- tag at all and
> it seemed to work normally like 3-vote cluster should...

It does seem to work fine. However, one thing I've found is that if I/O to 
the quorum disk blocks (e.g. if the quorum disk is on iscsi and you break 
the connection), then qdiskd doesn't detect the timeout and doesn't reboot 
the node.

Another thing I've noticed (with RHEL4 RHC) is the the quorum disk doesn't 
show up in the node list of the first node set up in the cluster, and 
doesn't contribute a vote. Anyone have any ideas about that?

--
Charlie




From Alain.Moulle at bull.net  Thu Apr 24 11:12:39 2008
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Thu, 24 Apr 2008 13:12:39 +0200
Subject: [Linux-cluster] CS5 Problem
Message-ID: <48106B27.3010505@bull.net>

Hi

I 'm facing a problem :

when testing a two-nodes cluster with quorum disk, when
I poweroff the node1 , node 2 fences well the node 1 and
failovers the service, but in log of node 2 I have before and after
the fence success messages  many messages like this:
Apr 24 11:30:04 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
Apr 24 11:30:04 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
Apr 24 11:30:05 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
Apr 24 11:30:05 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
Apr 24 11:30:06 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
Apr 24 11:30:06 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
Apr 24 11:30:07 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
Apr 24 11:30:07 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
Apr 24 11:30:08 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.

The problem is that when on node1 , after the reboot I try to start
again the CS5 , cman fails with these messages in syslog :
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Copyright (C) Red Hat, Inc.  2004  All
rights reserved.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: cluster.conf (cluster name = A0ha2,
version = 1) found.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: Remote copy of cluster.conf is from
quorate node.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Local version # : 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Remote version #: 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: Remote copy of cluster.conf is from
quorate node.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Local version # : 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Remote version #: 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: Remote copy of cluster.conf is from
quorate node.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Local version # : 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Remote version #: 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: Remote copy of cluster.conf is from
quorate node.
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Local version # : 1
Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Remote version #: 1
Apr 24 11:47:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 30 seconds.
Apr 24 11:48:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 60 seconds.
Apr 24 11:48:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 90 seconds.
Apr 24 11:48:37 s_sys at xn4 ntpd[6179]: synchronized to 192.168.64.99, stratum 11
Apr 24 11:48:37 s_sys at xn4 ntpd[6179]: kernel time sync enabled 0001
Apr 24 11:49:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 120 seconds.
Apr 24 11:49:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 150 seconds.
Apr 24 11:50:01 s_sys at xn4 crond[11455]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Apr 24 11:50:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 180 seconds.
Apr 24 11:50:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 210 seconds.
Apr 24 11:51:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 240 seconds.
Apr 24 11:51:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 270 seconds.
Apr 24 11:52:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 300 seconds.
Apr 24 11:52:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 330 seconds.
Apr 24 11:53:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 360 seconds.
Apr 24 11:53:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 390 seconds.
Apr 24 11:54:01 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 420 seconds.
Apr 24 11:54:31 s_sys at xn4 ccsd[11099]: Unable to connect to cluster
infrastructure after 450 seconds ...
etc.

or also :
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Cluster is not quorate.  Refusing connection.
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Error while processing connect:
Connection refused
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Invalid descriptor specified (-111).
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Someone may be attempting something evil.
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Error while processing get: Invalid
request descriptor
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Invalid descriptor specified (-111).
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Someone may be attempting something evil.
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Error while processing get: Invalid
request descriptor
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Invalid descriptor specified (-21).
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Someone may be attempting something evil.
Apr 24 10:17:37 s_sys at xn4 ccsd[11023]: Error while processing disconnect:
Invalid request descriptor
Apr 24 10:17:37 s_sys at xn4 rgmanager: [11331]: <notice> Cluster Service Manager
is stopped.


And I can't start it again, except after stopping the CS on both nodes.

My cluster.conf qdisk record is likewise :
<quorumd label="QDISK_2_0" interval="1" tko="10" votes="1" min_score="1">
     <heuristic interval="10" tko="3" program="ping -t1 -c1 192.168.64.99"
score="1"/>
     <heuristic interval="10" program="ping -t3 -c1 192.168.64.99" score="1"/>
</quorumd>

I need urgent help if you have any ideas on the problem ?

Thanks a lot
Regards.
Alain Moull?




From lhh at redhat.com  Thu Apr 24 18:05:58 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 24 Apr 2008 14:05:58 -0400
Subject: [Linux-cluster] Is heuristic really required?
In-Reply-To: <Pine.LNX.4.44.0804231607530.18321-100000@allspice.nssg.mitel.com>
References: <Pine.LNX.4.44.0804231607530.18321-100000@allspice.nssg.mitel.com>
Message-ID: <1209060358.809.10.camel@ayanami.boston.devel.redhat.com>

On Wed, 2008-04-23 at 16:11 -0400, Charlie Brady wrote:
> On Fri, 18 Apr 2008, HarriP?iv?niemi wrote:
> 
> > Qdisk man says at least 1 heuristic is reguired.
> > 
> > Is it?
> > 
> > I have (accidentally) tested and to my mind it worked fine without
> > heuristics. I gave 1 vote to quorumd and no <heuristic>- tag at all and
> > it seemed to work normally like 3-vote cluster should...

You don't need a heuristic (it will run without one).  However, it
doesn't provide much benefit without one.  It's supposed to use
"master-wins" mode if no heuristic is provided to avoid a fence-race -
but it currently doesn't provide that guarantee.


> Another thing I've noticed (with RHEL4 RHC) is the the quorum disk doesn't 
> show up in the node list of the first node set up in the cluster, and 
> doesn't contribute a vote. Anyone have any ideas about that?

It's not really a node, and is not reported by CMAN in the nodes list as
it is in RHEL5/STABLE2 branches.

You can see the votes (assuming each node gets one vote) by looking @
cman_tool status:

  [root at red cluster]# cman_tool status
  Protocol version: 5.0.1
  Config version: 187
  Cluster name: rhel4test1
  Cluster ID: 44365
  Cluster Member: Yes
  Membership state: Cluster-Member
  Nodes: 2
  Expected_votes: 3
  Total_votes: 3
  Quorum: 2   
  Active subsystems: 4
  ...

Nodes: 2 but Total_votes: 3 => last vote comes from qdiskd

-- Lon




From lhh at redhat.com  Thu Apr 24 18:07:20 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 24 Apr 2008 14:07:20 -0400
Subject: [Linux-cluster] CS5 Problem
In-Reply-To: <48106B27.3010505@bull.net>
References: <48106B27.3010505@bull.net>
Message-ID: <1209060440.809.13.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-24 at 13:12 +0200, Alain Moulle wrote:

> when testing a two-nodes cluster with quorum disk, when
> I poweroff the node1 , node 2 fences well the node 1 and
> failovers the service, but in log of node 2 I have before and after
> the fence success messages  many messages like this:
> Apr 24 11:30:04 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
> Apr 24 11:30:04 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
> Apr 24 11:30:05 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
> Apr 24 11:30:05 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
> Apr 24 11:30:06 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
> Apr 24 11:30:06 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
> Apr 24 11:30:07 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.
> Apr 24 11:30:07 s_sys at xn3 qdiskd[13740]: <alert> Writing eviction notice for node 2
> Apr 24 11:30:08 s_sys at xn3 qdiskd[13740]: <crit> Node 2 is undead.

http://sources.redhat.com/git/?p=cluster.git;a=commit;h=b2686ffe984c517110b949d604c54a71800b67c9

-- Lon




From Bennie_R_Thomas at raytheon.com  Thu Apr 24 19:10:51 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Thu, 24 Apr 2008 14:10:51 -0500
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
Message-ID: <4810DB3B.20402@raytheon.com>

I have a 3-node Cluster set up as 2-nodes active and one passive. I have 
assigned 2 IP Aliases
to fail over. The problem I am having is;  When I ssh to the IP aliases 
the first time it works fine,
I then failover the IP alias service to the backup node, then try 
ssh'ing to the alias,  it fails with man in the middle attack

I know I can go modify .ssh/known_hosts and remove the host key and it 
will work, but if the alias fails back to the
original node the problem starts all over.

How can I set up ssh to allow this connection. ?



From lhh at redhat.com  Thu Apr 24 19:23:48 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 24 Apr 2008 15:23:48 -0400
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <4810DB3B.20402@raytheon.com>
References: <4810DB3B.20402@raytheon.com>
Message-ID: <1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-24 at 14:10 -0500, Bennie Thomas wrote:
> I have a 3-node Cluster set up as 2-nodes active and one passive. I have 
> assigned 2 IP Aliases
> to fail over. The problem I am having is;  When I ssh to the IP aliases 
> the first time it works fine,
> I then failover the IP alias service to the backup node, then try 
> ssh'ing to the alias,  it fails with man in the middle attack
> 
> I know I can go modify .ssh/known_hosts and remove the host key and it 
> will work, but if the alias fails back to the
> original node the problem starts all over.
> 
> How can I set up ssh to allow this connection. ?

What I usually do is:

- make a copy of the sshd init script and place it somewhere
besides /etc/init.d (/cluster/scripts?)

- change the global sshd config file to bind to a *specific* VIP on the
host.

- create a separate config file for the cluster VIP using different host
keys for the cluster IP address

- copy service-specific sshd script / config / host keys to other
cluster node(s)

- add the copied script as part of the cluster service with the VIP you
want

You'll end up with 2 sshd instances running on the host when the service
is enabled - one for the host's IP with specific keys/etc. for that IP,
and one running for the cluster IP address with its own set of keys.

Because the host keys are distributed between the cluster nodes for this
address, no matter where the cluster IP is, it should work - IP matches
and the keys match :)

-- Lon



From lpleiman at redhat.com  Thu Apr 24 19:25:30 2008
From: lpleiman at redhat.com (Leo Pleiman)
Date: Thu, 24 Apr 2008 15:25:30 -0400
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <4810DB3B.20402@raytheon.com>
References: <4810DB3B.20402@raytheon.com>
Message-ID: <4810DEAA.7010306@redhat.com>

You have to install the same ssh key on each machine.

Bennie Thomas wrote:
> I have a 3-node Cluster set up as 2-nodes active and one passive. I 
> have assigned 2 IP Aliases
> to fail over. The problem I am having is;  When I ssh to the IP 
> aliases the first time it works fine,
> I then failover the IP alias service to the backup node, then try 
> ssh'ing to the alias,  it fails with man in the middle attack
>
> I know I can go modify .ssh/known_hosts and remove the host key and it 
> will work, but if the alias fails back to the
> original node the problem starts all over.
>
> How can I set up ssh to allow this connection. ?
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Leo J Pleiman
Senior Consultant, GPS Federal
410-688-3873

-------------- next part --------------
A non-text attachment was scrubbed...
Name: lpleiman.vcf
Type: text/x-vcard
Size: 194 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080424/52b3e9a5/attachment.vcf>

From lhh at redhat.com  Thu Apr 24 19:29:24 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 24 Apr 2008 15:29:24 -0400
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>
References: <4810DB3B.20402@raytheon.com>
	<1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>
Message-ID: <1209065364.12993.9.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-24 at 15:23 -0400, Lon Hohberger wrote:

I'm writing step by step wiki article now. ;)

[Note: it will need to be expanded for non-Red Hat/Fedora distributions]

-- Lon




From Bennie_R_Thomas at raytheon.com  Thu Apr 24 19:40:30 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Thu, 24 Apr 2008 14:40:30 -0500
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <4810DEAA.7010306@redhat.com>
References: <4810DB3B.20402@raytheon.com> <4810DEAA.7010306@redhat.com>
Message-ID: <4810E22E.6000101@raytheon.com>

the key is on the both nodes. I use nis for my accounts and automounted 
home directories.

Leo Pleiman wrote:
> You have to install the same ssh key on each machine.
>
> Bennie Thomas wrote:
>> I have a 3-node Cluster set up as 2-nodes active and one passive. I 
>> have assigned 2 IP Aliases
>> to fail over. The problem I am having is;  When I ssh to the IP 
>> aliases the first time it works fine,
>> I then failover the IP alias service to the backup node, then try 
>> ssh'ing to the alias,  it fails with man in the middle attack
>>
>> I know I can go modify .ssh/known_hosts and remove the host key and 
>> it will work, but if the alias fails back to the
>> original node the problem starts all over.
>>
>> How can I set up ssh to allow this connection. ?
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From jbrassow at redhat.com  Thu Apr 24 20:54:59 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Thu, 24 Apr 2008 15:54:59 -0500
Subject: [Linux-cluster] CS5 Problem
In-Reply-To: <48106B27.3010505@bull.net>
References: <48106B27.3010505@bull.net>
Message-ID: <21DB5D64-FC7C-4DBB-9097-EC8EEEEA1722@redhat.com>


On Apr 24, 2008, at 6:12 AM, Alain Moulle wrote:

> Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Local version # : 1
> Apr 24 11:47:02 s_sys at xn4 ccsd[11099]:  Remote version #: 1
> Apr 24 11:47:02 s_sys at xn4 ccsd[11099]: Remote copy of cluster.conf  
> is from
> quorate node.

Is there a different cluster.conf file on xn4 than xn3 after rebooting/ 
starting xn4?

  brassow



From lhh at redhat.com  Thu Apr 24 21:02:54 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 24 Apr 2008 17:02:54 -0400
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <1209065364.12993.9.camel@ayanami.boston.devel.redhat.com>
References: <4810DB3B.20402@raytheon.com>
	<1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>
	<1209065364.12993.9.camel@ayanami.boston.devel.redhat.com>
Message-ID: <1209070974.12993.15.camel@ayanami.boston.devel.redhat.com>

On Thu, 2008-04-24 at 15:29 -0400, Lon Hohberger wrote:
> On Thu, 2008-04-24 at 15:23 -0400, Lon Hohberger wrote:
> 
> I'm writing step by step wiki article now. ;)
> 
> [Note: it will need to be expanded for non-Red Hat/Fedora distributions]

Here's how to do it so that each service has a different ssh key (which
can be different from the host's ssh keys):

http://sources.redhat.com/cluster/wiki/ClusterSSH

This is designed to ensure that each {key, IP} pairing is static so that
connecting to a host (or a cluster IP) will not cause ssh to complain -
because each IP has a different SSH key.

Comments welcome.

-- Lon




From Bennie_R_Thomas at raytheon.com  Thu Apr 24 21:22:41 2008
From: Bennie_R_Thomas at raytheon.com (Bennie Thomas)
Date: Thu, 24 Apr 2008 16:22:41 -0500
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <1209070974.12993.15.camel@ayanami.boston.devel.redhat.com>
References: <4810DB3B.20402@raytheon.com>	<1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>	<1209065364.12993.9.camel@ayanami.boston.devel.redhat.com>
	<1209070974.12993.15.camel@ayanami.boston.devel.redhat.com>
Message-ID: <4810FA21.8080701@raytheon.com>

You work fast, Actually  Leo  was on to something. My personal user ssh 
files where the same
because of our automounted directories. However, I copied over the files 
in /etc/ssh to all three
nodes in my cluster and ssh works great.(I can ssh from any node on our 
network to the VIP
addresses then fail the services over to the backup node and all is good)

 I guess the problem with that is that the level of security is reduced 
a little.

I am going to try your way now and see how things work out.


Lon Hohberger wrote:
> On Thu, 2008-04-24 at 15:29 -0400, Lon Hohberger wrote:
>   
>> On Thu, 2008-04-24 at 15:23 -0400, Lon Hohberger wrote:
>>
>> I'm writing step by step wiki article now. ;)
>>
>> [Note: it will need to be expanded for non-Red Hat/Fedora distributions]
>>     
>
> Here's how to do it so that each service has a different ssh key (which
> can be different from the host's ssh keys):
>
> http://sources.redhat.com/cluster/wiki/ClusterSSH
>
> This is designed to ensure that each {key, IP} pairing is static so that
> connecting to a host (or a cluster IP) will not cause ssh to complain -
> because each IP has a different SSH key.
>
> Comments welcome.
>
> -- Lon
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   




From linux-cluster at merctech.com  Thu Apr 24 21:34:30 2008
From: linux-cluster at merctech.com (linux-cluster at merctech.com)
Date: Thu, 24 Apr 2008 17:34:30 -0400
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: Your message of "Thu, 24 Apr 2008 17:02:54 EDT."
	<1209070974.12993.15.camel@ayanami.boston.devel.redhat.com>
References: <1209070974.12993.15.camel@ayanami.boston.devel.redhat.com>
	<4810DB3B.20402@raytheon.com>
	<1209065028.12993.7.camel@ayanami.boston.devel.redhat.com>
	<1209065364.12993.9.camel@ayanami.boston.devel.redhat.com>
Message-ID: <32338.1209072870@localhost>



In the message dated: Thu, 24 Apr 2008 17:02:54 EDT,
The pithy ruminations from Lon Hohberger on 
<Re: [Linux-cluster] ssh'ing to Cluster IP aliases.> were:
=> On Thu, 2008-04-24 at 15:29 -0400, Lon Hohberger wrote:
=> > On Thu, 2008-04-24 at 15:23 -0400, Lon Hohberger wrote:
=> > 
=> > I'm writing step by step wiki article now. ;)
=> > 
=> > [Note: it will need to be expanded for non-Red Hat/Fedora distributions]
=> 
=> Here's how to do it so that each service has a different ssh key (which
=> can be different from the host's ssh keys):
=> 
=> http://sources.redhat.com/cluster/wiki/ClusterSSH

Impressive. Well documented. On the other hand, I really try had to avoid 
changing things like vendor-distributed /etc/init.d scripts. I feel that causes 
many problems in future maintenance and complicates troubleshooting.

=> 
=> This is designed to ensure that each {key, IP} pairing is static so that
=> connecting to a host (or a cluster IP) will not cause ssh to complain -
=> because each IP has a different SSH key.

I believe there's a much, much, much easier way of dealing with this.

Don't make any changes on the servers (ie., run a standard ssh 
config, which won't break everytime you run "yum update").

On the client, put in multiple host entries in the ~/.ssh/authorized_keys file,
as in:

	clusterVIP, node1 1024 35 12487479769271803949013278496...

this tells ssh that the given key is valid for all the named servers 
("clusterVIP" and "node1", in this case).

If you have different host keys for each server, you'll need multiple 
authorized_keys entries, each one naming the cluster VIP and the specific node, 
and specifying that node's key. Thus, the clusterVIP will have multiple valid 
keys, each one corresponding to a different node.

If you use the same host key for each node, you can list all the nodes as a 
single entry in the authorized_keys file, as in:
	
	clusterVIP, node1, node2, node3, nodeN 1025 35 8813746318910438843...



[Disclaimer]
I haven't actually used this method with VIPs on RHCS clusters, but I have used 
it to connect to other clusters (BSD, Sun).
[/Disclaimer]

Mark



=> 
=> Comments welcome.
=> 
=> -- Lon
=> 



From tiagocruz at forumgdh.net  Thu Apr 24 22:38:23 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Thu, 24 Apr 2008 19:38:23 -0300
Subject: [Linux-cluster] MySQL Cluster
Message-ID: <1209076703.7121.21.camel@tuxkiller.ig.com.br>

Hello Guys,

I'm researching about mysql's cluster.
What did you suggests? 

I need to have write in all nodes of cluster, so, I can't use DRBD7 +
HeartBeat.

Maybe, I should use DRBR8 and GFS to to this?

One proxy application to separe Insert/Update from Select?

Well.. thanks for your time! :)

-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From gordan at bobich.net  Fri Apr 25 00:02:53 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Fri, 25 Apr 2008 01:02:53 +0100
Subject: [Linux-cluster] MySQL Cluster
In-Reply-To: <1209076703.7121.21.camel@tuxkiller.ig.com.br>
References: <1209076703.7121.21.camel@tuxkiller.ig.com.br>
Message-ID: <48111FAD.8090805@bobich.net>

Tiago Cruz wrote:

> I'm researching about mysql's cluster.
> What did you suggests? 

Last time I tried using MySQL Cluster it was unacceptably slow. It 
wasn't really a workable solution. MySQL replication was a much better 
option.

> I need to have write in all nodes of cluster, so, I can't use DRBD7 +
> HeartBeat.
 >
> Maybe, I should use DRBR8 and GFS to to this?

You could use DRBD in primary-primary mode with GFS on top and external 
locking in MySQL. It would work, but performance would suffer a great 
deal. It depends on what kind of throughput you need.

> One proxy application to separe Insert/Update from Select?

Depending on your application, you could use something like MySQL 
round-robin replication. In that configuration, MySQL servers replicate 
in a ring. The downside is that there is a race condition inherent in 
the design, if you think about it carefully. (DRBD+GFS+ MySQL with 
external locking doesn't suffer the race condition.) If this potential 
race condition is not a concern for your application, then performance 
wise, it is likely to be the optimal solution for you. Since you said 
that fail-over is not an option for you, I am assuming that the load you 
plan to put through it is too great for one server to handle (I have 
seem low-spec servers with MySQL churning out 100,000+ queries per 
second when tuned up correctly). If you are dealing with this sort of 
load, DRBD+GFS+MySQL isn't going to cut it and round-robin replication 
with the mentioned race condition is pretty much your only option.

Gordan



From Harri.Paivaniemi at tietoenator.com  Fri Apr 25 04:38:36 2008
From: Harri.Paivaniemi at tietoenator.com (Harri.Paivaniemi at tietoenator.com)
Date: Fri, 25 Apr 2008 07:38:36 +0300
Subject: [Linux-cluster] MySQL Cluster
References: <1209076703.7121.21.camel@tuxkiller.ig.com.br>
Message-ID: <36F4E74FA8263744A6B016E6A461EFF603317E44@dino.eu.tieto.com>

If you need high select- performance but only moderate update/insert/delete,

then it's easy and robust way to:

- cluster master-db as active-hot standby (with RHCS)

- use mysql proxy to split selects and run many slaves for selects in loadbalance pool behind
one VIP

- use bin- replication master --> slaves



-hpj
  


-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Tiago Cruz
Sent: Fri 4/25/2008 1:38
To: linux clustering
Subject: [Linux-cluster] MySQL Cluster
 
Hello Guys,

I'm researching about mysql's cluster.
What did you suggests? 

I need to have write in all nodes of cluster, so, I can't use DRBD7 +
HeartBeat.

Maybe, I should use DRBR8 and GFS to to this?

One proxy application to separe Insert/Update from Select?

Well.. thanks for your time! :)

-- 
Tiago Cruz
http://everlinux.com
Linux User #282636


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 2982 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/b84697f7/attachment.bin>

From vimal at monster.co.in  Fri Apr 25 08:54:41 2008
From: vimal at monster.co.in (Vimal Gupta)
Date: Fri, 25 Apr 2008 14:24:41 +0530
Subject: [Linux-cluster] GFS Storage cluster !!!!
Message-ID: <48119C51.8020904@monster.co.in>

Hi Guys,

I want a storage cluster of 3 Nodes.But  I want to test first. So for 
Testing I have one quad CPU server with CentOS 5.1. I don't have a SAN 
now . Can I configure this server as a one node cluster and use its one 
scsi HDD as GFS. Just for testing . (may be this is a stupid question).


-- 

Vimal Gupta
Sr. System Administrator
Monster.com India Pvt.Ltd.
FC - 23, Block - B, 1st Floor, Film City, Sector - 16 A,
NOIDA, UP 201 301, INDIA
Ph# : +91-120-4024230 Fax: +91-40-66506449 Mobile: +91-9811150360



From harry.sutton at hp.com  Fri Apr 25 10:47:53 2008
From: harry.sutton at hp.com (Sutton, Harry (MSE))
Date: Fri, 25 Apr 2008 06:47:53 -0400
Subject: [Linux-cluster] GFS Storage cluster !!!!
In-Reply-To: <48119C51.8020904@monster.co.in>
References: <48119C51.8020904@monster.co.in>
Message-ID: <4811B6D9.8070907@hp.com>

Hi Vimal,

You can configure a single node to use a GFS filesystem, but you must 
use manual locking, something you definitely wouldn't do on a multi-node 
production cluster.

    /Harry

Vimal Gupta wrote:
> Hi Guys,
>
> I want a storage cluster of 3 Nodes.But  I want to test first. So for
> Testing I have one quad CPU server with CentOS 5.1. I don't have a SAN
> now . Can I configure this server as a one node cluster and use its one
> scsi HDD as GFS. Just for testing . (may be this is a stupid question).
>
>
> --
>
> Vimal Gupta
> Sr. System Administrator
> Monster.com India Pvt.Ltd.
> FC - 23, Block - B, 1st Floor, Film City, Sector - 16 A,
> NOIDA, UP 201 301, INDIA
> Ph# : +91-120-4024230 Fax: +91-40-66506449 Mobile: +91-9811150360
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>   

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6255 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/a3913100/attachment.bin>

From gniagnia at gmail.com  Fri Apr 25 12:41:08 2008
From: gniagnia at gmail.com (gnia gnia)
Date: Fri, 25 Apr 2008 14:41:08 +0200
Subject: [Linux-cluster] redhat cluster in multi SAN
Message-ID: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>

Hi all,

Situation :  2 nodes cluster (active/active) under RedHat 4.
Each clusterized service has a multipathed/mirrored LV (one PV on each
storage bay -> HP EVA 8000) declared as an FS resource.
(We decided to implement a multi-SAN environment to prevent from one SAN
failure)
But all clusterized mirrored LVs get frozen and become unusable when we
simulate one SAN outage.

According to you experts, what is the best way to implement a storage
redundant redhat cluster that wouldn't fail in case of one SAN outage ?
Is lvm mirroring a godd choice?

Cheers,
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/15e47978/attachment.htm>

From maciej.bogucki at artegence.com  Fri Apr 25 14:03:35 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Fri, 25 Apr 2008 16:03:35 +0200
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
Message-ID: <4811E4B7.4030904@artegence.com>

gnia gnia napisa?(a):
> Hi all,
> 
> Situation :  2 nodes cluster (active/active) under RedHat 4.
> Each clusterized service has a multipathed/mirrored LV (one PV on each
> storage bay -> HP EVA 8000) declared as an FS resource.
> (We decided to implement a multi-SAN environment to prevent from one SAN
> failure)
> But all clusterized mirrored LVs get frozen and become unusable when we
> simulate one SAN outage.

Because You have only one copy of LV log [1]. You cean check it by "lvs
-a -o +devices"

> 
> According to you experts, what is the best way to implement a storage
> redundant redhat cluster that wouldn't fail in case of one SAN outage ?
> Is lvm mirroring a godd choice?
Do You think about automatic failover of SAN? If Yes, I think that there
is no open source solution an the moment what is capable to do this.

[1] -
http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/en-US/RHEL510/Cluster_Logical_Volume_Manager/mirrored_volumes.html

Best Regards
Maciej Bogucki



From jbrassow at redhat.com  Fri Apr 25 14:11:32 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Fri, 25 Apr 2008 09:11:32 -0500
Subject: [Linux-cluster] GFS Storage cluster !!!!
In-Reply-To: <4811B6D9.8070907@hp.com>
References: <48119C51.8020904@monster.co.in> <4811B6D9.8070907@hp.com>
Message-ID: <387D61E1-E984-400C-B11E-884C590574CC@redhat.com>

If you have more than one machine, you could export a portion of you  
local hard drive to the other machines.  You would then be able to  
test as if you had a SAN.  GNBD/iSCSI can provide this capability.

  brassow

On Apr 25, 2008, at 5:47 AM, Sutton, Harry (MSE) wrote:

> Hi Vimal,
>
> You can configure a single node to use a GFS filesystem, but you  
> must use manual locking, something you definitely wouldn't do on a  
> multi-node production cluster.
>
>   /Harry
>
> Vimal Gupta wrote:
>> Hi Guys,
>>
>> I want a storage cluster of 3 Nodes.But  I want to test first. So for
>> Testing I have one quad CPU server with CentOS 5.1. I don't have a  
>> SAN
>> now . Can I configure this server as a one node cluster and use its  
>> one
>> scsi HDD as GFS. Just for testing . (may be this is a stupid  
>> question).
>>
>>
>> --
>>
>> Vimal Gupta
>> Sr. System Administrator
>> Monster.com India Pvt.Ltd.
>> FC - 23, Block - B, 1st Floor, Film City, Sector - 16 A,
>> NOIDA, UP 201 301, INDIA
>> Ph# : +91-120-4024230 Fax: +91-40-66506449 Mobile: +91-9811150360
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From jbrassow at redhat.com  Fri Apr 25 14:26:21 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Fri, 25 Apr 2008 09:26:21 -0500
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
Message-ID: <D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>

I'm not sure what you mean by multi-SAN.  Do you mean that you have  
two independent storage networks connecting the same machines to the  
same disks, or same machines to different disks, or something else?   
It sounds like you have 1 SAN with two storage devices?  It also  
sounds like your top level services are active/passive, but that your  
storage is setup active/active.  You may consider HA-LVM instead of  
CLVM in this case (http://sourceware.org/cluster/wiki/LVMFailover).

I've seen this scenario where things get frozen...  Turns out that the  
user did not have multipathd running, so multipath was queueing all I/ 
O until a path would come back - effectively freezing everything.

The problem could be a number of things... we may need more  
information if the above doesn't help.

  brassow

On Apr 25, 2008, at 7:41 AM, gnia gnia wrote:

> Hi all,
>
> Situation :  2 nodes cluster (active/active) under RedHat 4.
> Each clusterized service has a multipathed/mirrored LV (one PV on  
> each storage bay -> HP EVA 8000) declared as an FS resource.
> (We decided to implement a multi-SAN environment to prevent from one  
> SAN failure)
> But all clusterized mirrored LVs get frozen and become unusable when  
> we simulate one SAN outage.
>
> According to you experts, what is the best way to implement a  
> storage redundant redhat cluster that wouldn't fail in case of one  
> SAN outage ?
> Is lvm mirroring a godd choice?
>
> Cheers,
> Chris --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From Markus at hochholdinger.net  Fri Apr 25 14:39:56 2008
From: Markus at hochholdinger.net (Markus Hochholdinger)
Date: Fri, 25 Apr 2008 16:39:56 +0200
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
In-Reply-To: <4810DB3B.20402@raytheon.com>
References: <4810DB3B.20402@raytheon.com>
Message-ID: <200804251640.00494.Markus@hochholdinger.net>

Hi,

Am Donnerstag, 24. April 2008 21:10 schrieb Bennie Thomas:
> I have a 3-node Cluster set up as 2-nodes active and one passive. I have
> assigned 2 IP Aliases
> to fail over. The problem I am having is;  When I ssh to the IP aliases
> the first time it works fine,
> I then failover the IP alias service to the backup node, then try
> ssh'ing to the alias,  it fails with man in the middle attack
> I know I can go modify .ssh/known_hosts and remove the host key and it
> will work, but if the alias fails back to the
> original node the problem starts all over.
> How can I set up ssh to allow this connection. ?

this is what i read in the net about this issue:
http://blog.ganneff.de/blog/2008/03/24/ssh-known-hosts-for-cluster-en.html

Very easy, you only have to edit ~/.ssh/known_hosts on the client side.


-- 
greetings

eMHa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/cf829327/attachment.sig>

From mcse47 at hotmail.com  Fri Apr 25 16:34:47 2008
From: mcse47 at hotmail.com (Tracey Flanders)
Date: Fri, 25 Apr 2008 12:34:47 -0400
Subject: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs
	nodes.
Message-ID: <BAY123-W90D21314D7D1E7C07442BD4DD0@phx.gbl>


I've been trying to setup a 3 server cluster with GFS mounted over iSCSI on Qemu Virtual Machines. A 4th server acts as a iSCSI Target.
I found and article that explains my issue, but I can't seem to figure out what the solution is.

QUOTED from : http://kbase.redhat.com/faq/FAQ_51_10923.shtm
After successfully setting up a cluster, cman_tool shows the cluster is healthy. Mounting the gfs mount on the first node works successfully. However, when mounting gfs on the second node, the mount command hangs. Writing to a file on the first node also hangs.

On the second node, the following error is seen in /var/log/messages:

Jul 18 14:49:27 blade3 kernel: Lock_Harness 2.6.9-72.2 (built Apr 24
2007 12:45:55) installed
Jul 18 14:49:27 blade3 kernel: GFS 2.6.9-72.2 (built Apr 24 2007
12:46:12) installed
Jul 18 14:52:53 blade3 kernel: GFS: Trying to join cluster "lock_dlm",
"vcomcluster:testgfs"
Jul 18 14:52:53 blade3 kernel: Lock_DLM (built Apr 24 2007 12:45:57)
installed
Jul 18 14:52:53 blade3 kernel: dlm: connect from non cluster node
Jul 18 14:52:53 blade3 kernel: dlm: connect from non cluster node
END QUOTE

My Virtual Machines only have one interface so I still can't figure out why this is happening. I can successfully mount the GFS partition on any one node but as soon as I try to start the clvmd on a 2nd node it hangs the whole cluster. I'm wondering if its a Qemu VM network issue? Each host can ping each other by name and ip. The cluster works fine but I cant get GFS to work on th VMs. Is it possible to debug the clvmd to see what IP Address it is sending? Thanks,


Tracey Flanders

_________________________________________________________________
In a rush? Get real-time answers with Windows Live Messenger.
http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_realtime_042008
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/d7fba642/attachment.htm>

From gniagnia at gmail.com  Fri Apr 25 17:06:42 2008
From: gniagnia at gmail.com (gnia gnia)
Date: Fri, 25 Apr 2008 19:06:42 +0200
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
	<D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>
Message-ID: <f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>

sorry my first message was a bit vague.
We have 2 HP EVA arrays : one being the backup device of the other in case a
crash occurs.
So we make each node of the cluster (active/passive , multipath + clvmd +
cmirror) connected to both EVAs.
We create one disk on each EVAs with same characteristics, plus one disk on
one of the 2 EVAs as a log mirror disk.
Then we 'lvcreate -m1 disk1 disk2 disk3' and put our file system resource on
this lv.
That is where our test (shutdown of an EVA) totally fails.
Unfortunately ha-lvm won't be usable because we intend to use GFS soon.
What we need is a solution that ensures clusterized services to carry on
working even if on EVA burns...
Thanks a lot for helping.

C

On Fri, Apr 25, 2008 at 4:26 PM, Jonathan Brassow <jbrassow at redhat.com>
wrote:

> I'm not sure what you mean by multi-SAN.  Do you mean that you have two
> independent storage networks connecting the same machines to the same disks,
> or same machines to different disks, or something else?  It sounds like you
> have 1 SAN with two storage devices?  It also sounds like your top level
> services are active/passive, but that your storage is setup active/active.
>  You may consider HA-LVM instead of CLVM in this case (
> http://sourceware.org/cluster/wiki/LVMFailover).
>
> I've seen this scenario where things get frozen...  Turns out that the user
> did not have multipathd running, so multipath was queueing all I/O until a
> path would come back - effectively freezing everything.
>
> The problem could be a number of things... we may need more information if
> the above doesn't help.
>
>  brassow
>
>
> On Apr 25, 2008, at 7:41 AM, gnia gnia wrote:
>
>  Hi all,
>>
>> Situation :  2 nodes cluster (active/active) under RedHat 4.
>> Each clusterized service has a multipathed/mirrored LV (one PV on each
>> storage bay -> HP EVA 8000) declared as an FS resource.
>> (We decided to implement a multi-SAN environment to prevent from one SAN
>> failure)
>> But all clusterized mirrored LVs get frozen and become unusable when we
>> simulate one SAN outage.
>>
>> According to you experts, what is the best way to implement a storage
>> redundant redhat cluster that wouldn't fail in case of one SAN outage ?
>> Is lvm mirroring a godd choice?
>>
>> Cheers,
>> Chris --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/7836be99/attachment.htm>

From mpartio at gmail.com  Fri Apr 25 17:15:05 2008
From: mpartio at gmail.com (Mikko Partio)
Date: Fri, 25 Apr 2008 20:15:05 +0300
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
	<D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>
	<f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>
Message-ID: <2ca799770804251015j27259601wc49d4f2547a92aa2@mail.gmail.com>

2008/4/25 gnia gnia <gniagnia at gmail.com>:

> sorry my first message was a bit vague.
> We have 2 HP EVA arrays : one being the backup device of the other in case
> a crash occurs.
> So we make each node of the cluster (active/passive , multipath + clvmd +
> cmirror) connected to both EVAs.
> We create one disk on each EVAs with same characteristics, plus one disk
> on one of the 2 EVAs as a log mirror disk.
> Then we 'lvcreate -m1 disk1 disk2 disk3' and put our file system resource
> on this lv.
> That is where our test (shutdown of an EVA) totally fails.
> Unfortunately ha-lvm won't be usable because we intend to use GFS soon.
> What we need is a solution that ensures clusterized services to carry on
> working even if on EVA burns...



AFAIK there is no open source software to make this happen. I've read about
cmirror which could replicate data between two sans in a clustered
environment but it was not production-proof last time I heard of it...

We have a similar situation, two different sans one being a backup, and we
use the san providers (EMC) proprietary replication software.

Regards

Mikko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/2572be4c/attachment.htm>

From charlieb-linux-cluster at e-smith.com  Fri Apr 25 18:18:34 2008
From: charlieb-linux-cluster at e-smith.com (Charlie Brady)
Date: Fri, 25 Apr 2008 14:18:34 -0400 (EDT)
Subject: qdisk and voting in RHEL4 CS (Re: [Linux-cluster] Is heuristic really
	required?)
In-Reply-To: <1209060358.809.10.camel@ayanami.boston.devel.redhat.com>
Message-ID: <Pine.LNX.4.44.0804251412490.9517-100000@allspice.nssg.mitel.com>


On Thu, 24 Apr 2008, Lon Hohberger wrote:

> On Wed, 2008-04-23 at 16:11 -0400, Charlie Brady wrote:
> > On Fri, 18 Apr 2008, HarriP??iv??niemi wrote:
> > 
...
> 
> > Another thing I've noticed (with RHEL4 RHC) is the the quorum disk doesn't 
> > show up in the node list of the first node set up in the cluster, and 
> > doesn't contribute a vote. Anyone have any ideas about that?
> 
> It's not really a node, and is not reported by CMAN in the nodes list as
> it is in RHEL5/STABLE2 branches.

Not so, it is sometimes reported by cman in the nodes list. Herewith the 
/proc/cluster/nodes file from four nodes of a cluster, each with qdiskd 
running:


[root at g5node11 ~]# for i in 12 14 16 18 ; do ssh 192.168.10.$i cat /proc/cluster/nodes ; done
Node  Votes Exp Sts  Name
   1    1    1   M   g5node11
   2    1    3   M   g5node12
   3    1    5   M   g5node13
   4    1    7   M   huys3
Node  Votes Exp Sts  Name
   0    1    0   M   /dev/sda1
   1    1    1   M   g5node11
   2    1    3   M   g5node12
   3    1    5   M   g5node13
   4    1    7   M   huys3
Node  Votes Exp Sts  Name
   0    2    0   M   /dev/sda1
   1    1    1   M   g5node11
   2    1    3   M   g5node12
   3    1    5   M   g5node13
   4    1    7   M   huys3
Node  Votes Exp Sts  Name
   0    3    0   M   /dev/sda1
   1    1    1   M   g5node11
   2    1    3   M   g5node12
   3    1    5   M   g5node13
   4    1    7   M   huys3
[root at g5node11 ~]#

I'm guessing that the order of startup or of adding nodes to the cluster 
is responsible for the non-appearance of qdisk on one node, and the 
varying votes on other nodes, but I'd appreciate some better 
understanding, and a workaround or fix if someone can provide it.

--
Charlie




From jbrassow at redhat.com  Fri Apr 25 19:56:13 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Fri, 25 Apr 2008 14:56:13 -0500
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>
	<D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>
	<f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>
Message-ID: <02B220AA-C259-4982-814D-7BE1696C8230@redhat.com>

Ok, we'll assume you have to use CLVM then.

Have you checked if multipathd is running when you kill the storage  
device?

What are the symptoms of the cluster freeze.  (I will need some kind  
of logging output.)

  brassow

On Apr 25, 2008, at 12:06 PM, gnia gnia wrote:

> sorry my first message was a bit vague.
> We have 2 HP EVA arrays : one being the backup device of the other  
> in case a crash occurs.
> So we make each node of the cluster (active/passive , multipath +  
> clvmd + cmirror) connected to both EVAs.
> We create one disk on each EVAs with same characteristics, plus one  
> disk on one of the 2 EVAs as a log mirror disk.
> Then we 'lvcreate -m1 disk1 disk2 disk3' and put our file system  
> resource on this lv.
> That is where our test (shutdown of an EVA) totally fails.
> Unfortunately ha-lvm won't be usable because we intend to use GFS  
> soon.
> What we need is a solution that ensures clusterized services to  
> carry on working even if on EVA burns...
> Thanks a lot for helping.
>
> C
>
> On Fri, Apr 25, 2008 at 4:26 PM, Jonathan Brassow  
> <jbrassow at redhat.com> wrote:
> I'm not sure what you mean by multi-SAN.  Do you mean that you have  
> two independent storage networks connecting the same machines to the  
> same disks, or same machines to different disks, or something else?   
> It sounds like you have 1 SAN with two storage devices?  It also  
> sounds like your top level services are active/passive, but that  
> your storage is setup active/active.  You may consider HA-LVM  
> instead of CLVM in this case (http://sourceware.org/cluster/wiki/LVMFailover 
> ).
>
> I've seen this scenario where things get frozen...  Turns out that  
> the user did not have multipathd running, so multipath was queueing  
> all I/O until a path would come back - effectively freezing  
> everything.
>
> The problem could be a number of things... we may need more  
> information if the above doesn't help.
>
>  brassow
>
>
> On Apr 25, 2008, at 7:41 AM, gnia gnia wrote:
>
> Hi all,
>
> Situation :  2 nodes cluster (active/active) under RedHat 4.
> Each clusterized service has a multipathed/mirrored LV (one PV on  
> each storage bay -> HP EVA 8000) declared as an FS resource.
> (We decided to implement a multi-SAN environment to prevent from one  
> SAN failure)
> But all clusterized mirrored LVs get frozen and become unusable when  
> we simulate one SAN outage.
>
> According to you experts, what is the best way to implement a  
> storage redundant redhat cluster that wouldn't fail in case of one  
> SAN outage ?
> Is lvm mirroring a godd choice?
>
> Cheers,
> Chris --
>
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/76030e33/attachment.htm>

From jbrassow at redhat.com  Fri Apr 25 20:02:26 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Fri, 25 Apr 2008 15:02:26 -0500
Subject: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs
	nodes.
In-Reply-To: <BAY123-W90D21314D7D1E7C07442BD4DD0@phx.gbl>
References: <BAY123-W90D21314D7D1E7C07442BD4DD0@phx.gbl>
Message-ID: <4087E106-AE76-4233-A0E4-42E8A90BCBFE@redhat.com>

If you suspect a problem with clvmd, you could simply remove it from  
the equation and retest, right?

You could just use the underlying iSCSI device and mkfs.gfs on  
that.... at least for testing if clvmd is the problem.

I suppose you could also test if clvmd is the problem by testing the  
logical volumes without GFS in the mix.  IOW, create some LVs and read/ 
write to them at the same time from different machines.  If this is  
working, the file system should work.  If the file system doesn't,  
then the problem is probably higher up than clvmd.

  brassow

On Apr 25, 2008, at 11:34 AM, Tracey Flanders wrote:

> I've been trying to setup a 3 server cluster with GFS mounted over  
> iSCSI on Qemu Virtual Machines. A 4th server acts as a iSCSI Target.  
> I found and article that explains my issue, but I can't seem to  
> figure out what the solution is. QUOTED from :http://kbase.redhat.com/faq/FAQ_51_10923.shtm 
>  After successfully setting up a cluster, cman_tool shows the  
> cluster is healthy. Mounting the gfs mount on the first node works  
> successfully. However, when mounting gfs on the second node, the  
> mount command hangs. Writing to a file on the first node also hangs.  
> On the second node, the following error is seen in /var/log/ 
> messages: Jul 18 14:49:27 blade3 kernel: Lock_Harness 2.6.9-72.2  
> (built Apr 24 2007 12:45:55) installed Jul 18 14:49:27 blade3  
> kernel: GFS 2.6.9-72.2 (built Apr 24 2007 12:46:12) installed Jul 18  
> 14:52:53 blade3 kernel: GFS: Trying to join cluster "lock_dlm",  
> "vcomcluster:testgfs" Jul 18 14:52:53 blade3 kernel: Lock_DLM (built  
> Apr 24 2007 12:45:57) installed Jul 18 14:52:53 blade3 kernel: dlm:  
> connect from non cluster node Jul 18 14:52:53 blade3 kernel: dlm:  
> connect from non cluster node END QUOTE My Virtual Machines only  
> have one interface so I still can't figure out why this is  
> happening. I can successfully mount the GFS partition on any one  
> node but as soon as I try to start the clvmd on a 2nd node it hangs  
> the whole cluster. I'm wondering if its a Qemu VM network issue?  
> Each host can ping each other by name and ip. The cluster works fine  
> but I cant get GFS to work on th VMs. Is it possible to debug the  
> clvmd to see what IP Address it is sending? Thanks, Tracey Flanders
> In a rush? Get real-time answers with Windows Live Messenger. --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/a77993d8/attachment.htm>

From pmshehzad at yahoo.com  Sat Apr 26 06:30:04 2008
From: pmshehzad at yahoo.com (Mshehzad Pankhawala)
Date: Fri, 25 Apr 2008 23:30:04 -0700 (PDT)
Subject: [Linux-cluster] ssh'ing to Cluster IP aliases.
Message-ID: <404374.77043.qm@web45814.mail.sp1.yahoo.com>

Hi, 
I think this is the temperory solution to delete on the client side and then continue to shh the faild over node.
What is the permenant solution to this issue. Is There? or Not?
regards
Shehzad Pankhawala

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080425/e3db6e7b/attachment.htm>

From davi at jvsinfo.com.br  Sun Apr 27 22:42:07 2008
From: davi at jvsinfo.com.br (Davi Baldin)
Date: Sun, 27 Apr 2008 19:42:07 -0300
Subject: [Linux-cluster] What services should be enabled at boot time (RHCS
	5)
Message-ID: <OFBE7003DE.F59D8635-ON83257438.007C6C74-83257438.007CB56F@jvsinfo.com.br>

Hello all,

What service i should be enable at boot time on Red Hat 5.1 ?

a) Without GFS and/or CLVM?

b) With GFS?

Best regards,

Davi Baldin
JVS do Brasil - IBM BP Premier
davi at jvsinfo.com.br
(19) 3254-1266
(19) 9266-6793 ** NOVO **
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080427/8e73ce83/attachment.htm>

From davi at jvsinfo.com.br  Sun Apr 27 23:07:56 2008
From: davi at jvsinfo.com.br (Davi Baldin)
Date: Sun, 27 Apr 2008 20:07:56 -0300
Subject: [Linux-cluster] IBM RSA-II Fencing and boot
Message-ID: <OF68FE0F44.472A13B4-ON83257438.007CB772-83257438.007F128B@jvsinfo.com.br>

Hello,

I'm confused about fencing configuration and work, please can someone help 
me?

My setup:

- Host A 192.168.4.251
        - RSA-II 192.168.4.36 (plugged on server)

- Host B 192.168.4.252
        - RSA-II 192.168.4.37 (plugged on server)

cluster.conf

- Host A have fencing level 1, fencing device 192.168.4.36

- Host B have fencing level 1, fencing device 192.168.4.37

Services on cluster
        - LDAP, DNS, Radius for service "infra"
        - postfix and courier-imap for service "email"

This is correct setup? I need more other fencing configuration?


ISSUE

When the two servers are up, if I reboot one server (anyone), at the boot 
time when the service cluster starts, in " starting fecing..." the other 
host thats are alived, recieve a head shot, and reboot (not halt) it self. 
After the head shot, the cluster service continue starting and "kidnaps" 
all cluster services for himself. This issue occurs if the host have that 
one, all or none service configured running.

ISSUE 2

If I shutdown all servers, always at the boot time or a shutdown time the 
fecing always hang. At the boot tiime, I need to wait 10min, and the 
shutdown wait around 30 min.

All RSAs and nodes IPs are configured in /etc/hosts, always the network 
are available (except when host is poweroff), the RSA is always 
on-line....


Many thanks for any help,

Regards,

Davi

Davi Baldin
JVS do Brasil - IBM BP Premier
davi at jvsinfo.com.br
(19) 3254-1266
(19) 9266-6793 ** NOVO **
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080427/f3ab94c5/attachment.htm>

From mcse47 at hotmail.com  Mon Apr 28 12:15:56 2008
From: mcse47 at hotmail.com (Tracey Flanders)
Date: Mon, 28 Apr 2008 08:15:56 -0400
Subject: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs
	nodes.
In-Reply-To: <20080426160008.4626561A488@hormel.redhat.com>
References: <20080426160008.4626561A488@hormel.redhat.com>
Message-ID: <BAY123-W37455DEFD596264A266CDDD4DE0@phx.gbl>



I tested creating  a GFS disk with 2 nodes started in the cluster without using LVMs and CLVMD stop.
I mounted the disk on the first node but when I mounted the 2nd node it did the same thing. So it seems its something 
other than CLVMD. I've attached my cluster.conf. It's kind of dumbed down because I was troubleshooting. So I removed the GFS mount for a services, etc.
This the config I used.



        
        
                
                        
                                
                                        
                                
                        
                
                
                        
                                
                                        
                                
                        
                
                
                        
                                
                                        
                                
                        
                
        
        
        
                
                
                
        
        
                
                        
                                
                                
                                
                        
                        
                                
                                
                                
                        
                        
                                
                                
                                
                        
                
        



> If you suspect a problem with clvmd, you could simply remove it from  
> the equation and retest, right?
> 
> You could just use the underlying iSCSI device and mkfs.gfs on  
> that.... at least for testing if clvmd is the problem.
> 
> I suppose you could also test if clvmd is the problem by testing the  
> logical volumes without GFS in the mix.  IOW, create some LVs and read/ 
> write to them at the same time from different machines.  If this is  
> working, the file system should work.  If the file system doesn't,  
> then the problem is probably higher up than clvmd.
> 
>   brassow
> 
> On Apr 25, 2008, at 11:34 AM, Tracey Flanders wrote:
> 
>> I've been trying to setup a 3 server cluster with GFS mounted over  
>> iSCSI on Qemu Virtual Machines. A 4th server acts as a iSCSI Target.  
>> I found and article that explains my issue, but I can't seem to  
>> figure out what the solution is. QUOTED from :http://kbase.redhat.com/faq/FAQ_51_10923.shtm 
>>  After successfully setting up a cluster, cman_tool shows the  
>> cluster is healthy. Mounting the gfs mount on the first node works  
>> successfully. However, when mounting gfs on the second node, the  
>> mount command hangs. Writing to a file on the first node also hangs.  
>> On the second node, the following error is seen in /var/log/ 
>> messages: Jul 18 14:49:27 blade3 kernel: Lock_Harness 2.6.9-72.2  
>> (built Apr 24 2007 12:45:55) installed Jul 18 14:49:27 blade3  
>> kernel: GFS 2.6.9-72.2 (built Apr 24 2007 12:46:12) installed Jul 18  
>> 14:52:53 blade3 kernel: GFS: Trying to join cluster "lock_dlm",  
>> "vcomcluster:testgfs" Jul 18 14:52:53 blade3 kernel: Lock_DLM (built  
>> Apr 24 2007 12:45:57) installed Jul 18 14:52:53 blade3 kernel: dlm:  
>> connect from non cluster node Jul 18 14:52:53 blade3 kernel: dlm:  
>> connect from non cluster node END QUOTE My Virtual Machines only  
>> have one interface so I still can't figure out why this is  
>> happening. I can successfully mount the GFS partition on any one  
>> node but as soon as I try to start the clvmd on a 2nd node it hangs  
>> the whole cluster. I'm wondering if its a Qemu VM network issue?  
>> Each host can ping each other by name and ip. The cluster works fine  
>> but I cant get GFS to work on th VMs. Is it possible to debug the  
>> clvmd to see what IP Address it is sending? Thanks, Tracey Flanders
>> In a rush? Get real-time answers with Windows Live Messenger. --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
_________________________________________________________________
Spell a grand slam in this game where word skill meets World Series. Get in the game.
http://club.live.com/word_slugger.aspx?icid=word_slugger_wlhm_admod_april08



From jbrassow at redhat.com  Mon Apr 28 13:55:32 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Mon, 28 Apr 2008 08:55:32 -0500
Subject: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs
	nodes.
In-Reply-To: <BAY123-W37455DEFD596264A266CDDD4DE0@phx.gbl>
References: <20080426160008.4626561A488@hormel.redhat.com>
	<BAY123-W37455DEFD596264A266CDDD4DE0@phx.gbl>
Message-ID: <88476944-2A2D-43EE-89B1-6CD71EAC3DB6@redhat.com>

I just got a blank page for you cluster.conf.  It must be /really/  
slimmed down. ;)

  brassow

On Apr 28, 2008, at 7:15 AM, Tracey Flanders wrote:

>
>
> I tested creating  a GFS disk with 2 nodes started in the cluster  
> without using LVMs and CLVMD stop.
> I mounted the disk on the first node but when I mounted the 2nd node  
> it did the same thing. So it seems its something
> other than CLVMD. I've attached my cluster.conf. It's kind of dumbed  
> down because I was troubleshooting. So I removed the GFS mount for a  
> services, etc.
> This the config I used.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>> If you suspect a problem with clvmd, you could simply remove it from
>> the equation and retest, right?
>>
>> You could just use the underlying iSCSI device and mkfs.gfs on
>> that.... at least for testing if clvmd is the problem.
>>
>> I suppose you could also test if clvmd is the problem by testing the
>> logical volumes without GFS in the mix.  IOW, create some LVs and  
>> read/
>> write to them at the same time from different machines.  If this is
>> working, the file system should work.  If the file system doesn't,
>> then the problem is probably higher up than clvmd.
>>
>>  brassow
>>
>> On Apr 25, 2008, at 11:34 AM, Tracey Flanders wrote:
>>
>>> I've been trying to setup a 3 server cluster with GFS mounted over
>>> iSCSI on Qemu Virtual Machines. A 4th server acts as a iSCSI Target.
>>> I found and article that explains my issue, but I can't seem to
>>> figure out what the solution is. QUOTED from :http://kbase.redhat.com/faq/FAQ_51_10923.shtm
>>> After successfully setting up a cluster, cman_tool shows the
>>> cluster is healthy. Mounting the gfs mount on the first node works
>>> successfully. However, when mounting gfs on the second node, the
>>> mount command hangs. Writing to a file on the first node also hangs.
>>> On the second node, the following error is seen in /var/log/
>>> messages: Jul 18 14:49:27 blade3 kernel: Lock_Harness 2.6.9-72.2
>>> (built Apr 24 2007 12:45:55) installed Jul 18 14:49:27 blade3
>>> kernel: GFS 2.6.9-72.2 (built Apr 24 2007 12:46:12) installed Jul 18
>>> 14:52:53 blade3 kernel: GFS: Trying to join cluster "lock_dlm",
>>> "vcomcluster:testgfs" Jul 18 14:52:53 blade3 kernel: Lock_DLM (built
>>> Apr 24 2007 12:45:57) installed Jul 18 14:52:53 blade3 kernel: dlm:
>>> connect from non cluster node Jul 18 14:52:53 blade3 kernel: dlm:
>>> connect from non cluster node END QUOTE My Virtual Machines only
>>> have one interface so I still can't figure out why this is
>>> happening. I can successfully mount the GFS partition on any one
>>> node but as soon as I try to start the clvmd on a 2nd node it hangs
>>> the whole cluster. I'm wondering if its a Qemu VM network issue?
>>> Each host can ping each other by name and ip. The cluster works fine
>>> but I cant get GFS to work on th VMs. Is it possible to debug the
>>> clvmd to see what IP Address it is sending? Thanks, Tracey Flanders
>>> In a rush? Get real-time answers with Windows Live Messenger. --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
> _________________________________________________________________
> Spell a grand slam in this game where word skill meets World Series.  
> Get in the game.
> http://club.live.com/word_slugger.aspx?icid=word_slugger_wlhm_admod_april08
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From fdinitto at redhat.com  Mon Apr 28 17:37:53 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 28 Apr 2008 19:37:53 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.03.02 released
Message-ID: <Pine.LNX.4.64.0804281923590.12539@trider-g7>


The cluster team and its vibrant community are proud to announce the 4rd
release from the STABLE2 branch: 2.03.03.

The STABLE2 branch collects, on a daily base, all bug fixes and the bare 
minimal changes required to run the cluster on top of the most recent 
Linux kernel (2.6.25) and rock solid openais (0.80.3 or higher).

The 2.03.02 release features a major rework in the fence subsystem by a 
complete rewrite of 4 fence agents (apc, bladecenter, ilo and wti) and a 
brand new one (drac5), all featuring ssh support. As bonus track, the 
fence_apc_snmp agent has been readded to the release.

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.02.tar.gz

In order to use GFS1, the Linux kernel requires a minimal patch:

   ftp://sources.redhat.com/pub/cluster/releases/lockproto-exports.patch

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

footnote to packagers: the new fence agent library requires the python 
expect extension (called pexpect in most, if not all, Linux 
distributions).

Under the hood (from 2.03.01):

Benjamin Marzinski (1):
       The gnbd kernel module on 64 bit architectures didn't handle ioctls from 32 bit

Fabio M. Di Nitto (23):
       [rgmanager] Remove obsolete clushutdown utility
       [CMAN] Drop dependency on libdevmapper
       [CMAN] qdisk: add credits to Joel
       [MISC] Update copyright headers
       [MISC] Update Red Hat main copyright file
       [BUILD] Fix kernel check for good
       [FENCE] Enable fence_apc_snmp
       [FENCE] apc_snmp: allow paths to snmp binaries to be configurable
       [BUILD] Add fencelibdir support
       [FENCE] Move apc_snmp README where it belongs
       [FENCE] Move apc_snmp README where it belongs
       [FENCE] Remove obsoleted fence_apc perl implementation
       [BUILD] Royal cleanup of the fence agents build system
       [BUILD] Enable build and install of experimental fence agents
       [FENCE] Fix fencelib to pring version and copyright
       [FENCE] Make sure to version and copyright all built files
       [BUILD] Fix clean target for experimental fence/agents/lib
       [MISC] Update copyright headers
       [BUILD] Deal with the new libfence properly
       [FENCE] Enable new fence agents by default
       [RGMANAGER] Fix uninstall target
       [BUILD] Fix install/uninstall targets for fence/agents/lib
       [BUILD] Fix fence lib install target

Jonathan Brassow (1):
       rgmanager/lvm.metadata: Fix parameter description fields

Lon Hohberger (7):
       Remove clushutdown man page references from clusvcadm.8; resolves #324151
       [fence] Close file descriptors that are in invalid/error states
       [cman] Merge scandisk & fixes from RHEL5 branch
       [cman] Fix qdisk Makefile / disk_util merge bugs
       [cman] Make mkqdisk print all device paths
       [cman/qdisk] Fix type pun errors in proc.c
       [fence] Preliminary TPS/NBB/NPS support in new WTI agent.

Marek 'marx' Grac (4):
       fence/agents:	New fencings agents
       fence/agents:	WTI agents merged
       fence/agents:	Add obsolete options
       [RGMANAGER] Fixed typo in mysql.metadata

  Makefile                                      |    2 +-
  cman/qdisk/Makefile                           |    3 +-
  cman/qdisk/disk.c                             |  207 +-
  cman/qdisk/disk.h                             |   62 +-
  cman/qdisk/disk_util.c                        |   21 +-
  cman/qdisk/main.c                             |  157 +-
  cman/qdisk/mkqdisk.c                          |   15 +-
  cman/qdisk/proc.c                             |  267 +-
  cman/qdisk/scandisk.c                         |  751 +
  cman/qdisk/scandisk.h                         |  105 +
  config/copyright.cf                           |    4 +-
  configure                                     |   77 +-
  fence/agents/Makefile                         |    2 +-
  fence/agents/apc/Makefile                     |    5 +-
  fence/agents/apc/README                       |   47 -
  fence/agents/apc/fence_apc.pl                 |  469 -
  fence/agents/apc/fence_apc.py                 |  899 +-
  fence/agents/apc/fence_apc_snmp.py            |  367 -
  fence/agents/apc/powernet369.mib              |31109 -------------------------
  fence/agents/apc_snmp/Makefile                |   18 +
  fence/agents/apc_snmp/README                  |   47 +
  fence/agents/apc_snmp/fence_apc_snmp.py       |  367 +
  fence/agents/apc_snmp/powernet369.mib         |31109 +++++++++++++++++++++++++
  fence/agents/baytech/Makefile                 |    2 +-
  fence/agents/bladecenter/Makefile             |    5 +-
  fence/agents/bladecenter/fence_bladecenter.pl |  402 +-
  fence/agents/brocade/Makefile                 |    2 +-
  fence/agents/bullpap/Makefile                 |    2 +-
  fence/agents/cpint/Makefile                   |    2 +-
  fence/agents/drac/Makefile                    |    7 +-
  fence/agents/drac/fence_drac5.py              |   79 +
  fence/agents/egenera/Makefile                 |    2 +-
  fence/agents/ibmblade/Makefile                |    2 +-
  fence/agents/ilo/Makefile                     |    5 +-
  fence/agents/ilo/fence_ilo.pl                 |  655 -
  fence/agents/ilo/fence_ilo.py                 |   77 +
  fence/agents/lib/Makefile                     |   21 +
  fence/agents/lib/fencing.py.py                |  347 +
  fence/agents/mcdata/Makefile                  |    2 +-
  fence/agents/rsa/Makefile                     |    2 +-
  fence/agents/rsb/Makefile                     |    2 +-
  fence/agents/sanbox2/Makefile                 |    2 +-
  fence/agents/scsi/Makefile                    |   32 +-
  fence/agents/vixel/Makefile                   |    2 +-
  fence/agents/vmware/Makefile                  |    2 +-
  fence/agents/wti/Makefile                     |    5 +-
  fence/agents/wti/fence_wti.pl                 |  384 -
  fence/agents/wti/fence_wti.py                 |  109 +
  fence/agents/xcat/Makefile                    |    2 +-
  fence/agents/zvm/Makefile                     |    2 +-
  fence/fenced/main.c                           |    2 +-
  gnbd-kernel/src/gnbd.c                        |   33 +
  gnbd/server/device.c                          |    7 +-
  group/daemon/main.c                           |    2 +-
  make/clean.mk                                 |    2 +-
  make/defines.mk.input                         |    6 +-
  make/fencebuild.mk                            |   26 +
  make/fenceperl.mk                             |   18 -
  make/fencepy.mk                               |   18 -
  make/install.mk                               |    8 +
  make/uninstall.mk                             |    6 +
  rgmanager/man/clushutdown.8                   |   13 -
  rgmanager/man/clusvcadm.8                     |   13 +-
  rgmanager/src/resources/Makefile              |    5 +-
  rgmanager/src/resources/lvm.metadata          |   12 +-
  rgmanager/src/resources/mysql.metadata        |    4 +-
  rgmanager/src/utils/clushutdown               |   53 -
  scripts/define2var                            |   71 -
  scripts/fenceparse                            |   47 +
  69 files changed, 33884 insertions(+), 34728 deletions(-)

--
I'm going to make him an offer he can't refuse.



From rhurst at bidmc.harvard.edu  Mon Apr 28 18:40:47 2008
From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu)
Date: Mon, 28 Apr 2008 14:40:47 -0400
Subject: [Linux-cluster] Cluster 2.03.02 released
In-Reply-To: <Pine.LNX.4.64.0804281923590.12539@trider-g7>
References: <Pine.LNX.4.64.0804281923590.12539@trider-g7>
Message-ID: <1209408047.7343.86.camel@WSBID06223.bidmc.harvard.edu>

The "Ford" release?  Is that in compliance with the GPL?  ;^)


On Mon, 2008-04-28 at 19:37 +0200, Fabio M. Di Nitto wrote:

> The cluster team and its vibrant community are proud to announce the 4rd
> release from the STABLE2 branch: 2.03.03.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080428/2c10d7a0/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3227 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080428/2c10d7a0/attachment.p7s>

From fdinitto at redhat.com  Mon Apr 28 18:43:45 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 28 Apr 2008 20:43:45 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.99.00 (development snapshot) released
Message-ID: <Pine.LNX.4.64.0804282005530.12539@trider-g7>


The cluster team and its community are proud to announce the 1st release 
from the master branch: 2.99.00.

The 2.99.XX releases are _NOT_ meant to be used for production 
environments.. yet.

You have been warned: *this code will have no mercy* for your servers and 
your data.

The master branch is the main development tree that receives all new
features, code, clean up and a whole brand new set of bugs,

At some point in time this code will become the 3.0 stable release.

Everybody with test equipment and time to spare, is highly encouraged to 
download, install and test the 2.99 releases and more important report 
problems.

In order to build the 2.99.00 release you will need:

- openais (http://www.openais.org/doku.php?id=developers) latest svn 
checkout (rev 1528 or higher, soon to be 0.83)

- linux kernel (git snapshot or 2.6.26) from
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git
(but can run on 2.6.25 in compatibility mode)

NOTE to packagers: the library API/ABI's are _NOT_ stable (hence 2.9). We 
are still shipping shared libraries but remember that they can change 
anytime without warning. A bunch of new shared libraries have been added 
and more will come.

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.99.00.tar.gz

In order to use GFS1, the Linux kernel requires a minimal patch:

   ftp://sources.redhat.com/pub/cluster/releases/lockproto-exports.patch

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

Under the hood (from 2.03.02):

Abhijith Das (6):
       gfs2_tool: remove 'gfs2_tool counters' as they aren't implemented anymore
       gfs-kernel: fix for bz 429343 gfs_glock_is_locked_by_me assertion
       gfs2_tool manpage: gfs2_tool counters doesn't exist anymore.
       gfs2_tool: Fix build warnings in misc.c bz 441636
       gfs2_tool manpage: Updates to the manpage for bz441636
       gfs2_tool: Fix build warnings in misc.c bz 441636

Andrew Price (4):
       [[BUILD] Warn and continue if CONFIG_KERNELVERSION is not found
       [GFS2] gfs2_fsck: Fix operation on 'ptr' may be undefined warnings
       [GFS2] Remove unrequired header file
       [GFS2] gfs2_edit: Remove duplicate linux_endian.h

Benjamin Marzinski (1):
       The gnbd kernel module on 64 bit architectures didn't handle ioctls from 32 bit

Bob Peterson (14):
       Resolves: bz 435917: GFS2: mkfs.gfs2 default lock protocol
       Resolves: bz 421761: 'gfs_tool lockdump' wrongly says 'unknown
       Resolves: bz 431945: GFS: gfs-kernel should use device major:minor
       Merge branch 'master' of ssh://sources.redhat.com/git/cluster into master.bz431945
       Update to prior commit for bz431945: I forgot that STABLE2
       Resolves: bz 436383: GFS filesystem size inconsistent
       Fix savemeta so it saves gfs-1 rg information properly
       Fix gfs2_edit print options (-p) to work properly for gfs-1
       gfs2_edit was not recalculating the max block size after it figured
       Fix some compiler warnings in gfs2_edit
       bz440896/440897 GFS: gfs_fsck should repair gfs_grow corruption
       bz425421: gfs mount attempt hangs if no more journals available
       bz438762: gfs_tool: Cannot allocate memory
       bz295301: Need man page for gfs_edit

Chris Feist (1):
       Added back in change to description line to make chkconfig work properly.

Christine Caulfield (19):
       cman3 commit
       [CMAN] Don't ignore cman_tool version
       [CMAN] Remove deleted nodes from our list
       Merge branch 'cman3'
       Fix multicast display in 'cman_tool status'
       Initialise votes to 0
       [DLM] Don't segfault if lvbptr is NULL
       [CCS] Fix the config loader for good
       [CMAN] Make cman cope with the new objdb structure
       [CMAN] Free up any queued messages when someone disconnects
       [CMAN] Limit outstanding replies
       [CMAN] Don't declare a variable in the middle of a block
       [CMAN] valid port number & don't use it before validation
       [DLM] Mention lidlm_lt in the man page
       Remove references to broadcast.
       [CMAN] Save the new expected_votes when a node is removed
       [MISC] Make it build with gcc 4.3
       [FENCE] Make it build with gcc 4.3
       [CMAN] Disallow a new dirty node from joining the cman cluster

David Teigland (16):
       groupd: purge messages from dead nodes
       dlm_tool: print correct rq mode in lockdump
       libdlm: fix lvb copying
       dlm_controld: new version
       dlm_controld: quorum checking
       libdlm: max name length sanity
       dlm_controld: max name length sanity
       gfs: don't cancel glocks when writing to hidden file
       gfs_controld: retry recovery for withdrawn journal
       fenced: new version
       fenced: more new devel
       dlm_controld: build plock code
       fenced: new libfenced interface
       fence: using new libs
       fenced: process queries in a thread
       fenced: allow queries during fencing; group queries

Fabio M. Di Nitto (65):
       [CMAN] Move ccs config ais module into ccs/ccsais
       [BUILD] Fix configure script to handle releases
       [CCS] Upload all subsystem configs into objdb
       Add toplevel .gitignore
       [BUILD] Allow release version to contain padding 0's
       [CCS] Fix xml -> objdb config import
       [CCS] Cleanup duplicate vars from previous commit
       [CCS] Fix possible memory corruption on double free
       [CMAN] Drop dependency on libdevmapper
       [CMAN] Fix building when -DDEBUG is not specified
       [BUILD] Fix handling of version and libraries soname
       [BUILD] Update .gitignore for .o and .d files
       [BUILD] Set -MMD as default CFLAGS
       [BUILD] Fix man page install permission
       [CMAN] Fix config handling
       [CMAN] Do not duplicate entries in the objdb
       [CMAN] qdisk: add credits to Joel
       [FENCE] Enable fence_apc_snmp
       [FENCE] apc_snmp: allow paths to snmp binaries to be configurable
       [BUILD] Remove extra debugging entry
       [BUILD] Fix fenceperl and fencepy make snippets to allow multiple targets
       [BUILD] Add fencelibdir support
       [BUILD] add enable_experimental_fence_agents configure option
       [FENCE] Move apc_snmp README where it belongs
       [FENCE] Move apc_snmp README where it belongs
       [FENCE] Remove obsoleted fence_apc perl implementation
       [BUILD] Royal cleanup of the fence agents build system
       [BUILD] Enable build and install of experimental fence agents
       [FENCE] Fix fencelib to pring version and copyright
       [FENCE] Make sure to version and copyright all built files
       [BUILD] Fix clean target for experimental fence/agents/lib
       Revert "Fix help message to refer to script as 'fence_scsi_test'."
       Revert "fix bz277781 by accepting "nodename" as a synonym for "node""
       [KERNEL] Update modules to build with 2.6.25
       [GFS2] Fix build warning
       [BUILD] Add --enable_crack_of_the_day configure option
       [BUILD] Fix typo
       [BUILD] Set automatically cflags when building experimental bits
       [GROUP] Fix building with standard kernels
       Revert "gfs2_tool: Fix build warnings in misc.c bz 441636"
       [BUILD] Fix clean target
       [RGMANAGER] Fix build with gcc4.3
       [GFS2] Fix build warning
       [rgmanager] Remove obsolete clushutdown utility
       [CCS] libraries should never log
       [CCS] Convert to logsys
       [MISC] Update copyright headers
       [MISC] Update Red Hat main copyright file
       [BUILD] Deal with the new libfence properly
       [BUILD] Fix install/uninstall targets for fence/agents/lib
       [BUILD] Allow users to set default log dir and syslog facility
       [CCS] Switch to use user selected logdir and syslogfacility
       [CMAN] Convert qdiskd to use logsys
       [CMAN] Use build/user defined default logging facility
       [CCS] Document -d (debugging) switch
       [CCS] Allow ccsd logging level and facility to be set by cluster.conf
       [BUILD] Deal with new libfenced
       [BUILD] Fix building with separate object dir
       [BUILD] Fix kernel check for good
       [BUILD] Fix fence lib install target
       [GROUP] Apply patch to make gfs_controld work with 2.6.26
       [FENCE] Enable new fence agents by default
       [RGMANAGER] Fix uninstall target
       [BUILD] Fix build order. Gotta love circular build depends...
       [CMAN] Setup logging file

Joel Becker (1):
       libdlm: Don't pass LKF_WAIT to the kernel

Jonathan Brassow (5):
       rgmanager/lvm.sh: Fix bug 438816
       rgmanager/lvm.sh:  Fix bug bz242798
       rgmanager/lvm.sh: change argument order of shell command
       rgmanager/lvm.sh:  Minor comment updates
       rgmanager/lvm.metadata: Fix parameter description fields

Lon Hohberger (26):
       Correct incorrect netmask handling in ip.sh
       * Make fence_ack_manual.sh accept -n
       Fix #435189 - fenced override doesn't allow rgmanager to recover because
       Merge branch 'master' of ssh://lhh at sources.redhat.com/git/cluster
       Add Sybase failover agent
       Update changelog
       Add / fix Oracle 10g failover agent
       Merge branch 'master' of ssh://lhh at sources.redhat.com/git/cluster
       [fence] Make fence_xvmd support reloading of key files on the fly.
       [CMAN] Fix "Node X is undead" loop bug
       [rgmanager] Don't call quotaoff if quotas are not used
       [rgmanager] Make ip.sh check link states of non-ethernet devices
       [rgmanager] Set cloexec bit in msg_socket.c
       [rgmanager] Fix #432998
       [rgmanager] Remove unused lockspace.c file
       [cman] Merge scandisk & fixes from RHEL5 branch
       [cman] Fix qdisk Makefile / disk_util merge bugs
       [cman] Make mkqdisk print all device paths
       [cman] Apply missing fix for #315711
       [cman/qdisk] Fix type pun errors in proc.c
       [CMAN] Make cman init script start qdiskd intelligently
       Revert "[CMAN] Make cman init script start qdiskd intelligently"
       [CMAN] Make cman init script start qdiskd intelligently
       [fence] Preliminary TPS/NBB/NPS support in new WTI agent.
       [fence] Close file descriptors that are in invalid/error states
       Remove clushutdown man page references from clusvcadm.8; resolves #324151

Marek 'marx' Grac (4):
       fence/agents:	New fencings agents
       fence/agents:	WTI agents merged
       fence/agents:	Add obsolete options
       [RGMANAGER] Fixed typo in mysql.metadata

Ryan McCabe (6):
       Fix bz434790
       Fix a few misspellings
       Merge branch 'master' of ssh://sources.redhat.com/git/cluster
       Feeling pedantic. More spelling fixes.
       Merge branch 'master' of ssh://sources.redhat.com/git/cluster
       fix bz277781 by accepting "nodename" as a synonym for "node"

Ryan O'Hara (15):
       Variable should be quoted in conditional statement.
       Fix unregister code to report failure correctly.
       Remove "self" parameter. This was used to specify the name of the node
       Fix code to use get_key subroutine.
       Fix split calls to be consistent. Remove the optional LIMIT parameter.
       Replace /var/lock/subsys/${0##*/} with /var/lock/subsys/scsi_reserve.
       Fix success/failure reporting when registering devices at startup.
       Rewrite of get_scsi_devices function.
       Record devices that are successfully registered to /var/run/scsi_reserve.
       Allow 'stop' to release the reservation if and only if there are no other
       Attempt to register the node in the case where it must perform fence_scsi
       Fix help message to refer to script as 'fence_scsi_test'.
       BZ 248715
       BZ: 373491, 373511, 373531, 373541, 373571, 429033
       BZ 441323 : Redirect stderr to /dev/null when getting list of devices.

jparsons (1):
       Bump MAX_DEVICES in fenced from 4 to 8

  .gitignore                       |    2 +
  Makefile                         |    6 +-
  ccs/Makefile                     |    2 +-
  ccs/ccs_test/Makefile            |    2 +
  ccs/ccs_tool/Makefile            |    2 +
  ccs/ccsais/Makefile              |   45 +
  ccs/ccsais/config.c              |  257 +++++
  ccs/common/log.c                 |   23 -
  ccs/common/log.h                 |   98 --
  ccs/daemon/Makefile              |    5 +-
  ccs/daemon/ccsd.c                |  181 ++--
  ccs/daemon/cluster_mgr.c         |  114 +-
  ccs/daemon/cnx_mgr.c             |  340 +++---
  ccs/daemon/misc.c                |  115 ++-
  ccs/daemon/misc.h                |    2 +-
  ccs/include/debug.h              |   13 +-
  ccs/lib/Makefile                 |   13 +-
  ccs/lib/libccs.c                 |   54 +-
  ccs/man/ccsd.8                   |    3 +
  cman/cman_tool/Makefile          |    2 +
  cman/cman_tool/cman_tool.h       |    5 +-
  cman/cman_tool/join.c            |   10 +-
  cman/cman_tool/main.c            |   49 +-
  cman/daemon/Makefile             |   31 +-
  cman/daemon/ais.c                |  340 +-----
  cman/daemon/ais.h                |    7 +-
  cman/daemon/barrier.c            |    2 +-
  cman/daemon/barrier.h            |    2 +-
  cman/daemon/cman-preconfig.c     |  900 +++++++++++++++
  cman/daemon/cman.h               |   15 +
  cman/daemon/cmanccs.c            |  924 ---------------
  cman/daemon/cmanccs.h            |   14 -
  cman/daemon/cmanconfig.c         |  310 +++++
  cman/daemon/cmanconfig.h         |   45 +
  cman/daemon/cnxman-private.h     |   17 +-
  cman/daemon/cnxman-socket.h      |    3 +-
  cman/daemon/commands.c           |  133 ++-
  cman/daemon/commands.h           |   11 +-
  cman/daemon/config.c             |  151 ---
  cman/daemon/daemon.c             |    2 +
  cman/daemon/daemon.h             |    1 -
  cman/daemon/logging.c            |   36 +-
  cman/daemon/logging.h            |   18 +-
  cman/daemon/nodelist.h           |   73 ++
  cman/lib/Makefile                |    8 +-
  cman/lib/libcman.c               |   10 +
  cman/lib/libcman.h               |    3 +
  cman/man/cman_tool.8             |   57 +-
  cman/qdisk/Makefile              |   10 +-
  cman/qdisk/clulog.c              |  291 -----
  cman/qdisk/clulog.h              |  161 ---
  cman/qdisk/daemon_init.c         |    8 +-
  cman/qdisk/disk.c                |   80 +-
  cman/qdisk/disk_util.c           |    6 +-
  cman/qdisk/main.c                |  156 ++--
  cman/qdisk/mkqdisk.c             |   22 +-
  cman/qdisk/proc.c                |   42 +-
  cman/qdisk/score.c               |   23 +-
  configure                        |   67 +-
  dlm/lib/Makefile                 |    4 +
  dlm/lib/libdlm.c                 |   36 +-
  dlm/lib/libdlm.h                 |    7 +
  dlm/man/libdlm.3                 |    3 +
  dlm/tool/Makefile                |    2 +
  doc/usage.txt                    |   87 +-
  fence/Makefile                   |    4 +-
  fence/agents/ipmilan/Makefile    |    2 +
  fence/agents/rackswitch/Makefile |    2 +
  fence/agents/rps10/Makefile      |    2 +
  fence/agents/xvm/Makefile        |    5 +
  fence/agents/xvm/fence_xvm.c     |    2 +-
  fence/agents/xvm/fence_xvmd.c    |   31 +-
  fence/agents/xvm/simple_auth.c   |    2 +
  fence/agents/xvm/xvm.h           |    1 +
  fence/fence_node/Makefile        |   14 +-
  fence/fence_node/fence_node.c    |   22 +-
  fence/fence_tool/Makefile        |   18 +-
  fence/fence_tool/fence_tool.c    |  127 +--
  fence/fenced/Makefile            |   22 +-
  fence/fenced/agent.c             |  382 -------
  fence/fenced/config.c            |  140 +++
  fence/fenced/cpg.c               | 1305 +++++++++++++++++++++
  fence/fenced/fd.h                |  311 ++++--
  fence/fenced/fenced.h            |   46 +
  fence/fenced/group.c             |  324 +++++-
  fence/fenced/main.c              |  967 ++++++++++-------
  fence/fenced/member_cman.c       |  133 ++--
  fence/fenced/recover.c           |  349 ++-----
  fence/include/linux_endian.h     |   81 ++
  fence/include/list.h             |   11 +
  fence/libfence/Makefile          |   54 +
  fence/libfence/agent.c           |  350 ++++++
  fence/libfence/libfence.h        |   36 +
  fence/libfenced/Makefile         |   52 +
  fence/libfenced/libfenced.h      |   53 +
  fence/libfenced/main.c           |  327 ++++++
  gfs/gfs_debug/Makefile           |    2 +
  gfs/gfs_edit/Makefile            |    2 +
  gfs/gfs_fsck/Makefile            |    3 +-
  gfs/gfs_grow/Makefile            |    2 +
  gfs/gfs_jadd/Makefile            |    2 +
  gfs/gfs_mkfs/Makefile            |    2 +
  gfs/gfs_quota/Makefile           |    2 +
  gfs/gfs_tool/Makefile            |    2 +
  gfs/libgfs/Makefile              |    2 +
  gfs/tests/filecon2/Makefile      |    4 +-
  gfs/tests/mmdd/Makefile          |    8 +-
  gfs2/convert/Makefile            |    2 +
  gfs2/edit/Makefile               |    2 +
  gfs2/fsck/Makefile               |    1 -
  gfs2/include/list.h              |  325 ------
  gfs2/libgfs2/Makefile            |    3 +
  gfs2/mkfs/Makefile               |    2 +
  gfs2/mount/Makefile              |    4 +
  gfs2/quota/Makefile              |    2 +
  gfs2/tool/Makefile               |    2 +
  gnbd/client/Makefile             |    4 +
  gnbd/server/Makefile             |    4 +
  gnbd/tools/fence_gnbd/Makefile   |    2 +
  gnbd/tools/gnbd_export/Makefile  |    2 +
  gnbd/tools/gnbd_import/Makefile  |    2 +
  gnbd/utils/Makefile              |    6 +-
  group/daemon/Makefile            |    2 +
  group/dlm_controld/Makefile      |   19 +-
  group/dlm_controld/action.c      |  443 +++-----
  group/dlm_controld/config.c      |  288 +++++
  group/dlm_controld/config.h      |   56 +
  group/dlm_controld/cpg.c         | 1441 ++++++++++++++++++++++++
  group/dlm_controld/crc.c         |   85 ++
  group/dlm_controld/deadlock.c    |  326 +-----
  group/dlm_controld/dlm_daemon.h  |  220 +++-
  group/dlm_controld/group.c       |   49 +-
  group/dlm_controld/main.c        |  604 +++++-----
  group/dlm_controld/member_cman.c |   11 +-
  group/dlm_controld/netlink.c     |  237 ++++
  group/dlm_controld/plock.c       | 2315 ++++++++++++++++++++++++++++++++++++++
  group/gfs_controld/Makefile      |    2 +
  group/gfs_controld/lock_dlm.h    |    1 +
  group/gfs_controld/plock.c       |  256 ++++-
  group/include/list.h             |   11 +
  group/lib/Makefile               |    5 +-
  make/clean.mk                    |    2 +-
  make/defines.mk.input            |    8 +-
  rgmanager/src/clulib/Makefile    |    4 +
  rgmanager/src/clulib/lockspace.c |   93 --
  rgmanager/src/daemons/Makefile   |    6 +
  rgmanager/src/utils/Makefile     |   21 +-
  147 files changed, 11984 insertions(+), 5551 deletions(-)

--
I'm going to make him an offer he can't refuse.



From fdinitto at redhat.com  Mon Apr 28 18:49:14 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 28 Apr 2008 20:49:14 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.03.02 released
In-Reply-To: <1209408047.7343.86.camel@WSBID06223.bidmc.harvard.edu>
References: <Pine.LNX.4.64.0804281923590.12539@trider-g7>
	<1209408047.7343.86.camel@WSBID06223.bidmc.harvard.edu>
Message-ID: <Pine.LNX.4.64.0804282047540.12539@trider-g7>

On Mon, 28 Apr 2008, rhurst at bidmc.harvard.edu wrote:

> The "Ford" release?  Is that in compliance with the GPL?  ;^)

I knew somebody was going to spot the typo sooner or later :)

so much for copy&paste from templates ;)

> On Mon, 2008-04-28 at 19:37 +0200, Fabio M. Di Nitto wrote:
>
>> The cluster team and its vibrant community are proud to announce the 4rd
>> release from the STABLE2 branch: 2.03.03.

at least the version was unnoticed :)  ^^^^

Fabio

--
I'm going to make him an offer he can't refuse.



From lhh at redhat.com  Mon Apr 28 19:23:00 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 28 Apr 2008 15:23:00 -0400
Subject: qdisk and voting in RHEL4 CS (Re: [Linux-cluster] Is heuristic
	really required?)
In-Reply-To: <Pine.LNX.4.44.0804251412490.9517-100000@allspice.nssg.mitel.com>
References: <Pine.LNX.4.44.0804251412490.9517-100000@allspice.nssg.mitel.com>
Message-ID: <1209410580.12537.21.camel@ayanami.boston.devel.redhat.com>

On Fri, 2008-04-25 at 14:18 -0400, Charlie Brady wrote:
> On Thu, 24 Apr 2008, Lon Hohberger wrote:
> 
> > It's not really a node, and is not reported by CMAN in the nodes list as
> > it is in RHEL5/STABLE2 branches.

> Not so, it is sometimes reported by cman in the nodes list. Herewith the 
> /proc/cluster/nodes file from four nodes of a cluster, each with qdiskd 
> running:

Hmm, so the the quorum disk information is not reported via the cman
socket API.  So, for example, "clustat" never shows it.

The /proc/cluster/nodes file doesn't follow the same code path so it
seems.

The fact that it does not appear on one of the nodes is strange
(in /proc/cluster/nodes).  I would expect it to show up on all of them
after ~10-15 seconds of run-time of qdiskd.

-- Lon



From jboot at sina.com  Tue Apr 29 03:33:39 2008
From: jboot at sina.com (jboot)
Date: Tue, 29 Apr 2008 11:33:39 +0800
Subject: [Linux-cluster] about luci start
Message-ID: <200804291133346568292@sina.com>

Gonga is a nice GUI, but for some reson I cannot open Conga?s URL from remote machine, only from Conga?s box itself.
When I?m trying to access conga from remote, I get ?Please wait, redirecting?? for forever.

my browser is IE6, maybe I need firefox???

2008-04-29 



jboot 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/2d714c00/attachment.htm>

From kri_thi at yahoo.com  Tue Apr 29 04:33:12 2008
From: kri_thi at yahoo.com (krishnamurthi G)
Date: Mon, 28 Apr 2008 21:33:12 -0700 (PDT)
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
Message-ID: <587027.3208.qm@web90403.mail.mud.yahoo.com>

Hi Friends,

I am working on a RHEL 5 custer. I used RHEL4
earlier and going to support RHEL5 ( Linux kernel 2.6) and I need few
clarifications for below questions.
1) How to get the cluster name for RHEL 5 ( What CLI would display the name). For RHEL 4 I used "/sbin/cman_tool status" command
2)
Where is the configuration file available for RHEL5.0 or is it
configurable? ( For RHEL 4 ituse to be at "/etc/cluster/cluster.conf")

3) I am getting "CMAN is not running" while running /usr/sbin/clustat. How o run cman I could see /usr/sbin/cman_tool.

4)
How to get a virtual IP for a given HA resource group. For RHEL4
clustersuite it was extracted from "/etc/cluster/cluster.conf" parsing
"ip address=".

I appreciate your help on this.
Thanks in advance
- Krishna


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080428/c3b0b949/attachment.htm>

From Santosh.Panigrahi at in.unisys.com  Tue Apr 29 07:03:15 2008
From: Santosh.Panigrahi at in.unisys.com (Panigrahi, Santosh Kumar)
Date: Tue, 29 Apr 2008 12:33:15 +0530
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
In-Reply-To: <587027.3208.qm@web90403.mail.mud.yahoo.com>
References: <587027.3208.qm@web90403.mail.mud.yahoo.com>
Message-ID: <D566E8CF3538B54D95B925CB69CB4D2A0FA83777@inblr-exch1.eu.uis.unisys.com>

Hello KG,

 

All the below points asked about Rhel5 are same as Rhel4. In rhel5 cman,
ccsd and fenced services are all clubbed into a single service as cman.
Rgmanager is as it is. Rhel5 has a web based conga GUI tool (ricci/luci)
to configure a cluster beside system-config-cluster standalone tool.

 

Thanks,

Santosh

________________________________

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of krishnamurthi G
Sent: Tuesday, April 29, 2008 10:03 AM
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification

 

Hi Friends,

I am working on a RHEL 5 custer. I used RHEL4 earlier and going to
support RHEL5 ( Linux kernel 2.6) and I need few clarifications for
below questions.
1) How to get the cluster name for RHEL 5 ( What CLI would display the
name). For RHEL 4 I used "/sbin/cman_tool status" command
2) Where is the configuration file available for RHEL5.0 or is it
configurable? ( For RHEL 4 ituse to be at "/etc/cluster/cluster.conf")

3) I am getting "CMAN is not running" while running /usr/sbin/clustat.
How o run cman I could see /usr/sbin/cman_tool.

4) How to get a virtual IP for a given HA resource group. For RHEL4
clustersuite it was extracted from "/etc/cluster/cluster.conf" parsing
"ip address=".

I appreciate your help on this.
Thanks in advance
- Krishna

 

________________________________

Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
it now.
<http://us.rd.yahoo.com/evt=51733/*http:/mobile.yahoo.com/;_ylt=Ahu06i62
sR8HDtDypao8Wcj9tAcJ%20> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/ed3d3267/attachment.htm>

From maciej.bogucki at artegence.com  Tue Apr 29 07:54:05 2008
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Tue, 29 Apr 2008 09:54:05 +0200
Subject: [Linux-cluster] redhat cluster in multi SAN
In-Reply-To: <2ca799770804251015j27259601wc49d4f2547a92aa2@mail.gmail.com>
References: <f8eac96f0804250541t13f1ab81kf71bae3c206949d5@mail.gmail.com>	<D46D6BD8-930F-40C2-8153-46CE3E825B87@redhat.com>	<f8eac96f0804251006h115971aaj88a628a5a6391784@mail.gmail.com>
	<2ca799770804251015j27259601wc49d4f2547a92aa2@mail.gmail.com>
Message-ID: <4816D41D.2020400@artegence.com>

> We have a similar situation, two different sans one being a backup, and
> we use the san providers (EMC) proprietary replication software.

Does it work good? Did you have any problem with it?

Best Regards
Maciej Bogucki



From matthias.schlarb at sap.com  Tue Apr 29 10:05:42 2008
From: matthias.schlarb at sap.com (Matthias Schlarb)
Date: Tue, 29 Apr 2008 12:05:42 +0200
Subject: [Linux-cluster] shellfunction ocf_log
Message-ID: <4816F2F6.6060408@sap.com>

Hello,

in none of the resource agent scripts the ocf_log function reports 
something to syslogd (/var/log/messages or elsewhere). But executing the 
scripts manually does. Do you have an idea?

Kind regards,
-- 
Matthias Schlarb



From ccaulfie at redhat.com  Tue Apr 29 12:39:50 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 29 Apr 2008 13:39:50 +0100
Subject: [Linux-cluster] Cluster failing after rebooting a standby node
In-Reply-To: <481546CD.8010002@westnet.com.au>
References: <480EADFF.6030306@westnet.com.au> <480F3F70.3010004@redhat.com>
	<481546CD.8010002@westnet.com.au>
Message-ID: <48171716.2090900@redhat.com>

Ben J wrote:
> Hi Christine,
> 
> Thanks for the reply.
> 
> I've been able to today replicate the cluster failing again by rebooting
> one of the standby nodes.  I captured tcpdump data from 2 of the active
> nodes (store01 and store02) and from the 2 standby nodes (ha01 and
> ha02).  Ha01 is the node that we rebooted, so it will only show cluster
> communication that occurred up until it rebooted.  See attached zip file.
> 
> Note, I've sent this off-list as I didn't want to send this to the list
> for obvious reasons. :)
> 
> Let me know if you need any further information.  I've had the cluster
> running with debug level 7 logging, so I've got that information as well
> if you'd like me to shoot through that as well.
>

Thanks for the tcpdumps, they were very helpful in eliminating several
possible causes I had considered. Unfortunately I still don't quite know
what IS happening!

It seems that when one node leaves the cluster the others go into
transition MASTER state (because they all saw the node go down at the
same time) and they never resolve this state. What normally happens is
that one node will nominate itself master and take over the transition.
But it seems like this is not happening for some reason.

I did manage to reproduce it (or something very similar) on a three node
cluster yesterday, unfortunately I didn't have debugging enabled in the
modules so it didn't tell me much more (though it did tell me a little
more). I have restarted some tests and I hope they will yield some
results soon (ish).

-- 

Chrissie



From lhh at redhat.com  Tue Apr 29 14:26:40 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 29 Apr 2008 10:26:40 -0400
Subject: [Linux-cluster] shellfunction ocf_log
In-Reply-To: <4816F2F6.6060408@sap.com>
References: <4816F2F6.6060408@sap.com>
Message-ID: <1209479201.29793.1.camel@ayanami.boston.devel.redhat.com>

On Tue, 2008-04-29 at 12:05 +0200, Matthias Schlarb wrote:
> Hello,
> 
> in none of the resource agent scripts the ocf_log function reports 
> something to syslogd (/var/log/messages or elsewhere). But executing the 
> scripts manually does. Do you have an idea?

It's a bug in the scripts; it should be fixed now (in 4.7 / STABLE /
5.2 / STABLE2 / master branches).  They were not configuring themselves
correctly before.

-- Lon



From jbrassow at redhat.com  Tue Apr 29 15:31:35 2008
From: jbrassow at redhat.com (Jonathan Brassow)
Date: Tue, 29 Apr 2008 10:31:35 -0500
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
In-Reply-To: <D566E8CF3538B54D95B925CB69CB4D2A0FA83777@inblr-exch1.eu.uis.unisys.com>
References: <587027.3208.qm@web90403.mail.mud.yahoo.com>
	<D566E8CF3538B54D95B925CB69CB4D2A0FA83777@inblr-exch1.eu.uis.unisys.com>
Message-ID: <F27DA9D2-8F7F-4AB6-8AA1-9B0C1EA721ED@redhat.com>

You may also find this useful:
http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Cluster_Administration/ap-upgrade-rhel4-to-rhel5-CA.html

Some of the steps may not be relevant to you, but it should explain  
what has changed.

  brassow

On Apr 29, 2008, at 2:03 AM, Panigrahi, Santosh Kumar wrote:

> Hello KG,
>
> All the below points asked about Rhel5 are same as Rhel4. In rhel5  
> cman, ccsd and fenced services are all clubbed into a single service  
> as cman. Rgmanager is as it is. Rhel5 has a web based conga GUI tool  
> (ricci/luci) to configure a cluster beside system-config-cluster  
> standalone tool.
>
> Thanks,
> Santosh
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com 
> ] On Behalf Of krishnamurthi G
> Sent: Tuesday, April 29, 2008 10:03 AM
> To: Linux-cluster at redhat.com
> Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
>
> Hi Friends,
>
> I am working on a RHEL 5 custer. I used RHEL4 earlier and going to  
> support RHEL5 ( Linux kernel 2.6) and I need few clarifications for  
> below questions.
> 1) How to get the cluster name for RHEL 5 ( What CLI would display  
> the name). For RHEL 4 I used "/sbin/cman_tool status" command
> 2) Where is the configuration file available for RHEL5.0 or is it  
> configurable? ( For RHEL 4 ituse to be at "/etc/cluster/cluster.conf")
>
> 3) I am getting "CMAN is not running" while running /usr/sbin/ 
> clustat. How o run cman I could see /usr/sbin/cman_tool.
>
> 4) How to get a virtual IP for a given HA resource group. For RHEL4  
> clustersuite it was extracted from "/etc/cluster/cluster.conf"  
> parsing "ip address=".
>
> I appreciate your help on this.
> Thanks in advance
> - Krishna
>
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  
> Try it now.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/7a7aed79/attachment.htm>

From swilson at uchicago.edu  Tue Apr 29 15:48:38 2008
From: swilson at uchicago.edu (Scott Wilson)
Date: Tue, 29 Apr 2008 10:48:38 -0500 (CDT)
Subject: [Linux-cluster] about luci start
In-Reply-To: <200804291133346568292@sina.com>
References: <200804291133346568292@sina.com>
Message-ID: <Pine.GSO.4.64.0804291045400.21737@tiki>


Did you check your firewall?  Conga uses port 8084, which you'll need to 
open specifically if you're blocking all traffic by default.

With your Windows box, open up a command prompt and try:

telnet <Luci Server> 8084

My guess is you'll get something like:
Trying....

And nothing else until you get a timeout.  If that's the case, its the 
firewall.  If you get connected, where you see "Connected to FOO, Escape 
character is BAR", then never mind me, you have some other problem.


Scott Wilson                    Lead System Administrator
swilson at uchicago.edu            NSIT - DCS - SeaUnix

On Tue, 29 Apr 2008, jboot wrote:

> Gonga is a nice GUI, but for some reson I cannot open Conga??s URL from remote machine, only from Conga??s box itself.
> When I??m trying to access conga from remote, I get ??Please wait, redirecting???? for forever.
>
> my browser is IE6, maybe I need firefox???
>
> 2008-04-29
>
>
>
> jboot
>

From garromo at us.ibm.com  Tue Apr 29 15:50:26 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Tue, 29 Apr 2008 09:50:26 -0600
Subject: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all
	gfs	nodes.
In-Reply-To: <BAY123-W90D21314D7D1E7C07442BD4DD0@phx.gbl>
Message-ID: <OF158FEFF2.11A53368-ON8725743A.0056FB60-8725743A.0056D093@us.ibm.com>

Had you seen this?

https://bugzilla.redhat.com/show_bug.cgi?id=435491

Gary




Tracey Flanders <mcse47 at hotmail.com> 
Sent by: linux-cluster-bounces at redhat.com
04/25/2008 10:34 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
<linux-cluster at redhat.com>
cc

Subject
[Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs nodes.






I've been trying to setup a 3 server cluster with GFS mounted over iSCSI 
on Qemu Virtual Machines. A 4th server acts as a iSCSI Target. I found and 
article that explains my issue, but I can't seem to figure out what the 
solution is. QUOTED from : http://kbase.redhat.com/faq/FAQ_51_10923.shtm 
After successfully setting up a cluster, cman_tool shows the cluster is 
healthy. Mounting the gfs mount on the first node works successfully. 
However, when mounting gfs on the second node, the mount command hangs. 
Writing to a file on the first node also hangs. On the second node, the 
following error is seen in /var/log/messages: Jul 18 14:49:27 blade3 
kernel: Lock_Harness 2.6.9-72.2 (built Apr 24 2007 12:45:55) installed Jul 
18 14:49:27 blade3 kernel: GFS 2.6.9-72.2 (built Apr 24 2007 12:46:12) 
installed Jul 18 14:52:53 blade3 kernel: GFS: Trying to join cluster 
"lock_dlm", "vcomcluster:testgfs" Jul 18 14:52:53 blade3 kernel: Lock_DLM 
(built Apr 24 2007 12:45:57) installed Jul 18 14:52:53 blade3 kernel: dlm: 
connect from non cluster node Jul 18 14:52:53 blade3 kernel: dlm: connect 
from non cluster node END QUOTE My Virtual Machines only have one 
interface so I still can't figure out why this is happening. I can 
successfully mount the GFS partition on any one node but as soon as I try 
to start the clvmd on a 2nd node it hangs the whole cluster. I'm wondering 
if its a Qemu VM network issue? Each host can ping each other by name and 
ip. The cluster works fine but I cant get GFS to work on th VMs. Is it 
possible to debug the clvmd to see what IP Address it is sending? Thanks, 
Tracey Flanders 
In a rush? Get real-time answers with Windows Live Messenger.--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/433e44bf/attachment.htm>

From garromo at us.ibm.com  Tue Apr 29 15:54:22 2008
From: garromo at us.ibm.com (Gary Romo)
Date: Tue, 29 Apr 2008 09:54:22 -0600
Subject: [Linux-cluster] GFS Storage cluster !!!!
In-Reply-To: <4811B6D9.8070907@hp.com>
Message-ID: <OF021134FB.66CE5CCC-ON8725743A.0057466D-8725743A.00572D21@us.ibm.com>

Is there a set of instructions or some basic guidlines for setting up a 
single node cluster (if you will)?

Gary




"Sutton, Harry (MSE)" <harry.sutton at hp.com> 
Sent by: linux-cluster-bounces at redhat.com
04/25/2008 04:47 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc

Subject
Re: [Linux-cluster] GFS Storage cluster !!!!






Hi Vimal,

You can configure a single node to use a GFS filesystem, but you must 
use manual locking, something you definitely wouldn't do on a multi-node 
production cluster.

    /Harry

Vimal Gupta wrote:
> Hi Guys,
>
> I want a storage cluster of 3 Nodes.But  I want to test first. So for
> Testing I have one quad CPU server with CentOS 5.1. I don't have a SAN
> now . Can I configure this server as a one node cluster and use its one
> scsi HDD as GFS. Just for testing . (may be this is a stupid question).
>
>
> --
>
> Vimal Gupta
> Sr. System Administrator
> Monster.com India Pvt.Ltd.
> FC - 23, Block - B, 1st Floor, Film City, Sector - 16 A,
> NOIDA, UP 201 301, INDIA
> Ph# : +91-120-4024230 Fax: +91-40-66506449 Mobile: +91-9811150360
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> 

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/c0971d0f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/octet-stream
Size: 6255 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/c0971d0f/attachment.obj>

From hdghoghari at yahoo.com  Tue Apr 29 17:16:04 2008
From: hdghoghari at yahoo.com (haresh ghoghari)
Date: Tue, 29 Apr 2008 10:16:04 -0700 (PDT)
Subject: [Linux-cluster] Need help for NFS Redundancy
Message-ID: <152423.87362.qm@web36102.mail.mud.yahoo.com>

Dear Techies

I am new here and I am trying configure RedHat Cluster for NFS redundancy

Please advise How I can build NFS redundancy ?

Thank you
-Haresh


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/25de46a8/attachment.htm>

From fog at t.is  Tue Apr 29 17:22:48 2008
From: fog at t.is (=?iso-8859-1?Q?Finnur_=D6rn_Gu=F0mundsson_-_TM_Software?=)
Date: Tue, 29 Apr 2008 17:22:48 -0000
Subject: [Linux-cluster] Need help for NFS Redundancy
In-Reply-To: <152423.87362.qm@web36102.mail.mud.yahoo.com>
References: <152423.87362.qm@web36102.mail.mud.yahoo.com>
Message-ID: <3DDA6E3E456E144DA3BB0A62A7F7F77902005D8A@SKYHQAMX08.klasi.is>

Hi,

 

A extremly quick search on google would have given you this url:

 

http://sources.redhat.com/cluster/doc/nfscookbook.pdf

 

 

K?r kve?ja / Best Regards,

Finnur ?rn Gu?mundsson
Network Engineer - Network Operations
RedHat Certificed Architect - RHCA

fog at t.is <mailto:fog at t.is> 

TM Software
Ur?arhvarf 6, IS-203 K?pavogur, Iceland
Tel: +354 545 3000 - fax +354 545 3610
www.tm-software.is <http://www.tm-software.is/> 

This e-mail message and any attachments are confidential and may be privileged. TM Software e-mail disclaimer: www.tm-software.is/disclaimer <http://www.tm-software.is/disclaimer>  

 

 

From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of haresh ghoghari
Sent: 29. apr?l 2008 17:16
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Need help for NFS Redundancy

 

Dear Techies

I am new here and I am trying configure RedHat Cluster for NFS redundancy

Please advise How I can build NFS redundancy ?

Thank you
-Haresh



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/299a5634/attachment.htm>

From tiagocruz at forumgdh.net  Tue Apr 29 20:40:22 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Tue, 29 Apr 2008 17:40:22 -0300
Subject: [Linux-cluster] MySQL Cluster
In-Reply-To: <48111FAD.8090805@bobich.net>
References: <1209076703.7121.21.camel@tuxkiller.ig.com.br>
	<48111FAD.8090805@bobich.net>
Message-ID: <1209501622.2889.35.camel@tuxkiller.ig.com.br>

Hello Gordan

Thanks for your attention and sorry by my late, I spend some days
reading/ testing about this subject before I come to you again :)


On Fri, 2008-04-25 at 01:02 +0100, Gordan Bobic wrote:

> Last time I tried using MySQL Cluster it was unacceptably slow. It 
> wasn't really a workable solution. MySQL replication was a much better 
> option.

Ok, good to know!


> You could use DRBD in primary-primary mode with GFS on top and external 
> locking in MySQL. It would work, but performance would suffer a great 
> deal. It depends on what kind of throughput you need.

Yes... I'm not tried this yet, maybe should I try... I'll get some good
redundancy of servers!

But, I was researching about mysql "master -> master" that should be a
good throughput/ redundancy. Did you already use this? 


> Depending on your application, you could use something like MySQL 
> round-robin replication. In that configuration, MySQL servers replicate 
> in a ring.

The Ring, it's some like this: (?)

A -> B -> C -> A

Right? So... If I lost the "B" server... "A" will can't sync "C"?

Sound like dangerous for me :-)


>  The downside is that there is a race condition inherent in 
> the design, if you think about it carefully. (DRBD+GFS+ MySQL with 
> external locking doesn't suffer the race condition.) If this potential 
> race condition is not a concern for your application, then performance 
> wise, it is likely to be the optimal solution for you. Since you said 
> that fail-over is not an option for you, I am assuming that the load you 
> plan to put through it is too great for one server to handle (I have 
> seem low-spec servers with MySQL churning out 100,000+ queries per 
> second when tuned up correctly). If you are dealing with this sort of 
> load, DRBD+GFS+MySQL isn't going to cut it and round-robin replication 
> with the mentioned race condition is pretty much your only option.

Well, this LAMP-HA will be hands a lot of end-user application, like
Wordpress, Joomla, phpBB and etc, so I don't know if I'll have race
condition :-(

Thanks!
-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From tiagocruz at forumgdh.net  Tue Apr 29 20:51:08 2008
From: tiagocruz at forumgdh.net (Tiago Cruz)
Date: Tue, 29 Apr 2008 17:51:08 -0300
Subject: [Linux-cluster] MySQL Cluster
In-Reply-To: <36F4E74FA8263744A6B016E6A461EFF603317E44@dino.eu.tieto.com>
References: <1209076703.7121.21.camel@tuxkiller.ig.com.br>
	<36F4E74FA8263744A6B016E6A461EFF603317E44@dino.eu.tieto.com>
Message-ID: <1209502268.2889.46.camel@tuxkiller.ig.com.br>

Harri,

Many thanks for your answer!

On Fri, 2008-04-25 at 07:38 +0300, Harri.Paivaniemi at tietoenator.com
wrote:
> If you need high select- performance but only moderate update/insert/delete,
>  then it's easy and robust way to:

Yes, I need one solution LAMP-HA for Wordpress, Joomla, Moodle,
ForumPHPBB and others "default" applications...


> - cluster master-db as active-hot standby (with RHCS)

Standby... like heartbeat? 


> - use mysql proxy to split selects and run many slaves for selects in loadbalance pool behind
> one VIP

Dude... this solution sound like very good/crazy/wonderful! :)
I've spend a lot of days reading/testing the "mysql-proxy"

http://forge.mysql.com/wiki/MySQL_Proxy 
"...load balancing; failover; query analysis; query filtering and
modification..."

And this feature, give me totally crazy:
http://jan.kneschke.de/2007/8/26/mysql-proxy-more-r-w-splitting 

This should solve my problem, but it's not working very well :-(
http://bugs.mysql.com/bug.php?id=30304


Do you have been used this proxy before? Can you give-me some trick/tip
around? :-p


> - use bin- replication master --> slaves

Yes... I think on this, but I can't separe R from W from all
applications from my self :-(

Thanks!

-- 
Tiago Cruz
http://everlinux.com
Linux User #282636




From gordan at bobich.net  Tue Apr 29 23:38:24 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Wed, 30 Apr 2008 00:38:24 +0100
Subject: [Linux-cluster] MySQL Cluster
In-Reply-To: <1209501622.2889.35.camel@tuxkiller.ig.com.br>
References: <1209076703.7121.21.camel@tuxkiller.ig.com.br>	<48111FAD.8090805@bobich.net>
	<1209501622.2889.35.camel@tuxkiller.ig.com.br>
Message-ID: <4817B170.4000209@bobich.net>

Tiago Cruz wrote:

>> You could use DRBD in primary-primary mode with GFS on top and external 
>> locking in MySQL. It would work, but performance would suffer a great 
>> deal. It depends on what kind of throughput you need.
> 
> Yes... I'm not tried this yet, maybe should I try... I'll get some good
> redundancy of servers!
> 
> But, I was researching about mysql "master -> master" that should be a
> good throughput/ redundancy. Did you already use this? 

See below - it's a.k.a. round-robin replication.

>> Depending on your application, you could use something like MySQL 
>> round-robin replication. In that configuration, MySQL servers replicate 
>> in a ring.
> 
> The Ring, it's some like this: (?)
> 
> A -> B -> C -> A
> 
> Right? So... If I lost the "B" server... "A" will can't sync "C"?
> 
> Sound like dangerous for me :-)

Failover can be handled, and if you only use MyISAM tables, you can just 
"LOAD DATA FROM MASTER" to resync. If all the servers are 100% in sync, 
though, with replication log pointers being the same, you could just 
thump the master.info file on the next server in the chain and short the 
replication circuit as a quick fix until you can resync with the fixed 
server.

PostgreSQL can be set up to do star-topology replication, so losing one 
server wouldn't lead to replication breaks, but you'd have to implement 
it yourself with triggers and some pl/perl magic.

>>  The downside is that there is a race condition inherent in 
>> the design, if you think about it carefully. (DRBD+GFS+ MySQL with 
>> external locking doesn't suffer the race condition.) If this potential 
>> race condition is not a concern for your application, then performance 
>> wise, it is likely to be the optimal solution for you. Since you said 
>> that fail-over is not an option for you, I am assuming that the load you 
>> plan to put through it is too great for one server to handle (I have 
>> seem low-spec servers with MySQL churning out 100,000+ queries per 
>> second when tuned up correctly). If you are dealing with this sort of 
>> load, DRBD+GFS+MySQL isn't going to cut it and round-robin replication 
>> with the mentioned race condition is pretty much your only option.
> 
> Well, this LAMP-HA will be hands a lot of end-user application, like
> Wordpress, Joomla, phpBB and etc, so I don't know if I'll have race
> condition :-(

Probably not. The race condition itself is very difficult to provoke 
under normal circumstances. You would have to have two separate threads 
operating on the same records, and in specific order for it to arise.

Gordan



From jboot at sina.com  Wed Apr 30 02:46:39 2008
From: jboot at sina.com (jboot)
Date: Wed, 30 Apr 2008 10:46:39 +0800
Subject: [Linux-cluster] Re: about luci start (Scott Wilson)
References: <20080429160008.E6F0061941E@hormel.redhat.com>
Message-ID: <200804301045540468635@sina.com>

thanks for your suggestion.

Now, I use the firefox to avoid the problem, but I can't find the reason!!


2008-04-30 



jboot 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080430/903b9406/attachment.htm>

From kri_thi at yahoo.com  Wed Apr 30 03:13:59 2008
From: kri_thi at yahoo.com (krishnamurthi G)
Date: Tue, 29 Apr 2008 20:13:59 -0700 (PDT)
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
Message-ID: <981745.84690.qm@web90404.mail.mud.yahoo.com>

Hi All,

Thanks for your valuable information.

Krishna


----- Original Message ----
From: Jonathan Brassow <jbrassow at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Tuesday, April 29, 2008 9:01:35 PM
Subject: Re: [Linux-cluster] RHEL5 clustersuite need few clarification


You may also find this useful:http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Cluster_Administration/ap-upgrade-rhel4-to-rhel5-CA.html

Some of the steps may not be relevant to you, but it should explain what has changed.

 brassow


On Apr 29, 2008, at 2:03 AM, Panigrahi, Santosh Kumar wrote:

Hello KG,
 
All the below points asked about Rhel5 are same as Rhel4. In rhel5 cman, ccsd and fenced services are all clubbed into a single service as cman. Rgmanager is as it is. Rhel5 has a web based conga GUI tool (ricci/luci) to configure a cluster beside system-config-cluster standalone tool.
 
Thanks,
Santosh

________________________________

From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of krishnamurthi G
Sent: Tuesday, April 29, 2008 10:03 AM
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] RHEL5 clustersuite need few clarification
 
Hi Friends,

I am working on a RHEL 5 custer. I used RHEL4 earlier and going to support RHEL5 ( Linux kernel 2.6) and I need few clarifications for below questions.
1) How to get the cluster name for RHEL 5 ( What CLI would display the name). For RHEL 4 I used "/sbin/cman_tool status" command
2) Where is the configuration file available for RHEL5.0 or is it configurable? ( For RHEL 4 ituse to be at "/etc/cluster/cluster.conf")

3) I am getting "CMAN is not running" while running /usr/sbin/clustat. How o run cman I could see /usr/sbin/cman_tool.

4) How to get a virtual IP for a given HA resource group. For RHEL4 clustersuite it was extracted from "/etc/cluster/cluster.conf" parsing "ip address=".

I appreciate your help on this.
Thanks in advance
-  Krishna
 

________________________________

Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080429/7cf2ebdc/attachment.htm>

From bjlist at westnet.com.au  Wed Apr 30 04:31:32 2008
From: bjlist at westnet.com.au (Ben J)
Date: Wed, 30 Apr 2008 12:31:32 +0800
Subject: [Linux-cluster] Cluster failing after rebooting a standby node
In-Reply-To: <48171716.2090900@redhat.com>
References: <480EADFF.6030306@westnet.com.au> <480F3F70.3010004@redhat.com>
	<481546CD.8010002@westnet.com.au> <48171716.2090900@redhat.com>
Message-ID: <4817F624.8070408@westnet.com.au>

Thanks for the reply.

At the moment we've done some testing within our network and managed to 
switch to multicast mode for cluster communication successfully.  Red 
Hat support suggested that the bug we were encountering might be only 
present in broadcast cluster communication mode, would you think that 
the bug is related to the method of cluster communication (i.e. 
broadcast), or would you say that we should expect to also experience 
this issue using multicast as well?

As I mentioned in my first post, we can't replicate the issue until the 
cluster has been running for around 3-4 days at least.  Is this around 
how long your test cluster had been running when you replicated it, or 
did you do something else to replicate this sooner?

Thanks,

Ben


Christine Caulfield wrote:
> Ben J wrote:
>   
>> Hi Christine,
>>
>> Thanks for the reply.
>>
>> I've been able to today replicate the cluster failing again by rebooting
>> one of the standby nodes.  I captured tcpdump data from 2 of the active
>> nodes (store01 and store02) and from the 2 standby nodes (ha01 and
>> ha02).  Ha01 is the node that we rebooted, so it will only show cluster
>> communication that occurred up until it rebooted.  See attached zip file.
>>
>> Note, I've sent this off-list as I didn't want to send this to the list
>> for obvious reasons. :)
>>
>> Let me know if you need any further information.  I've had the cluster
>> running with debug level 7 logging, so I've got that information as well
>> if you'd like me to shoot through that as well.
>>
>>     
>
> Thanks for the tcpdumps, they were very helpful in eliminating several
> possible causes I had considered. Unfortunately I still don't quite know
> what IS happening!
>
> It seems that when one node leaves the cluster the others go into
> transition MASTER state (because they all saw the node go down at the
> same time) and they never resolve this state. What normally happens is
> that one node will nominate itself master and take over the transition.
> But it seems like this is not happening for some reason.
>
> I did manage to reproduce it (or something very similar) on a three node
> cluster yesterday, unfortunately I didn't have debugging enabled in the
> modules so it didn't tell me much more (though it did tell me a little
> more). I have restarted some tests and I hope they will yield some
> results soon (ish).
>
>   



From rmccabe at redhat.com  Wed Apr 30 15:15:57 2008
From: rmccabe at redhat.com (Ryan McCabe)
Date: Wed, 30 Apr 2008 11:15:57 -0400
Subject: [Linux-cluster] Re: about luci start (Scott Wilson)
In-Reply-To: <200804301045540468635@sina.com>
References: <20080429160008.E6F0061941E@hormel.redhat.com>
	<200804301045540468635@sina.com>
Message-ID: <20080430151557.GA443565@redhat.com>

On Wed, Apr 30, 2008 at 10:46:39AM +0800, jboot wrote:
> thanks for your suggestion.
> 
> Now, I use the firefox to avoid the problem, but I can't find the reason!!
> 

Which version of luci are you using? This sounds like a bug that ought
to be fixed in the 52/47 beta versions (>= 0.12.0/>= 0.11.1).


Ryan



From jpalmae at gmail.com  Wed Apr 30 15:30:26 2008
From: jpalmae at gmail.com (Jorge Palma)
Date: Wed, 30 Apr 2008 11:30:26 -0400
Subject: [Linux-cluster] GFS Storage cluster !!!!
In-Reply-To: <48119C51.8020904@monster.co.in>
References: <48119C51.8020904@monster.co.in>
Message-ID: <5b65f1b10804300830s46f1038bj3e29e79c9699a133@mail.gmail.com>

you can use ISCSI to simulate a SAN

http://iscsitarget.sourceforge.net/

Regards

-- 
Jorge Palma Escobar
Ingeniero de Sistemas
Red Hat Linux Certified Engineer
Certificate N? 804005089418233


On Fri, Apr 25, 2008 at 4:54 AM, Vimal Gupta <vimal at monster.co.in> wrote:
> Hi Guys,
>
>  I want a storage cluster of 3 Nodes.But  I want to test first. So for
> Testing I have one quad CPU server with CentOS 5.1. I don't have a SAN now .
> Can I configure this server as a one node cluster and use its one scsi HDD
> as GFS. Just for testing . (may be this is a stupid question).
>
>
>  --
>
>  Vimal Gupta
>  Sr. System Administrator
>  Monster.com India Pvt.Ltd.
>  FC - 23, Block - B, 1st Floor, Film City, Sector - 16 A,
>  NOIDA, UP 201 301, INDIA
>  Ph# : +91-120-4024230 Fax: +91-40-66506449 Mobile: +91-9811150360
>
>  --
>  Linux-cluster mailing list
>  Linux-cluster at redhat.com
>  https://www.redhat.com/mailman/listinfo/linux-cluster
>



From jmacfarland at nexatech.com  Wed Apr 30 15:43:01 2008
From: jmacfarland at nexatech.com (Jeff Macfarland)
Date: Wed, 30 Apr 2008 10:43:01 -0500
Subject: [Linux-cluster] GFS Storage cluster !!!!
In-Reply-To: <5b65f1b10804300830s46f1038bj3e29e79c9699a133@mail.gmail.com>
References: <48119C51.8020904@monster.co.in>
	<5b65f1b10804300830s46f1038bj3e29e79c9699a133@mail.gmail.com>
Message-ID: <48189385.8080007@nexatech.com>

Just an FYI- does not support SCSI PR

Jorge Palma wrote:
> you can use ISCSI to simulate a SAN
> 
> http://iscsitarget.sourceforge.net/
> 
> Regards
> 



From mcse47 at hotmail.com  Wed Apr 30 19:28:58 2008
From: mcse47 at hotmail.com (Tracey Flanders)
Date: Wed, 30 Apr 2008 15:28:58 -0400
Subject: [Linux-cluster] Re: CLVMD hangs on 2nd node startup and hangs
	all	gfs	nodes.
In-Reply-To: <20080429160008.E6F0061941E@hormel.redhat.com>
References: <20080429160008.E6F0061941E@hormel.redhat.com>
Message-ID: <BAY123-W1881268FC807EA3B7BD4F8D4D80@phx.gbl>


I ended up abandoning Qemu KVM Virtual Machines all together and going back to VMware Server Beta 2. For what ever reason the servers could communictate just fine. Maybe it was sending the wrong IP Address, like the NAT'd IP. Doesn't make sense for it to NAT an IP thats on the same network. Anyways, I setup the same rig on 4 vmware virtual machines running CentOS 5.1 and it works great. I must be doing something wrong with Qemu KVM networking. It's disappointing because Qemu KVMs run a lot better on my box than vmware, even though I have tweaked the configs for performance. Oh well, I need to do some more reading I guess. Thanks for all the info. 

> Message: 7
> Date: Tue, 29 Apr 2008 09:50:26 -0600
> From: Gary Romo 
> Subject: Re: [Linux-cluster] CLVMD hangs on 2nd node startup and hangs
> 	all	gfs	nodes.
> To: linux clustering 
> Message-ID:
> 	
> Content-Type: text/plain; charset="us-ascii"
> 
> Had you seen this?
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=435491
> 
> Gary
> 
> 
> 
> 
> Tracey Flanders  
> Sent by: linux-cluster-bounces at redhat.com
> 04/25/2008 10:34 AM
> Please respond to
> linux clustering 
> 
> 
> To
> 
> cc
> 
> Subject
> [Linux-cluster] CLVMD hangs on 2nd node startup and hangs all gfs nodes.
> 
> 
> 
> 
> 
> 
> I've been trying to setup a 3 server cluster with GFS mounted over iSCSI 
> on Qemu Virtual Machines. A 4th server acts as a iSCSI Target. I found and 
> article that explains my issue, but I can't seem to figure out what the 
> solution is. QUOTED from : http://kbase.redhat.com/faq/FAQ_51_10923.shtm 
> After successfully setting up a cluster, cman_tool shows the cluster is 
> healthy. Mounting the gfs mount on the first node works successfully. 
> However, when mounting gfs on the second node, the mount command hangs. 
> Writing to a file on the first node also hangs. On the second node, the 
> following error is seen in /var/log/messages: Jul 18 14:49:27 blade3 
> kernel: Lock_Harness 2.6.9-72.2 (built Apr 24 2007 12:45:55) installed Jul 
> 18 14:49:27 blade3 kernel: GFS 2.6.9-72.2 (built Apr 24 2007 12:46:12) 
> installed Jul 18 14:52:53 blade3 kernel: GFS: Trying to join cluster 
> "lock_dlm", "vcomcluster:testgfs" Jul 18 14:52:53 blade3 kernel: Lock_DLM 
> (built Apr 24 2007 12:45:57) installed Jul 18 14:52:53 blade3 kernel: dlm: 
> connect from non cluster node Jul 18 14:52:53 blade3 kernel: dlm: connect 
> from non cluster node END QUOTE My Virtual Machines only have one 
> interface so I still can't figure out why this is happening. I can 
> successfully mount the GFS partition on any one node but as soon as I try 
> to start the clvmd on a 2nd node it hangs the whole cluster. I'm wondering 
> if its a Qemu VM network issue? Each host can ping each other by name and 
> ip. The cluster works fine but I cant get GFS to work on th VMs. Is it 
> possible to debug the clvmd to see what IP Address it is sending? Thanks, 
> Tracey Flanders 
> In a rush? Get real-time answers with Windows Live Messenger.--
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: https://www.redhat.com/archives/linux-cluster/attachments/20080429/433e44bf/attachment.html
> 

_________________________________________________________________
Express yourself wherever you are. Mobilize!
http://www.gowindowslive.com/Mobile/Landing/Messenger/Default.aspx?Locale=en-US?ocid=TAG_APRIL