From jvantuyl at engineyard.com  Thu Feb  1 11:57:46 2007
From: jvantuyl at engineyard.com (Jayson Vantuyl)
Date: Thu, 1 Feb 2007 05:57:46 -0600
Subject: [Linux-cluster] Diskless Shared-Root GFS/Cluster
In-Reply-To: <200713113342.116652@leena>
References: <200713113342.116652@leena>
Message-ID: <BF394909-64D9-44FE-8904-E5B1094C0DF0@engineyard.com>

We are talking about application servers.

One of the toughest things about clustering in general and GFS in  
particular is the failure scenarios.

When you have any sort of cluster issue, if your root is on a shared  
GFS, that GFS freezes in various ways until fencing happens.  The  
problem with this is that certain binaries that are on the same GFS  
may need to be used to recover.  How do you execute fence_apc to  
fence a failed node when it is on a GFS that is hung waiting on that  
same fencing operation?

There are ways around this involving RAM disks and the like, but  
eventually we just settled on having a minimal flash disk that would  
get us onto our SAN (but not clustered).  Only after we were on a non- 
clustered-FS on our SAN would we then start up our clustered  
filesystem.  This gave us the ability to move our nodes around  
easily.  This is an often overlooked benefit of a shared root that  
putting your root FS on SAN gives you as well.  There's nothing like  
booting up a dead node on spare hardware.  This also gives you a  
solid way to debug a damaged root system.  With shared-root it's all  
or nothing.  It's not so with this configuration.  You also have  
separate syslog files and other things that are one more special case  
on a shared root.  It's also easy to set up nodes with slightly  
different configurations (shared-root makes this another special  
case).  As for the danger of drive failure, a read-only IDE flash  
disk (Google for Transcend) is simple, easy, and dead solid.

After consolidating your shared configuration files into /etc/shared  
and placing appropriate symlinks into that directory, it is a simple  
matter of rsync / csync / tsync / cron+scp to keep them synchronized.

It is tempting to want to have a shared root to minimize management  
requirements.  It is tempting to want to play games with ramfs and  
the like to provide a support system that will function when that  
shared root is hung due to clustering issues.  It is tempting to  
think that having a shared GFS root is really useful.

However, if you value reliability and practicality, it's much easier  
to script up an occasional Rsync than it is to do so many acrobatics  
for such little gain.  For a cluster (and its apps) to be reliable at  
all, it needs to be able to function, recover, and generally have a  
stable operating environment.  Putting GFS under the userspace that  
drives it is asking for trouble.

On Jan 31, 2007, at 1:34 PM, isplist at logicore.net wrote:

> I'm thinking for application servers/cluster only, not workstation  
> users.
>
>
> On Wed, 31 Jan 2007 11:10:55 -0800, Tom Mornini wrote:
>> We boot from flash drives, then pivot root to SAN storage.
>>
>> I agree with no drives in servers, but shared root is a
>> whole different ball game if you mean everyone using a
>> single filesystem for root.
>>
>> --
>> -- Tom Mornini, CTO
>> -- Engine Yard, Ruby on Rails Hosting
>> -- Reliability, Ease of Use, Scalability
>> -- (866) 518-YARD (9273)

-- 
Jayson Vantuyl
Systems Architect
Engine Yard
jvantuyl at engineyard.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070201/c0caed55/attachment.htm>

From jvantuyl at engineyard.com  Thu Feb  1 12:26:36 2007
From: jvantuyl at engineyard.com (Jayson Vantuyl)
Date: Thu, 1 Feb 2007 06:26:36 -0600
Subject: [Linux-cluster] Alternative? Diskless Shared-Root GFS/Cluster
In-Reply-To: <2007131144339.631166@leena>
References: <2007131144339.631166@leena>
Message-ID: <7FDF11F8-ECF8-49E8-B88F-6E163AFAA533@engineyard.com>

> Ok, might as well ask this... since I can't seem to find anything  
> on it. How
> about just a central storage that can be split up into many small  
> segments so
> that blades can boot over the network, then joint the GFS cluster?
We use an IDE flash disk in each server.  It's just too easy to put a  
readonly bootstrap image on the flash and boot up off of that.  With  
affordable 256MB flash disks you can even have a powerful repair  
environment there in case things get broken.

> I mean, all I want to do is to remove the drives since they really  
> aren't
> being used. All of the work is being done on the GFS cluster once a  
> machine is
> up and running. It barely does anything with it's drive other than  
> the OS of
> course, even logging is all remote.
Don't remove the drives, use IDE flash drives instead.  I think you  
can also use USB thumb drives if your BIOS supported it.  A 256MB  
flash for $26.30 is hard to beat.  Order directly from the  
manufacturer at:

http://www.transcendusa.com/Products/ModDetail.asp?ModNo=26&LangNo=0

We put the boot loader, kernel, and a simple maintenance environment  
on the flash.  We still boot our root off of the SAN.  Interestingly,  
our SAN supports partitioning.  What we do here is have partitions  
for each node (automatically mounted using a LABEL= mount).  After  
that boots up, we run CLVM with our GFSes on top of it.  Quite handy  
(and CLVM isn't really necessary for your case).

> Isn't there a simpler way of getting this done without having to  
> get into
> whole new technologies? All of the blades have PXE boot  
> capabilities, there
> must be some simple way of doing this?
I'd avoid this.  I've tried the PXE boot thing before and the PXE  
only becomes one more single point of failure / maintenance.  There's  
nothing like rebooting your cluster only to find that the PXE server  
has a failed disk.  :(

Basically with a SAN set up as follows:

/dev/san0p1 (FS for node 0, labeled node0)
/dev/san0p2 (FS for node 1, labeled node1)
...
/dev/san1 (CLVM / GFS / other stuff)

Your boot flash doesn't need much more than a very tiny Linux system  
(busybox is your friend), a file containing the node id (in this  
case /node_id) and a /linuxrc containing:

#!/bin/sh
NODEID=`cat /node_id`
# SET UP SAN HERE IF NECESSARY
mount /proc # Necessary because LABEL-mounting requires /proc/partitions
mount -o ro -L root-${NODEID} /newroot
cd /newroot
pivot_root . oldroot/
exec sbin/init

Considering that your flash hardly ever changes, and you can script  
creating of the flash image and node partitions, this quickly becomes  
very low maintenance.  If you want them to be identical, grab the MAC  
address off of the first NIC and generate the label with that...

A shared root is a nice idea.  However you just end up creating a  
fragile custom environment that is hostile to lots of software and  
creates new single points of failure and contention (making it  
neither high-performance nor high-availability).

-- 
Jayson Vantuyl
Systems Architect
Engine Yard
jvantuyl at engineyard.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070201/49c66c01/attachment.htm>

From grimme at atix.de  Thu Feb  1 16:12:26 2007
From: grimme at atix.de (Marc Grimme)
Date: Thu, 1 Feb 2007 17:12:26 +0100
Subject: [Linux-cluster] Diskless Shared-Root GFS/Cluster
In-Reply-To: <BF394909-64D9-44FE-8904-E5B1094C0DF0@engineyard.com>
References: <200713113342.116652@leena>
	<BF394909-64D9-44FE-8904-E5B1094C0DF0@engineyard.com>
Message-ID: <200702011712.26655.grimme@atix.de>

On Thursday 01 February 2007 12:57, Jayson Vantuyl wrote:
> We are talking about application servers.
>
> One of the toughest things about clustering in general and GFS in
> particular is the failure scenarios.
>
> When you have any sort of cluster issue, if your root is on a shared
> GFS, that GFS freezes in various ways until fencing happens.  The
> problem with this is that certain binaries that are on the same GFS
> may need to be used to recover.  How do you execute fence_apc to
> fence a failed node when it is on a GFS that is hung waiting on that
> same fencing operation?
We move that fencing, ccsd functionality into a special chroot that is rebuilt 
at any time you boot the server. This might be on a tmpfs - which is the case 
if the path you specified for the chroot is identified as GFS - and stays 
untouched if it is a local FS.
Many customers are using lokal disks but not for booting or any valueable data 
just for temporary files and swap. So that a server is only an independent 
exchangeable box of metal.
>
> There are ways around this involving RAM disks and the like, but
> eventually we just settled on having a minimal flash disk that would
> get us onto our SAN (but not clustered).  Only after we were on a non-
> clustered-FS on our SAN would we then start up our clustered
> filesystem.  This gave us the ability to move our nodes around
> easily.  This is an often overlooked benefit of a shared root that
> putting your root FS on SAN gives you as well.  There's nothing like
> booting up a dead node on spare hardware.  This also gives you a
> solid way to debug a damaged root system.  With shared-root it's all
> or nothing.  It's not so with this configuration.  You also have
> separate syslog files and other things that are one more special case
> on a shared root.  It's also easy to set up nodes with slightly
> different configurations (shared-root makes this another special
> case).  As for the danger of drive failure, a read-only IDE flash
> disk (Google for Transcend) is simple, easy, and dead solid.
You can also boot nodes with different hw configurations. The initrd in the 
open sharedroot does the hw detection.
>
> After consolidating your shared configuration files into /etc/shared
> and placing appropriate symlinks into that directory, it is a simple
> matter of rsync / csync / tsync / cron+scp to keep them synchronized.
That's a question of architecture not technology. Where do you want to have 
your complexity? In the FS or userspace?
>
> It is tempting to want to have a shared root to minimize management
> requirements.  It is tempting to want to play games with ramfs and
> the like to provide a support system that will function when that
> shared root is hung due to clustering issues.  It is tempting to
> think that having a shared GFS root is really useful.
>
> However, if you value reliability and practicality, it's much easier
> to script up an occasional Rsync than it is to do so many acrobatics
> for such little gain.  For a cluster (and its apps) to be reliable at
> all, it needs to be able to function, recover, and generally have a
> stable operating environment.  Putting GFS under the userspace that
> drives it is asking for trouble.
You should really have a deeper look into sharedroot concepts . You'll like 
it!

Regards Marc.
>
> On Jan 31, 2007, at 1:34 PM, isplist at logicore.net wrote:
> > I'm thinking for application servers/cluster only, not workstation
> > users.
> >
> > On Wed, 31 Jan 2007 11:10:55 -0800, Tom Mornini wrote:
> >> We boot from flash drives, then pivot root to SAN storage.
> >>
> >> I agree with no drives in servers, but shared root is a
> >> whole different ball game if you mean everyone using a
> >> single filesystem for root.
> >>
> >> --
> >> -- Tom Mornini, CTO
> >> -- Engine Yard, Ruby on Rails Hosting
> >> -- Reliability, Ease of Use, Scalability
> >> -- (866) 518-YARD (9273)
>
> --
> Jayson Vantuyl
> Systems Architect
> Engine Yard
> jvantuyl at engineyard.com

-- 
Gruss / Regards,

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany


From mvz+rhcluster at nimium.hr  Thu Feb  1 17:50:36 2007
From: mvz+rhcluster at nimium.hr (Miroslav Zubcic)
Date: Thu, 01 Feb 2007 18:50:36 +0100
Subject: [Linux-cluster] fenced(8) not fencing through all fenceing devices
Message-ID: <45C2286C.7050906@nimium.hr>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

As subject says, I have problem with yet another critical cluster
installation. If we cannot reslove this problem, our customer is
considering another cluster solution. :-(

I have two nodes - IBM servers with redundant power. Every server is
connected on the other APC power switch. If we pool out all ethernet
cables from active node, other node will fence faulty unreachable node,
but fenced(8) will only fork ONE fence_apc command to ONE APC power switch.

Every server has first power cable in first, and second in second power
switch for maximal redundancy and secutiry.

By recycling only one power device, server normaly doesn't shut down
because it has redundant power.

I have attached strace(1) on fenced and I see only one fence_apc beeing
execve(2)d and only one connect(2) - on the first (pwr01) switch.

This has worked on RHEL 3 2-3 years ago AFAIR ...


This is relevant part of cluster.conf(5):


        <clusternodes>
                <clusternode name="tan9f1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="pwr01" port="1"
switch="10.52.2.240"/>
                                </method>
                                <method name="2">
                                        <device name="pwr02" port="1"
switch="10.52.2.241"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="tan9f2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="pwr01" port="2"
switch="10.52.2.240"/>
                                </method>
                                <method name="2">
                                        <device name="pwr02" port="2"
switch="10.52.2.241"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_apc" ipaddr="10.52.2.240"
login="clumgr" name="pwr01" passwd="XXXXXXXXXXX"/>
                <fencedevice agent="fence_apc" ipaddr="10.52.2.241"
login="clumgr" name="pwr02" passwd="XXXXXXXXXXX"/>
        </fencedevices>


- --
Miroslav Zubcic, Nimium d.o.o., email: <mvz at nimium.hr>
Tel: +385 01 4852 639, Fax: +385 01 4852 640, Mobile: +385 098 942 8672
Gredicka 3, 10000 Zagreb, Hrvatska

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQEVAwUBRcIobMqzT+8/3SzgAQLErgf9FDTGtj6mnb0xQwEUxfiejzpObhcPelQr
GZhmf51ANJlEXwvH6Z17LxA5Y0QPU1xByUNR1bVBY45EYH3aZx90dWqBy7JNwq54
C1DOYCaoi+djeI7SUu/k5yhX20cDDqCKJfczQxKaW9LrIOYYIPmryBagnvPC2qaC
wLeYubrXGXJgn0HDeGFzDw/uWsLTJJItQnqHrCrFF+HI8nXNSYfaN5VmztkfUIhT
9D5/06z+k3nClMok5g1PLc+L5+8/7SYDrlfCs/mf5JBygW7H2vF9U2XRdKnIlPC0
iLYr/yGAcYqnj2DBDeyJqYZ+XK67RPsySFELGDF7kkv7fsMzLPMRrg==
=hZCZ
-----END PGP SIGNATURE-----


From teigland at redhat.com  Thu Feb  1 18:37:29 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 1 Feb 2007 12:37:29 -0600
Subject: [Linux-cluster] fenced(8) not fencing through all fenceing devices
In-Reply-To: <45C2286C.7050906@nimium.hr>
References: <45C2286C.7050906@nimium.hr>
Message-ID: <20070201183729.GB23329@redhat.com>

On Thu, Feb 01, 2007 at 06:50:36PM +0100, Miroslav Zubcic wrote:
>         <clusternodes>
>                 <clusternode name="tan9f1" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device name="pwr01" port="1"
> switch="10.52.2.240"/>
>                                 </method>
>                                 <method name="2">
>                                         <device name="pwr02" port="1"
> switch="10.52.2.241"/>
>                                 </method>
>                         </fence>

I think you want something like this instead:

<fence>
	<method name="1">
		<device name="pwr01" option="off" port="1" ../>
		<device name="pwr02" option="off" port="1" ../>
		<device name="pwr01" option="on" port="1" ../>
		<device name="pwr02" option="on" port="1" ../>
	</method>
</fence>

There are two problems with your config:

1. You have both devices in separate methods.  A second method is only
tried if the first fails.  

2. You're using the default "reboot" option which isn't reliable with dual
power supplies.  The first port may come back on before the second is
turned off.  So, you need to turn both ports off (ensuring the power is
really off) before turning either back on.

You may still have a minor problem, though, because in the two-node
cluster mode, a cluster partition will result in both nodes trying to
fence each other in parallel.  With a single power supply this works fine
because one node will always be turned off before it can turn off the
other.  But, with dual power supplies you can get both nodes turning off
one power port on the other, although only one of the nodes should succeed
in turning off the second power port.  i.e. the winner of the fencing race
may end up with one of its power ports turned off.  Whether this is a big
problem, I don't know.

Dave


From jparsons at redhat.com  Thu Feb  1 19:59:24 2007
From: jparsons at redhat.com (Jim Parsons)
Date: Thu, 01 Feb 2007 14:59:24 -0500
Subject: [Linux-cluster] fenced(8) not fencing through all fenceing devices
References: <45C2286C.7050906@nimium.hr> <20070201183729.GB23329@redhat.com>
Message-ID: <45C2469C.7090204@redhat.com>

David Teigland wrote:

>On Thu, Feb 01, 2007 at 06:50:36PM +0100, Miroslav Zubcic wrote:
>
>>        <clusternodes>
>>                <clusternode name="tan9f1" votes="1">
>>                        <fence>
>>                                <method name="1">
>>                                        <device name="pwr01" port="1"
>>switch="10.52.2.240"/>
>>                                </method>
>>                                <method name="2">
>>                                        <device name="pwr02" port="1"
>>switch="10.52.2.241"/>
>>                                </method>
>>                        </fence>
>>
>
>I think you want something like this instead:
>
><fence>
>	<method name="1">
>		<device name="pwr01" option="off" port="1" ../>
>		<device name="pwr02" option="off" port="1" ../>
>		<device name="pwr01" option="on" port="1" ../>
>		<device name="pwr02" option="on" port="1" ../>
>	</method>
></fence>
>
>There are two problems with your config:
>
>1. You have both devices in separate methods.  A second method is only
>tried if the first fails.  
>
>2. You're using the default "reboot" option which isn't reliable with dual
>power supplies.  The first port may come back on before the second is
>turned off.  So, you need to turn both ports off (ensuring the power is
>really off) before turning either back on.
>
>You may still have a minor problem, though, because in the two-node
>cluster mode, a cluster partition will result in both nodes trying to
>fence each other in parallel.  With a single power supply this works fine
>because one node will always be turned off before it can turn off the
>other.  But, with dual power supplies you can get both nodes turning off
>one power port on the other, although only one of the nodes should succeed
>in turning off the second power port.  i.e. the winner of the fencing race
>may end up with one of its power ports turned off.  Whether this is a big
>problem, I don't know.
>
>Dave
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster
>
Did you use system-config-cluster to generate this config file? It sets 
up redundant power supply fencing for you. What are the fencedevices 
being used here?

-J


From rpeterso at redhat.com  Thu Feb  1 21:45:54 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Thu, 01 Feb 2007 15:45:54 -0600
Subject: [Linux-cluster] GFS crashing on 2.6.18.6
In-Reply-To: <339554D0FE9DD94A8E5ACE4403676CEB01DA28E0@douwes.ka.sara.nl>
References: <339554D0FE9DD94A8E5ACE4403676CEB01DA28DD@douwes.ka.sara.nl>
	<339554D0FE9DD94A8E5ACE4403676CEB01DA28E0@douwes.ka.sara.nl>
Message-ID: <45C25F92.1020303@redhat.com>

Jaap Dijkshoorn wrote:
> People,
>
> This problem does not occure with kernel 2.6.17.14
>   
>> People,
>>
>> We have build the stable GFS tree againt 2.6.18.6. After we executed
>> ccs,cman, fenced and clvmd we tried to mount the GFS filesystems, with
>> the following result:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 
>>     
(snip)
>> EIP is at do_add_mount+0x66/0xeb
>>     
Hi Jaap and others,

I've gotten STABLE to compile against 2.6.20-rc6.  Now I, too, am 
hitting this
same issue.  So I'm working on this one now.  Hopefully I'll soon be 
updating the
STABLE tree in CVS with the fix for this at the same time.

Regards,

Bob Peterson
Red Hat Cluster Suite


From basv at sara.nl  Fri Feb  2 11:45:18 2007
From: basv at sara.nl (Bas van der Vlies)
Date: Fri, 02 Feb 2007 12:45:18 +0100
Subject: [Linux-cluster] GFS crashing on 2.6.18.6
In-Reply-To: <45C25F92.1020303@redhat.com>
References: <339554D0FE9DD94A8E5ACE4403676CEB01DA28DD@douwes.ka.sara.nl>	<339554D0FE9DD94A8E5ACE4403676CEB01DA28E0@douwes.ka.sara.nl>
	<45C25F92.1020303@redhat.com>
Message-ID: <45C3244E.7000802@sara.nl>

Robert Peterson wrote:
> Jaap Dijkshoorn wrote:
>> People,
>>
>> This problem does not occure with kernel 2.6.17.14
>>  
>>> People,
>>>
>>> We have build the stable GFS tree againt 2.6.18.6. After we executed
>>> ccs,cman, fenced and clvmd we tried to mount the GFS filesystems, with
>>> the following result:
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at     
> (snip)
>>> EIP is at do_add_mount+0x66/0xeb
>>>     
> Hi Jaap and others,
> 
> I've gotten STABLE to compile against 2.6.20-rc6.  Now I, too, am 
> hitting this
> same issue.  So I'm working on this one now.  Hopefully I'll soon be 
> updating the
> STABLE tree in CVS with the fix for this at the same time.
>

Thanks and good luck ;-)

-- 
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv at sara.nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************


From teigland at redhat.com  Fri Feb  2 15:15:13 2007
From: teigland at redhat.com (David Teigland)
Date: Fri, 2 Feb 2007 09:15:13 -0600
Subject: [Linux-cluster] fenced(8) not fencing through all fenceing devices
In-Reply-To: <45C312F9.1030403@nimium.hr>
References: <45C2286C.7050906@nimium.hr> <20070201183729.GB23329@redhat.com>
	<45C312F9.1030403@nimium.hr>
Message-ID: <20070202151513.GA16215@redhat.com>

On Fri, Feb 02, 2007 at 11:31:21AM +0100, Miroslav Zubcic wrote:
> David Teigland wrote:
> 
> > I think you want something like this instead:
> > 
> > <fence>
> > 	<method name="1">
> > 		<device name="pwr01" option="off" port="1" ../>
> > 		<device name="pwr02" option="off" port="1" ../>
> > 		<device name="pwr01" option="on" port="1" ../>
> > 		<device name="pwr02" option="on" port="1" ../>
> > 	</method>
> > </fence>
> 
> Yes! It works. I just tried two devices in one method, and then
> triggered fencing with "ip link set dev bond0 down" on one node ...
> three times just in case. It works.
> 
> > There are two problems with your config:
> 
> > 1. You have both devices in separate methods.  A second method is only
> > tried if the first fails.  
> 
> I didn't know there is a option to define both devices in one method.
> This system-config-cluster tool which I'm usually using to create
> initial configuration/skeleton for a new cluster setup doesn't have such
> option, man page cluster.conf(5) and PDF documentation fails to describe
> all possible config parameters, I concluded ad hoc, that declaring two
> devices in one method will be error. Well, ok, now I see that it isn't.
> 
> > 2. You're using the default "reboot" option which isn't reliable with dual
> > power supplies.  The first port may come back on before the second is
> > turned off.  So, you need to turn both ports off (ensuring the power is
> > really off) before turning either back on.
> 
> I have configured outlet on APC switches like this:
> 
>      1- Outlet Name         : Outlet 1
>      2- Power On Delay(sec) : 4
>      3- Power Off Delay(sec): 1
>      4- Reboot Duration(sec): 6
>      5- Accept Changes      :
> 
> So I didn't used undocumented options off/on in cluster.conf(5), 6
> second is enough for two telnet actions from fence_apc(8) I think.
> 
> It would be really nice if man(1) pages are up2date eh?
> 
> > You may still have a minor problem, though, because in the two-node
> > cluster mode, a cluster partition will result in both nodes trying to
> > fence each other in parallel.
> 
> I have discussed this issue on this list earlier. In RHEL 3 there was
> "tiebreaker IP address" option which dissapeared in RHEL 4 cluster, so
> we have well known "split brain" cluster condition.
> 
> It wolud be really nice if fenced(8) checks ethernet link condition
> before deciding to fence partner in two-node cluster. Somehow, two-node
> clusters are very popular in my country.

Have you looked into qdisk yet?  It's new and might help in this area.


> > With a single power supply this works fine
> > because one node will always be turned off before it can turn off the
> > other.  But, with dual power supplies you can get both nodes turning off
> > one power port on the other, although only one of the nodes should succeed
> > in turning off the second power port.  i.e. the winner of the fencing race
> > may end up with one of its power ports turned off.  Whether this is a big
> > problem, I don't know.
> 
> Yes, this is fourth time - fourth cluster installation, and I always
> have this problem. Weather machines have single or dual power supply.
> 
> I have workaround for this:
> 
> I creat bonding interface with all physical ethernet ports in it.
> Then, I configure vlanX interface with bond0 as base. On Cisco ethernet
> switch, I configure main ethernet segment untagged, and vlanX ethernet
> segment as tagged. On main ethernet there is data network, VIP addresses
> etc, but connection with fence devices (APC, WTI, iLO, RSA II ...) are
> in encapsulated vlan interface. In that way, while the last physical
> ethernet is functional and working, node is not fenced. If the last
> ethernet in bonding aggregation fails, node is fenced, but it doesn't
> have a chance to fence other node, because L2 layer + network is on the
> same physical devices where main link is.


From lshen at cisco.com  Fri Feb  2 19:48:21 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Fri, 2 Feb 2007 11:48:21 -0800
Subject: [Linux-cluster] Debug messages in GFS
Message-ID: <08A9A3213527A6428774900A80DBD8D803676A20@xmb-sjc-222.amer.cisco.com>

Is there a way to turn on/off debug messages when running GFS? And are
there different levels of debug tracing in GFS?

Lin 


From rpeterso at redhat.com  Fri Feb  2 20:44:48 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Fri, 02 Feb 2007 14:44:48 -0600
Subject: [Linux-cluster] Debug messages in GFS
In-Reply-To: <08A9A3213527A6428774900A80DBD8D803676A20@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D803676A20@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C3A2C0.6050102@redhat.com>

Lin Shen (lshen) wrote:
> Is there a way to turn on/off debug messages when running GFS? And are
> there different levels of debug tracing in GFS?
>
> Lin
Hi Lin,

In theory, you can use "mount...-o debug" with GFS, but I don't think it 
will tell
you much.  If you describe what you're trying to debug, maybe people on 
the list
can offer suggestions on better ways to do it.

Regards,

Bob Peterson
Red Hat Cluster Suite


From lshen at cisco.com  Fri Feb  2 21:53:06 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Fri, 2 Feb 2007 13:53:06 -0800
Subject: [Linux-cluster] Debug messages in GFS
In-Reply-To: <45C3A2C0.6050102@redhat.com>
Message-ID: <08A9A3213527A6428774900A80DBD8D803676AD3@xmb-sjc-222.amer.cisco.com>

Hi Bob,

I'm just trying to find out in general how to debug in GFS besides
looking into /var/log/messages. Does "mount -o debug" enable extra debug
tracing in GFS code?

Lin   

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Robert Peterson
> Sent: Friday, February 02, 2007 12:45 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] Debug messages in GFS
> 
> Lin Shen (lshen) wrote:
> > Is there a way to turn on/off debug messages when running 
> GFS? And are 
> > there different levels of debug tracing in GFS?
> >
> > Lin
> Hi Lin,
> 
> In theory, you can use "mount...-o debug" with GFS, but I 
> don't think it will tell you much.  If you describe what 
> you're trying to debug, maybe people on the list can offer 
> suggestions on better ways to do it.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From rpeterso at redhat.com  Fri Feb  2 23:05:52 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Fri, 02 Feb 2007 17:05:52 -0600
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
Message-ID: <45C3C3D0.3060209@redhat.com>

Hi Folks,

Attached is a patch to bring the entire STABLE branch of cluster
suite up to date so that it compiles against newer upstream kernels. 
I used Linus' upstream kernel, 2.6.20-rc7. I'm posting it here rather
than just committing it to CVS to give everyone a chance to eyeball
it first.  Here's what changed:

1. It compiles against the new kernel.
2. It should no longer cause a kernel panic in do_add_mount.
3. Included are the recent AIO changes to GFS.

The code hasn't seen much testing, so beware.  I've done some general
I/O testing but it probably should NOT be considered production ready.

As stated before, there are basically two separate cluster worlds now:

The cman-kernel way of doing things (e.g. RHEL4, STABLE) and
the openais way of doing things (e.g. RHEL5, HEAD).  They can't be
mixed and matched.  The place where those world collide is in clvmd,
the clustered lvm2.  In RHEL5, the clvmd works with the openais-based
cluster code (obviously).  To get clvmd to work properly with the
STABLE branch on a RHEL5 system, I had to do some minimal
patching and compile it from source.  I didn't try it with the RHEL4
version of clvmd because I wanted to compare GFS performance of the
two infrastructures, so I did my testing on the same RHEL5 cluster.
The performance test was also why I integrated the aio changes.
(BTW, performance was the same between the two infrastructures,
but this wasn't a very good performance test).

This is NOT meant to catch STABLE up with RHEL4, so there might be
other little changes between RHEL4 and STABLE.  I'll go
through and try to clean that up sometime after this is committed.

If some of you want to try it out and let me know I'd appreciate it.

Regards,

Bob Peterson
Red Hat Cluster Suite

-------------- next part --------------
A non-text attachment was scrubbed...
Name: stable.patch
Type: text/x-patch
Size: 40048 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070202/245f5872/attachment.bin>

From brandonlamb at gmail.com  Fri Feb  2 23:18:00 2007
From: brandonlamb at gmail.com (Brandon Lamb)
Date: Fri, 2 Feb 2007 15:18:00 -0800
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <45C3C3D0.3060209@redhat.com>
References: <45C3C3D0.3060209@redhat.com>
Message-ID: <f25c44610702021518g676352aeq9a1c9a5103591824@mail.gmail.com>

Very cool, thanks! I plan to try this out after im done with my ocfs2
and iscsi testing, cant wait to try out GFS too!

On 2/2/07, Robert Peterson <rpeterso at redhat.com> wrote:
> Hi Folks,
>
> Attached is a patch to bring the entire STABLE branch of cluster
> suite up to date so that it compiles against newer upstream kernels.
> I used Linus' upstream kernel, 2.6.20-rc7. I'm posting it here rather
> than just committing it to CVS to give everyone a chance to eyeball
> it first.  Here's what changed:
>
> 1. It compiles against the new kernel.
> 2. It should no longer cause a kernel panic in do_add_mount.
> 3. Included are the recent AIO changes to GFS.
>
> The code hasn't seen much testing, so beware.  I've done some general
> I/O testing but it probably should NOT be considered production ready.
>
> As stated before, there are basically two separate cluster worlds now:
>
> The cman-kernel way of doing things (e.g. RHEL4, STABLE) and
> the openais way of doing things (e.g. RHEL5, HEAD).  They can't be
> mixed and matched.  The place where those world collide is in clvmd,
> the clustered lvm2.  In RHEL5, the clvmd works with the openais-based
> cluster code (obviously).  To get clvmd to work properly with the
> STABLE branch on a RHEL5 system, I had to do some minimal
> patching and compile it from source.  I didn't try it with the RHEL4
> version of clvmd because I wanted to compare GFS performance of the
> two infrastructures, so I did my testing on the same RHEL5 cluster.
> The performance test was also why I integrated the aio changes.
> (BTW, performance was the same between the two infrastructures,
> but this wasn't a very good performance test).
>
> This is NOT meant to catch STABLE up with RHEL4, so there might be
> other little changes between RHEL4 and STABLE.  I'll go
> through and try to clean that up sometime after this is committed.
>
> If some of you want to try it out and let me know I'd appreciate it.
>
> Regards,
>
> Bob Peterson
> Red Hat Cluster Suite
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>


From natecars at natecarlson.com  Sat Feb  3 02:34:04 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Fri, 2 Feb 2007 20:34:04 -0600 (CST)
Subject: [Linux-cluster] GFS crashing on 2.6.18.6
In-Reply-To: <45C25F92.1020303@redhat.com>
References: <339554D0FE9DD94A8E5ACE4403676CEB01DA28DD@douwes.ka.sara.nl>
	<339554D0FE9DD94A8E5ACE4403676CEB01DA28E0@douwes.ka.sara.nl>
	<45C25F92.1020303@redhat.com>
Message-ID: <Pine.LNX.4.63.0702022033490.24155@tungsten.msp.technicality.org>

On Thu, 1 Feb 2007, Robert Peterson wrote:
> I've gotten STABLE to compile against 2.6.20-rc6.  Now I, too, am 
> hitting this same issue.  So I'm working on this one now.  Hopefully 
> I'll soon be updating the STABLE tree in CVS with the fix for this at 
> the same time.

Great! Thanks for taking a look at this.  :)

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From natecars at natecarlson.com  Sat Feb  3 07:18:18 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Sat, 3 Feb 2007 01:18:18 -0600 (CST)
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <45C3C3D0.3060209@redhat.com>
References: <45C3C3D0.3060209@redhat.com>
Message-ID: <Pine.LNX.4.63.0702022326200.24155@tungsten.msp.technicality.org>

On Fri, 2 Feb 2007, Robert Peterson wrote:
> If some of you want to try it out and let me know I'd appreciate it.

I grabbed CVS -STABLE, applied your patch, and tested.  :)

As a side note, if it would be possible to get the code to compile both on 
2.6.20 and 2.6.18, it'd be great - 2.6.18's the most recent kernel that I 
have Xen running with.

I'm on Debian with a 2.6.20-rc6-amd64 (64-bit) kernel. I got everything to 
build, joined the cluster, and tried to mount a GFS filesystem.. it 
crapped out. Looking at the logs, looks like a lock_dlm issue:

lock_dlm: disagrees about version of symbol dlm_new_lockspace
lock_dlm: Unknown symbol dlm_new_lockspace
lock_dlm: disagrees about version of symbol dlm_lock
lock_dlm: Unknown symbol dlm_lock

Note that it did actually throw a kernel dump:

GFS: Trying to join cluster "lock_dlm", "nate_test:gfs01"
lock_dlm: disagrees about version of symbol dlm_new_lockspace
lock_dlm: Unknown symbol dlm_new_lockspace
lock_dlm: disagrees about version of symbol dlm_lock
lock_dlm: Unknown symbol dlm_lock
lock_harness:  can't find protocol lock_dlm
GFS: can't mount proto = lock_dlm, table = nate_test:gfs01, hostdata =
Unable to handle kernel NULL pointer dereference at 0000000000000066 RIP:
  [<ffffffff802c48bf>] simple_set_mnt+0x4/0x20
PGD f596d067 PUD f6d5f067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: gfs(F) lock_harness dlm(F) cman button ac battery ipv6 
loop serio_raw evdev psmouse pcspkr sr_mod cdrom sg ext3 jbd mbcache 
dm_mirror dm_snapshot dm_mod raid1 md_mod ide_generic sd_mod amd74xx 
generic ide_core ata_generic ohci_hcd mptfc scsi_transport_fc tg3 mptspi 
scsi_transport_spi pata_amd libata i2c_amd756 i2c_core amd_rng shpchp 
k8temp pci_hotplug thermal processor fan mptscsih mptbase scsi_mod
Pid: 3183, comm: mount Tainted: GF     2.6.20-rc6-amd64 #1
RIP: 0010:[<ffffffff802c48bf>]  [<ffffffff802c48bf>] 
simple_set_mnt+0x4/0x20
RSP: 0018:ffff8100f5f9fa90  EFLAGS: 00010246
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000020
RDX: 0000000000000000 RSI: fffffffffffffffe RDI: ffff8100f7fce280
RBP: ffffc2000014a000 R08: 0000000000000000 R09: ffff810037ce53c0
R10: 0000000000000000 R11: ffff8100f779d3d8 R12: 00000000fffffffe
R13: 0000000000000000 R14: fffffffffffffffe R15: ffff8100f779d3c0
FS:  00002aab38ab11d0(0000) GS:ffffffff804d6000(0000) 
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000066 CR3: 00000000f5eda000 CR4: 00000000000006e0
Process mount (pid: 3183, threadinfo ffff8100f5f9e000, task 
ffff810037fe0890)
Stack:  ffffffff882a597a ffff8100f7fce280 0000000000000000 
ffffffff804455e0
  ffff8100f5f9faf8 ffff8100f58dc000 ffffffffffffffed ffff8100f58dd000
  0000000000000000 0000000000000000 ffffffff8028b1c2 ffff8100f7fce580
Call Trace:
  [<ffffffff882a597a>] :gfs:gfs_get_sb+0x1085/0x10a7
  [<ffffffff8028b1c2>] request_module+0x139/0x14d
  [<ffffffff80209a67>] __link_path_walk+0xc2d/0xd77
  [<ffffffff80254aeb>] cache_alloc_refill+0xda/0x1d9
  [<ffffffff802bb403>] vfs_kern_mount+0x93/0x11a
  [<ffffffff802bb4cc>] do_kern_mount+0x36/0x4d
  [<ffffffff802c4088>] do_mount+0x676/0x6e9
  [<ffffffff80229ede>] mntput_no_expire+0x19/0x8b
  [<ffffffff8020e031>] link_path_walk+0xc5/0xd7
  [<ffffffff8812b1fa>] :sd_mod:scsi_disk_put+0x2e/0x3f
  [<ffffffff802aacc6>] zone_statistics+0x3f/0x60
  [<ffffffff8020e11a>] __alloc_pages+0x5a/0x2bc
  [<ffffffff802460ae>] sys_mount+0x8a/0xd7
  [<ffffffff8025611e>] system_call+0x7e/0x83


Code: 48 8b 46 68 48 85 c0 74 0c 83 38 00 75 04 0f 0b eb fe f0 ff
RIP  [<ffffffff802c48bf>] simple_set_mnt+0x4/0x20
  RSP <ffff8100f5f9fa90>
CR2: 0000000000000066

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From natecars at natecarlson.com  Sat Feb  3 07:35:52 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Sat, 3 Feb 2007 01:35:52 -0600 (CST)
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <Pine.LNX.4.63.0702022326200.24155@tungsten.msp.technicality.org>
References: <45C3C3D0.3060209@redhat.com>
	<Pine.LNX.4.63.0702022326200.24155@tungsten.msp.technicality.org>
Message-ID: <Pine.LNX.4.63.0702030131510.24155@tungsten.msp.technicality.org>

On Sat, 3 Feb 2007, Nate Carlson wrote:
> lock_dlm: disagrees about version of symbol dlm_new_lockspace
> lock_dlm: Unknown symbol dlm_new_lockspace
> lock_dlm: disagrees about version of symbol dlm_lock
> lock_dlm: Unknown symbol dlm_lock

*sigh*

Of course, the kernel had the integrated cman modules.. silly me!

Disregard this - rebuilding and stuff.

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From natecars at natecarlson.com  Sat Feb  3 08:02:48 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Sat, 3 Feb 2007 02:02:48 -0600 (CST)
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <Pine.LNX.4.63.0702030131510.24155@tungsten.msp.technicality.org>
References: <45C3C3D0.3060209@redhat.com>
	<Pine.LNX.4.63.0702022326200.24155@tungsten.msp.technicality.org>
	<Pine.LNX.4.63.0702030131510.24155@tungsten.msp.technicality.org>
Message-ID: <Pine.LNX.4.63.0702030201580.24155@tungsten.msp.technicality.org>

On Sat, 3 Feb 2007, Nate Carlson wrote:
> Of course, the kernel had the integrated cman modules.. silly me!
>
> Disregard this - rebuilding and stuff.

Odd - even after building a kernel without dlm/gfs, I still get that 
error. In any case, if i 'modprobe -f' it, I can mount the GFS fs now - 
yay!

Now, if we could get the modules to build on 2.6.18 also, well, that'd 
just be unbelievably cool.  :)

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From kudjak at gmail.com  Sun Feb  4 17:50:35 2007
From: kudjak at gmail.com (Jan Kudjak)
Date: Sun, 4 Feb 2007 18:50:35 +0100
Subject: [Linux-cluster] (?) Problem with relocating extents within a VG
In-Reply-To: <45BE5E07.2080700@utmem.edu>
References: <45BE5E07.2080700@utmem.edu>
Message-ID: <353fcd0b0702040950la82ce36l7feabd59638d5626@mail.gmail.com>

I don't know if this helps.
I've done some tests on RHEL with pvmove and it didn't work for me
without VG deactivation, so you might go for vgchange -a n <vg>
and run pvmove afterwards.

jan

On 1/29/07, Jay Leafey <jleafey at utmem.edu> wrote:
>
> Environment:
> Two HP DL360 servers, each connected to the SAN via two FC adapters, SAN
> is an HP EVA3000 (I think).
>
> CentOS 4.4
> ccs-1.0.7-0.x86_64
> cman-1.0.11-0.x86_64
> GFS-6.1.6-1.x86_64
> GFS-kernel-smp-2.6.9-60.3.x86_64
> GFS-kernheaders-2.6.9-60.3.x86_64
> kernel-smp-2.6.9-42.0.3.EL.x86_64
> lvm2-cluster-2.02.06-7.0.RHEL4.x86_64
> lvm2-2.02.06-6.0.RHEL4.x86_64
> rgmanager-1.9.54-1.x86_64
>
> I have a volume group made up of two LUNs from our SAN.  I was
> attempting to move the extents of one logical volume in the VG from one
> PV to another when I received the following error:
>
> [root at cobalt ~]# pvmove --name old_coeusdev_data /dev/dm-10 /dev/dm-11
>    Error locking on node cobalt.utmem.edu: Resource temporarily
> unavailable
>    Failed to activate old_coeusdev_data
>
> For now, an LV stays where it is located (i.e. on a specific PV) once it
> is created.  I really need to move all of the LVs off of that PV so I
> can release it from the VG (alphabet soup!).  That PV is part of a
> storage pool we need to upgrade to larger drives.
>
> Anyway, any thoughts on this, or other information needed?
> --
> Jay Leafey - University of Tennessee
> E-Mail:  jleafey at utmem.edu  Phone:  901-448-5848  FAX:  901-448-8199
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070204/8acf1495/attachment.htm>

From jaap at sara.nl  Mon Feb  5 07:17:32 2007
From: jaap at sara.nl (Jaap Dijkshoorn)
Date: Mon, 5 Feb 2007 08:17:32 +0100
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <45C3C3D0.3060209@redhat.com>
References: <45C3C3D0.3060209@redhat.com>
Message-ID: <339554D0FE9DD94A8E5ACE4403676CEB01DA29DE@douwes.ka.sara.nl>

Hi Bob,

Thanks for the work. It sounds good that the development is still in
good progress. We are now running with 2.6.17.14 with STABLE. As soon as
we hit the files not showing up problem again i guess we will upgrade
again.

Best Regards,
Jaap
 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Robert Peterson
> Sent: zaterdag 3 februari 2007 0:06
> To: linux clustering
> Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
> 
> Hi Folks,
> 
> Attached is a patch to bring the entire STABLE branch of cluster
> suite up to date so that it compiles against newer upstream kernels. 
> I used Linus' upstream kernel, 2.6.20-rc7. I'm posting it here rather
> than just committing it to CVS to give everyone a chance to eyeball
> it first.  Here's what changed:
> 
> 1. It compiles against the new kernel.
> 2. It should no longer cause a kernel panic in do_add_mount.
> 3. Included are the recent AIO changes to GFS.
> 
> The code hasn't seen much testing, so beware.  I've done some general
> I/O testing but it probably should NOT be considered production ready.
> 
> As stated before, there are basically two separate cluster worlds now:
> 
> The cman-kernel way of doing things (e.g. RHEL4, STABLE) and
> the openais way of doing things (e.g. RHEL5, HEAD).  They can't be
> mixed and matched.  The place where those world collide is in clvmd,
> the clustered lvm2.  In RHEL5, the clvmd works with the openais-based
> cluster code (obviously).  To get clvmd to work properly with the
> STABLE branch on a RHEL5 system, I had to do some minimal
> patching and compile it from source.  I didn't try it with the RHEL4
> version of clvmd because I wanted to compare GFS performance of the
> two infrastructures, so I did my testing on the same RHEL5 cluster.
> The performance test was also why I integrated the aio changes.
> (BTW, performance was the same between the two infrastructures,
> but this wasn't a very good performance test).
> 
> This is NOT meant to catch STABLE up with RHEL4, so there might be
> other little changes between RHEL4 and STABLE.  I'll go
> through and try to clean that up sometime after this is committed.
> 
> If some of you want to try it out and let me know I'd appreciate it.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3199 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070205/dcd94265/attachment.bin>

From wkenji at labs.fujitsu.com  Mon Feb  5 11:18:05 2007
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Mon, 05 Feb 2007 20:18:05 +0900
Subject: [Linux-cluster] GFS2 on FC6
Message-ID: <45C7126D.5000300@labs.fujitsu.com>

Hello,

I'm trying to use GFS2 on FC6 using hardware that had been able to
successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
But it is very unstable... ;-<

The cluster has three nodes (Xeon x2) and a NetApp Filer as an iSCSI
target.  Data traffic and cman's traffic are using the same network.

For example, while cp-ing a file of 4GB from local disk to GFS2 area,
if an 'ls -l' for the destination GFS2 directory is executed on the
other node, the cp command freezes.

If the ls is not executed, the cp can finish.  But following that,
when rm of the copied file on GFS2 is executed from the other node,
disk usage of the GFS2 area (df) is not changed.  It seems like disk
blocks are not freed.

Currently I'm using FC6's standard packages only:

- kernel-2.6.19-1.2895.fc6 (SMP)
- iscsi-initiator-utils-6.2.0.747-0.0.fc6
- openais-0.80.1-3
- cman-2.0.60-1.fc6
- gfs2-utils-0.1.25-1.fc6

And I usually start cman and GFS2 on each node as follows:

# service cman start
# mount -t gfs2 /dev/isda /mnt/gfs
                ^^^^^^^^^symlink to /dev/sdb by udev

What's the problem?
Thanks,

Kenji


From nickolay at protei.ru  Mon Feb  5 11:58:37 2007
From: nickolay at protei.ru (Nickolay)
Date: Mon, 05 Feb 2007 14:58:37 +0300
Subject: [Linux-cluster] GFS1 + 2.6.18 patch
In-Reply-To: <20070203170007.BB09573753@hormel.redhat.com>
References: <20070203170007.BB09573753@hormel.redhat.com>
Message-ID: <45C71BED.3080609@protei.ru>

Patch fixes GFS crashing on 2.6.18.6
If anyone will have problems with this patch, let me know.
Applied to the latest CVS STABLE sources.

-- 
Nickolay Vinogradov
Protei Research and Development Center
St.Petersburg, 194044, Russia
Tel.: +7 812 449 47 27

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gfs1-patch-2.6.18.diff
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070205/6cc8e048/attachment.ksh>

From teigland at redhat.com  Mon Feb  5 15:02:56 2007
From: teigland at redhat.com (David Teigland)
Date: Mon, 5 Feb 2007 09:02:56 -0600
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <45C7126D.5000300@labs.fujitsu.com>
References: <45C7126D.5000300@labs.fujitsu.com>
Message-ID: <20070205150256.GA24917@redhat.com>

On Mon, Feb 05, 2007 at 08:18:05PM +0900, Kenji Wakamiya wrote:
> I'm trying to use GFS2 on FC6 using hardware that had been able to
> successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
> But it is very unstable... ;-<

If you want something stable and usable, you need to stick with GFS1.

Dave


From rpeterso at redhat.com  Mon Feb  5 16:16:11 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Mon, 05 Feb 2007 10:16:11 -0600
Subject: [Linux-cluster] NFS bug fixed in GFS stable tree?
In-Reply-To: <45BA12FB.2050807@sara.nl>
References: <339554D0FE9DD94A8E5ACE4403676CEB01DA282C@douwes.ka.sara.nl>	<45B78A54.1010204@redhat.com>	<339554D0FE9DD94A8E5ACE4403676CEB01DA2847@douwes.ka.sara.nl>	<45B8F965.5020906@redhat.com>
	<45BA12FB.2050807@sara.nl>
Message-ID: <45C7584B.4080609@redhat.com>

Bas van der Vlies wrote:
> Bob,
>
>  I just build the new STABLE source agains our 2.6.17.11 kernel and 
> did not encounter any problems, except for rgmanager. Some files are 
> missing
> that it tries to install:
>
> install: cannot stat `utils/config-utils.sh': No such file or directory
> install: cannot stat `utils/ra-skelet.sh': No such file or directory
> install: cannot stat `utils/messages.sh': No such file or directory
> install: cannot stat `utils/httpd-parse-config.pl': No such file or 
> directory
> install: cannot stat `utils/tomcat-parse-config.pl': No such file or 
> directory
> make[3]: *** [install] Error 1
>
> Regards
Hi Bas,

I believe Lon Hohberger fixed this in STABLE last week.

Regards,

Bob Peterson
Red Hat Cluster Suite


From maciej.bogucki at artegence.com  Mon Feb  5 17:16:11 2007
From: maciej.bogucki at artegence.com (Maciej Bogucki)
Date: Mon, 05 Feb 2007 18:16:11 +0100
Subject: [Linux-cluster] GFS problem with du and df
Message-ID: <45C7665B.7020703@artegence.com>

Hello,

I hava 5 node cluster with GFS filesystem. One of the partition is used 
to share indexes for Apache Lucene search engine - 
http://lucene.apache.org/java/docs/ across 5 application nodes.
The problem is that "du" command show different output than "df".
"du" output is correct, I'm sure that there is only 15M data.
The strangest thins is that "df" output of used space grows from minute 
to minute.

Here is the output of du, df, and mount.
[root at repo05 ~]# df /datafs/search-indexes/
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/cluster_116550929916-lucene
                        4980192   3169396   1810796  64% 
/datafs/search-indexes
[root at repo05 ~]# du -sh /datafs/search-indexes/
15M     /datafs/search-indexes/
[root at repo05 ~]# mount | grep search
/dev/mapper/cluster_116550929916-lucene on /datafs/search-indexes type 
gfs (rw,noatime,nodiratime)
[root at repo05 ~]#

Any ideas?

Best Regards
Maciej Bogucki


From natecars at natecarlson.com  Mon Feb  5 17:16:46 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Mon, 5 Feb 2007 11:16:46 -0600 (CST)
Subject: [Linux-cluster] GFS1 + 2.6.18 patch
In-Reply-To: <45C71BED.3080609@protei.ru>
References: <20070203170007.BB09573753@hormel.redhat.com>
	<45C71BED.3080609@protei.ru>
Message-ID: <Pine.LNX.4.63.0702051116340.22935@tungsten.msp.technicality.org>

On Mon, 5 Feb 2007, Nickolay wrote:
> Patch fixes GFS crashing on 2.6.18.6 If anyone will have problems with 
> this patch, let me know. Applied to the latest CVS STABLE sources.

You're the man.  :)

I will give this a shot later today - thanks!

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From rpeterso at redhat.com  Mon Feb  5 17:40:19 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Mon, 05 Feb 2007 11:40:19 -0600
Subject: [Linux-cluster] Debug messages in GFS
In-Reply-To: <08A9A3213527A6428774900A80DBD8D803676AD3@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D803676AD3@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C76C03.9050800@redhat.com>

Lin Shen (lshen) wrote:
> Hi Bob,
>
> I'm just trying to find out in general how to debug in GFS besides
> looking into /var/log/messages. Does "mount -o debug" enable extra debug
> tracing in GFS code?
>
> Lin   
>   
Hi Lin,

Mounting gfs with -o debug won't give you any extra debug messages.
There are several methods of debugging GFS issues, but they depend on
what issue you're trying to debug.  Most GFS debugging is done either
through special gfs_tool commands or using the /proc file system.

Regards,

Bob Peterson
Red Hat Cluster Suite


From lshen at cisco.com  Mon Feb  5 18:15:34 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Mon, 5 Feb 2007 10:15:34 -0800
Subject: [Linux-cluster] Debug messages in GFS
In-Reply-To: <45C76C03.9050800@redhat.com>
Message-ID: <08A9A3213527A6428774900A80DBD8D803676E7F@xmb-sjc-222.amer.cisco.com>

Hi Bob,

Is there a document that describes the gfs_tool and any GFS specific
/proc file system variables?

Lin

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Robert Peterson
> Sent: Monday, February 05, 2007 9:40 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] Debug messages in GFS
> 
> Lin Shen (lshen) wrote:
> > Hi Bob,
> >
> > I'm just trying to find out in general how to debug in GFS besides 
> > looking into /var/log/messages. Does "mount -o debug" enable extra 
> > debug tracing in GFS code?
> >
> > Lin   
> >   
> Hi Lin,
> 
> Mounting gfs with -o debug won't give you any extra debug messages.
> There are several methods of debugging GFS issues, but they 
> depend on what issue you're trying to debug.  Most GFS 
> debugging is done either through special gfs_tool commands or 
> using the /proc file system.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From lshen at cisco.com  Mon Feb  5 18:57:38 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Mon, 5 Feb 2007 10:57:38 -0800
Subject: [Linux-cluster] gnbd fencing
Message-ID: <08A9A3213527A6428774900A80DBD8D803676EE4@xmb-sjc-222.amer.cisco.com>

I read in one of the howto guide that there is a fencing method called
"gnbd fencing". Where can I find more information how this works?

Lin 


From rpeterso at redhat.com  Mon Feb  5 19:01:39 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Mon, 05 Feb 2007 13:01:39 -0600
Subject: [Linux-cluster] Debug messages in GFS
In-Reply-To: <08A9A3213527A6428774900A80DBD8D803676E7F@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D803676E7F@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C77F13.1060405@redhat.com>

Lin Shen (lshen) wrote:
> Hi Bob,
>
> Is there a document that describes the gfs_tool and any GFS specific
> /proc file system variables?
>
> Lin
>   

Hi Lin,

I don't know of any documents like that, except for the gfs_tool man 
page and the faq.

Regards,

Bob Peterson
Red Hat Cluster Suite


From breeves at redhat.com  Mon Feb  5 19:03:36 2007
From: breeves at redhat.com (Bryn M. Reeves)
Date: Mon, 05 Feb 2007 19:03:36 +0000
Subject: [Linux-cluster] gnbd fencing
In-Reply-To: <08A9A3213527A6428774900A80DBD8D803676EE4@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D803676EE4@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C77F88.6080701@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lin Shen (lshen) wrote:
> I read in one of the howto guide that there is a fencing method called
> "gnbd fencing". Where can I find more information how this works?
> 
> Lin 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

The fence_gnbd fencing agent is included in the gnbd package - take a
look at man 8 fence_gnbd for the details.

Kind regards,

Bryn.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFFx3+I6YSQoMYUY94RApZUAJ4nopHml1kvBZMXfGqa/uA1kb+FGwCfXYQm
QcdSdtPpB0stqNuGS+JvO14=
=0jT6
-----END PGP SIGNATURE-----


From lshen at cisco.com  Mon Feb  5 23:05:21 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Mon, 5 Feb 2007 15:05:21 -0800
Subject: [Linux-cluster] Data integrity testing tool
Message-ID: <08A9A3213527A6428774900A80DBD8D803677081@xmb-sjc-222.amer.cisco.com>

Is there a data integrity testing tool/suite for cluster file system,
mainly for detecting data corruptions?

Lin  


From natecars at natecarlson.com  Tue Feb  6 05:22:50 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Mon, 5 Feb 2007 23:22:50 -0600 (CST)
Subject: [Linux-cluster] GFS1 + 2.6.18 patch
In-Reply-To: <Pine.LNX.4.63.0702051116340.22935@tungsten.msp.technicality.org>
References: <20070203170007.BB09573753@hormel.redhat.com>
	<45C71BED.3080609@protei.ru>
	<Pine.LNX.4.63.0702051116340.22935@tungsten.msp.technicality.org>
Message-ID: <Pine.LNX.4.63.0702052322270.2435@tungsten.msp.technicality.org>

On Mon, 5 Feb 2007, Nickolay wrote:
>> Patch fixes GFS crashing on 2.6.18.6 If anyone will have problems with this 
>> patch, let me know. Applied to the latest CVS STABLE sources.

On Mon, 5 Feb 2007, Nate Carlson wrote:
> You're the man.  :)
>
> I will give this a shot later today - thanks!

Looks great - thanks! I will do more stress testing a bit later.

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From wkenji at labs.fujitsu.com  Tue Feb  6 06:29:38 2007
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Tue, 06 Feb 2007 15:29:38 +0900
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <20070205150256.GA24917@redhat.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
Message-ID: <45C82052.1080606@labs.fujitsu.com>

David Teigland wrote:
> On Mon, Feb 05, 2007 at 08:18:05PM +0900, Kenji Wakamiya wrote:
>> I'm trying to use GFS2 on FC6 using hardware that had been able to
>> successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
>> But it is very unstable... ;-<
> 
> If you want something stable and usable, you need to stick with GFS1.

Okay, I'll wait more time for GFS2 and will try patches for GFS1 with
newer kernels.  Thank you!

Kenji


From brandonlamb at gmail.com  Tue Feb  6 06:35:26 2007
From: brandonlamb at gmail.com (Brandon Lamb)
Date: Mon, 5 Feb 2007 22:35:26 -0800
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <45C82052.1080606@labs.fujitsu.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
	<45C82052.1080606@labs.fujitsu.com>
Message-ID: <f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>

On 2/5/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> David Teigland wrote:
> > On Mon, Feb 05, 2007 at 08:18:05PM +0900, Kenji Wakamiya wrote:
> >> I'm trying to use GFS2 on FC6 using hardware that had been able to
> >> successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
> >> But it is very unstable... ;-<
> >
> > If you want something stable and usable, you need to stick with GFS1.
>
> Okay, I'll wait more time for GFS2 and will try patches for GFS1 with
> newer kernels.  Thank you!
>
> Kenji

Correct me if I am wrong redhat, but it was my understanding that
development has moved to gfs2 and that in order to use a stable GFS v1
setup one would have to run older software with old kernels in order
to get it working?

Are you stuck with kernel 2.6.9 if you need stable gfs (v1)?


From srramasw at cisco.com  Tue Feb  6 07:55:18 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Mon, 5 Feb 2007 23:55:18 -0800
Subject: [Linux-cluster] Suggestion for journal and RG size for small
	filesystems
Message-ID: <B14199FA0DBAAF4AA89E83EB41D354350304D74C@xmb-sjc-22c.amer.cisco.com>

We have a need to create a very small GFS filesystem, as low as on a
512MB disk partition. For a three node cluster ( -j 3) what is the
suggested value for Journal size and ResourceGroup size? Obviously I
want maximize the usable diskspace while still keeping it safe to
operate and get a decent performance.
 
The default gfs_mkfs only leaves about 110 / 512MB volume. Reducing the
Journal size from 128M(default) to 32M(min) gives me back about 288MB. 

What is the impact of reducing journal size? Can I safely live with this
journal size if my app using this filesystem is less metadata intensive?
 
Similarly w.r.t ResourceGroup size, what is the impact of reducing its
size from 128M (default) to 32M (min) ?
 
Appreciate any thoughts on this.
 
thanks,
Sridharan
 
 
PS: Few of those gfs_mkfs snippets for reference,
 
$ gfs_mkfs -p lock_dlm -t alpha:gfs2 -j 3 /dev/hda12
...
Blocksize:                 4096
Filesystem Size:        28164
Journals:                   3
Resource Groups:     8
 
$ gfs_mkfs -p lock_dlm -t alpha:gfs2 -j 3 -J 32 /dev/hda12
...
Blocksize:                 4096
Filesystem Size:        101892
Journals:                   3
Resource Groups:     8

$ gfs_mkfs -p lock_dlm -t cisco:gfs2 -j 3 -J 32 -r 32 /dev/hda12
...
Blocksize:                 4096
Filesystem Size:        101868
Journals:                   3
Resource Groups:    14
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070205/3eda968b/attachment.htm>

From wkenji at labs.fujitsu.com  Tue Feb  6 08:43:12 2007
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Tue, 06 Feb 2007 17:43:12 +0900
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
References: <45C7126D.5000300@labs.fujitsu.com>	<20070205150256.GA24917@redhat.com>	<45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
Message-ID: <45C83FA0.10507@labs.fujitsu.com>

Hello,

Brandon Lamb wrote:
> Correct me if I am wrong redhat, but it was my understanding that
> development has moved to gfs2 and that in order to use a stable GFS v1
> setup one would have to run older software with old kernels in order
> to get it working?

I think so, too.  If I need to use truly stable GFS, I probably should
select RHEL4/CentOS4 and released version of GFS.  In that sense, the
stability level which I need now may be a little bit lower. :)

I have no reluctance to select CentOS, but currently I use FC5 and
GFS1 (release 1.03.00).  Updates for FC5 will soon stop, and I heard
FC6 includes GFS(2) in it's kernel and packages.  So, I just tried to
use that GFS2 on FC6, but unfortunately it doesn't has enough
stability in my environment.

Thanks,
Kenji


From brandonlamb at gmail.com  Tue Feb  6 08:50:12 2007
From: brandonlamb at gmail.com (Brandon Lamb)
Date: Tue, 6 Feb 2007 00:50:12 -0800
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <45C83FA0.10507@labs.fujitsu.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
	<45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
	<45C83FA0.10507@labs.fujitsu.com>
Message-ID: <f25c44610702060050t779b3ce2y3d59d3e6895e3e15@mail.gmail.com>

On 2/6/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> Hello,
>
> Brandon Lamb wrote:
> > Correct me if I am wrong redhat, but it was my understanding that
> > development has moved to gfs2 and that in order to use a stable GFS v1
> > setup one would have to run older software with old kernels in order
> > to get it working?
>
> I think so, too.  If I need to use truly stable GFS, I probably should
> select RHEL4/CentOS4 and released version of GFS.  In that sense, the
> stability level which I need now may be a little bit lower. :)
>
> I have no reluctance to select CentOS, but currently I use FC5 and
> GFS1 (release 1.03.00).  Updates for FC5 will soon stop, and I heard
> FC6 includes GFS(2) in it's kernel and packages.  So, I just tried to
> use that GFS2 on FC6, but unfortunately it doesn't has enough
> stability in my environment.
>
> Thanks,
> Kenji

This might be bad netiquette to post on the GFS mailing list, but you
might look at oracle's OCFS2. I was able to get it up and running in a
single day, its in the latest 2.6.20 kernel and then you just need to
download the 1.2.2 tools from their website.

It was much easier to get up and running. For that matter I was never
able to get GFS working or compile after 6 days. I have yet to try
since stable was updated.

Just another option to look at.


From basv at sara.nl  Tue Feb  6 09:07:30 2007
From: basv at sara.nl (Bas van der Vlies)
Date: Tue, 06 Feb 2007 10:07:30 +0100
Subject: [Linux-cluster] NFS bug fixed in GFS stable tree?
In-Reply-To: <45C7584B.4080609@redhat.com>
References: <339554D0FE9DD94A8E5ACE4403676CEB01DA282C@douwes.ka.sara.nl>	<45B78A54.1010204@redhat.com>	<339554D0FE9DD94A8E5ACE4403676CEB01DA2847@douwes.ka.sara.nl>	<45B8F965.5020906@redhat.com>	<45BA12FB.2050807@sara.nl>
	<45C7584B.4080609@redhat.com>
Message-ID: <45C84552.9010403@sara.nl>

Robert Peterson wrote:
> Bas van der Vlies wrote:
>> Bob,
>>
>>  I just build the new STABLE source agains our 2.6.17.11 kernel and 
>> did not encounter any problems, except for rgmanager. Some files are 
>> missing
>> that it tries to install:
>>
>> install: cannot stat `utils/config-utils.sh': No such file or directory
>> install: cannot stat `utils/ra-skelet.sh': No such file or directory
>> install: cannot stat `utils/messages.sh': No such file or directory
>> install: cannot stat `utils/httpd-parse-config.pl': No such file or 
>> directory
>> install: cannot stat `utils/tomcat-parse-config.pl': No such file or 
>> directory
>> make[3]: *** [install] Error 1
>>
>> Regards
> Hi Bas,
> 
> I believe Lon Hohberger fixed this in STABLE last week.
> 
Thanks it is fixed. I use the wrong options for cvs. Forgot the -d 
option for the update, maybe it can be in the FAQ or something. I am 
used to svn which does this automatically ;-)

Regards


-- 
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv at sara.nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************


From wkenji at labs.fujitsu.com  Tue Feb  6 12:00:21 2007
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Tue, 06 Feb 2007 21:00:21 +0900
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <f25c44610702060050t779b3ce2y3d59d3e6895e3e15@mail.gmail.com>
References: <45C7126D.5000300@labs.fujitsu.com>	<20070205150256.GA24917@redhat.com>	<45C82052.1080606@labs.fujitsu.com>	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>	<45C83FA0.10507@labs.fujitsu.com>
	<f25c44610702060050t779b3ce2y3d59d3e6895e3e15@mail.gmail.com>
Message-ID: <45C86DD5.2060708@labs.fujitsu.com>

Brandon Lamb wrote:
> This might be bad netiquette to post on the GFS mailing list, but you
> might look at oracle's OCFS2. I was able to get it up and running in a
> single day, its in the latest 2.6.20 kernel and then you just need to
> download the 1.2.2 tools from their website.

A few years ago, I was interested also in OCFS2.  But my uses need
Posix locks, Posix ACLs, and quota.  For this reason, I chose GFS
necessarily.  Now I like GFS on the whole.

> It was much easier to get up and running. For that matter I was never
> able to get GFS working or compile after 6 days. I have yet to try
> since stable was updated.

Apart from that, I didn't know installation of OCFS2 is so easy.
Thanks for everything,

Kenji


From teigland at redhat.com  Tue Feb  6 16:14:37 2007
From: teigland at redhat.com (David Teigland)
Date: Tue, 6 Feb 2007 10:14:37 -0600
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
	<45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
Message-ID: <20070206161437.GB20306@redhat.com>

On Mon, Feb 05, 2007 at 10:35:26PM -0800, Brandon Lamb wrote:
> On 2/5/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> >David Teigland wrote:
> >> On Mon, Feb 05, 2007 at 08:18:05PM +0900, Kenji Wakamiya wrote:
> >>> I'm trying to use GFS2 on FC6 using hardware that had been able to
> >>> successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
> >>> But it is very unstable... ;-<
> >>
> >> If you want something stable and usable, you need to stick with GFS1.
> >
> >Okay, I'll wait more time for GFS2 and will try patches for GFS1 with
> >newer kernels.  Thank you!
> >
> >Kenji
> 
> Correct me if I am wrong redhat, but it was my understanding that
> development has moved to gfs2 and that in order to use a stable GFS v1
> setup one would have to run older software with old kernels in order
> to get it working?
> 
> Are you stuck with kernel 2.6.9 if you need stable gfs (v1)?

No, GFS1 is very much alive and current and probably will be for a long
time.  GFS1 will be available on RHEL5 (based on 2.6.18+) just as it was
on RHEL4.  That GFS1 code comes from the RHEL5 cvs branch in the cluster
tree.  If you're not using RHEL, GFS1 is also being maintained for recent
upstream kernels in the STABLE and HEAD cvs branches.

So, there are 4 versions of GFS1 that are currently being maintained:

cluster-infrastructure-v1 (cman-kernel):

1. RHEL4 kernels		(cvs RHEL4 branch, cluster/gfs-kernel)
2. recent upstream kernels	(cvs STABLE branch, cluster/gfs-kernel)

cluster-infrastructure-v2 (openais):

3. RHEL5 kernels		(cvs RHEL5 branch, cluster/gfs-kernel)
4. recent upstream kernels	(cvs HEAD, cluster/gfs-kernel)

We may decide to stop porting #2 forward after 2.6.20, but if others want
to send patches to continue it, we'd be happy to take them.

Dave


From rpeterso at redhat.com  Tue Feb  6 16:14:35 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Tue, 06 Feb 2007 10:14:35 -0600
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <45C83FA0.10507@labs.fujitsu.com>
References: <45C7126D.5000300@labs.fujitsu.com>	<20070205150256.GA24917@redhat.com>	<45C82052.1080606@labs.fujitsu.com>	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
	<45C83FA0.10507@labs.fujitsu.com>
Message-ID: <45C8A96B.5050701@redhat.com>

Kenji Wakamiya wrote:
> Hello,
>
> Brandon Lamb wrote:
>> Correct me if I am wrong redhat, but it was my understanding that
>> development has moved to gfs2 and that in order to use a stable GFS v1
>> setup one would have to run older software with old kernels in order
>> to get it working?
> I think so, too.  If I need to use truly stable GFS, I probably should
> select RHEL4/CentOS4 and released version of GFS.  In that sense, the
> stability level which I need now may be a little bit lower. :)
Hi Brandon and Kenji,

Just a few days ago I posted a patch to the STABLE branch of CVS to bring
GFS v1 up to the latest upstream 2.6.20-rc7 kernel.  That's bleeding edge,
not "older software with old kernels".  We've got GFS1 for RHEL5 too.
GFS v1 will be around for a long time, and on the latest kernels.

Since GFS2 was accepted into the upstream kernel, we are also working
to get that stabilized.  I expect that people will naturally migrate 
from GFS v1
to GFS2 over time as it eventually makes GFS v1 obsolete, as ext3 did to 
ext2.

Disclaimer: these are only my beliefs and opinions, not Red Hat Gospel
because I'm not a manager and I'm not part of the decision-making process.

Regards,

Bob Peterson
Red Hat Cluster Suite


From teigland at redhat.com  Tue Feb  6 16:34:01 2007
From: teigland at redhat.com (David Teigland)
Date: Tue, 6 Feb 2007 10:34:01 -0600
Subject: [Linux-cluster] GFS2 on FC6
In-Reply-To: <20070206161437.GB20306@redhat.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
	<45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
	<20070206161437.GB20306@redhat.com>
Message-ID: <20070206163401.GD20306@redhat.com>

On Tue, Feb 06, 2007 at 10:14:37AM -0600, David Teigland wrote:
> On Mon, Feb 05, 2007 at 10:35:26PM -0800, Brandon Lamb wrote:
> > On 2/5/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> > >David Teigland wrote:
> > >> On Mon, Feb 05, 2007 at 08:18:05PM +0900, Kenji Wakamiya wrote:
> > >>> I'm trying to use GFS2 on FC6 using hardware that had been able to
> > >>> successfully use GFS1 (tarball-1.03.00) on FC5 and Open-iSCSI (svn).
> > >>> But it is very unstable... ;-<
> > >>
> > >> If you want something stable and usable, you need to stick with GFS1.
> > >
> > >Okay, I'll wait more time for GFS2 and will try patches for GFS1 with
> > >newer kernels.  Thank you!
> > >
> > >Kenji
> > 
> > Correct me if I am wrong redhat, but it was my understanding that
> > development has moved to gfs2 and that in order to use a stable GFS v1
> > setup one would have to run older software with old kernels in order
> > to get it working?
> > 
> > Are you stuck with kernel 2.6.9 if you need stable gfs (v1)?
> 
> No, GFS1 is very much alive and current and probably will be for a long
> time.  GFS1 will be available on RHEL5 (based on 2.6.18+) just as it was
> on RHEL4.  That GFS1 code comes from the RHEL5 cvs branch in the cluster
> tree.  If you're not using RHEL, GFS1 is also being maintained for recent
> upstream kernels in the STABLE and HEAD cvs branches.
> 
> So, there are 4 versions of GFS1 that are currently being maintained:
> 
> cluster-infrastructure-v1 (cman-kernel):
> 
> 1. RHEL4 kernels		(cvs RHEL4 branch, cluster/gfs-kernel)
> 2. recent upstream kernels	(cvs STABLE branch, cluster/gfs-kernel)
> 
> cluster-infrastructure-v2 (openais):
> 
> 3. RHEL5 kernels		(cvs RHEL5 branch, cluster/gfs-kernel)
> 4. recent upstream kernels	(cvs HEAD, cluster/gfs-kernel)

Also note that all 4 versions of GFS1 here are as similar as we can make
them, so they should all weigh in at about the same level of stability.
We haven't been making changes to GFS1 apart from bug fixes for a long
time now.

Dave


From lshen at cisco.com  Tue Feb  6 23:27:18 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Tue, 6 Feb 2007 15:27:18 -0800
Subject: [Linux-cluster] Minimum journal size
Message-ID: <08A9A3213527A6428774900A80DBD8D80367752D@xmb-sjc-222.amer.cisco.com>

According to gfs_mkfs man page, the minimum journal size is 32MB and
each node needs at least one journal.

Are those hard requirements? Is it possible to lower the minimum number
with some performance reduction? We have a use case that need to run gfs
on a 512MB Compact Flash to share among a few nodes. Based on the
current minimum requirements on journal and resource group, the disk
space overhead is too much.

Lin   


From rpeterso at redhat.com  Tue Feb  6 23:42:43 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Tue, 06 Feb 2007 17:42:43 -0600
Subject: [Linux-cluster] Minimum journal size
In-Reply-To: <08A9A3213527A6428774900A80DBD8D80367752D@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D80367752D@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C91273.6040807@redhat.com>

Lin Shen (lshen) wrote:
> According to gfs_mkfs man page, the minimum journal size is 32MB and
> each node needs at least one journal.
>
> Are those hard requirements? Is it possible to lower the minimum number
> with some performance reduction? We have a use case that need to run gfs
> on a 512MB Compact Flash to share among a few nodes. Based on the
> current minimum requirements on journal and resource group, the disk
> space overhead is too much.
>
> Lin
>   
Hi Lin,

You can create a gfs file system with journals smaller than 32MB by using
the undocumented, unrecommended, unsupported -X option (expert mode).
Something like this: gfs_mkfs -X -J 16 ...

This gfs_mkfs option is mostly used for testing weird file system 
conditions.
I haven't studied the journal code well enough to know if this will work,
and if it does, how well it will work.  Use it at your own risk.

Regards,

Bob Peterson
Red Hat Cluster Suite


From lshen at cisco.com  Wed Feb  7 01:27:43 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Tue, 6 Feb 2007 17:27:43 -0800
Subject: [Linux-cluster] Minimum journal size
In-Reply-To: <45C91273.6040807@redhat.com>
Message-ID: <08A9A3213527A6428774900A80DBD8D8036775C5@xmb-sjc-222.amer.cisco.com>

Hi Bob,

How a smaller than minimum journal size (32MB) would potentially affect
the file system? In other words, would it most likely to affect
performance or data integrity? Understanding those will help us to test
It out.

Lin     

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Robert Peterson
> Sent: Tuesday, February 06, 2007 3:43 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] Minimum journal size
> 
> Lin Shen (lshen) wrote:
> > According to gfs_mkfs man page, the minimum journal size is 
> 32MB and 
> > each node needs at least one journal.
> >
> > Are those hard requirements? Is it possible to lower the minimum 
> > number with some performance reduction? We have a use case 
> that need 
> > to run gfs on a 512MB Compact Flash to share among a few 
> nodes. Based 
> > on the current minimum requirements on journal and resource 
> group, the 
> > disk space overhead is too much.
> >
> > Lin
> >   
> Hi Lin,
> 
> You can create a gfs file system with journals smaller than 
> 32MB by using the undocumented, unrecommended, unsupported -X 
> option (expert mode).
> Something like this: gfs_mkfs -X -J 16 ...
> 
> This gfs_mkfs option is mostly used for testing weird file 
> system conditions.
> I haven't studied the journal code well enough to know if 
> this will work, and if it does, how well it will work.  Use 
> it at your own risk.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From benoit.duffau at devoteam.com  Wed Feb  7 08:59:33 2007
From: benoit.duffau at devoteam.com (Benoit DUFFAU)
Date: Wed, 07 Feb 2007 09:59:33 +0100
Subject: [Linux-cluster] GFS1 from cvs HEAD
In-Reply-To: <20070206161437.GB20306@redhat.com>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com> <45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
	<20070206161437.GB20306@redhat.com>
Message-ID: <1170838773.16660.15.camel@localhost.localdomain>

Le mardi 06 f?vrier 2007 ? 10:14 -0600, David Teigland a ?crit :
> On Mon, Feb 05, 2007 at 10:35:26PM -0800, Brandon Lamb wrote:
> > On 2/5/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> > 
> > Are you stuck with kernel 2.6.9 if you need stable gfs (v1)?
> 
> No, GFS1 is very much alive and current and probably will be for a long
> time.  GFS1 will be available on RHEL5 (based on 2.6.18+) just as it was
> on RHEL4.  That GFS1 code comes from the RHEL5 cvs branch in the cluster
> tree.  If you're not using RHEL, GFS1 is also being maintained for recent
> upstream kernels in the STABLE and HEAD cvs branches.
> 
> So, there are 4 versions of GFS1 that are currently being maintained:
> 
> cluster-infrastructure-v1 (cman-kernel):
> 
> 1. RHEL4 kernels		(cvs RHEL4 branch, cluster/gfs-kernel)
> 2. recent upstream kernels	(cvs STABLE branch, cluster/gfs-kernel)
> 
> cluster-infrastructure-v2 (openais):
> 
> 3. RHEL5 kernels		(cvs RHEL5 branch, cluster/gfs-kernel)
> 4. recent upstream kernels	(cvs HEAD, cluster/gfs-kernel)
> 

I'm very interested in #4 but the doc/usage.txt focuses on GFS2, i don't
see how i could get cvs HEAD GFS1 running with the new architecture ...

I found commented lines in the configure and Makefile, i uncommented
them, after few compile issues against my 2.6.19.2 i get a gfs.ko module

i try to modprobe it and i have those errors : 

gfs: Unknown symbol generic_file_read
gfs: Unknown symbol __generic_file_aio_read
gfs: Unknown symbol gfs2_unmount_lockproto
gfs: Unknown symbol gfs2_withdraw_lockproto
gfs: Unknown symbol generic_file_write_nolock
gfs: Unknown symbol gfs2_mount_lockproto

the problem is that i already modproded dlm and lock_dlm before probing
gfs... (i even probed gfs2 :) )

other problem is : even if i had succeeded in modprobing gfs module,
what do i have to do next ? ccsd -X ; cman_tool join ; groupd ; fenced ;
fence_tool join ; dlm_controld ; gfs_controld ? and using mkfs_gfs,
mount.gfs ...

If someone has already installed GFS1 from CVS HEAD, i'd be glad to see
a quick and dirty howto :)

Regards,

Benoit DUFFAU

+---------------------------------------------------------------------+
Combining consulting and technology solutions offers enables Devoteam
to provide its customers with independent advice and effective solutions
that meet their strategic objectives (IT performance and optimisation)
in complementary areas: networks, systems infrastructure, security
and e-business applications.
Created in 1995, Devoteam achieved in 2005 a turnover of 199 million euros
and an operating margin of 7%. The group counts 2,400 employees through
sixteen countries in Europe, the Middle East and North Africa.
Listed on Euronext (Eurolist B compartment) since October 28, 1999.
Part of the Nexteconomy, CAC SMALL 90, IT CAC 50, SBF 250 index of 
Euronext Paris
ISIN: FR 000007379 3, Reuters: DVTM.LM, Bloomberg: DEVO FP
+---------------------------------------------------------------------+


From Alain.Moulle at bull.net  Wed Feb  7 10:49:59 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Wed, 07 Feb 2007 11:49:59 +0100
Subject: [Linux-cluster] CS4 U4 / problem about Heart-Beat
Message-ID: <45C9AED7.6030901@bull.net>

Hi

With a HA pair node1 /node2 in mutual takeover :

node1 fails, so node2 is retreiving services normally
from node1.
node1 reboots
CS4 is re-started on node1 after reboot.
But as soon as it is re-started, it happens that
node2 fence again immediately the node1 .
Logs identify the "Missing too many heart-beats" cause
of fencing.
It happens 1 time on 10 tries, not systematically.

Is there an already known problem around this ?

Thanks
Alain Moull?


From cesar at ati.es  Wed Feb  7 12:23:50 2007
From: cesar at ati.es (Cesar O. Pablo)
Date: Wed, 07 Feb 2007 13:23:50 +0100
Subject: [Linux-cluster] Easy and simple question if you already knows the
	answer
Message-ID: <200702071223.l17CNuIB018682@mx2.redhat.com>


Hi,


 We have plans to build a 30TB aprox. data storage using a set of SAN boxes configured as 
RAID 5, our understanding is that using LVM2 with Red Hat 4 and our 64 bit AMD opteron 
processors shall allow us to do that. Our concern is to know in advance if our approach is 
wrong or even better if somebody else has been successfully taking the same road.

 Documentation tells us that the limit is in the order of the 8 Exabytes (8TB for IA-32) 
but this not enough as this is a go / no go decision.

  Thanks for any help or pointer to a similar rig.

 Cesar O. Pablo
 

PS: By the way, we already discovered that the max. physical partition cannot exceed 2TB 
with our Hardware.


Cesar


From lhh at redhat.com  Wed Feb  7 14:32:49 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 07 Feb 2007 09:32:49 -0500
Subject: [Linux-cluster] CS4 U4 / problem about Heart-Beat
In-Reply-To: <45C9AED7.6030901@bull.net>
References: <45C9AED7.6030901@bull.net>
Message-ID: <1170858770.7044.3.camel@localhost.localdomain>

On Wed, 2007-02-07 at 11:49 +0100, Alain Moulle wrote:
> Hi
> 
> With a HA pair node1 /node2 in mutual takeover :
> 
> node1 fails, so node2 is retreiving services normally
> from node1.
> node1 reboots
> CS4 is re-started on node1 after reboot.
> But as soon as it is re-started, it happens that
> node2 fence again immediately the node1 .
> Logs identify the "Missing too many heart-beats" cause
> of fencing.
> It happens 1 time on 10 tries, not systematically.
> 
> Is there an already known problem around this ?

Does it only happen on boot, or does it occasionally happen at other
times?

-- Lon


From teigland at redhat.com  Wed Feb  7 15:29:53 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 7 Feb 2007 09:29:53 -0600
Subject: [Linux-cluster] GFS1 from cvs HEAD
In-Reply-To: <1170838773.16660.15.camel@localhost.localdomain>
References: <45C7126D.5000300@labs.fujitsu.com>
	<20070205150256.GA24917@redhat.com>
	<45C82052.1080606@labs.fujitsu.com>
	<f25c44610702052235l5432a88fpba10b39bbd36982b@mail.gmail.com>
	<20070206161437.GB20306@redhat.com>
	<1170838773.16660.15.camel@localhost.localdomain>
Message-ID: <20070207152953.GA15952@redhat.com>

On Wed, Feb 07, 2007 at 09:59:33AM +0100, Benoit DUFFAU wrote:
> Le mardi 06 f?vrier 2007 ? 10:14 -0600, David Teigland a ?crit :
> > On Mon, Feb 05, 2007 at 10:35:26PM -0800, Brandon Lamb wrote:
> > > On 2/5/07, Kenji Wakamiya <wkenji at labs.fujitsu.com> wrote:
> > > 
> > > Are you stuck with kernel 2.6.9 if you need stable gfs (v1)?
> > 
> > No, GFS1 is very much alive and current and probably will be for a long
> > time.  GFS1 will be available on RHEL5 (based on 2.6.18+) just as it was
> > on RHEL4.  That GFS1 code comes from the RHEL5 cvs branch in the cluster
> > tree.  If you're not using RHEL, GFS1 is also being maintained for recent
> > upstream kernels in the STABLE and HEAD cvs branches.
> > 
> > So, there are 4 versions of GFS1 that are currently being maintained:
> > 
> > cluster-infrastructure-v1 (cman-kernel):
> > 
> > 1. RHEL4 kernels		(cvs RHEL4 branch, cluster/gfs-kernel)
> > 2. recent upstream kernels	(cvs STABLE branch, cluster/gfs-kernel)
> > 
> > cluster-infrastructure-v2 (openais):
> > 
> > 3. RHEL5 kernels		(cvs RHEL5 branch, cluster/gfs-kernel)
> > 4. recent upstream kernels	(cvs HEAD, cluster/gfs-kernel)
> > 
> 
> I'm very interested in #4 but the doc/usage.txt focuses on GFS2, i don't
> see how i could get cvs HEAD GFS1 running with the new architecture ...
> 
> I found commented lines in the configure and Makefile, i uncommented
> them, after few compile issues against my 2.6.19.2 i get a gfs.ko module
> 
> i try to modprobe it and i have those errors : 
> 
> gfs: Unknown symbol generic_file_read
> gfs: Unknown symbol __generic_file_aio_read
> gfs: Unknown symbol generic_file_write_nolock

Bob recently checked some gfs1 changes into cvs head to make it work on
2.6.20 -- these should go away with that update.

> gfs: Unknown symbol gfs2_unmount_lockproto
> gfs: Unknown symbol gfs2_withdraw_lockproto
> gfs: Unknown symbol gfs2_mount_lockproto

GFS1 uses the lockproto stuff in GFS2 to connect with the lock modules.
You need to add EXPORT_SYMBOL's for these three functions to
fs/gfs2/locking.c and recompile gfs2.  (My attempts to add these exports
upstream have been rejected.)

> the problem is that i already modproded dlm and lock_dlm before probing
> gfs... (i even probed gfs2 :) )

GFS1 uses dlm, lock_dlm and gfs2 from the upstream kernels.

> other problem is : even if i had succeeded in modprobing gfs module,
> what do i have to do next ? ccsd -X ; cman_tool join ; groupd ; fenced ;
> fence_tool join ; dlm_controld ; gfs_controld ? and using mkfs_gfs,
> mount.gfs ...

Yes, the same steps as using GFS2.

Dave


From rpeterso at redhat.com  Wed Feb  7 15:28:06 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 07 Feb 2007 09:28:06 -0600
Subject: [Linux-cluster] [PATCH] Port STABLE branch to upstream kernel
In-Reply-To: <Pine.LNX.4.63.0702030201580.24155@tungsten.msp.technicality.org>
References: <45C3C3D0.3060209@redhat.com>	<Pine.LNX.4.63.0702022326200.24155@tungsten.msp.technicality.org>	<Pine.LNX.4.63.0702030131510.24155@tungsten.msp.technicality.org>
	<Pine.LNX.4.63.0702030201580.24155@tungsten.msp.technicality.org>
Message-ID: <45C9F006.8050207@redhat.com>

Hi Cluster People,

The patch to bring STABLE up to the 2.6.20-rc7 kernel is now applied
to the STABLE tree in CVS.  Let me know if you have problems.

Regards,

Bob Peterson
Red Hat Cluster Suite


From rpeterso at redhat.com  Wed Feb  7 15:31:43 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 07 Feb 2007 09:31:43 -0600
Subject: [Linux-cluster] Minimum journal size
In-Reply-To: <08A9A3213527A6428774900A80DBD8D8036775C5@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D8036775C5@xmb-sjc-222.amer.cisco.com>
Message-ID: <45C9F0DF.8020405@redhat.com>

Lin Shen (lshen) wrote:
> Hi Bob,
>
> How a smaller than minimum journal size (32MB) would potentially affect
> the file system? In other words, would it most likely to affect
> performance or data integrity? Understanding those will help us to test
> It out.
>
> Lin     
>   
>> I haven't studied the journal code well enough to know if 
>> this will work, and if it does, how well it will work.  Use 
>> it at your own risk.
>>     
Hi Lin,

As I said (quoted above), I haven't studied the journal code of GFS well
enough to know how well it will work.  Sorry.

Regards,

Bob Peterson
Red Hat Cluster Suite


From rpeterso at redhat.com  Wed Feb  7 15:43:57 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 07 Feb 2007 09:43:57 -0600
Subject: [Linux-cluster] GFS1 + 2.6.18 patch
In-Reply-To: <45C71BED.3080609@protei.ru>
References: <20070203170007.BB09573753@hormel.redhat.com>
	<45C71BED.3080609@protei.ru>
Message-ID: <45C9F3BD.4070602@redhat.com>

Nickolay wrote:
> Patch fixes GFS crashing on 2.6.18.6
> If anyone will have problems with this patch, let me know.
> Applied to the latest CVS STABLE sources.
Hi Nickolay,

Thank you for the patch.  Unfortunately, I don't like the idea of
cluttering up the code with version-based conditional compile directives,
especially now that STABLE is ported to the 2.6.20 kernel.
(Your patch would probably require even more #ifdefs for <=2.6.17,
and < 2.6.20 or some such.)  So I'm not planning to apply this to the
STABLE branch in CVS at this time.

Regards,

Bob Peterson
Red Hat Cluster Suite


From teigland at redhat.com  Wed Feb  7 18:44:58 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 7 Feb 2007 12:44:58 -0600
Subject: [Linux-cluster] Minimum journal size
In-Reply-To: <08A9A3213527A6428774900A80DBD8D8036775C5@xmb-sjc-222.amer.cisco.com>
References: <45C91273.6040807@redhat.com>
	<08A9A3213527A6428774900A80DBD8D8036775C5@xmb-sjc-222.amer.cisco.com>
Message-ID: <20070207184457.GB15952@redhat.com>

On Tue, Feb 06, 2007 at 05:27:43PM -0800, Lin Shen (lshen) wrote:
> How a smaller than minimum journal size (32MB) would potentially affect
> the file system? In other words, would it most likely to affect
> performance or data integrity? Understanding those will help us to test
> It out.

I was curious, so I did:

  lvcreate -L 512MB -n small bench
  gfs_mkfs -X -J 16 -r 16 -j 4 -p lock_dlm -t bench:s /dev/bench/small 

mounted the fs on four nodes, and ran some misc load on all nodes;
it worked fine.

Dave


From Alain.Moulle at bull.net  Thu Feb  8 09:19:30 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Thu, 08 Feb 2007 10:19:30 +0100
Subject: [Linux-cluster] Re: CS4 U4 / problem about Heart-Beat (Lon
	Hohberger)
Message-ID: <45CAEB22.4060108@bull.net>

Hi Lon
It only happens after a reboot (the node1 beeing stopped
by a reset command), never in normal situation.
Alain

>> Hi
>>
>> With a HA pair node1 /node2 in mutual takeover :
>>
>> node1 fails, so node2 is retreiving services normally
>> from node1.
>> node1 reboots
>> CS4 is re-started on node1 after reboot.
>> But as soon as it is re-started, it happens that
>> node2 fence again immediately the node1 .
>> Logs identify the "Missing too many heart-beats" cause
>> of fencing.
>> It happens 1 time on 10 tries, not systematically.
>>
>> Is there an already known problem around this ?


>Does it only happen on boot, or does it occasionally happen at other
>times?
>-- Lon
-- 


mailto:Alain.Moulle at bull.net
+------------------------------+--------------------------------+
|	Alain Moull?	       	| from France :	04 76 29 75 99  |
|                              	| FAX number  : 04 76 29 72 49  |
| Bull SA		       	|				|
| 1, Rue de Provence  		| Adr  : FREC B1-041            |
| B.P. 208			|				|
| 38432 Echirolles - CEDEX     	| Email: Alain.Moulle at bull.net  |
| France                       	| BCOM : 229 7599               |
+-------------------------------+-------------------------------+


From shirai at sc-i.co.jp  Thu Feb  8 09:34:54 2007
From: shirai at sc-i.co.jp (Shirai@SystemCreateINC)
Date: Thu, 8 Feb 2007 18:34:54 +0900
Subject: [Linux-cluster] Easy and simple question if you already knows
	theanswer
References: <200702071223.l17CNuIB018682@mx2.redhat.com>
Message-ID: <002c01c74b64$63d3b4a0$6200a8c0@tostar>

Dear Cesar,

I connected 16 LUN of 1.5TB and created one LVM.
And, the GFS filesystem was able to be created correctly.
Perhaps, you will be able also to create the GFS filesystem of 30TB.
However, the df command is very slow  :<.

# uname -r
2.6.9-34.ELsmp

--- Logical volume ---
  LV Name                /dev/vg00/gfslv00
  VG Name                vg00
  LV UUID                bNmNaQ-NcRp-Kmp2-AwVx-LEFC-8HlJ-BIeSRZ
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                23.00 TB
  Current LE             6029312
  Segments               16
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:2

 # cat /proc/partitions
major minor  #blocks  name

   8    48 1562488832 sdd
   8    64 1562488832 sde
   8    80 1562488832 sdf
   8    96 1562488832 sdg
   8   112 1562488832 sdh
   8   128 1562488832 sdi
   8   144 1562488832 sdj
   8   160 1562488832 sdk
   8   176 1562488832 sdl
   8   192 1562488832 sdm
   8   208 1562488832 sdn
   8   224 1562488832 sdo
   8   240 1562488832 sdp
  65     0 1562488832 sdq
  65    16 1562488832 sdr
  65    32 1562488832 sds
  65    48   16777216 sdt
 253     2 24696061952 dm-2

Regards

------------------------------------------------------
Shirai Noriyuki
Chief Engineer Technical Div. System Create Inc
Kanda Toyo Bldg, 3-4-2 Kandakajicho
Chiyodaku Tokyo 101-0045 Japan
Tel81-3-5296-3775 Fax81-3-5296-3777
e-mail:shirai at sc-i.co.jp web:http://www.sc-i.co.jp
------------------------------------------------------


> Hi,
>
>
> We have plans to build a 30TB aprox. data storage using a set of SAN boxes 
> configured as
> RAID 5, our understanding is that using LVM2 with Red Hat 4 and our 64 bit 
> AMD opteron
> processors shall allow us to do that. Our concern is to know in advance if 
> our approach is
> wrong or even better if somebody else has been successfully taking the 
> same road.
>
> Documentation tells us that the limit is in the order of the 8 Exabytes 
> (8TB for IA-32)
> but this not enough as this is a go / no go decision.
>
>  Thanks for any help or pointer to a similar rig.
>
> Cesar O. Pablo
>
>
>
> PS: By the way, we already discovered that the max. physical partition 
> cannot exceed 2TB
> with our Hardware.
>
>
>
>
>
> Cesar
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.432 / Virus Database: 268.17.30/674 - Release Date: 
> 2007/02/07 15:33
>
> 


From rh-cluster at menole.net  Thu Feb  8 11:34:52 2007
From: rh-cluster at menole.net (rh-cluster at menole.net)
Date: Thu, 8 Feb 2007 12:34:52 +0100
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
Message-ID: <20070208113452.GA8178@shlemil.dyndns.org>

Hi,

since some days I do get a withdraw on 1 node of my 6 nodes gfs1 cluster.
Yesterday I did reboot all nodes. Now the problem has moved to another
node.

kernel messages are the same anytime:

GFS: fsid=epsilon:amal.1: fatal: assertion "x <= length" failed
GFS: fsid=epsilon:amal.1:   function = blkalloc_internal
GFS: fsid=epsilon:amal.1:   file =
/build/buildd/linux-modules-extra-2.6-2.6.17/debian/build/build_amd64_none_amd64_redhat-cluster/gfs/gfs/rgrp.c,
line = 1458
GFS: fsid=epsilon:amal.1:   time = 1170922910
GFS: fsid=epsilon:amal.1: about to withdraw from the cluster
GFS: fsid=epsilon:amal.1: waiting for outstanding I/O
GFS: fsid=epsilon:amal.1: telling LM to withdraw
lock_dlm: withdraw abandoned memory
GFS: fsid=epsilon:amal.1: withdrawn

`gfs_tool df` says:
/home:
  SB lock proto = "lock_dlm"rently  mounted GFS filesystems.  Each line
repre-
  SB lock table = "epsilon:affaire"The columns represent (in order): 1)
A num-
  SB ondisk format = 1309s a cookie that represents the mounted
filesystem. 2)
  SB multihost format = 1401e device that holds the filesystem (well, the
name
  Block size = 4096he Linux kernel knows it). 3) The lock table field
that the
  Journals = 12ilesystem was mounted with.
  Resource Groups = 1166
  Mounted lock proto = "lock_dlm"rsize]
  Mounted lock table = "epsilon:amal"t the locks this machine holds 
for  a
  Mounted host data = ""esystem.  Buffersize  is  the  size  of the
buffer (in
  Journal number = 0 that gfs_tool allocates to store  the  lock  data 
during
  Lock module flags = ng.  It defaults to 4194304 bytes.
  Local flocks = FALSE
  Local caching = FALSE
  Oopses OK = FALSE loads  arguments  into  the  module what will
override the
              mount options passed with the -o field on the next  mount. 
 See
  Type           Total          Used           Free           use%
  ------------------------------------------------------------------------
  inodes         731726         731726         0              100%
  metadata       329491         4392           325099         1%cks.
  data           75336111       4646188        70689923       6%


System:
6 Dual AMD Opteron
Kernel 2.6.17-2-amd64
Userland 32 Bit
Storage device via qlogic fibre channel qla2xxx, without serious problems
No LVM


Kind Regards,

menole


From wcheng at redhat.com  Thu Feb  8 15:00:32 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Thu, 08 Feb 2007 10:00:32 -0500
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208113452.GA8178@shlemil.dyndns.org>
References: <20070208113452.GA8178@shlemil.dyndns.org>
Message-ID: <1170946832.3452.5.camel@localhost>

On Thu, 2007-02-08 at 12:34 +0100, rh-cluster at menole.net wrote:

> 
> kernel messages are the same anytime:
> 
> GFS: fsid=epsilon:amal.1: fatal: assertion "x <= length" failed
> GFS: fsid=epsilon:amal.1:   function = blkalloc_internal
> GFS: fsid=epsilon:amal.1:   file =
> /build/buildd/linux-modules-extra-2.6-2.6.17/debian/build/build_amd64_none_amd64_redhat-cluster/gfs/gfs/rgrp.c,
> line = 1458
> GFS: fsid=epsilon:amal.1:   time = 1170922910
> GFS: fsid=epsilon:amal.1: about to withdraw from the cluster
> GFS: fsid=epsilon:amal.1: waiting for outstanding I/O
> GFS: fsid=epsilon:amal.1: telling LM to withdraw
> lock_dlm: withdraw abandoned memory
> GFS: fsid=epsilon:amal.1: withdrawn

Could you go to any of the good nodes and do a plain "df" command ?
Please mail out the 'df' output. 

-- Wendy


From axehind007 at yahoo.com  Thu Feb  8 15:26:47 2007
From: axehind007 at yahoo.com (Brian Pontz)
Date: Thu, 8 Feb 2007 07:26:47 -0800 (PST)
Subject: [Linux-cluster] kernel panic
Message-ID: <77900.45969.qm@web33210.mail.mud.yahoo.com>

I got the following kernel panic last night on one of
my cluster nodes. Does this look like a known bug?
I'll try and give some other useful info and let me
know if you need any other info.

OS: CentOS release 4.4 (Final)

uname -a
Linux scylla1 2.6.9-42.0.3.ELsmp #1 SMP Fri Oct 6
06:21:39 CDT 2006 i686 i686 i386 GNU/Linux

---------------------------------------------------
Feb  7 18:28:59 scylla1 clurgmgrd: [4346]: <info>
Executing /etc/init.d/httpd status
Feb  7 18:28:59 scylla1 clurgmgrd: [4346]: <info>
Executing /etc/init.d/mysqld status
Feb  7 18:29:00 scylla1 kernel: Unable to handle
kernel NULL pointer dereference at virtual address
00000000
Feb  7 18:29:00 scylla1 kernel:  printing eip:
Feb  7 18:29:00 scylla1 kernel: f8c743a6
Feb  7 18:29:00 scylla1 kernel: *pde = 00004001
Feb  7 18:29:00 scylla1 kernel: Oops: 0000 [#1]
Feb  7 18:29:00 scylla1 kernel: SMP
Feb  7 18:29:00 scylla1 kernel: Modules linked in:
iptable_filter ip_tables nfs nfsd exportfs lockd
nfs_acl parport_pc lp parport autofs4 i2c_dev i2c_core
lock_dlm(U) gfs
(U) lock_harness(U) dlm(U) cman(U) sunrpc dm_mirror
dm_multipath dm_mod button battery ac md5 ipv6
uhci_hcd ehci_hcd hw_random tg3 floppy ext3 jbd cciss
sd_mod scsi_mod
Feb  7 18:29:00 scylla1 kernel: CPU:    1
Feb  7 18:29:00 scylla1 kernel: EIP:   
0060:[<f8c743a6>]    Not tainted VLI
Feb  7 18:29:00 scylla1 kernel: EFLAGS: 00010206  
(2.6.9-42.0.3.ELsmp)
Feb  7 18:29:00 scylla1 kernel: EIP is at
gfs_glock_dq+0xaf/0x16e [gfs]
Feb  7 18:29:00 scylla1 kernel: eax: d3758b30   ebx:
d3758b24   ecx: f7f46400   edx: 00000000
Feb  7 18:29:00 scylla1 kernel: esi: 00000000   edi:
d3758b08   ebp: f6a8b41c   esp: e9adae98
Feb  7 18:29:00 scylla1 kernel: ds: 007b   es: 007b  
ss: 0068
Feb  7 18:29:00 scylla1 kernel: Process ypserv (pid:
31510, threadinfo=e9ada000 task=f387c130)
Feb  7 18:29:00 scylla1 kernel: Stack: 0219c74e
ee98019c f8ca96a0 f8945000 f6a8b41c f6a8b41c f6a8b404
f6a8b400
Feb  7 18:29:00 scylla1 kernel:        f8c747aa
f1cb7280 f8c8945c e9adaeec f1cb7280 00000000 00000007
f1cb7280
Feb  7 18:29:00 scylla1 kernel:        f8c894d0
f1cb7280 f8ca98e0 efeb2e20 c016e8ac 00000000 00000000
00000000
Feb  7 18:29:00 scylla1 kernel: Call Trace:
Feb  7 18:29:00 scylla1 kernel:  [<f8c747aa>]
gfs_glock_dq_uninit+0x8/0x10 [gfs]
Feb  7 18:29:00 scylla1 kernel:  [<f8c8945c>]
do_unflock+0x4f/0x61 [gfs]
Feb  7 18:29:00 scylla1 kernel:  [<f8c894d0>]
gfs_flock+0x62/0x76 [gfs]
Feb  7 18:29:00 scylla1 kernel:  [<c016e8ac>]
locks_remove_flock+0x49/0xe1
Feb  7 18:29:00 scylla1 kernel:  [<c015bbc2>]
__fput+0x41/0x100
Feb  7 18:29:00 scylla1 kernel:  [<c015a7f5>]
filp_close+0x59/0x5f
Feb  7 18:29:00 scylla1 kernel:  [<c0123b5b>]
put_files_struct+0x57/0xc0
Feb  7 18:29:00 scylla1 kernel:  [<c012476f>]
do_exit+0x245/0x404
Feb  7 18:29:00 scylla1 kernel:  [<c0124a19>]
sys_exit_group+0x0/0xd
Feb  7 18:29:00 scylla1 kernel:  [<c02d47cb>]
syscall_call+0x7/0xb
Feb  7 18:29:00 scylla1 kernel: Code: f8 ba 57 85 c9
f8 68 2d 82 c9 f8 8b 44 24 14 e8 e0 1e 02 00 59 5b f6
45 15 08 74 06 f0 0f ba 6f 08 04 f6 45 15 04 74 38 8b
57 28 <8b
> 02 0f 18 00 90 8d 47 28 39 c2 74 0b ff 04 24 89 54
24 04 8b
Feb  7 18:29:00 scylla1 kernel:  <0>Fatal exception:
panic in 5 seconds


From rh-cluster at menole.net  Thu Feb  8 15:49:53 2007
From: rh-cluster at menole.net (rh-cluster at menole.net)
Date: Thu, 8 Feb 2007 16:49:53 +0100
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <1170946832.3452.5.camel@localhost>
References: <20070208113452.GA8178@shlemil.dyndns.org>
	<1170946832.3452.5.camel@localhost>
Message-ID: <20070208154953.GA12788@shlemil.dyndns.org>

Hi,

df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             30755832   1326260  27867256   5% /
tmpfs                  4054504        80   4054424   1% /dev/shm
/dev/sda5             96124904   6868672  84373280   8% /home_local
/dev/sda6             57715868  11211396  43572612  21% /var
tmpfs                    10240        72     10168   1% /dev
/dev/sdc1            305589312  21658372 283930940   8% /home

the /dev/sdc1 is the GFS one

On Thu, Feb 08, 2007 at 10:00:32AM -0500, Wendy Cheng wrote:
> On Thu, 2007-02-08 at 12:34 +0100, rh-cluster at menole.net wrote:
> 
> > 
> > kernel messages are the same anytime:
> > 
> > GFS: fsid=epsilon:amal.1: fatal: assertion "x <= length" failed
> > GFS: fsid=epsilon:amal.1:   function = blkalloc_internal
> > GFS: fsid=epsilon:amal.1:   file =
> > /build/buildd/linux-modules-extra-2.6-2.6.17/debian/build/build_amd64_none_amd64_redhat-cluster/gfs/gfs/rgrp.c,
> > line = 1458
> > GFS: fsid=epsilon:amal.1:   time = 1170922910
> > GFS: fsid=epsilon:amal.1: about to withdraw from the cluster
> > GFS: fsid=epsilon:amal.1: waiting for outstanding I/O
> > GFS: fsid=epsilon:amal.1: telling LM to withdraw
> > lock_dlm: withdraw abandoned memory
> > GFS: fsid=epsilon:amal.1: withdrawn
> 
> Could you go to any of the good nodes and do a plain "df" command ?
> Please mail out the 'df' output. 
> 
> -- Wendy
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

-- 
Regards,

menole


From orkcu at yahoo.com  Thu Feb  8 16:15:12 2007
From: orkcu at yahoo.com (Roger Pe�a Escobio)
Date: Thu, 8 Feb 2007 08:15:12 -0800 (PST)
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208154953.GA12788@shlemil.dyndns.org>
Message-ID: <418582.74667.qm@web50602.mail.yahoo.com>


--- rh-cluster at menole.net wrote:

> Hi,
> 
> df
> Filesystem           1K-blocks      Used Available
> Use% Mounted on
> /dev/sda1             30755832   1326260  27867256  
> 5% /
> tmpfs                  4054504        80   4054424  
> 1% /dev/shm
> /dev/sda5             96124904   6868672  84373280  
> 8% /home_local
> /dev/sda6             57715868  11211396  43572612 
> 21% /var
> tmpfs                    10240        72     10168  
> 1% /dev
> /dev/sdc1            305589312  21658372 283930940  
> 8% /home
> 

what about "df -i" ?
I think I remember from your messages that there
wasn't free inodes left...

cu
roger

__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Any questions? Get answers on any topic at www.Answers.yahoo.com.  Try it now.


From rh-cluster at menole.net  Thu Feb  8 17:12:31 2007
From: rh-cluster at menole.net (rh-cluster at menole.net)
Date: Thu, 8 Feb 2007 18:12:31 +0100
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <418582.74667.qm@web50602.mail.yahoo.com>
References: <20070208154953.GA12788@shlemil.dyndns.org>
	<418582.74667.qm@web50602.mail.yahoo.com>
Message-ID: <20070208171231.GA16945@shlemil.dyndns.org>

Hi,

df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda1            3908128   55280 3852848    2% /
tmpfs                1013626      23 1013603    1% /dev/shm
/dev/sda5            12222464  107967 12114497    1% /home_local
/dev/sda6            7340032    5729 7334303    1% /var
tmpfs                1013626     741 1012885    1% /dev
/dev/sdc1            71712409  740007 70972402    2% /home

/dev/sdc1 is is the GFS1 partition

On Thu, Feb 08, 2007 at 08:15:12AM -0800, Roger Pe?a Escobio wrote:
> 
> --- rh-cluster at menole.net wrote:
> 
> > Hi,
> > 
> > df
> > Filesystem           1K-blocks      Used Available
> > Use% Mounted on
> > /dev/sda1             30755832   1326260  27867256  
> > 5% /
> > tmpfs                  4054504        80   4054424  
> > 1% /dev/shm
> > /dev/sda5             96124904   6868672  84373280  
> > 8% /home_local
> > /dev/sda6             57715868  11211396  43572612 
> > 21% /var
> > tmpfs                    10240        72     10168  
> > 1% /dev
> > /dev/sdc1            305589312  21658372 283930940  
> > 8% /home
> > 
> 
> what about "df -i" ?
> I think I remember from your messages that there
> wasn't free inodes left...
> 
> cu
> roger
> 
> __________________________________________
> RedHat Certified Engineer ( RHCE )
> Cisco Certified Network Associate ( CCNA )
> 
> 
>  
> ____________________________________________________________________________________
> Any questions? Get answers on any topic at www.Answers.yahoo.com.  Try it now.
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

-- 
Regards

menole


From srramasw at cisco.com  Thu Feb  8 18:02:50 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Thu, 8 Feb 2007 10:02:50 -0800
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208113452.GA8178@shlemil.dyndns.org>
Message-ID: <B14199FA0DBAAF4AA89E83EB41D354350304E354@xmb-sjc-22c.amer.cisco.com>

Interesting. While testing GFS with low jounrnal size and ResourceGroup
size, I hit the same issue,


Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: fatal: assertion "x
<= length" failed
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   function =
blkalloc_internal 
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   file =
/download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/rgrp.c, line = 1458 
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   time = 1170896502
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: about to withdraw
from the cluster
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: waiting for
outstanding I/O
Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: telling LM to
withdraw


This happened on a 3 node GFS over 512M device.

$ gfs_mkfs -t cisco:gfs2 -p lock_dlm -j 3 -J 8 -r 16 -X /dev/hda12

I was using bonnie++ to create about 10K files of 1K each from each of 3
nodes simulataneous.

Look at the code in rgrp.c it seems related to failure to find a
particular resource group block. Could this be due to a very low RG size
I'm using (16M) ??

Thanks,
Sridharan

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> rh-cluster at menole.net
> Sent: Thursday, February 08, 2007 3:35 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
> 
> Hi,
> 
> since some days I do get a withdraw on 1 node of my 6 nodes 
> gfs1 cluster.
> Yesterday I did reboot all nodes. Now the problem has moved to another
> node.
> 
> kernel messages are the same anytime:
> 
> GFS: fsid=epsilon:amal.1: fatal: assertion "x <= length" failed
> GFS: fsid=epsilon:amal.1:   function = blkalloc_internal
> GFS: fsid=epsilon:amal.1:   file =
> /build/buildd/linux-modules-extra-2.6-2.6.17/debian/build/buil
d_amd64_none_amd64_redhat-cluster/gfs/gfs/rgrp.c,
> line = 1458
> GFS: fsid=epsilon:amal.1:   time = 1170922910
> GFS: fsid=epsilon:amal.1: about to withdraw from the cluster
> GFS: fsid=epsilon:amal.1: waiting for outstanding I/O
> GFS: fsid=epsilon:amal.1: telling LM to withdraw
> lock_dlm: withdraw abandoned memory
> GFS: fsid=epsilon:amal.1: withdrawn
> 
> `gfs_tool df` says:
> /home:
>   SB lock proto = "lock_dlm"rently  mounted GFS filesystems.  
> Each line
> repre-
>   SB lock table = "epsilon:affaire"The columns represent (in 
> order): 1)
> A num-
>   SB ondisk format = 1309s a cookie that represents the mounted
> filesystem. 2)
>   SB multihost format = 1401e device that holds the 
> filesystem (well, the
> name
>   Block size = 4096he Linux kernel knows it). 3) The lock table field
> that the
>   Journals = 12ilesystem was mounted with.
>   Resource Groups = 1166
>   Mounted lock proto = "lock_dlm"rsize]
>   Mounted lock table = "epsilon:amal"t the locks this machine holds 
> for  a
>   Mounted host data = ""esystem.  Buffersize  is  the  size  of the
> buffer (in
>   Journal number = 0 that gfs_tool allocates to store  the  
> lock  data 
> during
>   Lock module flags = ng.  It defaults to 4194304 bytes.
>   Local flocks = FALSE
>   Local caching = FALSE
>   Oopses OK = FALSE loads  arguments  into  the  module what will
> override the
>               mount options passed with the -o field on the 
> next  mount. 
>  See
>   Type           Total          Used           Free           use%
>   
> --------------------------------------------------------------
> ----------
>   inodes         731726         731726         0              100%
>   metadata       329491         4392           325099         1%cks.
>   data           75336111       4646188        70689923       6%
> 
> 
> System:
> 6 Dual AMD Opteron
> Kernel 2.6.17-2-amd64
> Userland 32 Bit
> Storage device via qlogic fibre channel qla2xxx, without 
> serious problems
> No LVM
> 
> 
> Kind Regards,
> 
> menole
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From wcheng at redhat.com  Thu Feb  8 18:13:11 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Thu, 08 Feb 2007 13:13:11 -0500
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <B14199FA0DBAAF4AA89E83EB41D354350304E354@xmb-sjc-22c.amer.cisco.com>
References: <B14199FA0DBAAF4AA89E83EB41D354350304E354@xmb-sjc-22c.amer.cisco.com>
Message-ID: <1170958391.4618.13.camel@localhost>

On Thu, 2007-02-08 at 10:02 -0800, Sridharan Ramaswamy (srramasw) wrote:
> Interesting. While testing GFS with low jounrnal size and ResourceGroup
> size, I hit the same issue,

Thanks for all the good ifo. Will look into it when I'm back to office
sometime tomorrow.

-- Wendy

> 
> 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: fatal: assertion "x
> <= length" failed
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   function =
> blkalloc_internal 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   file =
> /download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/rgrp.c, line = 1458 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   time = 1170896502
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: about to withdraw
> from the cluster
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: waiting for
> outstanding I/O
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: telling LM to
> withdraw
> 
> 
> This happened on a 3 node GFS over 512M device.
> 
> $ gfs_mkfs -t cisco:gfs2 -p lock_dlm -j 3 -J 8 -r 16 -X /dev/hda12
> 
> I was using bonnie++ to create about 10K files of 1K each from each of 3
> nodes simulataneous.
> 
> Look at the code in rgrp.c it seems related to failure to find a
> particular resource group block. Could this be due to a very low RG size
> I'm using (16M) ??
> 
> Thanks,
> Sridharan
> 
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com 
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> > rh-cluster at menole.net
> > Sent: Thursday, February 08, 2007 3:35 AM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
> > 
> > Hi,
> > 
> > since some days I do get a withdraw on 1 node of my 6 nodes 
> > gfs1 cluster.
> > Yesterday I did reboot all nodes. Now the problem has moved to another
> > node.
> > 
> > kernel messages are the same anytime:
> > 
> > GFS: fsid=epsilon:amal.1: fatal: assertion "x <= length" failed
> > GFS: fsid=epsilon:amal.1:   function = blkalloc_internal
> > GFS: fsid=epsilon:amal.1:   file =
> > /build/buildd/linux-modules-extra-2.6-2.6.17/debian/build/buil
> d_amd64_none_amd64_redhat-cluster/gfs/gfs/rgrp.c,
> > line = 1458
> > GFS: fsid=epsilon:amal.1:   time = 1170922910
> > GFS: fsid=epsilon:amal.1: about to withdraw from the cluster
> > GFS: fsid=epsilon:amal.1: waiting for outstanding I/O
> > GFS: fsid=epsilon:amal.1: telling LM to withdraw
> > lock_dlm: withdraw abandoned memory
> > GFS: fsid=epsilon:amal.1: withdrawn
> > 
> > `gfs_tool df` says:
> > /home:
> >   SB lock proto = "lock_dlm"rently  mounted GFS filesystems.  
> > Each line
> > repre-
> >   SB lock table = "epsilon:affaire"The columns represent (in 
> > order): 1)
> > A num-
> >   SB ondisk format = 1309s a cookie that represents the mounted
> > filesystem. 2)
> >   SB multihost format = 1401e device that holds the 
> > filesystem (well, the
> > name
> >   Block size = 4096he Linux kernel knows it). 3) The lock table field
> > that the
> >   Journals = 12ilesystem was mounted with.
> >   Resource Groups = 1166
> >   Mounted lock proto = "lock_dlm"rsize]
> >   Mounted lock table = "epsilon:amal"t the locks this machine holds 
> > for  a
> >   Mounted host data = ""esystem.  Buffersize  is  the  size  of the
> > buffer (in
> >   Journal number = 0 that gfs_tool allocates to store  the  
> > lock  data 
> > during
> >   Lock module flags = ng.  It defaults to 4194304 bytes.
> >   Local flocks = FALSE
> >   Local caching = FALSE
> >   Oopses OK = FALSE loads  arguments  into  the  module what will
> > override the
> >               mount options passed with the -o field on the 
> > next  mount. 
> >  See
> >   Type           Total          Used           Free           use%
> >   
> > --------------------------------------------------------------
> > ----------
> >   inodes         731726         731726         0              100%
> >   metadata       329491         4392           325099         1%cks.
> >   data           75336111       4646188        70689923       6%
> > 
> > 
> > System:
> > 6 Dual AMD Opteron
> > Kernel 2.6.17-2-amd64
> > Userland 32 Bit
> > Storage device via qlogic fibre channel qla2xxx, without 
> > serious problems
> > No LVM
> > 
> > 
> > Kind Regards,
> > 
> > menole
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From teigland at redhat.com  Thu Feb  8 18:16:12 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 8 Feb 2007 12:16:12 -0600
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <B14199FA0DBAAF4AA89E83EB41D354350304E354@xmb-sjc-22c.amer.cisco.com>
References: <20070208113452.GA8178@shlemil.dyndns.org>
	<B14199FA0DBAAF4AA89E83EB41D354350304E354@xmb-sjc-22c.amer.cisco.com>
Message-ID: <20070208181612.GA4927@redhat.com>

On Thu, Feb 08, 2007 at 10:02:50AM -0800, Sridharan Ramaswamy (srramasw) wrote:
> Interesting. While testing GFS with low jounrnal size and ResourceGroup
> size, I hit the same issue,
> 
> 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: fatal: assertion "x
> <= length" failed
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   function =
> blkalloc_internal 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   file =
> /download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/rgrp.c, line = 1458 
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2:   time = 1170896502
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: about to withdraw
> from the cluster
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: waiting for
> outstanding I/O
> Feb  7 17:01:42 cfs1 kernel: GFS: fsid=cisco:gfs2.2: telling LM to
> withdraw
> 
> 
> This happened on a 3 node GFS over 512M device.
> 
> $ gfs_mkfs -t cisco:gfs2 -p lock_dlm -j 3 -J 8 -r 16 -X /dev/hda12
> 
> I was using bonnie++ to create about 10K files of 1K each from each of 3
> nodes simulataneous.
> 
> Look at the code in rgrp.c it seems related to failure to find a
> particular resource group block. Could this be due to a very low RG size
> I'm using (16M) ??

This is bz 215793 which has been around for quite a while and has been
very difficult for us to reproduce.  Perhaps using a smaller rg size is a
way to reproduce the bug more easily.

Dave


From orkcu at yahoo.com  Thu Feb  8 18:43:12 2007
From: orkcu at yahoo.com (Roger Pe�a Escobio)
Date: Thu, 8 Feb 2007 10:43:12 -0800 (PST)
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208113452.GA8178@shlemil.dyndns.org>
Message-ID: <733345.92962.qm@web50613.mail.yahoo.com>


--- rh-cluster at menole.net wrote:

> Hi,
[....]
from your original messages:

>   Oopses OK = FALSE loads  arguments  into  the 
> module what will
> override the
>               mount options passed with the -o field
> on the next  mount. 
>  See
>   Type   Total     Used      Free      use%
>   inodes 731726    731726       0      100%


so ...
 "df -i" say you are ok about inodes but this last
line say you are out of free inodes...


cu
roger


__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 


From orkcu at yahoo.com  Thu Feb  8 18:54:59 2007
From: orkcu at yahoo.com (Roger Pe�a Escobio)
Date: Thu, 8 Feb 2007 10:54:59 -0800 (PST)
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208181612.GA4927@redhat.com>
Message-ID: <907283.47701.qm@web50607.mail.yahoo.com>


--- David Teigland <teigland at redhat.com> wrote:

> On Thu, Feb 08, 2007 at 10:02:50AM -0800, Sridharan
> Ramaswamy (srramasw) wrote:
> > Interesting. While testing GFS with low jounrnal
> size and ResourceGroup
> > size, I hit the same issue,
> > 
> > 
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2: fatal: assertion "x
> > <= length" failed
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2:   function =
> > blkalloc_internal 
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2:   file =
> >
>
/download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/rgrp.c,
> line = 1458 
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2:   time = 1170896502
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2: about to withdraw
> > from the cluster
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2: waiting for
> > outstanding I/O
> > Feb  7 17:01:42 cfs1 kernel: GFS:
> fsid=cisco:gfs2.2: telling LM to
> > withdraw
> > 
> > 
> > This happened on a 3 node GFS over 512M device.
> > 
> > $ gfs_mkfs -t cisco:gfs2 -p lock_dlm -j 3 -J 8 -r
> 16 -X /dev/hda12
> > 
> > I was using bonnie++ to create about 10K files of
> 1K each from each of 3
> > nodes simulataneous.
> > 
> > Look at the code in rgrp.c it seems related to
> failure to find a
> > particular resource group block. Could this be due
> to a very low RG size
> > I'm using (16M) ??
> 
> This is bz 215793 which has been around for quite a
> while and has been
> very difficult for us to reproduce.  Perhaps using a
> smaller rg size is a
> way to reproduce the bug more easily.

but that bug is private?
I am getting:
" You are not authorized to access bug #215793"
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=215793

this is my second bug hit that is marked privated,
sadly related to cluster stuff
this is just a new trend in the working flow in redhat
or it is just because the reporter marked as privated
and redhat engineer respect the client's privacy ?

:-( I really do not like security by obscurity :-(
cu
roger

__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Expecting? Get great news right away with email Auto-Check. 
Try the Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html 


From rh-cluster at menole.net  Thu Feb  8 19:19:51 2007
From: rh-cluster at menole.net (rh-cluster at menole.net)
Date: Thu, 8 Feb 2007 20:19:51 +0100
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <733345.92962.qm@web50613.mail.yahoo.com>
References: <20070208113452.GA8178@shlemil.dyndns.org>
	<733345.92962.qm@web50613.mail.yahoo.com>
Message-ID: <20070208191950.GA20054@shlemil.dyndns.org>

On Thu, Feb 08, 2007 at 10:43:12AM -0800, Roger Pe?a Escobio wrote:
> 
> --- rh-cluster at menole.net wrote:
> 
> > Hi,
> [....]
> from your original messages:
> 
> >   Oopses OK = FALSE loads  arguments  into  the 
> > module what will
> > override the
> >               mount options passed with the -o field
> > on the next  mount. 
> >  See
> >   Type   Total     Used      Free      use%
> >   inodes 731726    731726       0      100%
> 
> 
> so ...
>  "df -i" say you are ok about inodes but this last
> line say you are out of free inodes...

you are absolutely right. strange. df vs. gfs_tool. Who is right?
Where did 98% of my inodes go?

Will fsck help?

> 
> 
> cu
> roger
> 
> 
> 
> __________________________________________
> RedHat Certified Engineer ( RHCE )
> Cisco Certified Network Associate ( CCNA )
> 
> 
>  
> ____________________________________________________________________________________
> Don't pick lemons.
> See all the new 2007 cars at Yahoo! Autos.
> http://autos.yahoo.com/new_cars.html 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
--

Regards,

menole


From srramasw at cisco.com  Thu Feb  8 19:29:46 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Thu, 8 Feb 2007 11:29:46 -0800
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208191950.GA20054@shlemil.dyndns.org>
Message-ID: <B14199FA0DBAAF4AA89E83EB41D354350304E3F4@xmb-sjc-22c.amer.cisco.com>

I ran a gfs_fsck after the problem occurred. It seems to have "fixed"
inodes, bitmaps and rg blocks. I umounted / re-mounted to bring the
filesystem back online. I haven't been able to check the integrity of
the existings files in the file or the ones being created.


$ gfs_fsck /dev/hda12
Initializing fsck
Clearing journals (this may take a while)..
Journals cleared.
Starting pass1
Found unused inode marked in-use
Clear unused inode at block 32972? (y/n) y
Pass1 complete
Starting pass1b
...
Starting pass5 
...
Converting 106 unused metadata blocks to free data blocks...
ondisk and fsck bitmaps differ at block 32972
Fix bitmap for block 32972? (y/n) y
Succeeded.
RG #9 free count inconsistent: is 177 should be 283
RG #9 used inode count inconsistent: is 3563 should be 3562
RG #9 free meta count inconsistent: is 201 should be 96
Update resource group counts? (y/n) y
Resource group counts updated
Converting 68 unused metadata blocks to free data blocks...
...
Converting 72 unused metadata blocks to free data blocks...
Pass5 complete
Writing changes to disk
$

Thanks,
Sridharan

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> rh-cluster at menole.net
> Sent: Thursday, February 08, 2007 11:20 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS1: node get withdrawn intermittent
> 
> On Thu, Feb 08, 2007 at 10:43:12AM -0800, Roger Pe a Escobio wrote:
> > 
> > --- rh-cluster at menole.net wrote:
> > 
> > > Hi,
> > [....]
> > from your original messages:
> > 
> > >   Oopses OK = FALSE loads  arguments  into  the 
> > > module what will
> > > override the
> > >               mount options passed with the -o field
> > > on the next  mount. 
> > >  See
> > >   Type   Total     Used      Free      use%
> > >   inodes 731726    731726       0      100%
> > 
> > 
> > so ...
> >  "df -i" say you are ok about inodes but this last
> > line say you are out of free inodes...
> 
> you are absolutely right. strange. df vs. gfs_tool. Who is right?
> Where did 98% of my inodes go?
> 
> Will fsck help?
> 
> > 
> > 
> > cu
> > roger
> > 
> > 
> > 
> > __________________________________________
> > RedHat Certified Engineer ( RHCE )
> > Cisco Certified Network Associate ( CCNA )
> > 
> > 
> >  
> > 
> ______________________________________________________________
> ______________________
> > Don't pick lemons.
> > See all the new 2007 cars at Yahoo! Autos.
> > http://autos.yahoo.com/new_cars.html 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> > 
> --
> 
> Regards,
> 
> menole
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From kanderso at redhat.com  Thu Feb  8 19:37:32 2007
From: kanderso at redhat.com (Kevin Anderson)
Date: Thu, 08 Feb 2007 13:37:32 -0600
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <733345.92962.qm@web50613.mail.yahoo.com>
References: <733345.92962.qm@web50613.mail.yahoo.com>
Message-ID: <1170963452.2905.5.camel@localhost.localdomain>

On Thu, 2007-02-08 at 10:43 -0800, Roger PeXa Escobio wrote:
> --- rh-cluster at menole.net wrote:
> 
> > Hi,
> [....]
> from your original messages:
> 
> >   Oopses OK = FALSE loads  arguments  into  the 
> > module what will
> > override the
> >               mount options passed with the -o field
> > on the next  mount. 
> >  See
> >   Type   Total     Used      Free      use%
> >   inodes 731726    731726       0      100%
> 
> 
> so ...
>  "df -i" say you are ok about inodes but this last
> line say you are out of free inodes...
> 
GFS does dynamic inode allocation, so as long as there are free blocks
to allocate in the filesystem, you are never out of inodes.  df -i
doesn't provide anything useful with respect to gfs.

Kevin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070208/870a78be/attachment.htm>

From teigland at redhat.com  Thu Feb  8 19:40:53 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 8 Feb 2007 13:40:53 -0600
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <20070208191950.GA20054@shlemil.dyndns.org>
References: <20070208113452.GA8178@shlemil.dyndns.org>
	<733345.92962.qm@web50613.mail.yahoo.com>
	<20070208191950.GA20054@shlemil.dyndns.org>
Message-ID: <20070208194053.GB4927@redhat.com>

On Thu, Feb 08, 2007 at 08:19:51PM +0100, rh-cluster at menole.net wrote:
> On Thu, Feb 08, 2007 at 10:43:12AM -0800, Roger Pe???a Escobio wrote:
> > 
> > --- rh-cluster at menole.net wrote:
> > 
> > > Hi,
> > [....]
> > from your original messages:
> > 
> > >   Oopses OK = FALSE loads  arguments  into  the 
> > > module what will
> > > override the
> > >               mount options passed with the -o field
> > > on the next  mount. 
> > >  See
> > >   Type   Total     Used      Free      use%
> > >   inodes 731726    731726       0      100%
> > 
> > 
> > so ...
> >  "df -i" say you are ok about inodes but this last
> > line say you are out of free inodes...
> 
> you are absolutely right. strange. df vs. gfs_tool. Who is right?
> Where did 98% of my inodes go?

gfs has dynamic inodes, it creates them as it needs them, you'll never run
out.

Dave


From teigland at redhat.com  Thu Feb  8 20:01:17 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 8 Feb 2007 14:01:17 -0600
Subject: [Linux-cluster] GFS1: node get withdrawn intermittent
In-Reply-To: <907283.47701.qm@web50607.mail.yahoo.com>
References: <20070208181612.GA4927@redhat.com>
	<907283.47701.qm@web50607.mail.yahoo.com>
Message-ID: <20070208200117.GC4927@redhat.com>

On Thu, Feb 08, 2007 at 10:54:59AM -0800, Roger PeXa Escobio wrote:
> but that bug is private?
> I am getting:
> " You are not authorized to access bug #215793"
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=215793
> 
> this is my second bug hit that is marked privated,
> sadly related to cluster stuff
> this is just a new trend in the working flow in redhat
> or it is just because the reporter marked as privated
> and redhat engineer respect the client's privacy ?

Sorry about that, private bugs are stupid, I don't understand it myself.
There's no reason the bug couldn't be visible with any private customer
details hidden.  There's probably some mysterious buraeucratic way of
dealing with it but I can't be bothered.  I've opened a new bz instead:

  227892: assertion 'x <= length' failed

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227892

AFAICT, there's nothing interesting or helpful in the private bug, it just
records the difficulty various people have had trying to reproduce it.
Thanks very much for the reports of this, it may be a break in finally
solving this.

Dave


From treddy at rallydev.com  Thu Feb  8 20:00:59 2007
From: treddy at rallydev.com (Tarun Reddy)
Date: Thu, 8 Feb 2007 13:00:59 -0700
Subject: [Linux-cluster] Optimal number of nodes?
Message-ID: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>

I'm currently testing a RHCS 4 cluster with 4 nodes. Is there any  
benefit from actually reducing this number down to 3? Other  
clustering systems I've used seem to prefer odd number of nodes and  
was wondering if RHCS has that same preference....

Thanks,
Tarun


From teigland at redhat.com  Thu Feb  8 20:13:20 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 8 Feb 2007 14:13:20 -0600
Subject: [Linux-cluster] Optimal number of nodes?
In-Reply-To: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>
References: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>
Message-ID: <20070208201320.GD4927@redhat.com>

On Thu, Feb 08, 2007 at 01:00:59PM -0700, Tarun Reddy wrote:
> I'm currently testing a RHCS 4 cluster with 4 nodes. Is there any  
> benefit from actually reducing this number down to 3? Other  
> clustering systems I've used seem to prefer odd number of nodes and  
> was wondering if RHCS has that same preference....

The odd number of nodes (votes) relates to quorum which requires over half
of the nodes (votes) to be present for any to operate.  So, if you define
three nodes (each with 1 vote) in cluster.conf, you need two in the
cluster to run.  If you define four nodes in cluster.conf, you need three
in the cluster to run.  Both configurations stop operating if two or more
nodes are down.

Dave


From treddy at rallydev.com  Thu Feb  8 20:22:31 2007
From: treddy at rallydev.com (Tarun Reddy)
Date: Thu, 8 Feb 2007 13:22:31 -0700
Subject: [Linux-cluster] Optimal number of nodes?
In-Reply-To: <20070208201320.GD4927@redhat.com>
References: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>
	<20070208201320.GD4927@redhat.com>
Message-ID: <A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>

On Feb 8, 2007, at 1:13 PM, David Teigland wrote:

> On Thu, Feb 08, 2007 at 01:00:59PM -0700, Tarun Reddy wrote:
>> I'm currently testing a RHCS 4 cluster with 4 nodes. Is there any
>> benefit from actually reducing this number down to 3? Other
>> clustering systems I've used seem to prefer odd number of nodes and
>> was wondering if RHCS has that same preference....
>
> The odd number of nodes (votes) relates to quorum which requires  
> over half
> of the nodes (votes) to be present for any to operate.  So, if you  
> define
> three nodes (each with 1 vote) in cluster.conf, you need two in the
> cluster to run.  If you define four nodes in cluster.conf, you need  
> three
> in the cluster to run.  Both configurations stop operating if two  
> or more
> nodes are down.
>
> Dave
>

Thank you Dave, that explains a lot and yet so simple. :-)

Just another question. In my cage I actually have a fifth machine,  
also running RedHat 4 but purposed for usage outside of the cluster.  
Actually adding it to the cluster, but limiting  services only to the  
four "real" nodes would actually significant improve the tolerance of  
my environment by allowing any two nodes to fail or be powered off,  
correct?

Tarun


From teigland at redhat.com  Thu Feb  8 20:31:32 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 8 Feb 2007 14:31:32 -0600
Subject: [Linux-cluster] Optimal number of nodes?
In-Reply-To: <A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>
References: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>
	<20070208201320.GD4927@redhat.com>
	<A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>
Message-ID: <20070208203132.GE4927@redhat.com>

On Thu, Feb 08, 2007 at 01:22:31PM -0700, Tarun Reddy wrote:
> Just another question. In my cage I actually have a fifth machine,  
> also running RedHat 4 but purposed for usage outside of the cluster.  
> Actually adding it to the cluster, but limiting  services only to the  
> four "real" nodes would actually significant improve the tolerance of  
> my environment by allowing any two nodes to fail or be powered off,  
> correct?

Correct, good plan.


From Danny.Wall at health-first.org  Thu Feb  8 20:56:59 2007
From: Danny.Wall at health-first.org (Danny Wall)
Date: Thu, 08 Feb 2007 15:56:59 -0500
Subject: [Linux-cluster] Question about memory usage
Message-ID: <45CB484A.449E.00C8.0@health-first.org>

Are there any resources that can help determine how much memory I need
to run a SAMBA cluster with several terabytes GFS storage? 

I have three RHCS clusters on Red Hat 4 U4, with two nodes in each
cluster. Both servers in a cluster have the same SAN storage mounted,
but only one node accesses the storage at a time (mostly). The storage
is shared out over several SAMBA shares, with several users accessing
the data at a time, via an automated proxy user, so technically, only a
few user connections are made directly to the clusters, but a few
hundred GB of data is written and accessed daily. Almost all data is
write once, read many, and the files range from a few KB to several
hundred MB.

I recently noticed that when running 'free -m', the servers all run
very low on memory. If I remove one node from the cluster by stopping
rgmanager, gfs, clvmd, fenced, cman, and ccsd, the memory get released
until I join it to the cluster again. I could stop them one at a time to
make sure it is GFS, but I assume much of the RAM is getting used for
caching and for GFS needs. I do not mind upgrading the RAM, but I would
like to know if there is a good way to size the servers properly for
this type of usage.

The servers are Dual Proc 3.6Ghz, with 2GB RAM each. They have U320 15K
SCSI drives, and Emulex Fibre Channel to the SAN. Everything else
appears to run fine, but one server ran out of memory, and I see others
that range between 16MB and 250MB free RAM. 

Thanks,
Danny


From lhh at redhat.com  Fri Feb  9 14:41:38 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 09 Feb 2007 09:41:38 -0500
Subject: [Linux-cluster] Optimal number of nodes?
In-Reply-To: <A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>
References: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com>
	<20070208201320.GD4927@redhat.com>
	<A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>
Message-ID: <1171032098.4798.13.camel@asuka.boston.devel.redhat.com>

On Thu, 2007-02-08 at 13:22 -0700, Tarun Reddy wrote:

> Just another question. In my cage I actually have a fifth machine,  
> also running RedHat 4 but purposed for usage outside of the cluster.  
> Actually adding it to the cluster, but limiting  services only to the  
> four "real" nodes would actually significant improve the tolerance of  
> my environment by allowing any two nodes to fail or be powered off,  
> correct?

Yes.

-- Lon


From lhh at redhat.com  Fri Feb  9 21:39:03 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 09 Feb 2007 16:39:03 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <45CCE528.8070904@redhat.com>
References: <45CCE528.8070904@redhat.com>
Message-ID: <1171057143.4798.43.camel@asuka.boston.devel.redhat.com>

On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:

> Todos:
>    Investigate gparted, one of the partition management tools we already 
> have (apis? remote accessibility?) (I believe Jim Meyering volunteered 

* Investigate Conga's cluster and non-cluster remotely-accessible LVM
management, which sounds like it would fit the bill?

APIs are all XMLRPC, IIRC, so they're extensible and flexible.

http://sourceware.org/cluster/conga/

-- Lon


From 14117614 at sun.ac.za  Sat Feb 10 07:37:38 2007
From: 14117614 at sun.ac.za (Pool, LC, Mr <14117614@sun.ac.za>)
Date: Sat, 10 Feb 2007 09:37:38 +0200
Subject: [Linux-cluster] Diagnostics?
References: <DD5CB500-1CE4-46FD-82FD-AB739ADA7E1D@rallydev.com><20070208201320.GD4927@redhat.com><A20A8FEF-94E7-453B-802D-BCC4EDAF5F39@rallydev.com>
	<1171032098.4798.13.camel@asuka.boston.devel.redhat.com>
Message-ID: <2C04D2F14FD8254386851063BC2B67065E09FC@STBEVS01.stb.sun.ac.za>

Hi....


I have 10 nodes that are not connected to any monitor or cd/dvd rom drive or a floppy drive, and I would like to run some diagnostics, like memtest, etc.

Is there anyway that I can do this?

I'm looking at using tftp/bootp, but this would mean writing my own image so as to get the output from the tests to determine if a machine is faulty or not. This would require alot of my time. Is there any free software that is already doing this.?

I'm looking on the web, but most of them are for single pc's....

Any ideas?

"We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances." Sir Isaac Newton


-----Original Message-----
From: linux-cluster-bounces at redhat.com on behalf of Lon Hohberger
Sent: Fri 2/9/2007 04:41 PM
To: linux clustering
Subject: Re: [Linux-cluster] Optimal number of nodes?
 
On Thu, 2007-02-08 at 13:22 -0700, Tarun Reddy wrote:

> Just another question. In my cage I actually have a fifth machine,  
> also running RedHat 4 but purposed for usage outside of the cluster.  
> Actually adding it to the cluster, but limiting  services only to the  
> four "real" nodes would actually significant improve the tolerance of  
> my environment by allowing any two nodes to fail or be powered off,  
> correct?

Yes.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3497 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070210/30d5a6ea/attachment.bin>

From isplist at logicore.net  Mon Feb 12 14:44:19 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Mon, 12 Feb 2007 08:44:19 -0600
Subject: [Linux-cluster] Can't see all volumes
Message-ID: <200721284419.218411@leena>

This might or might not be the right list and if it is not, does anyone know 
one that would cover things like fibre channel RAID devices?

I have an 800GB RAID drive which I have split into 32 smaller volumes. This is 
a fibre channel system, Xyratex. The problem is that I cannot see any more 
than 2 volumes per controller when I check from any of the nodes.

Basically, I would like to set up a small volume for each node along with 
their various GFS volumes.

Mike


From mwill at penguincomputing.com  Mon Feb 12 15:58:29 2007
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 12 Feb 2007 07:58:29 -0800
Subject: [Linux-cluster] Can't see all volumes
Message-ID: <433093DF7AD7444DA65EFAFE3987879C33D8A3@orca.penguincomputing.com>

Xyratex has a pretty good second level support team - work with the vendor you got it from and they should be able to help you out. 

 -----Original Message-----
From: 	isplist at logicore.net [mailto:isplist at logicore.net]
Sent:	Mon Feb 12 06:44:54 2007
To:	linux-cluster
Subject:	[Linux-cluster] Can't see all volumes

This might or might not be the right list and if it is not, does anyone know 
one that would cover things like fibre channel RAID devices?

I have an 800GB RAID drive which I have split into 32 smaller volumes. This is 
a fibre channel system, Xyratex. The problem is that I cannot see any more 
than 2 volumes per controller when I check from any of the nodes.

Basically, I would like to set up a small volume for each node along with 
their various GFS volumes.

Mike


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070212/7ca816e5/attachment.htm>

From jparsons at redhat.com  Mon Feb 12 17:29:34 2007
From: jparsons at redhat.com (James Parsons)
Date: Mon, 12 Feb 2007 12:29:34 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
Message-ID: <45D0A3FE.3000500@redhat.com>

Lon Hohberger wrote:

>On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
>
>  
>
>>Todos:
>>   Investigate gparted, one of the partition management tools we already 
>>have (apis? remote accessibility?) (I believe Jim Meyering volunteered 
>>    
>>
>
>* Investigate Conga's cluster and non-cluster remotely-accessible LVM
>management, which sounds like it would fit the bill?
>
>APIs are all XMLRPC, IIRC, so they're extensible and flexible.
>
>http://sourceware.org/cluster/conga/
>
>-- Lon
>
>  
>
It does general partition and file system creation tasks as well. The 
next release will even help you set up iscsi initiator and target.

-j


From cesar at ati.es  Mon Feb 12 18:28:30 2007
From: cesar at ati.es (Cesar O. Pablo)
Date: Mon, 12 Feb 2007 19:28:30 +0100
Subject: [Linux-cluster] Re: Easy and simple question if you already knows
	the answer
Message-ID: <200702121828.l1CISwVs030684@mx2.redhat.com>


  Dear Shirai,


     Thank you very much for the kindly information provided.


 Best regards


  Cesar


Cesar


From lhh at redhat.com  Mon Feb 12 21:19:58 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 12 Feb 2007 16:19:58 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <20070212175357.GF21671@redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
	<20070212175357.GF21671@redhat.com>
Message-ID: <1171315198.4693.27.camel@asuka.boston.devel.redhat.com>

On Mon, 2007-02-12 at 17:53 +0000, Daniel P. Berrange wrote:

> Hence our initial goal is to find a suitable C library we can call into
> to perform our simple set of storage management tasks. Now in keeping 
> with the libvirt model of pluggable hypervisor drivers, I'd  expect the 
> underlying libvirt impl of any storage APIs to also be pluggable. So while
> the initial impl might be based on GParteD, we would have the option of
> also providing a Conga based backend at a later date.

Thanks for the clarification!

I'm comfortable with that explanation.  My initial response was without
a full understanding of the context & requirements.

-- Lon


From lhh at redhat.com  Mon Feb 12 21:21:23 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 12 Feb 2007 16:21:23 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171308215.12535.23.camel@localhost.localdomain>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
	<45D0A3FE.3000500@redhat.com>
	<1171308215.12535.23.camel@localhost.localdomain>
Message-ID: <1171315283.4693.30.camel@asuka.boston.devel.redhat.com>

On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
> On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
> > Lon Hohberger wrote:
> > >http://sourceware.org/cluster/conga/
> 
> It seems that that should be http://www.sourceware.org/cluster/conga/ -
> I get a 503 without the www. BTW, where is the conga CVS ? It doesn't
> seem to be linked from that page.

Hrm, I thought it was on sources.redhat.com... Jim?

-- Lon


From lhh at redhat.com  Mon Feb 12 21:28:58 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 12 Feb 2007 16:28:58 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171315283.4693.30.camel@asuka.boston.devel.redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
	<45D0A3FE.3000500@redhat.com>
	<1171308215.12535.23.camel@localhost.localdomain>
	<1171315283.4693.30.camel@asuka.boston.devel.redhat.com>
Message-ID: <1171315739.4693.35.camel@asuka.boston.devel.redhat.com>

On Mon, 2007-02-12 at 16:21 -0500, Lon Hohberger wrote:
> On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
> > On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
> > > Lon Hohberger wrote:
> > > >http://sourceware.org/cluster/conga/
> > 
> > It seems that that should be http://www.sourceware.org/cluster/conga/ -
> > I get a 503 without the www. BTW, where is the conga CVS ? It doesn't
> > seem to be linked from that page.
> 
> Hrm, I thought it was on sources.redhat.com... Jim?
> 

http://sources.redhat.com/cgi-bin/cvsweb.cgi/conga/?cvsroot=cluster

-- Lon


From srramasw at cisco.com  Tue Feb 13 01:08:52 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Mon, 12 Feb 2007 17:08:52 -0800
Subject: [Linux-cluster] GFS Journaling: How to interpret gfs_tool jindex
	output?
Message-ID: <B14199FA0DBAAF4AA89E83EB41D35435030A65EB@xmb-sjc-22c.amer.cisco.com>

I'm trying to measure the usage of journal blocks created within a GFS
disk. I'm running bonnie++ small file test hoping to overwhelm the
journal with lot of entries to measure its peak usage. I saw gfs_tool
prints jindex data structure. Can anyone help to interpret its output?
 
Also I like to hear from anyone who went through similar exercise of
understanding GFS journal usage in an active system.
 
thanks,
Sridharan
 
[root at cfs1 cluster]$ gfs_tool jindex /mnt/gfs2
  mh_magic = 0x01161970
  mh_type = 4
  mh_generation = 0
  mh_format = 400
  mh_incarn = 0
  no_formal_ino = 21
  no_addr = 21
  di_mode = 0600
  di_uid = 0
  di_gid = 0
  di_nlink = 1
  di_size = 240
  di_blocks = 1
  di_atime = 1171308008
  di_mtime = 1171308008
  di_ctime = 1171308008
  di_major = 0
  di_minor = 0
  di_rgrp = 0
  di_goal_rgrp = 0
  di_goal_dblk = 0
  di_goal_mblk = 0
  di_flags = 0x00000001
  di_payload_format = 1000
  di_type = 1
  di_height = 0
  di_incarn = 0
  di_pad = 0
  di_depth = 0
  di_entries = 0
  no_formal_ino = 0
  no_addr = 0
  di_eattr = 0
  di_reserved =
...
 
Journal 0:
 
  ji_addr = 61728
  ji_nsegment = 63
  ji_pad = 0
  ji_reserved =
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070212/5596c542/attachment.htm>

From hlawatschek at atix.de  Tue Feb 13 12:37:18 2007
From: hlawatschek at atix.de (Mark Hlawatschek)
Date: Tue, 13 Feb 2007 13:37:18 +0100
Subject: [Linux-cluster] gfs deadlock situation
Message-ID: <200702131337.18680.hlawatschek@atix.de>

Hi,

we have the following deadlock situation:

2 node cluster consisting of node1 and node2. 
/usr/local is placed on a GFS filesystem mounted on both nodes. 
Lockmanager is dlm.
We are using RHEL4u4

a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
lstat("/usr/local/swadmin/mnx/xml",                                                                

This happens on both cluster nodes.

All processes trying to access the directory /usr/local/swadmin/mnx/xml are 
in "Waiting for IO (D)" state. I.e. system load is at about 400 ;-)

Any ideas ?

a lockdump analysis with the decipher_lockstate_dump and parse_lockdump shows 
the following output (The whole file is too large for the mailing-list):

Entries:  101939
Glocks:  60112
PIDs:  751

4 chain:
lockdump.node1.dec Glock (inode[2], 1114343)
  gl_flags = lock[1]
  gl_count = 5
  gl_state = shared[3]
  req_gh = yes
  req_bh = yes
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 1
  ail_bufs = no
  Request
    owner = 5856
    gh_state = exclusive[1]
    gh_flags = try[0] local_excl[5] async[6]
    error = 0
    gh_iflags = promote[1]
  Waiter3
    owner = 5856
    gh_state = exclusive[1]
    gh_flags = try[0] local_excl[5] async[6]
    error = 0
    gh_iflags = promote[1]
  Inode: busy
lockdump.node2.dec Glock (inode[2], 1114343)
  gl_flags =
  gl_count = 2
  gl_state = unlocked[0]
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 0
  ail_bufs = no
  Inode:
    num = 1114343/1114343
    type = regular[1]
    i_count = 1
    i_flags =
    vnode = yes
lockdump.node1.dec Glock (inode[2], 627732)
  gl_flags = dirty[5]
  gl_count = 379
  gl_state = exclusive[1]
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 58
  ail_bufs = no
  Holder
    owner = 5856
    gh_state = exclusive[1]
    gh_flags = try[0] local_excl[5] async[6]
    error = 0
    gh_iflags = promote[1] holder[6] first[7]
  Waiter2
    owner = none[-1]
    gh_state = shared[3]
    gh_flags = try[0]
    error = 0
    gh_iflags = demote[2] alloced[4] dealloc[5]
  Waiter3
    owner = 32753
    gh_state = shared[3]
    gh_flags = any[3]
    error = 0
    gh_iflags = promote[1]
  [...loads of Waiter3 entries...]
  Waiter3
    owner = 4566
    gh_state = shared[3]
    gh_flags = any[3]
    error = 0
    gh_iflags = promote[1]
  Inode: busy
lockdump.node2.dec Glock (inode[2], 627732)
  gl_flags = lock[1]
  gl_count = 375
  gl_state = unlocked[0]
  req_gh = yes
  req_bh = yes
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 0
  ail_bufs = no
  Request
    owner = 20187
    gh_state = shared[3]
    gh_flags = any[3]
    error = 0
    gh_iflags = promote[1]
  Waiter3
    owner = 20187
    gh_state = shared[3]
    gh_flags = any[3]
    error = 0
    gh_iflags = promote[1]
  [...loads of Waiter3 entries...]
  Waiter3
    owner = 10460
    gh_state = shared[3]
    gh_flags = any[3]
    error = 0
    gh_iflags = promote[1]
  Inode: busy
2 requests

-- 
Gruss / Regards,

Mark Hlawatschek
http://www.atix.de/               http://www.open-sharedroot.org/

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany


From wcheng at redhat.com  Tue Feb 13 14:48:03 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Tue, 13 Feb 2007 09:48:03 -0500
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <200702131337.18680.hlawatschek@atix.de>
References: <200702131337.18680.hlawatschek@atix.de>
Message-ID: <45D1CFA3.7050608@redhat.com>

Mark Hlawatschek wrote:
> Hi,
>
> we have the following deadlock situation:
>
> 2 node cluster consisting of node1 and node2. 
> /usr/local is placed on a GFS filesystem mounted on both nodes. 
> Lockmanager is dlm.
> We are using RHEL4u4
>
> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
> lstat("/usr/local/swadmin/mnx/xml",                                                                
>
> This happens on both cluster nodes.
>
> All processes trying to access the directory /usr/local/swadmin/mnx/xml are 
> in "Waiting for IO (D)" state. I.e. system load is at about 400 ;-)
>
> Any ideas ?
>   
Quickly browsing this, look to me that process with pid=5856 got stuck. 
That process had the file or directory (ino number 627732 - probably 
/usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for 
it. The faulty process was apparently in the middle of obtaining another 
exclusive lock (and almost got it). We need to know where pid=5856 was 
stuck at that time. If this occurs again, could you use "crash" to back 
trace that process and show us the output ?

-- Wendy
> a lockdump analysis with the decipher_lockstate_dump and parse_lockdump shows 
> the following output (The whole file is too large for the mailing-list):
>
> Entries:  101939
> Glocks:  60112
> PIDs:  751
>
> 4 chain:
> lockdump.node1.dec Glock (inode[2], 1114343)
>   gl_flags = lock[1]
>   gl_count = 5
>   gl_state = shared[3]
>   req_gh = yes
>   req_bh = yes
>   lvb_count = 0
>   object = yes
>   new_le = no
>   incore_le = no
>   reclaim = no
>   aspace = 1
>   ail_bufs = no
>   Request
>     owner = 5856
>     gh_state = exclusive[1]
>     gh_flags = try[0] local_excl[5] async[6]
>     error = 0
>     gh_iflags = promote[1]
>   Waiter3
>     owner = 5856
>     gh_state = exclusive[1]
>     gh_flags = try[0] local_excl[5] async[6]
>     error = 0
>     gh_iflags = promote[1]
>   Inode: busy
> lockdump.node2.dec Glock (inode[2], 1114343)
>   gl_flags =
>   gl_count = 2
>   gl_state = unlocked[0]
>   req_gh = no
>   req_bh = no
>   lvb_count = 0
>   object = yes
>   new_le = no
>   incore_le = no
>   reclaim = no
>   aspace = 0
>   ail_bufs = no
>   Inode:
>     num = 1114343/1114343
>     type = regular[1]
>     i_count = 1
>     i_flags =
>     vnode = yes
> lockdump.node1.dec Glock (inode[2], 627732)
>   gl_flags = dirty[5]
>   gl_count = 379
>   gl_state = exclusive[1]
>   req_gh = no
>   req_bh = no
>   lvb_count = 0
>   object = yes
>   new_le = no
>   incore_le = no
>   reclaim = no
>   aspace = 58
>   ail_bufs = no
>   Holder
>     owner = 5856
>     gh_state = exclusive[1]
>     gh_flags = try[0] local_excl[5] async[6]
>     error = 0
>     gh_iflags = promote[1] holder[6] first[7]
>   Waiter2
>     owner = none[-1]
>     gh_state = shared[3]
>     gh_flags = try[0]
>     error = 0
>     gh_iflags = demote[2] alloced[4] dealloc[5]
>   Waiter3
>     owner = 32753
>     gh_state = shared[3]
>     gh_flags = any[3]
>     error = 0
>     gh_iflags = promote[1]
>   [...loads of Waiter3 entries...]
>   Waiter3
>     owner = 4566
>     gh_state = shared[3]
>     gh_flags = any[3]
>     error = 0
>     gh_iflags = promote[1]
>   Inode: busy
> lockdump.node2.dec Glock (inode[2], 627732)
>   gl_flags = lock[1]
>   gl_count = 375
>   gl_state = unlocked[0]
>   req_gh = yes
>   req_bh = yes
>   lvb_count = 0
>   object = yes
>   new_le = no
>   incore_le = no
>   reclaim = no
>   aspace = 0
>   ail_bufs = no
>   Request
>     owner = 20187
>     gh_state = shared[3]
>     gh_flags = any[3]
>     error = 0
>     gh_iflags = promote[1]
>   Waiter3
>     owner = 20187
>     gh_state = shared[3]
>     gh_flags = any[3]
>     error = 0
>     gh_iflags = promote[1]
>   [...loads of Waiter3 entries...]
>   Waiter3
>     owner = 10460
>     gh_state = shared[3]
>     gh_flags = any[3]
>     error = 0
>     gh_iflags = promote[1]
>   Inode: busy
> 2 requests
>
>   


From wcheng at redhat.com  Tue Feb 13 15:00:23 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Tue, 13 Feb 2007 10:00:23 -0500
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D1CFA3.7050608@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>
	<45D1CFA3.7050608@redhat.com>
Message-ID: <45D1D287.6060203@redhat.com>

Wendy Cheng wrote:
> Mark Hlawatschek wrote:
>> Hi,
>>
>> we have the following deadlock situation:
>>
>> 2 node cluster consisting of node1 and node2. /usr/local is placed on 
>> a GFS filesystem mounted on both nodes. Lockmanager is dlm.
>> We are using RHEL4u4
>>
>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
>> lstat("/usr/local/swadmin/mnx/xml",                                                                
>>
>> This happens on both cluster nodes.
>>
>> All processes trying to access the directory 
>> /usr/local/swadmin/mnx/xml are in "Waiting for IO (D)" state. I.e. 
>> system load is at about 400 ;-)
>>
>> Any ideas ?
>>   
> Quickly browsing this, look to me that process with pid=5856 got 
> stuck. That process had the file or directory (ino number 627732 - 
> probably /usr/local/swadmin/mnx/xml) exclusive lock so everyone was 
> waiting for it. The faulty process was apparently in the middle of 
> obtaining another exclusive lock (and almost got it). We need to know 
> where pid=5856 was stuck at that time. If this occurs again, could you 
> use "crash" to back trace that process and show us the output ?
Or an "echo t  > /proc/sysrq-trigger" to obtain *all* threads backtrace 
would be better - but it has the risk of missing heartbeat that could 
result cluster fence action since sysrq-t could stall the system for a 
while.

-- Wendy
>
>> a lockdump analysis with the decipher_lockstate_dump and 
>> parse_lockdump shows the following output (The whole file is too 
>> large for the mailing-list):
>>
>> Entries:  101939
>> Glocks:  60112
>> PIDs:  751
>>
>> 4 chain:
>> lockdump.node1.dec Glock (inode[2], 1114343)
>>   gl_flags = lock[1]
>>   gl_count = 5
>>   gl_state = shared[3]
>>   req_gh = yes
>>   req_bh = yes
>>   lvb_count = 0
>>   object = yes
>>   new_le = no
>>   incore_le = no
>>   reclaim = no
>>   aspace = 1
>>   ail_bufs = no
>>   Request
>>     owner = 5856
>>     gh_state = exclusive[1]
>>     gh_flags = try[0] local_excl[5] async[6]
>>     error = 0
>>     gh_iflags = promote[1]
>>   Waiter3
>>     owner = 5856
>>     gh_state = exclusive[1]
>>     gh_flags = try[0] local_excl[5] async[6]
>>     error = 0
>>     gh_iflags = promote[1]
>>   Inode: busy
>> lockdump.node2.dec Glock (inode[2], 1114343)
>>   gl_flags =
>>   gl_count = 2
>>   gl_state = unlocked[0]
>>   req_gh = no
>>   req_bh = no
>>   lvb_count = 0
>>   object = yes
>>   new_le = no
>>   incore_le = no
>>   reclaim = no
>>   aspace = 0
>>   ail_bufs = no
>>   Inode:
>>     num = 1114343/1114343
>>     type = regular[1]
>>     i_count = 1
>>     i_flags =
>>     vnode = yes
>> lockdump.node1.dec Glock (inode[2], 627732)
>>   gl_flags = dirty[5]
>>   gl_count = 379
>>   gl_state = exclusive[1]
>>   req_gh = no
>>   req_bh = no
>>   lvb_count = 0
>>   object = yes
>>   new_le = no
>>   incore_le = no
>>   reclaim = no
>>   aspace = 58
>>   ail_bufs = no
>>   Holder
>>     owner = 5856
>>     gh_state = exclusive[1]
>>     gh_flags = try[0] local_excl[5] async[6]
>>     error = 0
>>     gh_iflags = promote[1] holder[6] first[7]
>>   Waiter2
>>     owner = none[-1]
>>     gh_state = shared[3]
>>     gh_flags = try[0]
>>     error = 0
>>     gh_iflags = demote[2] alloced[4] dealloc[5]
>>   Waiter3
>>     owner = 32753
>>     gh_state = shared[3]
>>     gh_flags = any[3]
>>     error = 0
>>     gh_iflags = promote[1]
>>   [...loads of Waiter3 entries...]
>>   Waiter3
>>     owner = 4566
>>     gh_state = shared[3]
>>     gh_flags = any[3]
>>     error = 0
>>     gh_iflags = promote[1]
>>   Inode: busy
>> lockdump.node2.dec Glock (inode[2], 627732)
>>   gl_flags = lock[1]
>>   gl_count = 375
>>   gl_state = unlocked[0]
>>   req_gh = yes
>>   req_bh = yes
>>   lvb_count = 0
>>   object = yes
>>   new_le = no
>>   incore_le = no
>>   reclaim = no
>>   aspace = 0
>>   ail_bufs = no
>>   Request
>>     owner = 20187
>>     gh_state = shared[3]
>>     gh_flags = any[3]
>>     error = 0
>>     gh_iflags = promote[1]
>>   Waiter3
>>     owner = 20187
>>     gh_state = shared[3]
>>     gh_flags = any[3]
>>     error = 0
>>     gh_iflags = promote[1]
>>   [...loads of Waiter3 entries...]
>>   Waiter3
>>     owner = 10460
>>     gh_state = shared[3]
>>     gh_flags = any[3]
>>     error = 0
>>     gh_iflags = promote[1]
>>   Inode: busy
>> 2 requests
>>
>>   
>
>


From rpeterso at redhat.com  Tue Feb 13 15:17:00 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Tue, 13 Feb 2007 09:17:00 -0600
Subject: [Linux-cluster] GFS Journaling: How to interpret gfs_tool jindex
	output?
In-Reply-To: <B14199FA0DBAAF4AA89E83EB41D35435030A65EB@xmb-sjc-22c.amer.cisco.com>
References: <B14199FA0DBAAF4AA89E83EB41D35435030A65EB@xmb-sjc-22c.amer.cisco.com>
Message-ID: <45D1D66C.2090207@redhat.com>

Sridharan Ramaswamy (srramasw) wrote:
> I'm trying to measure the usage of journal blocks created within a GFS
> disk. I'm running bonnie++ small file test hoping to overwhelm the
> journal with lot of entries to measure its peak usage. I saw gfs_tool
> prints jindex data structure. Can anyone help to interpret its output?
>  
> Also I like to hear from anyone who went through similar exercise of
> understanding GFS journal usage in an active system.
>  
> thanks,
> Sridharan
Hi Sridharan,

gfs_tool jindex just dumps the journal index file of a GFS file system.
The jindex file is an index of all the journals and where they're located
and the number of segments.  The first part of the output is just a dump
of the jindex file's inode, which doesn't tell you much.
This doesn't tell you much about what's happening inside the actual
journals.

Regards,

Bob Peterson
Red Hat Cluster Suite


From jparsons at redhat.com  Tue Feb 13 15:48:34 2007
From: jparsons at redhat.com (James Parsons)
Date: Tue, 13 Feb 2007 10:48:34 -0500
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171315283.4693.30.camel@asuka.boston.devel.redhat.com>
References: <45CCE528.8070904@redhat.com>	
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>	
	<45D0A3FE.3000500@redhat.com>	
	<1171308215.12535.23.camel@localhost.localdomain>
	<1171315283.4693.30.camel@asuka.boston.devel.redhat.com>
Message-ID: <45D1DDD2.2050904@redhat.com>

Lon Hohberger wrote:

>On Mon, 2007-02-12 at 11:23 -0800, David Lutterkort wrote:
>  
>
>>On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
>>    
>>
>>>Lon Hohberger wrote:
>>>      
>>>
>>>>http://sourceware.org/cluster/conga/
>>>>        
>>>>
>>It seems that that should be http://www.sourceware.org/cluster/conga/ -
>>I get a 503 without the www. BTW, where is the conga CVS ? It doesn't
>>seem to be linked from that page.
>>    
>>
>
>Hrm, I thought it was on sources.redhat.com... Jim?
>
>-- Lon
>
>  
>
It is on sources.redhat.com/cluster/conga


From srramasw at cisco.com  Tue Feb 13 16:43:31 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Tue, 13 Feb 2007 08:43:31 -0800
Subject: [Linux-cluster] GFS Journaling: How to interpret gfs_tool
	jindexoutput?
In-Reply-To: <45D1D66C.2090207@redhat.com>
Message-ID: <B14199FA0DBAAF4AA89E83EB41D35435030A6786@xmb-sjc-22c.amer.cisco.com>

Thanks Bob. That explains why jindex output didn't change at all while
metadata intensive operations are thrown at gfs.

So I presume there is nothing else that will display whats happening
inside those journal segments. May be I can try to look around in the
code and add more debug log mesg to capture that.

- Sridharan 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Robert Peterson
> Sent: Tuesday, February 13, 2007 7:17 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS Journaling: How to interpret 
> gfs_tool jindexoutput?
> 
> Sridharan Ramaswamy (srramasw) wrote:
> > I'm trying to measure the usage of journal blocks created 
> within a GFS
> > disk. I'm running bonnie++ small file test hoping to overwhelm the
> > journal with lot of entries to measure its peak usage. I 
> saw gfs_tool
> > prints jindex data structure. Can anyone help to interpret 
> its output?
> >  
> > Also I like to hear from anyone who went through similar exercise of
> > understanding GFS journal usage in an active system.
> >  
> > thanks,
> > Sridharan
> Hi Sridharan,
> 
> gfs_tool jindex just dumps the journal index file of a GFS 
> file system.
> The jindex file is an index of all the journals and where 
> they're located
> and the number of segments.  The first part of the output is 
> just a dump
> of the jindex file's inode, which doesn't tell you much.
> This doesn't tell you much about what's happening inside the actual
> journals.
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From hlawatschek at atix.de  Tue Feb 13 16:52:46 2007
From: hlawatschek at atix.de (Mark Hlawatschek)
Date: Tue, 13 Feb 2007 17:52:46 +0100
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D1CFA3.7050608@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>
	<45D1CFA3.7050608@redhat.com>
Message-ID: <200702131752.47150.hlawatschek@atix.de>

Hi Wendy,

thanks for your answer!
The system is still in the deadlock state, so I hopefully can collect all 
information you need :-) (you'll find the crash output below)

Thanks,

Mark

> > we have the following deadlock situation:
> >
> > 2 node cluster consisting of node1 and node2.
> > /usr/local is placed on a GFS filesystem mounted on both nodes.
> > Lockmanager is dlm.
> > We are using RHEL4u4
> >
> > a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
> > lstat("/usr/local/swadmin/mnx/xml",
> >
> > This happens on both cluster nodes.
> >
> > All processes trying to access the directory /usr/local/swadmin/mnx/xml
> > are in "Waiting for IO (D)" state. I.e. system load is at about 400 ;-)
> >
> > Any ideas ?
>
> Quickly browsing this, look to me that process with pid=5856 got stuck.
> That process had the file or directory (ino number 627732 - probably
> /usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
> it. The faulty process was apparently in the middle of obtaining another
> exclusive lock (and almost got it). We need to know where pid=5856 was
> stuck at that time. If this occurs again, could you use "crash" to back
> trace that process and show us the output ?

Here's the crash output:

crash> bt 5856
PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
 #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
 #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
 #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
 #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
 #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
 #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
 #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
 #7 [10bd20cff30] filp_close at ffffffff80178e48
 #8 [10bd20cff50] error_exit at ffffffff80110d91
    RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
    RAX: 0000000000000057  RBX: ffffffff8011026a  RCX: 0000002a9cc9c870
    RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI: 0000002ae5989000
    RBP: 0000000000000000   R8: 0000002a9630abb0   R9: 0000000000000ffc
    R10: 0000002a9630abc0  R11: 0000000000000206  R12: 0000000040115700
    R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15: 0000002ae5989000
    ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b

> > a lockdump analysis with the decipher_lockstate_dump and parse_lockdump
> > shows the following output (The whole file is too large for the
> > mailing-list):
> >
> > Entries:  101939
> > Glocks:  60112
> > PIDs:  751
> >
> > 4 chain:
> > lockdump.node1.dec Glock (inode[2], 1114343)
> >   gl_flags = lock[1]
> >   gl_count = 5
> >   gl_state = shared[3]
> >   req_gh = yes
> >   req_bh = yes
> >   lvb_count = 0
> >   object = yes
> >   new_le = no
> >   incore_le = no
> >   reclaim = no
> >   aspace = 1
> >   ail_bufs = no
> >   Request
> >     owner = 5856
> >     gh_state = exclusive[1]
> >     gh_flags = try[0] local_excl[5] async[6]
> >     error = 0
> >     gh_iflags = promote[1]
> >   Waiter3
> >     owner = 5856
> >     gh_state = exclusive[1]
> >     gh_flags = try[0] local_excl[5] async[6]
> >     error = 0
> >     gh_iflags = promote[1]
> >   Inode: busy
> > lockdump.node2.dec Glock (inode[2], 1114343)
> >   gl_flags =
> >   gl_count = 2
> >   gl_state = unlocked[0]
> >   req_gh = no
> >   req_bh = no
> >   lvb_count = 0
> >   object = yes
> >   new_le = no
> >   incore_le = no
> >   reclaim = no
> >   aspace = 0
> >   ail_bufs = no
> >   Inode:
> >     num = 1114343/1114343
> >     type = regular[1]
> >     i_count = 1
> >     i_flags =
> >     vnode = yes
> > lockdump.node1.dec Glock (inode[2], 627732)
> >   gl_flags = dirty[5]
> >   gl_count = 379
> >   gl_state = exclusive[1]
> >   req_gh = no
> >   req_bh = no
> >   lvb_count = 0
> >   object = yes
> >   new_le = no
> >   incore_le = no
> >   reclaim = no
> >   aspace = 58
> >   ail_bufs = no
> >   Holder
> >     owner = 5856
> >     gh_state = exclusive[1]
> >     gh_flags = try[0] local_excl[5] async[6]
> >     error = 0
> >     gh_iflags = promote[1] holder[6] first[7]
> >   Waiter2
> >     owner = none[-1]
> >     gh_state = shared[3]
> >     gh_flags = try[0]
> >     error = 0
> >     gh_iflags = demote[2] alloced[4] dealloc[5]
> >   Waiter3
> >     owner = 32753
> >     gh_state = shared[3]
> >     gh_flags = any[3]
> >     error = 0
> >     gh_iflags = promote[1]
> >   [...loads of Waiter3 entries...]
> >   Waiter3
> >     owner = 4566
> >     gh_state = shared[3]
> >     gh_flags = any[3]
> >     error = 0
> >     gh_iflags = promote[1]
> >   Inode: busy
> > lockdump.node2.dec Glock (inode[2], 627732)
> >   gl_flags = lock[1]
> >   gl_count = 375
> >   gl_state = unlocked[0]
> >   req_gh = yes
> >   req_bh = yes
> >   lvb_count = 0
> >   object = yes
> >   new_le = no
> >   incore_le = no
> >   reclaim = no
> >   aspace = 0
> >   ail_bufs = no
> >   Request
> >     owner = 20187
> >     gh_state = shared[3]
> >     gh_flags = any[3]
> >     error = 0
> >     gh_iflags = promote[1]
> >   Waiter3
> >     owner = 20187
> >     gh_state = shared[3]
> >     gh_flags = any[3]
> >     error = 0
> >     gh_iflags = promote[1]
> >   [...loads of Waiter3 entries...]
> >   Waiter3
> >     owner = 10460
> >     gh_state = shared[3]
> >     gh_flags = any[3]
> >     error = 0
> >     gh_iflags = promote[1]
> >   Inode: busy
> > 2 requests
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Gruss / Regards,

Mark Hlawatschek
http://www.atix.de/               http://www.open-sharedroot.org/

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany


From srramasw at cisco.com  Tue Feb 13 17:08:11 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Tue, 13 Feb 2007 09:08:11 -0800
Subject: [Linux-cluster] GFS panic in check_seg_usage
Message-ID: <B14199FA0DBAAF4AA89E83EB41D35435030A67AD@xmb-sjc-22c.amer.cisco.com>

I'm hitting this crash while testing small file create/write/delete test
with low journal size. It happened twice for me, once when I enable data
journaling and the other without that.
 
This seems similar to bugzilla,
 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146672
 
Has anyone seen this lately?
 
>From /var/log/message,
 
Feb 12 16:37:42 cfs1 kernel: GFS: fsid=cisco:gfs2.0: head_off = 62144,
head_wrap = 16
Feb 12 16:37:42 cfs1 kernel: GFS: fsid=cisco:gfs2.0: head_off = 62144,
head_wrap = 16
Feb 12 16:37:42 cfs1 kernel: GFS: fsid=cisco:gfs2.0: dump_off = 62144,
dump_wrap = 15
Feb 12 16:37:42 cfs1 kernel: GFS: fsid=cisco:gfs2.0: dump_off = 62144,
dump_wrap = 15
Feb 12 16:37:42 cfs1 kernel:  [<e1086506>] gfs_assert_i+0x48/0x69 [gfs]
Feb 12 16:37:42 cfs1 kernel:  [<e1086506>] gfs_assert_i+0x48/0x69 [gfs]
Feb 12 16:37:42 cfs1 kernel:  [<e106e880>] check_seg_usage+0x197/0x19f
[gfs]
Feb 12 16:37:42 cfs1 kernel:  [<e106e880>] check_seg_usage+0x197/0x19f
[gfs]
Feb 12 16:37:42 cfs1 kernel:  [<e106ea03>] sync_trans+0x143/0x1b1 [gfs]

>From console,
 
gfs_log_reserve+0x19e/0x20e [gfs]
glock_wait_internal
gfs_glock_nq
gfs_trans_begin_i
iinode_init_and_link
gfs_lock_nq_init
gfs_lock_nq_num
gfs_createi
gfs_create
vfs_create
open_namei
filp_open
__cond_resched
direct_strncpy_from_user
sys_open
sys_create
syscall_call
packet_rcv


Kernel panic - not syncing: GFS: fsid=cisco:gfs2.0: assertion "FALSE"
failed
GFS: fsid=cisco:gfs2.0:  function = check_seg_usage
GFS: fsid=cisco:gfs2.0:  file =
/download/gfs/cluster/gfs-kernel/src/gfs/log.c, line 590
GFS: fsid=cisco:gfs2.0:  time=1171047359

 
thanks,
Sridharan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070213/b4475192/attachment.htm>

From erickson.jon at gmail.com  Tue Feb 13 18:21:50 2007
From: erickson.jon at gmail.com (Jon Erickson)
Date: Tue, 13 Feb 2007 13:21:50 -0500
Subject: [Linux-cluster] clusterfs.sh error
In-Reply-To: <1169494901.9453.45.camel@rei.boston.devel.redhat.com>
References: <6a90e4da0701191302r359ca698t3f932e9311e41e83@mail.gmail.com>
	<1169494901.9453.45.camel@rei.boston.devel.redhat.com>
Message-ID: <6a90e4da0702131021y28ca4c90w835f1bc476b54703@mail.gmail.com>

Just to let you know I updated these packages and have not received
this error again.  Thanks for your help.

On 1/22/07, Lon Hohberger <lhh at redhat.com> wrote:
> On Fri, 2007-01-19 at 16:02 -0500, Jon Erickson wrote:
> > All,
> >
> > Every once and a while I get a error message on no matter what system
> > is the owner of the fs cluster service.
> >
> > Error:
> > kernel: clusterfs.sh[21141]: segfault at 0000000000000008 rip
> > 0000000000432098 rsp 0000007fbfffdd00 error 4
> >
> > clurgmgrd[8879]: <notice> Stopping service fs_cluster
> > clurgmgrd[8879]: <notice> Service fs_cluster is recovering
> > clurgmgrd[8879]: <notice> Recovering failed fs_cluster
> > clurgmgrd[8879]: <notice> Service fs_cluster startred
> >
> > Why is this happening? Should I be concered?
>
> There are two possible causes for that, both should be fixed in the
> RHEL4 Update 4 packages - make sure you have the current bash &
> rgmanager packages.
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Jon


From wcheng at redhat.com  Tue Feb 13 18:56:29 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Tue, 13 Feb 2007 13:56:29 -0500
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <200702131752.47150.hlawatschek@atix.de>
References: <200702131337.18680.hlawatschek@atix.de>	<45D1CFA3.7050608@redhat.com>
	<200702131752.47150.hlawatschek@atix.de>
Message-ID: <45D209DD.5050003@redhat.com>

Mark Hlawatschek wrote:
> Hi Wendy,
>
> thanks for your answer!
> The system is still in the deadlock state, so I hopefully can collect all 
> information you need :-) (you'll find the crash output below)
>
> Thanks,
>
>   
So it is removing a file. It has obtained the directory lock and is 
waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never 
occurs. Do you have abnormal messages in your /var/log/messages file ? 
Dave, how to dump the locks from DLM side to see how DLM is thinking ?

-- Wendy
>   
>>> we have the following deadlock situation:
>>>
>>> 2 node cluster consisting of node1 and node2.
>>> /usr/local is placed on a GFS filesystem mounted on both nodes.
>>> Lockmanager is dlm.
>>> We are using RHEL4u4
>>>
>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
>>> lstat("/usr/local/swadmin/mnx/xml",
>>>
>>> This happens on both cluster nodes.
>>>
>>> All processes trying to access the directory /usr/local/swadmin/mnx/xml
>>> are in "Waiting for IO (D)" state. I.e. system load is at about 400 ;-)
>>>
>>> Any ideas ?
>>>       
>> Quickly browsing this, look to me that process with pid=5856 got stuck.
>> That process had the file or directory (ino number 627732 - probably
>> /usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
>> it. The faulty process was apparently in the middle of obtaining another
>> exclusive lock (and almost got it). We need to know where pid=5856 was
>> stuck at that time. If this occurs again, could you use "crash" to back
>> trace that process and show us the output ?
>>     
>
> Here's the crash output:
>
> crash> bt 5856
> PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
>  #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
>  #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
>  #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
>  #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
>  #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
>  #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
>  #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
>  #7 [10bd20cff30] filp_close at ffffffff80178e48
>  #8 [10bd20cff50] error_exit at ffffffff80110d91
>     RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
>     RAX: 0000000000000057  RBX: ffffffff8011026a  RCX: 0000002a9cc9c870
>     RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI: 0000002ae5989000
>     RBP: 0000000000000000   R8: 0000002a9630abb0   R9: 0000000000000ffc
>     R10: 0000002a9630abc0  R11: 0000000000000206  R12: 0000000040115700
>     R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15: 0000002ae5989000
>     ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b
>
>   
>>> a lockdump analysis with the decipher_lockstate_dump and parse_lockdump
>>> shows the following output (The whole file is too large for the
>>> mailing-list):
>>>
>>> Entries:  101939
>>> Glocks:  60112
>>> PIDs:  751
>>>
>>> 4 chain:
>>> lockdump.node1.dec Glock (inode[2], 1114343)
>>>   gl_flags = lock[1]
>>>   gl_count = 5
>>>   gl_state = shared[3]
>>>   req_gh = yes
>>>   req_bh = yes
>>>   lvb_count = 0
>>>   object = yes
>>>   new_le = no
>>>   incore_le = no
>>>   reclaim = no
>>>   aspace = 1
>>>   ail_bufs = no
>>>   Request
>>>     owner = 5856
>>>     gh_state = exclusive[1]
>>>     gh_flags = try[0] local_excl[5] async[6]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   Waiter3
>>>     owner = 5856
>>>     gh_state = exclusive[1]
>>>     gh_flags = try[0] local_excl[5] async[6]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   Inode: busy
>>> lockdump.node2.dec Glock (inode[2], 1114343)
>>>   gl_flags =
>>>   gl_count = 2
>>>   gl_state = unlocked[0]
>>>   req_gh = no
>>>   req_bh = no
>>>   lvb_count = 0
>>>   object = yes
>>>   new_le = no
>>>   incore_le = no
>>>   reclaim = no
>>>   aspace = 0
>>>   ail_bufs = no
>>>   Inode:
>>>     num = 1114343/1114343
>>>     type = regular[1]
>>>     i_count = 1
>>>     i_flags =
>>>     vnode = yes
>>> lockdump.node1.dec Glock (inode[2], 627732)
>>>   gl_flags = dirty[5]
>>>   gl_count = 379
>>>   gl_state = exclusive[1]
>>>   req_gh = no
>>>   req_bh = no
>>>   lvb_count = 0
>>>   object = yes
>>>   new_le = no
>>>   incore_le = no
>>>   reclaim = no
>>>   aspace = 58
>>>   ail_bufs = no
>>>   Holder
>>>     owner = 5856
>>>     gh_state = exclusive[1]
>>>     gh_flags = try[0] local_excl[5] async[6]
>>>     error = 0
>>>     gh_iflags = promote[1] holder[6] first[7]
>>>   Waiter2
>>>     owner = none[-1]
>>>     gh_state = shared[3]
>>>     gh_flags = try[0]
>>>     error = 0
>>>     gh_iflags = demote[2] alloced[4] dealloc[5]
>>>   Waiter3
>>>     owner = 32753
>>>     gh_state = shared[3]
>>>     gh_flags = any[3]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   [...loads of Waiter3 entries...]
>>>   Waiter3
>>>     owner = 4566
>>>     gh_state = shared[3]
>>>     gh_flags = any[3]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   Inode: busy
>>> lockdump.node2.dec Glock (inode[2], 627732)
>>>   gl_flags = lock[1]
>>>   gl_count = 375
>>>   gl_state = unlocked[0]
>>>   req_gh = yes
>>>   req_bh = yes
>>>   lvb_count = 0
>>>   object = yes
>>>   new_le = no
>>>   incore_le = no
>>>   reclaim = no
>>>   aspace = 0
>>>   ail_bufs = no
>>>   Request
>>>     owner = 20187
>>>     gh_state = shared[3]
>>>     gh_flags = any[3]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   Waiter3
>>>     owner = 20187
>>>     gh_state = shared[3]
>>>     gh_flags = any[3]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   [...loads of Waiter3 entries...]
>>>   Waiter3
>>>     owner = 10460
>>>     gh_state = shared[3]
>>>     gh_flags = any[3]
>>>     error = 0
>>>     gh_iflags = promote[1]
>>>   Inode: busy
>>> 2 requests
>>>       
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>     
>
>   


From erickson.jon at gmail.com  Tue Feb 13 19:19:25 2007
From: erickson.jon at gmail.com (Jon Erickson)
Date: Tue, 13 Feb 2007 14:19:25 -0500
Subject: [Linux-cluster] DLM Locks and Memory issue
Message-ID: <6a90e4da0702131119n18b73e8av244262bd7705287c@mail.gmail.com>

All,

 I've been testing GFS with millions and millions of files and my
system performance degrades as the number of locks increases.  Along
with the number of locks being in the millions, all of my system
memory is used.  After performing a umount and mount of my GFS file
system the locks go back down to zero and all the memory is reclaimed.
 This of course improves my system performance.  Is there a way to
release all locks and memory associated without unmounting my file
system?

Should I try the patch in the bug report?
Comment #31: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=214239

-- 
Jon


From teigland at redhat.com  Tue Feb 13 19:42:46 2007
From: teigland at redhat.com (David Teigland)
Date: Tue, 13 Feb 2007 13:42:46 -0600
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D209DD.5050003@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>
	<45D1CFA3.7050608@redhat.com>
	<200702131752.47150.hlawatschek@atix.de>
	<45D209DD.5050003@redhat.com>
Message-ID: <20070213194246.GA28116@redhat.com>

On Tue, Feb 13, 2007 at 01:56:29PM -0500, Wendy Cheng wrote:
> Mark Hlawatschek wrote:
> >Hi Wendy,
> >
> >thanks for your answer!
> >The system is still in the deadlock state, so I hopefully can collect all 
> >information you need :-) (you'll find the crash output below)
> >
> >Thanks,
> >
> >  
> So it is removing a file. It has obtained the directory lock and is 
> waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never 
> occurs. Do you have abnormal messages in your /var/log/messages file ? 
> Dave, how to dump the locks from DLM side to see how DLM is thinking ?

echo "lockspace name" >> /proc/cluster/dlm_locks
cat /proc/cluster/dlm_locks > locks.txt

Dave


From wcheng at redhat.com  Tue Feb 13 19:56:28 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Tue, 13 Feb 2007 14:56:28 -0500
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D209DD.5050003@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>	<45D1CFA3.7050608@redhat.com>
	<200702131752.47150.hlawatschek@atix.de>
	<45D209DD.5050003@redhat.com>
Message-ID: <45D217EC.7080907@redhat.com>

Wendy Cheng wrote:
> Mark Hlawatschek wrote:
>> Hi Wendy,
>>
>> thanks for your answer!
>> The system is still in the deadlock state, so I hopefully can collect 
>> all information you need :-) (you'll find the crash output below)
>>
>> Thanks,
>>
>>   
> So it is removing a file. It has obtained the directory lock and is 
> waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never 
> occurs. Do you have abnormal messages in your /var/log/messages file ? 
> Dave, how to dump the locks from DLM side to see how DLM is thinking ?
>
Sorry, stepped out for lunch - was hoping Dave would take over this :) 
... anyway, please dump DLM locks as the following:

shell> cman_tool services    /* find your lock space name */
shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks
shell> cat /proc/cluster/dlm_locks

Then try to find the lock (2, hex of (1114343)) - cut and paste the 
contents of that file here.

-- Wendy
 |
|
>>  
>>>> we have the following deadlock situation:
>>>>
>>>> 2 node cluster consisting of node1 and node2.
>>>> /usr/local is placed on a GFS filesystem mounted on both nodes.
>>>> Lockmanager is dlm.
>>>> We are using RHEL4u4
>>>>
>>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
>>>> lstat("/usr/local/swadmin/mnx/xml",
>>>>
>>>> This happens on both cluster nodes.
>>>>
>>>> All processes trying to access the directory 
>>>> /usr/local/swadmin/mnx/xml
>>>> are in "Waiting for IO (D)" state. I.e. system load is at about 400 
>>>> ;-)
>>>>
>>>> Any ideas ?
>>>>       
>>> Quickly browsing this, look to me that process with pid=5856 got stuck.
>>> That process had the file or directory (ino number 627732 - probably
>>> /usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
>>> it. The faulty process was apparently in the middle of obtaining 
>>> another
>>> exclusive lock (and almost got it). We need to know where pid=5856 was
>>> stuck at that time. If this occurs again, could you use "crash" to back
>>> trace that process and show us the output ?
>>>     
>>
>> Here's the crash output:
>>
>> crash> bt 5856
>> PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
>>  #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
>>  #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
>>  #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
>>  #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
>>  #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
>>  #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
>>  #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
>>  #7 [10bd20cff30] filp_close at ffffffff80178e48
>>  #8 [10bd20cff50] error_exit at ffffffff80110d91
>>     RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
>>     RAX: 0000000000000057  RBX: ffffffff8011026a  RCX: 0000002a9cc9c870
>>     RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI: 0000002ae5989000
>>     RBP: 0000000000000000   R8: 0000002a9630abb0   R9: 0000000000000ffc
>>     R10: 0000002a9630abc0  R11: 0000000000000206  R12: 0000000040115700
>>     R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15: 0000002ae5989000
>>     ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b
>>
>>  
>>>> a lockdump analysis with the decipher_lockstate_dump and 
>>>> parse_lockdump
>>>> shows the following output (The whole file is too large for the
>>>> mailing-list):
>>>>
>>>> Entries:  101939
>>>> Glocks:  60112
>>>> PIDs:  751
>>>>
>>>> 4 chain:
>>>> lockdump.node1.dec Glock (inode[2], 1114343)
>>>>   gl_flags = lock[1]
>>>>   gl_count = 5
>>>>   gl_state = shared[3]
>>>>   req_gh = yes
>>>>   req_bh = yes
>>>>   lvb_count = 0
>>>>   object = yes
>>>>   new_le = no
>>>>   incore_le = no
>>>>   reclaim = no
>>>>   aspace = 1
>>>>   ail_bufs = no
>>>>   Request
>>>>     owner = 5856
>>>>     gh_state = exclusive[1]
>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   Waiter3
>>>>     owner = 5856
>>>>     gh_state = exclusive[1]
>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   Inode: busy
>>>> lockdump.node2.dec Glock (inode[2], 1114343)
>>>>   gl_flags =
>>>>   gl_count = 2
>>>>   gl_state = unlocked[0]
>>>>   req_gh = no
>>>>   req_bh = no
>>>>   lvb_count = 0
>>>>   object = yes
>>>>   new_le = no
>>>>   incore_le = no
>>>>   reclaim = no
>>>>   aspace = 0
>>>>   ail_bufs = no
>>>>   Inode:
>>>>     num = 1114343/1114343
>>>>     type = regular[1]
>>>>     i_count = 1
>>>>     i_flags =
>>>>     vnode = yes
>>>> lockdump.node1.dec Glock (inode[2], 627732)
>>>>   gl_flags = dirty[5]
>>>>   gl_count = 379
>>>>   gl_state = exclusive[1]
>>>>   req_gh = no
>>>>   req_bh = no
>>>>   lvb_count = 0
>>>>   object = yes
>>>>   new_le = no
>>>>   incore_le = no
>>>>   reclaim = no
>>>>   aspace = 58
>>>>   ail_bufs = no
>>>>   Holder
>>>>     owner = 5856
>>>>     gh_state = exclusive[1]
>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>     error = 0
>>>>     gh_iflags = promote[1] holder[6] first[7]
>>>>   Waiter2
>>>>     owner = none[-1]
>>>>     gh_state = shared[3]
>>>>     gh_flags = try[0]
>>>>     error = 0
>>>>     gh_iflags = demote[2] alloced[4] dealloc[5]
>>>>   Waiter3
>>>>     owner = 32753
>>>>     gh_state = shared[3]
>>>>     gh_flags = any[3]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   [...loads of Waiter3 entries...]
>>>>   Waiter3
>>>>     owner = 4566
>>>>     gh_state = shared[3]
>>>>     gh_flags = any[3]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   Inode: busy
>>>> lockdump.node2.dec Glock (inode[2], 627732)
>>>>   gl_flags = lock[1]
>>>>   gl_count = 375
>>>>   gl_state = unlocked[0]
>>>>   req_gh = yes
>>>>   req_bh = yes
>>>>   lvb_count = 0
>>>>   object = yes
>>>>   new_le = no
>>>>   incore_le = no
>>>>   reclaim = no
>>>>   aspace = 0
>>>>   ail_bufs = no
>>>>   Request
>>>>     owner = 20187
>>>>     gh_state = shared[3]
>>>>     gh_flags = any[3]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   Waiter3
>>>>     owner = 20187
>>>>     gh_state = shared[3]
>>>>     gh_flags = any[3]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   [...loads of Waiter3 entries...]
>>>>   Waiter3
>>>>     owner = 10460
>>>>     gh_state = shared[3]
>>>>     gh_flags = any[3]
>>>>     error = 0
>>>>     gh_iflags = promote[1]
>>>>   Inode: busy
>>>> 2 requests
>>>>       
>>> -- 
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>     
>>
>>   
>
>


From wcheng at redhat.com  Tue Feb 13 23:44:38 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Tue, 13 Feb 2007 18:44:38 -0500
Subject: [Linux-cluster] DLM Locks and Memory issue
In-Reply-To: <6a90e4da0702131119n18b73e8av244262bd7705287c@mail.gmail.com>
References: <6a90e4da0702131119n18b73e8av244262bd7705287c@mail.gmail.com>
Message-ID: <45D24D66.2080104@redhat.com>

Jon Erickson wrote:
> All,
>
> I've been testing GFS with millions and millions of files and my
> system performance degrades as the number of locks increases.  Along
> with the number of locks being in the millions, all of my system
> memory is used.  After performing a umount and mount of my GFS file
> system the locks go back down to zero and all the memory is reclaimed.
> This of course improves my system performance.  Is there a way to
> release all locks and memory associated without unmounting my file
> system?
>
> Should I try the patch in the bug report?
> Comment #31: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=214239
>
Yes, please do.  Check out:

http://people.redhat.com/wcheng/Patches/GFS/R4/readme

-- Wendy


From hlawatschek at atix.de  Wed Feb 14 08:17:30 2007
From: hlawatschek at atix.de (Mark Hlawatschek)
Date: Wed, 14 Feb 2007 09:17:30 +0100
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D217EC.7080907@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>
	<45D209DD.5050003@redhat.com> <45D217EC.7080907@redhat.com>
Message-ID: <200702140917.31024.hlawatschek@atix.de>

On Tuesday 13 February 2007 20:56, Wendy Cheng wrote:
> Wendy Cheng wrote:
> >
> > So it is removing a file. It has obtained the directory lock and is
> > waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never
> > occurs. Do you have abnormal messages in your /var/log/messages file ?
> > Dave, how to dump the locks from DLM side to see how DLM is thinking ?
>
> shell> cman_tool services    /* find your lock space name */
> shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks
> shell> cat /proc/cluster/dlm_locks
>
> Then try to find the lock (2, hex of (1114343)) - cut and paste the
> contents of that file here.

syslog seems to be ok.
note, that the process 5856 is running on node1

Here's the dlm output:

node1:
Resource 0000010001218088 (parent 0000000000000000). Name (len=24) "       2          
1100e7"
Local Copy, Master is node 2
Granted Queue
Conversion Queue
Waiting Queue
5eb00178 PR (EX) Master:     3eeb0117  LQ: 0,0x5
[...]
Resource 00000100f56f0618 (parent 0000000000000000). Name (len=24) "       5          
1100e7"
Local Copy, Master is node 2
Granted Queue
5bc20257 PR Master:     3d9703e0
Conversion Queue
Waiting Queue

node2:
Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) "       2          
1100e7"
Master Copy
Granted Queue
3eeb0117 PR Remote:   1 5eb00178
Conversion Queue
Waiting Queue
[...]
Resource 000001079f7e81d8 (parent 0000000000000000). Name (len=24) "       5
      1100e7"
Master Copy
Granted Queue
3d9703e0 PR Remote:   1 5bc20257
3e500091 PR
Conversion Queue
Waiting Queue

Thanks for your help,

Mark

> >>>> we have the following deadlock situation:
> >>>>
> >>>> 2 node cluster consisting of node1 and node2.
> >>>> /usr/local is placed on a GFS filesystem mounted on both nodes.
> >>>> Lockmanager is dlm.
> >>>> We are using RHEL4u4
> >>>>
> >>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
> >>>> lstat("/usr/local/swadmin/mnx/xml",
> >>>>
> >>>> This happens on both cluster nodes.
> >>>>
> >>>> All processes trying to access the directory
> >>>> /usr/local/swadmin/mnx/xml
> >>>> are in "Waiting for IO (D)" state. I.e. system load is at about 400
> >>>> ;-)
> >>>>
> >>>> Any ideas ?
> >>>
> >>> Quickly browsing this, look to me that process with pid=5856 got stuck.
> >>> That process had the file or directory (ino number 627732 - probably
> >>> /usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
> >>> it. The faulty process was apparently in the middle of obtaining
> >>> another
> >>> exclusive lock (and almost got it). We need to know where pid=5856 was
> >>> stuck at that time. If this occurs again, could you use "crash" to back
> >>> trace that process and show us the output ?
> >>
> >> Here's the crash output:
> >>
> >> crash> bt 5856
> >> PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
> >>  #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
> >>  #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
> >>  #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
> >>  #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
> >>  #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
> >>  #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
> >>  #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
> >>  #7 [10bd20cff30] filp_close at ffffffff80178e48
> >>  #8 [10bd20cff50] error_exit at ffffffff80110d91
> >>     RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
> >>     RAX: 0000000000000057  RBX: ffffffff8011026a  RCX: 0000002a9cc9c870
> >>     RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI: 0000002ae5989000
> >>     RBP: 0000000000000000   R8: 0000002a9630abb0   R9: 0000000000000ffc
> >>     R10: 0000002a9630abc0  R11: 0000000000000206  R12: 0000000040115700
> >>     R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15: 0000002ae5989000
> >>     ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b
> >>
> >>>> a lockdump analysis with the decipher_lockstate_dump and
> >>>> parse_lockdump
> >>>> shows the following output (The whole file is too large for the
> >>>> mailing-list):
> >>>>
> >>>> Entries:  101939
> >>>> Glocks:  60112
> >>>> PIDs:  751
> >>>>
> >>>> 4 chain:
> >>>> lockdump.node1.dec Glock (inode[2], 1114343)
> >>>>   gl_flags = lock[1]
> >>>>   gl_count = 5
> >>>>   gl_state = shared[3]
> >>>>   req_gh = yes
> >>>>   req_bh = yes
> >>>>   lvb_count = 0
> >>>>   object = yes
> >>>>   new_le = no
> >>>>   incore_le = no
> >>>>   reclaim = no
> >>>>   aspace = 1
> >>>>   ail_bufs = no
> >>>>   Request
> >>>>     owner = 5856
> >>>>     gh_state = exclusive[1]
> >>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   Waiter3
> >>>>     owner = 5856
> >>>>     gh_state = exclusive[1]
> >>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   Inode: busy
> >>>> lockdump.node2.dec Glock (inode[2], 1114343)
> >>>>   gl_flags =
> >>>>   gl_count = 2
> >>>>   gl_state = unlocked[0]
> >>>>   req_gh = no
> >>>>   req_bh = no
> >>>>   lvb_count = 0
> >>>>   object = yes
> >>>>   new_le = no
> >>>>   incore_le = no
> >>>>   reclaim = no
> >>>>   aspace = 0
> >>>>   ail_bufs = no
> >>>>   Inode:
> >>>>     num = 1114343/1114343
> >>>>     type = regular[1]
> >>>>     i_count = 1
> >>>>     i_flags =
> >>>>     vnode = yes
> >>>> lockdump.node1.dec Glock (inode[2], 627732)
> >>>>   gl_flags = dirty[5]
> >>>>   gl_count = 379
> >>>>   gl_state = exclusive[1]
> >>>>   req_gh = no
> >>>>   req_bh = no
> >>>>   lvb_count = 0
> >>>>   object = yes
> >>>>   new_le = no
> >>>>   incore_le = no
> >>>>   reclaim = no
> >>>>   aspace = 58
> >>>>   ail_bufs = no
> >>>>   Holder
> >>>>     owner = 5856
> >>>>     gh_state = exclusive[1]
> >>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1] holder[6] first[7]
> >>>>   Waiter2
> >>>>     owner = none[-1]
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = try[0]
> >>>>     error = 0
> >>>>     gh_iflags = demote[2] alloced[4] dealloc[5]
> >>>>   Waiter3
> >>>>     owner = 32753
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = any[3]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   [...loads of Waiter3 entries...]
> >>>>   Waiter3
> >>>>     owner = 4566
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = any[3]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   Inode: busy
> >>>> lockdump.node2.dec Glock (inode[2], 627732)
> >>>>   gl_flags = lock[1]
> >>>>   gl_count = 375
> >>>>   gl_state = unlocked[0]
> >>>>   req_gh = yes
> >>>>   req_bh = yes
> >>>>   lvb_count = 0
> >>>>   object = yes
> >>>>   new_le = no
> >>>>   incore_le = no
> >>>>   reclaim = no
> >>>>   aspace = 0
> >>>>   ail_bufs = no
> >>>>   Request
> >>>>     owner = 20187
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = any[3]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   Waiter3
> >>>>     owner = 20187
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = any[3]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   [...loads of Waiter3 entries...]
> >>>>   Waiter3
> >>>>     owner = 10460
> >>>>     gh_state = shared[3]
> >>>>     gh_flags = any[3]
> >>>>     error = 0
> >>>>     gh_iflags = promote[1]
> >>>>   Inode: busy
> >>>> 2 requests
> >>>
> >>> --

-- 
Gruss / Regards,

Mark Hlawatschek
http://www.atix.de/               http://www.open-sharedroot.org/

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany


From frederik.ferner at diamond.ac.uk  Wed Feb 14 13:05:04 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Wed, 14 Feb 2007 13:05:04 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
Message-ID: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>

Hi All,

I've recently run into the problem that in one of my clusters the second
node doesn't join the cluster anymore.

First some background on my setup here. I have a couple of two node
clusters connected to a common storage each. They're basically identical
setups running basically RHEL4U4 and corresponding cluster suite.
Everything was running fine until yesterday in one clusters one node
(i04-storage2) was fenced and can't seem to join the cluster anymore,
all I could find was messages in the log files of i04-storage2 telling
me "kernel: CMAN: sending membership request" over and over again. On
the node still in the cluster (i04-storage1) I could see nothing in any
log files. 

To get i04-storage2 back into my cluster, I tried to fence it again
using fence_tool on i04-storage1 without success. The node gets fenced,
as I can see on i04-storage1 in the log. When I increased the version of
the cluster config on the working node, the join request was rejected
directly but the same timeout occured when I copied the new
configuration and tried to start the cluster suite again. 

There's no firewall on any computer involved, both are connected to the
same switch. Using wireshark I can see UDP packets with source and
destination port 6809 going from i04-storage2 to i04-storage1 and from
i04-storage1 to the network broadcast address. No other network traffic
seems to be going between these two hosts.

The same setup used to work fine. All other clusters are supposed to be
identical to that one and I don't see that kind of behaviour. If there's
a difference, I can't spot it.

Does anyone have any suggestions what else I could look for? What could
be wrong here?

If you need any other bits of information that I haven't supplied,
please ask.

Many thanks in advance,
Frederik

-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468


From admin.cluster at gmail.com  Wed Feb 14 13:48:28 2007
From: admin.cluster at gmail.com (Anthony)
Date: Wed, 14 Feb 2007 14:48:28 +0100
Subject: [Linux-cluster] how to get the Gflops power?
Message-ID: <45D3132C.9000902@gmail.com>

Hello,

I need to get the gflops value of my cluster , so that i compare it with 
teh top500 :-D

so my question is: how can i get the Gflops values of each server node?


heres is one listing of a server node cpuinfo:
# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 850
stepping        : 10
cpu MHz         : 2390.185
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 
3dnowext 3dnow
bogomips        : 4718.59
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 850
stepping        : 10
cpu MHz         : 2390.185
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 
3dnowext 3dnow
bogomips        : 4767.74
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 850
stepping        : 10
cpu MHz         : 2390.185
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 
3dnowext 3dnow
bogomips        : 4767.74
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 850
stepping        : 10
cpu MHz         : 2390.185
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 
3dnowext 3dnow
bogomips        : 4767.74
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp


From pcaulfie at redhat.com  Wed Feb 14 14:06:30 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 14 Feb 2007 14:06:30 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
Message-ID: <45D31766.3080908@redhat.com>

Frederik Ferner wrote:
> Hi All,
> 
> I've recently run into the problem that in one of my clusters the second
> node doesn't join the cluster anymore.
> 
> First some background on my setup here. I have a couple of two node
> clusters connected to a common storage each. They're basically identical
> setups running basically RHEL4U4 and corresponding cluster suite.
> Everything was running fine until yesterday in one clusters one node
> (i04-storage2) was fenced and can't seem to join the cluster anymore,
> all I could find was messages in the log files of i04-storage2 telling
> me "kernel: CMAN: sending membership request" over and over again. On
> the node still in the cluster (i04-storage1) I could see nothing in any
> log files. 
> 
> To get i04-storage2 back into my cluster, I tried to fence it again
> using fence_tool on i04-storage1 without success. The node gets fenced,
> as I can see on i04-storage1 in the log. When I increased the version of
> the cluster config on the working node, the join request was rejected
> directly but the same timeout occured when I copied the new
> configuration and tried to start the cluster suite again. 
> 
> There's no firewall on any computer involved, both are connected to the
> same switch. Using wireshark I can see UDP packets with source and
> destination port 6809 going from i04-storage2 to i04-storage1 and from
> i04-storage1 to the network broadcast address. No other network traffic
> seems to be going between these two hosts.
> 
> The same setup used to work fine. All other clusters are supposed to be
> identical to that one and I don't see that kind of behaviour. If there's
> a difference, I can't spot it.
> 
> Does anyone have any suggestions what else I could look for? What could
> be wrong here?
> 
> If you need any other bits of information that I haven't supplied,
> please ask.

The main reason a node would repeatedly try to rejoin a cluster is that it gets
told to "wait" by the remaining nodes. This happens when the remaining cluster
nodes are still in transition state (ie they haven't sorted out the cluster
after the node has left). Normally this state only lasts a fraction of a second
or maybe a handful of seconds for a very large cluster.

As you only have one node in the cluster It sounds like the remaining node may
be in some strange state that it can't get out of. I'm not sure what that would
be off-hand...

- it must be able to see the fenced nodes 'joinreq' messages because if you
increment the config version in reject it.
- it can't even be in transition here for the same reason ... the transition
state is checked before the validity of the joinreq message so the former case
would also fail!

Can you check the output of 'cman_tool status' and see what state the remaining
node is in. It might also be worth sending me the 'tcpdump -s0 -x port 6809'
output in case that shows anything useful.

-- 

patrick


From isplist at logicore.net  Wed Feb 14 14:02:16 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Wed, 14 Feb 2007 08:02:16 -0600
Subject: [Linux-cluster] Can't see all volumes
Message-ID: <20072148216.915974@leena>

With all of the tech's on this list... no one has seen this problem?
Thought I'd ask again since I'm still stumped.

By the way, anyone know of any user groups, forums for the Xyratex/MTI style 
storage chassis?

Mike

>This might or might not be the right list and if it is not, does anyone know 
>one that would cover things like fibre channel RAID devices?

>I have an 800GB RAID drive which I have split into 32 smaller volumes. This 
>is a fibre channel system, Xyratex. The problem is that I cannot see any more 
>than 2 volumes per controller when I check from any of the nodes.

>Basically, I would like to set up a small volume for each node along with 
>their various GFS volumes.


From dnlombar at ichips.intel.com  Wed Feb 14 15:32:09 2007
From: dnlombar at ichips.intel.com (Lombard, David N)
Date: Wed, 14 Feb 2007 07:32:09 -0800
Subject: [Linux-cluster] how to get the Gflops power?
In-Reply-To: <45D3132C.9000902@gmail.com>
References: <45D3132C.9000902@gmail.com>
Message-ID: <20070214153209.GA24846@nlxdcldnl2.cl.intel.com>

On Wed, Feb 14, 2007 at 02:48:28PM +0100, Anthony wrote:
> Hello,
> 
> I need to get the gflops value of my cluster , so that i compare it with 
> teh top500 :-D
> 
> so my question is: how can i get the Gflops values of each server node?
> 
> heres is one listing of a server node cpuinfo:

Top500 ranking is based on the performance of "HPL" or the "High-Performance
Linpack Benchmark for Distributed-Memory Computers".  It's a non-trivial
activity to get the number, as you need to understand what the benchmark is
doing and how to tune it for your system.  You also need an increasingly
faster (lower latency and higher bandwidth) network as cluster size increases.

The minimum entry has exceeded 1 TF in the last four lists (i.e., since ISC
2005)--the current #500 is almost three times faster at 2736 GF; the current
#1 is two orders of magnitude larger at 280.6 TF.

At any rate, take a look at Top500 itself <http://www.top500.org/> and
especially the "About" page if you want to understand more or give it a go.

-- 
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.


From shirai at sc-i.co.jp  Wed Feb 14 15:00:25 2007
From: shirai at sc-i.co.jp (shirai@SystemCreateInc)
Date: Thu, 15 Feb 2007 00:00:25 +0900
Subject: [Linux-cluster] Can't see all volumes
References: <20072148216.915974@leena>
Message-ID: <24FC2E99D9E64349B866EAC79D68A9A0@ZSTAR>

Hi Mike

Could you show the output of cat /proc/scsi/scsi?
I think that it can recognize only the drive of LUN number 0.

1. Add the following line to/etc/modprobe.conf.
    options scsi_mod.o max_scsi_luns=255

2. Next, this setting must be read when you boot OS.
    #mkinitrd /boot/< your initrd image > `uname -r`

Regard

------------------------------------------------------
Shirai Noriyuki
Chief Engineer Technical Div. System Create Inc
Kanda Toyo Bldg, 3-4-2 Kandakajicho
Chiyodaku Tokyo 101-0045 Japan
Tel81-3-5296-3775 Fax81-3-5296-3777
e-mail:shirai at sc-i.co.jp web:http://www.sc-i.co.jp
------------------------------------------------------


> With all of the tech's on this list... no one has seen this problem?
> Thought I'd ask again since I'm still stumped.
>
> By the way, anyone know of any user groups, forums for the Xyratex/MTI
style
> storage chassis?
>
> Mike
>
> >This might or might not be the right list and if it is not, does anyone
know
> >one that would cover things like fibre channel RAID devices?
>
> >I have an 800GB RAID drive which I have split into 32 smaller volumes.
This
> >is a fibre channel system, Xyratex. The problem is that I cannot see any
more
> >than 2 volumes per controller when I check from any of the nodes.
>
> >Basically, I would like to set up a small volume for each node along with
> >their various GFS volumes.
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>


From teigland at redhat.com  Wed Feb 14 15:59:57 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 14 Feb 2007 09:59:57 -0600
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <200702140917.31024.hlawatschek@atix.de>
References: <200702131337.18680.hlawatschek@atix.de>
	<45D209DD.5050003@redhat.com> <45D217EC.7080907@redhat.com>
	<200702140917.31024.hlawatschek@atix.de>
Message-ID: <20070214155957.GA22795@redhat.com>

> node1:
> Resource 0000010001218088 (parent 0000000000000000). Name (len=24) "       2
> 1100e7"
> Local Copy, Master is node 2
> Granted Queue
> Conversion Queue
> Waiting Queue
> 5eb00178 PR (EX) Master:     3eeb0117  LQ: 0,0x5

> node2:
> Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) "       2
> 1100e7"
> Master Copy
> Granted Queue
> 3eeb0117 PR Remote:   1 5eb00178
> Conversion Queue
> Waiting Queue

The state of the lock on node1 looks bad.  I'm studying the code and
struggling to understand how it could possibly arrive in that state.

Some things to notice:
- the lock is converting, it should be on the Conversion Queue, not the
  Waiting Queue
- lockqueue_state is 0, so either node1 has not sent a remote request to
  node2 at all, or node1 did send something and already received some kind
  of reply so it's not waiting for a reply any longer
- the state of the lock on node2 looks normal

Did you check for suspicious syslog messages on both nodes?  Did any nodes
on this fs mount, unmount or fail around the time this happened?  Has this
happened before?  If you'd like to try to reproduce this with some dlm
debugging I could send you a patch (although this is such an odd state I'm
not sure yet where I'd begin to add debugging.)

Dave


From erickson.jon at gmail.com  Wed Feb 14 15:57:02 2007
From: erickson.jon at gmail.com (Jon Erickson)
Date: Wed, 14 Feb 2007 10:57:02 -0500
Subject: [Linux-cluster] DLM Locks and Memory issue
In-Reply-To: <45D24D66.2080104@redhat.com>
References: <6a90e4da0702131119n18b73e8av244262bd7705287c@mail.gmail.com>
	<45D24D66.2080104@redhat.com>
Message-ID: <6a90e4da0702140757k482fc7o9543037dfda3a56b@mail.gmail.com>

Wendy,

I tried using the ko files from your directory, but I received a
Invalid Symbol message.

i running two separate clusters with the following packages.  Can you
create a ko files that will work in these environments to test?

Envirorment One  x86_64
uname -r = 2.6.9-42.0.3.ELsmp
GFS-6.1.6-1
GFS-kernel-smp-2.6.9-60.3
GFS-kernheaders-2.6.9-60.3

Enviroment Two i686
uname -r = 2.6.9-34.0.1.ELhudemem
GFS-kernel-hugemem-2.6.9-49-1.1
GFS-kernheaders.2.6.9-49.1.1
GFS-6.1.5-0

Let me know what you think.

Thanks,
Jon


On 2/13/07, Wendy Cheng <wcheng at redhat.com> wrote:
> Jon Erickson wrote:
> > All,
> >
> > I've been testing GFS with millions and millions of files and my
> > system performance degrades as the number of locks increases.  Along
> > with the number of locks being in the millions, all of my system
> > memory is used.  After performing a umount and mount of my GFS file
> > system the locks go back down to zero and all the memory is reclaimed.
> > This of course improves my system performance.  Is there a way to
> > release all locks and memory associated without unmounting my file
> > system?
> >
> > Should I try the patch in the bug report?
> > Comment #31: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=214239
> >
> Yes, please do.  Check out:
>
> http://people.redhat.com/wcheng/Patches/GFS/R4/readme
>
> -- Wendy
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Jon


From frederik.ferner at diamond.ac.uk  Wed Feb 14 16:03:48 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Wed, 14 Feb 2007 16:03:48 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <45D31766.3080908@redhat.com>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
	<45D31766.3080908@redhat.com>
Message-ID: <1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>

Hi Patrick,

thanks for you reply.

I've just discovered that I seem to have the same problem on one more
cluster, so maybe I've change something that causes this but did not
affect a running cluster. I'll append the cluster.conf for the original
cluster as well.

On Wed, 2007-02-14 at 14:06 +0000, Patrick Caulfield wrote:
> Frederik Ferner wrote:
> > I've recently run into the problem that in one of my clusters the second
> > node doesn't join the cluster anymore.
> > 
> > First some background on my setup here. I have a couple of two node
> > clusters connected to a common storage each. They're basically identical
> > setups running basically RHEL4U4 and corresponding cluster suite.
> > Everything was running fine until yesterday in one clusters one node
> > (i04-storage2) was fenced and can't seem to join the cluster anymore,
> > all I could find was messages in the log files of i04-storage2 telling
> > me "kernel: CMAN: sending membership request" over and over again. On
> > the node still in the cluster (i04-storage1) I could see nothing in any
> > log files. 

> The main reason a node would repeatedly try to rejoin a cluster is that it gets
> told to "wait" by the remaining nodes. This happens when the remaining cluster
> nodes are still in transition state (ie they haven't sorted out the cluster
> after the node has left). Normally this state only lasts a fraction of a second
> or maybe a handful of seconds for a very large cluster.
> 
> As you only have one node in the cluster It sounds like the remaining node may
> be in some strange state that it can't get out of. I'm not sure what that would
> be off-hand...
> 
> - it must be able to see the fenced nodes 'joinreq' messages because if you
> increment the config version in reject it.

That's what I assumed.

> - it can't even be in transition here for the same reason ... the transition
> state is checked before the validity of the joinreq message so the former case
> would also fail!
> 
> Can you check the output of 'cman_tool status' and see what state the remaining
> node is in. It might also be worth sending me the 'tcpdump -s0 -x port 6809'
> output in case that shows anything useful.

See attached file for tcpdump output.

<snip>
[bnh65367 at i04-storage1 log]$ cman_tool status
Protocol version: 5.0.1
Config version: 20
Cluster name: i04-cluster
Cluster ID: 33460
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 2
Total_votes: 4
Quorum: 3
Active subsystems: 8
Node name: i04-storage1.diamond.ac.uk
Node ID: 1
Node addresses: 172.23.104.33

[bnh65367 at i04-storage1 log]$
</snip>

Thanks,
Frederik

-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468
-------------- next part --------------
A non-text attachment was scrubbed...
Name: i04_tcpdump_s0_port_6809
Type: application/octet-stream
Size: 1299 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070214/df045c74/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: i04-cluster.conf
Type: text/xml
Size: 2738 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070214/df045c74/attachment.xml>

From pcaulfie at redhat.com  Wed Feb 14 16:33:19 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 14 Feb 2007 16:33:19 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>	<45D31766.3080908@redhat.com>
	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
Message-ID: <45D339CF.7070408@redhat.com>

Frederik Ferner wrote:
> Hi Patrick,
> 
> thanks for you reply.
> 
> I've just discovered that I seem to have the same problem on one more
> cluster, so maybe I've change something that causes this but did not
> affect a running cluster. I'll append the cluster.conf for the original
> cluster as well.
> 

Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
message from the fenced one - there are no responses to it at all. You haven't
enabled any iptables filtering have you ?
-- 

patrick


From frederik.ferner at diamond.ac.uk  Wed Feb 14 17:36:18 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Wed, 14 Feb 2007 17:36:18 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <45D339CF.7070408@redhat.com>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
	<45D31766.3080908@redhat.com>
	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
	<45D339CF.7070408@redhat.com>
Message-ID: <1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>

On Wed, 2007-02-14 at 16:33 +0000, Patrick Caulfield wrote:
> Frederik Ferner wrote:
> > I've just discovered that I seem to have the same problem on one more
> > cluster, so maybe I've change something that causes this but did not
> > affect a running cluster. I'll append the cluster.conf for the original
> > cluster as well.
> > 
> 
> Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
> message from the fenced one - there are no responses to it at all. You haven't
> enabled any iptables filtering have you ?

But they seem to reach the network card at least, correct? So I don't
have to start looking at the switch, should I?

No, iptables filtering is definitely disabled, just verified with
iptables:

<snip>
[bnh65367 at i04-storage2 ~]$ sudo iptables -L -n -v
Chain INPUT (policy ACCEPT 2851K packets, 511M bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 2420K packets, 157M bytes)
 pkts bytes target     prot opt in     out     source               destination
[bnh65367 at i04-storage2 ~]$
</snip>

Cheers,
Frederik
-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468


From lhh at redhat.com  Wed Feb 14 18:15:19 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 14 Feb 2007 13:15:19 -0500
Subject: [Linux-cluster] how to get the Gflops power?
In-Reply-To: <45D3132C.9000902@gmail.com>
References: <45D3132C.9000902@gmail.com>
Message-ID: <1171476919.3036.0.camel@localhost.localdomain>

On Wed, 2007-02-14 at 14:48 +0100, Anthony wrote:
> I need to get the gflops value of my cluster , so that i compare it
> with 
> teh top500 :-D 

I thought it was a benchmark, not just a /proc/* thing. *shrug*

-- Lon


From wcheng at redhat.com  Wed Feb 14 19:41:25 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Wed, 14 Feb 2007 14:41:25 -0500
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <200702140917.31024.hlawatschek@atix.de>
References: <200702131337.18680.hlawatschek@atix.de>	<45D209DD.5050003@redhat.com>
	<45D217EC.7080907@redhat.com>
	<200702140917.31024.hlawatschek@atix.de>
Message-ID: <45D365E5.40308@redhat.com>

Mark Hlawatschek wrote:
> On Tuesday 13 February 2007 20:56, Wendy Cheng wrote:
>   
>> Wendy Cheng wrote:
>>     
>>> So it is removing a file. It has obtained the directory lock and is
>>> waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never
>>> occurs. Do you have abnormal messages in your /var/log/messages file ?
>>> Dave, how to dump the locks from DLM side to see how DLM is thinking ?
>>>       
>> shell> cman_tool services    /* find your lock space name */
>> shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks
>> shell> cat /proc/cluster/dlm_locks
>>
>> Then try to find the lock (2, hex of (1114343)) - cut and paste the
>> contents of that file here.
>>     
>
> syslog seems to be ok.
> note, that the process 5856 is running on node1
>   
What I was looking for is a lock (type=2 and lock number=0x98d3 
(=1114343)) - that's the lock hangs process id=5856. Since pid=5856 also 
holds another directory exclusive lock, so no body can access to that 
directory.

Apparently from GFS end, node 2 thinks 0x98d3 is "unlocked" and node 1 
is waiting for it. The only thing that can get node 1 out of this wait 
state is DLM's callback. If DLM doesn't have any record of this lock, 
pid=5856 will wait forever. Are you sure this is the whole file of DLM 
output ? This lock somehow disappears from DLM and I have no idea why we 
get into this state. If the files are too large, could you tar the files 
and email over ? I would like to see (both) complete glock and dlm lock 
dumps on both nodes (4 files here). If possible, add the following two 
outputs (so 6 files total):

shell> cd /tmp  /* on both nodes */
shell> script /* this should generate a file called typescript in /tmp 
directory */
shell> crash
crash> foreach bt /* keep hitting space bar until this command run thru */
crash> quit
shell> <cntl><d> /* this should close out typescript file */
shell> mv typescript nodex_crash /* x=1, 2 based on node1 or node2 */

Tar these 6 files (glock_dump_1, glock_dump_2, dlm_dump_1, dlm_dump_2, 
node1_crash, node2_crash) and email them to wcheng at redhat.com

Thank you for the helps if you can.

-- Wendy

> Here's the dlm output:
>
> node1:
> Resource 0000010001218088 (parent 0000000000000000). Name (len=24) "       2          
> 1100e7"
> Local Copy, Master is node 2
> Granted Queue
> Conversion Queue
> Waiting Queue
> 5eb00178 PR (EX) Master:     3eeb0117  LQ: 0,0x5
> [...]
> Resource 00000100f56f0618 (parent 0000000000000000). Name (len=24) "       5          
> 1100e7"
> Local Copy, Master is node 2
> Granted Queue
> 5bc20257 PR Master:     3d9703e0
> Conversion Queue
> Waiting Queue
>
> node2:
> Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) "       2          
> 1100e7"
> Master Copy
> Granted Queue
> 3eeb0117 PR Remote:   1 5eb00178
> Conversion Queue
> Waiting Queue
> [...]
> Resource 000001079f7e81d8 (parent 0000000000000000). Name (len=24) "       5
>       1100e7"
> Master Copy
> Granted Queue
> 3d9703e0 PR Remote:   1 5bc20257
> 3e500091 PR
> Conversion Queue
> Waiting Queue
>
> Thanks for your help,
>
> Mark
>
>   
>>>>>> we have the following deadlock situation:
>>>>>>
>>>>>> 2 node cluster consisting of node1 and node2.
>>>>>> /usr/local is placed on a GFS filesystem mounted on both nodes.
>>>>>> Lockmanager is dlm.
>>>>>> We are using RHEL4u4
>>>>>>
>>>>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
>>>>>> lstat("/usr/local/swadmin/mnx/xml",
>>>>>>
>>>>>> This happens on both cluster nodes.
>>>>>>
>>>>>> All processes trying to access the directory
>>>>>> /usr/local/swadmin/mnx/xml
>>>>>> are in "Waiting for IO (D)" state. I.e. system load is at about 400
>>>>>> ;-)
>>>>>>
>>>>>> Any ideas ?
>>>>>>             
>>>>> Quickly browsing this, look to me that process with pid=5856 got stuck.
>>>>> That process had the file or directory (ino number 627732 - probably
>>>>> /usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
>>>>> it. The faulty process was apparently in the middle of obtaining
>>>>> another
>>>>> exclusive lock (and almost got it). We need to know where pid=5856 was
>>>>> stuck at that time. If this occurs again, could you use "crash" to back
>>>>> trace that process and show us the output ?
>>>>>           
>>>> Here's the crash output:
>>>>
>>>> crash> bt 5856
>>>> PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
>>>>  #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
>>>>  #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
>>>>  #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
>>>>  #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
>>>>  #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
>>>>  #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
>>>>  #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
>>>>  #7 [10bd20cff30] filp_close at ffffffff80178e48
>>>>  #8 [10bd20cff50] error_exit at ffffffff80110d91
>>>>     RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
>>>>     RAX: 0000000000000057  RBX: ffffffff8011026a  RCX: 0000002a9cc9c870
>>>>     RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI: 0000002ae5989000
>>>>     RBP: 0000000000000000   R8: 0000002a9630abb0   R9: 0000000000000ffc
>>>>     R10: 0000002a9630abc0  R11: 0000000000000206  R12: 0000000040115700
>>>>     R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15: 0000002ae5989000
>>>>     ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b
>>>>
>>>>         
>>>>>> a lockdump analysis with the decipher_lockstate_dump and
>>>>>> parse_lockdump
>>>>>> shows the following output (The whole file is too large for the
>>>>>> mailing-list):
>>>>>>
>>>>>> Entries:  101939
>>>>>> Glocks:  60112
>>>>>> PIDs:  751
>>>>>>
>>>>>> 4 chain:
>>>>>> lockdump.node1.dec Glock (inode[2], 1114343)
>>>>>>   gl_flags = lock[1]
>>>>>>   gl_count = 5
>>>>>>   gl_state = shared[3]
>>>>>>   req_gh = yes
>>>>>>   req_bh = yes
>>>>>>   lvb_count = 0
>>>>>>   object = yes
>>>>>>   new_le = no
>>>>>>   incore_le = no
>>>>>>   reclaim = no
>>>>>>   aspace = 1
>>>>>>   ail_bufs = no
>>>>>>   Request
>>>>>>     owner = 5856
>>>>>>     gh_state = exclusive[1]
>>>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   Waiter3
>>>>>>     owner = 5856
>>>>>>     gh_state = exclusive[1]
>>>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   Inode: busy
>>>>>> lockdump.node2.dec Glock (inode[2], 1114343)
>>>>>>   gl_flags =
>>>>>>   gl_count = 2
>>>>>>   gl_state = unlocked[0]
>>>>>>   req_gh = no
>>>>>>   req_bh = no
>>>>>>   lvb_count = 0
>>>>>>   object = yes
>>>>>>   new_le = no
>>>>>>   incore_le = no
>>>>>>   reclaim = no
>>>>>>   aspace = 0
>>>>>>   ail_bufs = no
>>>>>>   Inode:
>>>>>>     num = 1114343/1114343
>>>>>>     type = regular[1]
>>>>>>     i_count = 1
>>>>>>     i_flags =
>>>>>>     vnode = yes
>>>>>> lockdump.node1.dec Glock (inode[2], 627732)
>>>>>>   gl_flags = dirty[5]
>>>>>>   gl_count = 379
>>>>>>   gl_state = exclusive[1]
>>>>>>   req_gh = no
>>>>>>   req_bh = no
>>>>>>   lvb_count = 0
>>>>>>   object = yes
>>>>>>   new_le = no
>>>>>>   incore_le = no
>>>>>>   reclaim = no
>>>>>>   aspace = 58
>>>>>>   ail_bufs = no
>>>>>>   Holder
>>>>>>     owner = 5856
>>>>>>     gh_state = exclusive[1]
>>>>>>     gh_flags = try[0] local_excl[5] async[6]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1] holder[6] first[7]
>>>>>>   Waiter2
>>>>>>     owner = none[-1]
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = try[0]
>>>>>>     error = 0
>>>>>>     gh_iflags = demote[2] alloced[4] dealloc[5]
>>>>>>   Waiter3
>>>>>>     owner = 32753
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = any[3]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   [...loads of Waiter3 entries...]
>>>>>>   Waiter3
>>>>>>     owner = 4566
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = any[3]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   Inode: busy
>>>>>> lockdump.node2.dec Glock (inode[2], 627732)
>>>>>>   gl_flags = lock[1]
>>>>>>   gl_count = 375
>>>>>>   gl_state = unlocked[0]
>>>>>>   req_gh = yes
>>>>>>   req_bh = yes
>>>>>>   lvb_count = 0
>>>>>>   object = yes
>>>>>>   new_le = no
>>>>>>   incore_le = no
>>>>>>   reclaim = no
>>>>>>   aspace = 0
>>>>>>   ail_bufs = no
>>>>>>   Request
>>>>>>     owner = 20187
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = any[3]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   Waiter3
>>>>>>     owner = 20187
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = any[3]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   [...loads of Waiter3 entries...]
>>>>>>   Waiter3
>>>>>>     owner = 10460
>>>>>>     gh_state = shared[3]
>>>>>>     gh_flags = any[3]
>>>>>>     error = 0
>>>>>>     gh_iflags = promote[1]
>>>>>>   Inode: busy
>>>>>> 2 requests
>>>>>>             
>>>>> --
>>>>>           
>
>   


From wcheng at redhat.com  Wed Feb 14 20:04:06 2007
From: wcheng at redhat.com (Wendy Cheng)
Date: Wed, 14 Feb 2007 15:04:06 -0500
Subject: [Linux-cluster] DLM Locks and Memory issue
In-Reply-To: <6a90e4da0702140757k482fc7o9543037dfda3a56b@mail.gmail.com>
References: <6a90e4da0702131119n18b73e8av244262bd7705287c@mail.gmail.com>	<45D24D66.2080104@redhat.com>
	<6a90e4da0702140757k482fc7o9543037dfda3a56b@mail.gmail.com>
Message-ID: <45D36B36.4040701@redhat.com>

Jon Erickson wrote:
> Wendy,
>
> I tried using the ko files from your directory, but I received a
> Invalid Symbol message.
>
> i running two separate clusters with the following packages.  Can you
> create a ko files that will work in these environments to test?
I'm buried in two other urgent issues at this moment so this could take 
a while before I can get to it. The best way is for you to contact Red 
Hat support if you can.

-- Wendy
>
> Envirorment One  x86_64
> uname -r = 2.6.9-42.0.3.ELsmp
> GFS-6.1.6-1
> GFS-kernel-smp-2.6.9-60.3
> GFS-kernheaders-2.6.9-60.3
>
> Enviroment Two i686
> uname -r = 2.6.9-34.0.1.ELhudemem
> GFS-kernel-hugemem-2.6.9-49-1.1
> GFS-kernheaders.2.6.9-49.1.1
> GFS-6.1.5-0
>
> Let me know what you think.
>
> Thanks,
> Jon
>
>
> On 2/13/07, Wendy Cheng <wcheng at redhat.com> wrote:
>> Jon Erickson wrote:
>> > All,
>> >
>> > I've been testing GFS with millions and millions of files and my
>> > system performance degrades as the number of locks increases.  Along
>> > with the number of locks being in the millions, all of my system
>> > memory is used.  After performing a umount and mount of my GFS file
>> > system the locks go back down to zero and all the memory is reclaimed.
>> > This of course improves my system performance.  Is there a way to
>> > release all locks and memory associated without unmounting my file
>> > system?
>> >
>> > Should I try the patch in the bug report?
>> > Comment #31: 
>> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=214239
>> >
>> Yes, please do.  Check out:
>>
>> http://people.redhat.com/wcheng/Patches/GFS/R4/readme
>>
>> -- Wendy
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>


From srramasw at cisco.com  Wed Feb 14 20:49:31 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Wed, 14 Feb 2007 12:49:31 -0800
Subject: [Linux-cluster] GFS journaling related crash
Message-ID: <B14199FA0DBAAF4AA89E83EB41D354350310952C@xmb-sjc-22c.amer.cisco.com>

GFS experts,
 
I'm looking for help to debug this crash. I'm running GFS with low
journal size. I understand this is bit unchartered territory for GFS,
but we have a need to be really stingy on journal overhead. 
 
GFS is consistently crashing with same stack trace, seems related to
journal log being committed to the disk. gfs_tool counters show
log-space-used close to 100%. iostat shows a huge write just before the
crash. Probably the journal blocks being flushed to disk?
 
I tried the workaround to reduce "max_atomic_write" value listed in
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146672. But no
luck.
 
Could this be a GFS bug that gets exercised only on small journal size? 
 
Full stack trace, gfs counters and iostat data listed below. Appreciate
any help!
 
- Sridharan
Cisco Systems, Inc
 
---------------------
 
Logs:

Feb 14 11:39:33 cfs1 kernel: GFS: fsid=cisco:gfs2.0: jid=1: Looking at
journal...
Feb 14 11:39:33 cfs1 kernel: GFS: fsid=cisco:gfs2.0: jid=1: Done
Feb 14 11:39:33 cfs1 kernel: GFS: fsid=cisco:gfs2.0: jid=2: Trying to
acquire journal lock...
Feb 14 11:39:33 cfs1 kernel: GFS: fsid=cisco:gfs2.0: jid=2: Looking at
journal...
Feb 14 11:39:33 cfs1 kernel: GFS: fsid=cisco:gfs2.0: jid=2: Done
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: head_off = 61856,
head_wrap = 8
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: dump_off = 61856,
dump_wrap = 7
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: assertion "FALSE"
failed
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   function =
check_seg_usage
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   file =
/download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/log.c, line = 590
Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   time = 1171482110
Feb 14 11:41:50 cfs1 kernel: ------------[ cut here ]------------
Feb 14 11:41:50 cfs1 kernel: kernel BUG at
/download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/util.c:211!
Feb 14 11:41:50 cfs1 kernel: invalid operand: 0000 [#1]
Feb 14 11:41:50 cfs1 kernel: SMP 
Feb 14 11:41:50 cfs1 kernel: Modules linked in: lock_dlm(U) dlm(U)
gfs(U) lock_harness(U) cman(U) nfsd exp
ortfs nfs lockd nfs_acl md5 ipv6 parport_pc lp parport autofs4 i2c_dev
i2c_core sunrpc dm_mirror dm_mod bu
tton battery ac uhci_hcd ehci_hcd hw_random e1000 e100 mii floppy ext3
jbd
Feb 14 11:41:50 cfs1 kernel: CPU:    0
Feb 14 11:41:50 cfs1 kernel: EIP:    0060:[<e10864f6>]    Tainted: GF
VLI
Feb 14 11:41:50 cfs1 kernel: EFLAGS: 00010246   (2.6.9-42.7.ELsmp) 
Feb 14 11:41:50 cfs1 kernel: EIP is at gfs_assert_i+0x38/0x69 [gfs]
Feb 14 11:41:50 cfs1 kernel: eax: 000000f8   ebx: e0c9e000   ecx:
c7cb3b7c   edx: e108d0a2
Feb 14 11:41:50 cfs1 kernel: esi: e108702a   edi: e108a021   ebp:
0000024e   esp: c7cb3b78
Feb 14 11:41:50 cfs1 kernel: ds: 007b   es: 007b   ss: 0068
Feb 14 11:41:50 cfs1 kernel: Process bonnie++ (pid: 8091,
threadinfo=c7cb3000 task=df3c43b0)
Feb 14 11:41:50 cfs1 kernel: Stack: e108d0a2 e0cc2734 e108a021 e0cc2734
e108702a e0cc2734 e1089e34 0000024e 
Feb 14 11:41:50 cfs1 kernel:        e0cc2734 45d365fe 00000000 0000f190
c48b9080 e0c9e000 e106e880 e1089e34 
Feb 14 11:41:50 cfs1 kernel:        e0cc2734 45d365fe 00000000 0000f190
c48b9080 e0c9e000 e106e880 e1089e34 
Feb 14 11:41:50 cfs1 kernel:        0000024e 00000007 00000000 0000f1a0
00000000 0000f1c0 00000000 00000008 
Feb 14 11:41:50 cfs1 kernel:        0000024e 00000007 00000000 0000f1a0
00000000 0000f1c0 00000000 00000008 
Feb 14 11:41:50 cfs1 kernel: Call Trace:
Feb 14 11:41:50 cfs1 kernel:  [<e106e880>] check_seg_usage+0x197/0x19f
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106ea03>] sync_trans+0x143/0x1b1 [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e1071641>] quota_trans_size+0x20/0x36
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106ebdd>] disk_commit+0xec/0x264 [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106edf6>] log_refund+0x61/0x187 [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106f235>] log_flush_internal+0xec/0x19e
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106e131>] gfs_log_reserve+0x19e/0x20e
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e1063f77>]
glock_wait_internal+0x1e3/0x1ef [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e10642e1>] gfs_glock_nq+0xe3/0x116 [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e1085334>] gfs_trans_begin_i+0xfd/0x15a
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e106842e>]
inode_init_and_link+0x1fe/0x388 [gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e10648a3>] gfs_glock_nq_init+0x13/0x26
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e10648f4>] gfs_glock_nq_num+0x2e/0x71
[gfs]
Feb 14 11:41:50 cfs1 kernel:  [<e1068767>] gfs_createi+0x1af/0x1f1 [gfs]
Feb 14 11:41:51 cfs1 kernel:  [<e107a9d0>] gfs_create+0x68/0x16f [gfs]
Feb 14 11:41:51 cfs1 kernel:  [<c0167bd0>] vfs_create+0xbc/0x103
Feb 14 11:41:51 cfs1 kernel:  [<c0167fa8>] open_namei+0x177/0x579
Feb 14 11:41:51 cfs1 kernel:  [<c015a442>] filp_open+0x45/0x70
Feb 14 11:41:51 cfs1 kernel:  [<c02d2e66>] __cond_resched+0x14/0x39
Feb 14 11:41:51 cfs1 kernel:  [<c01c3326>]
direct_strncpy_from_user+0x3e/0x5d
Feb 14 11:41:51 cfs1 kernel:  [<c015a79e>] sys_open+0x31/0x7d
Feb 14 11:41:51 cfs1 kernel:  [<c015a7fc>] sys_creat+0x12/0x16
Feb 14 11:41:51 cfs1 kernel:  [<c02d48bb>] syscall_call+0x7/0xb
Feb 14 11:41:51 cfs1 kernel:  [<c02d007b>] packet_rcv+0x8e/0x307
Feb 14 11:41:51 cfs1 kernel: Code: c3 83 b8 30 02 00 00 00 74 2c ff 35
90 7d 43 c0 8d 80 34 47 02 00 50 55
 ff 74 24 20 50 51 50 52 50 68 a2 d0 08 e1 e8 ac c3 09 df <0f> 0b d3 00
25 d1 08 e1 83 c4 28 e8 9d f8 07 d
f ff 35 90 7d 43 
Feb 14 11:41:51 cfs1 kernel:  <0>Fatal exception: panic in 5 seconds

 
[root at cfs1 ~]$ gfs_tool counters /mnt/gfs2
 
                                  locks 11451
                             locks held 11450
                           freeze count 0
                          incore inodes 5713
                       metadata buffers 5838
                        unlinked inodes 1
                              quota IDs 2
                     incore log buffers 893
                         log space used 98.41%
              meta header cache entries 53
                     glock dependencies 0
                 glocks on reclaim list 0
                              log wraps 7
                   outstanding LM calls 0
                  outstanding BIO calls 0
                       fh2dentry misses 0
                       glocks reclaimed 9
                         glock nq calls 131554
                         glock dq calls 125830
                   glock prefetch calls 0
                          lm_lock calls 11484
                        lm_unlock calls 12
                           lm callbacks 11496
                     address operations 0
                      dentry operations 4
                      export operations 0
                        file operations 17177
                       inode operations 45696
                       super operations 45690
                          vm operations 0
                        block I/O reads 52
                       block I/O writes 17269

 
[root at cfs1 ~]$ iostat 2 1000 | grep hda12
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
hda12             0.42         1.52         1.84       2242       2720
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12             2.51         0.00        20.10          0         40
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0
hda12          4467.68      0.00     35741.41      0      70768
hda12             0.00         0.00         0.00          0          0
hda12             0.00         0.00         0.00          0          0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070214/b12fb868/attachment.htm>

From teigland at redhat.com  Wed Feb 14 21:12:20 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 14 Feb 2007 15:12:20 -0600
Subject: [Linux-cluster] GFS journaling related crash
In-Reply-To: <B14199FA0DBAAF4AA89E83EB41D354350310952C@xmb-sjc-22c.amer.cisco.com>
References: <B14199FA0DBAAF4AA89E83EB41D354350310952C@xmb-sjc-22c.amer.cisco.com>
Message-ID: <20070214211220.GC25238@redhat.com>

On Wed, Feb 14, 2007 at 12:49:31PM -0800, Sridharan Ramaswamy (srramasw) wrote:
> Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: head_off = 61856,
> head_wrap = 8
> Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: dump_off = 61856,
> dump_wrap = 7
> Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: assertion "FALSE"
> failed
> Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   function =
> check_seg_usage
> Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   file =
> /download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/log.c, line = 590

Have you tried playing with the journal segment size with gfs_mkfs -s ?
Scaling that down along with the journal size might have some effect.
You might also disable quotas if you haven't yet which should reduce the
transaction sizes.  WRT the assertion, that might be something that's
fixable, but it would require delving into the logging code a bit.

Dave


From srramasw at cisco.com  Wed Feb 14 22:01:25 2007
From: srramasw at cisco.com (Sridharan Ramaswamy (srramasw))
Date: Wed, 14 Feb 2007 14:01:25 -0800
Subject: [Linux-cluster] GFS journaling related crash
In-Reply-To: <20070214211220.GC25238@redhat.com>
Message-ID: <B14199FA0DBAAF4AA89E83EB41D354350310959D@xmb-sjc-22c.amer.cisco.com>

Thanks Dave. I'll try to play around with journal segment size. Didn't
notice this knob till now.

FWIW, this crash happens only when Journal size is reduced from 6M to 4M
on 3-node cluster w/ 512M filesystem.

- Sridharan

> -----Original Message-----
> From: David Teigland [mailto:teigland at redhat.com] 
> Sent: Wednesday, February 14, 2007 1:12 PM
> To: Sridharan Ramaswamy (srramasw)
> Cc: linux-cluster at redhat.com
> Subject: Re: [Linux-cluster] GFS journaling related crash
> 
> On Wed, Feb 14, 2007 at 12:49:31PM -0800, Sridharan Ramaswamy 
> (srramasw) wrote:
> > Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: 
> head_off = 61856,
> > head_wrap = 8
> > Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: 
> dump_off = 61856,
> > dump_wrap = 7
> > Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0: 
> assertion "FALSE"
> > failed
> > Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   function =
> > check_seg_usage
> > Feb 14 11:41:50 cfs1 kernel: GFS: fsid=cisco:gfs2.0:   file =
> > /download/gfs/cluster.cvs-rhel4/gfs-kernel/src/gfs/log.c, line = 590
> 
> Have you tried playing with the journal segment size with 
> gfs_mkfs -s ?
> Scaling that down along with the journal size might have some effect.
> You might also disable quotas if you haven't yet which should 
> reduce the
> transaction sizes.  WRT the assertion, that might be something that's
> fixable, but it would require delving into the logging code a bit.
> 
> Dave
> 


From list.sudhakar at gmail.com  Thu Feb 15 05:50:05 2007
From: list.sudhakar at gmail.com (Sudhakar G)
Date: Thu, 15 Feb 2007 11:20:05 +0530
Subject: [Linux-cluster] DLM internals
Message-ID: <ce31d420702142150t48cf1f89h68b16b1f3116a066@mail.gmail.com>

Hi,

Can any one let me know how DLM (Distributed Lock Manager) works. The
internals of it. ie., whether the logic of granting of locks is centralised
or distributed. If distributed how?

Thanks
Sudhakar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/e41f8777/attachment.htm>

From pcaulfie at redhat.com  Thu Feb 15 09:07:03 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Thu, 15 Feb 2007 09:07:03 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>	<45D31766.3080908@redhat.com>	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>	<45D339CF.7070408@redhat.com>
	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>
Message-ID: <45D422B7.30506@redhat.com>

Frederik Ferner wrote:
> On Wed, 2007-02-14 at 16:33 +0000, Patrick Caulfield wrote:
>> Frederik Ferner wrote:
>>> I've just discovered that I seem to have the same problem on one more
>>> cluster, so maybe I've change something that causes this but did not
>>> affect a running cluster. I'll append the cluster.conf for the original
>>> cluster as well.
>>>
>> Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
>> message from the fenced one - there are no responses to it at all. You haven't
>> enabled any iptables filtering have you ?
> 
> But they seem to reach the network card at least, correct? So I don't
> have to start looking at the switch, should I?

Well, they are reaching tcpdump - more than that is hard to say ;-)

It's hard to make much sense of the symptoms to be quite honest. If it's a
switch problem then I would expect it to affect running nodes as well as joining
ones - that's the whole point of the heartbeat!

You could try running tcpdump on the two machines to see if the packets are the
same on both...if so then it could be some strange bug in cman that we've not
seen before that's preventing it seeing incoming packets (I have no idea what
that might be though, off hand)

It would be interesting to know - though you may not want to do it - if the
problem persists when the still-running node is rebooted.
-- 

patrick


From francisco_javier.pena at roche.com  Thu Feb 15 10:46:05 2007
From: francisco_javier.pena at roche.com (Pena, Francisco Javier)
Date: Thu, 15 Feb 2007 11:46:05 +0100
Subject: [Linux-cluster] CMAN and qdiskd (limited?) online reconfig
	capabilities
Message-ID: <C0C1791E8EC6F249B5570F01409BD3EE014854A6@rmamsem1.emea.roche.com>

Hello everyone,

While checking some strange cman startup behavior (it always assigns 1 vote to each cluster node, no matter what I set in the cluster.conf file), I have spent some time digging through the cluster code to understand how it manages online reconfigurations.

In the Red Hat docs (and also at the RH436 course) we are told that the following command sequence is required to update the cluster config:

- ccs_tool update /etc/cluster/cluster.conf
- cman_tool version -r <new version>

However, following the cman_tool code down to the kernel module part, all it does is to update the internal config_version variable to be the new version on all nodes. From the cman-kernel source (RHEL4 U4 SRPMS), file membership.c:

static int do_process_reconfig(struct msghdr *msg, char *buf, int len)
{
	...

        case RECONFIG_PARAM_CONFIG_VERSION:
                config_version = val;
                break;
	...
}

No configuration is reread, so it does not matter what you change in the configuration file, CMAN will never know about it until it is restarted. Similarly, I found out that qdiskd never updates the configuration, in fact it is listed as TODO ( 2) Poll ccsd for configuration changes. ).

It looks like the RHEL5 branch of the CMAN code does reread the configuration (calling read_ccs_nodes), so any change is updated. The qdisk code still shows the ccsd poll in the TODO list.

So, straight to the questions: are there any plans to change the RHEL4 code to make CMAN and qdisk get the configuration changes without needing to restart? Should I file a bugzilla against this?

Thanks in advance. Regards,

Javier Pe?a


From pcaulfie at redhat.com  Thu Feb 15 11:24:31 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Thu, 15 Feb 2007 11:24:31 +0000
Subject: [Linux-cluster] CMAN and qdiskd (limited?) online
	reconfig	capabilities
In-Reply-To: <C0C1791E8EC6F249B5570F01409BD3EE014854A6@rmamsem1.emea.roche.com>
References: <C0C1791E8EC6F249B5570F01409BD3EE014854A6@rmamsem1.emea.roche.com>
Message-ID: <45D442EF.3070300@redhat.com>

Pena, Francisco Javier wrote:
> Hello everyone,
> 
> While checking some strange cman startup behavior (it always assigns 1 vote to each cluster node, no matter what I set in the cluster.conf file), I have spent some time digging through the cluster code to understand how it manages online reconfigurations.

I think that's a bug introduced in U4 - it's fixed in U5.

> In the Red Hat docs (and also at the RH436 course) we are told that the following command sequence is required to update the cluster config:
> 
> - ccs_tool update /etc/cluster/cluster.conf
> - cman_tool version -r <new version>
> 
> However, following the cman_tool code down to the kernel module part, all it does is to update the internal config_version variable to be the new version on all nodes. From the cman-kernel source (RHEL4 U4 SRPMS), file membership.c:
> 
> static int do_process_reconfig(struct msghdr *msg, char *buf, int len)
> {
> 	...
> 
>         case RECONFIG_PARAM_CONFIG_VERSION:
>                 config_version = val;
>                 break;
> 	...
> }
> 
> No configuration is reread, so it does not matter what you change in the configuration file, CMAN will never know about it until it is restarted. Similarly, I found out that qdiskd never updates the configuration, in fact it is listed as TODO ( 2) Poll ccsd for configuration changes. ).
> 
> It looks like the RHEL5 branch of the CMAN code does reread the configuration (calling read_ccs_nodes), so any change is updated. The qdisk code still shows the ccsd poll in the TODO list.
> 
> So, straight to the questions: are there any plans to change the RHEL4 code to make CMAN and qdisk get the configuration changes without needing to restart? Should I file a bugzilla against this?
> 

No, there are no plans to change this behaviour in RHEL4. if you want to change
the votes of a node you will need to use cman_tool votes -v<n>

-- 

patrick


From frederik.ferner at diamond.ac.uk  Thu Feb 15 11:36:03 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Thu, 15 Feb 2007 11:36:03 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <45D422B7.30506@redhat.com>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
	<45D31766.3080908@redhat.com>
	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
	<45D339CF.7070408@redhat.com>
	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>
	<45D422B7.30506@redhat.com>
Message-ID: <1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>

On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
> Frederik Ferner wrote:
> > On Wed, 2007-02-14 at 16:33 +0000, Patrick Caulfield wrote:
> >> Frederik Ferner wrote:
> >>> I've just discovered that I seem to have the same problem on one more
> >>> cluster, so maybe I've change something that causes this but did not
> >>> affect a running cluster. I'll append the cluster.conf for the original
> >>> cluster as well.
> >>>
> >> Looking at the tcpdump it seems that the existing node isn't seeing the joinreq
> >> message from the fenced one - there are no responses to it at all. You haven't
> >> enabled any iptables filtering have you ?
> > 
> > But they seem to reach the network card at least, correct? So I don't
> > have to start looking at the switch, should I?
> 
> Well, they are reaching tcpdump - more than that is hard to say ;-)

> You could try running tcpdump on the two machines to see if the packets are the
> same on both...if so then it could be some strange bug in cman that we've not
> seen before that's preventing it seeing incoming packets (I have no idea what
> that might be though, off hand)

I've had a look at the tcpdump on both machines at the same time. The
packets look identical to me. I've attached the two tcpdump files, maybe
someone can see a difference that I'm missing.

> It would be interesting to know - though you may not want to do it - if the
> problem persists when the still-running node is rebooted.

Obviously not at the moment, but I have a maintenance window upcoming
soon where I might be able to do that. I'll keep you informed about the
result.

Thanks for looking into that,
Frederik
-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468


From orkcu at yahoo.com  Thu Feb 15 14:55:18 2007
From: orkcu at yahoo.com (=?iso-8859-1?Q?Roger_Pe=F1a?=)
Date: Thu, 15 Feb 2007 06:55:18 -0800 (PST)
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <20072148216.915974@leena>
Message-ID: <131598.98604.qm@web50606.mail.yahoo.com>


--- "isplist at logicore.net" <isplist at logicore.net>
wrote:

> With all of the tech's on this list... no one has
> seen this problem?
> Thought I'd ask again since I'm still stumped.
> 
> By the way, anyone know of any user groups, forums
> for the Xyratex/MTI style 
> storage chassis?
> 
> Mike
> 
> >This might or might not be the right list and if it
> is not, does anyone know 
> >one that would cover things like fibre channel RAID
> devices?
> 
> >I have an 800GB RAID drive which I have split into
> 32 smaller volumes. This 
> >is a fibre channel system, Xyratex. The problem is
> that I cannot see any more 
> >than 2 volumes per controller when I check from any
> of the nodes.
well, first of all:
do the system see all the _devices_ ?
I mean, can you see all the devices (LUNs) exported
from the SAN but can't see all the LVM in that devices

or you can see just 2 LUNs exported from the SAN ?
so you can see just the LVM in that just 2 LUNs.

if it is the last , then you need to check the SAN
infrasestructure.
if it is the first case ... then you have a problem in
the cluster-system architecture; and then, maybe this
list can help you ;-)

cu
roger

__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
http://answers.yahoo.com/dir/?link=list&sid=396546091


From teigland at redhat.com  Thu Feb 15 15:14:26 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 15 Feb 2007 09:14:26 -0600
Subject: [Linux-cluster] DLM internals
In-Reply-To: <ce31d420702142150t48cf1f89h68b16b1f3116a066@mail.gmail.com>
References: <ce31d420702142150t48cf1f89h68b16b1f3116a066@mail.gmail.com>
Message-ID: <20070215151426.GA18284@redhat.com>

On Thu, Feb 15, 2007 at 11:20:05AM +0530, Sudhakar G wrote:
> Hi,
> 
> Can any one let me know how DLM (Distributed Lock Manager) works. The
> internals of it. ie., whether the logic of granting of locks is centralised
> or distributed. If distributed how?

This is an excellent description of a dlm and the general ideas/logic
reflect very well our own dlm:

http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf

Dave


From isplist at logicore.net  Thu Feb 15 16:20:14 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Thu, 15 Feb 2007 10:20:14 -0600
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <131598.98604.qm@web50606.mail.yahoo.com>
Message-ID: <2007215102014.387428@leena>

The bottom line on what I'd like to achieve is that all servers would have 
central storage for their own boot/OS drives. On another FC channel, nodes 
would then also have separate access to their GFS/Cluster storage. This would 
save me on drives, hardware failures, etc and allows me to have truly 
centralized storage.

So, what I've done in trying this is as follows;

Create a small RAID array using 12 drives or around 800GB.
Create 32 individual volumes each with it's own LUN, 0-31.

Take 32 servers and use the volumes as their OS drives rather than having a 
drive on each server.

Note: Maybe that's the problem? Servers cannot see beyond LUN's 0/1 or so for 
installation?

> do the system see all the _devices_ ?
> I mean, can you see all the devices (LUNs) exported
> from the SAN but can't see all the LVM in that devices

Nodes and  servers with FC HBA installed can only see two volumes per 
controller. The RAID chassis has 2 controllers to the most I can see is 4 
volumes. 

> or you can see just 2 LUNs exported from the SAN ?
> so you can see just the LVM in that just 2 LUNs.
> if it is the last , then you need to check the SAN
> infrasestructure.
> if it is the first case ... then you have a problem in
> the cluster-system architecture; and then, maybe this
> list can help you ;-)

I'm sure it's at the SAN level since I'm only trying to install the OS's on 
the new blades right now. Mind you, I can't see past the two volumes on any 
server anyhow.

I figure many on this list deal with large amounts of complex storage, a good 
place to ask. Since I don't know the answer, the hope is that someone who has 
some ideas will ask me questions that lead to finding a solution.

Mike


From jleafey at utmem.edu  Thu Feb 15 17:34:37 2007
From: jleafey at utmem.edu (Jay Leafey)
Date: Thu, 15 Feb 2007 11:34:37 -0600
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <2007215102014.387428@leena>
References: <2007215102014.387428@leena>
Message-ID: <45D499AD.4030902@utmem.edu>

You might want to change the number of LUNs visible to the SCSI adapter. 
  We've got a Fibre-based SAN and we use a lot more than two LUNs.  You 
can alter the number of LUNs the SCSI subsystem sees by adding the 
following to your /etc/modprobe.conf file:

options scsi_mod max_luns=255

This will tell the SCSI subsystem to allow for up to 255 LUNs for each 
SCSI device.

Since we boot off of local storage I'm not sure how you would handle 
this on boot, but I imagine it can be passed as a boot-time kernel 
option for the initial install.

Just a thought!
-- 
Jay Leafey - University of Tennessee
E-Mail:  jleafey at utmem.edu  Phone:  901-448-5848  FAX:  901-448-8199
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5153 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/b45d2dc1/attachment.bin>

From rstevens at vitalstream.com  Thu Feb 15 17:38:19 2007
From: rstevens at vitalstream.com (Rick Stevens)
Date: Thu, 15 Feb 2007 09:38:19 -0800
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <45D499AD.4030902@utmem.edu>
References: <2007215102014.387428@leena>  <45D499AD.4030902@utmem.edu>
Message-ID: <1171561099.3074.135.camel@prophead.corp.publichost.com>

On Thu, 2007-02-15 at 11:34 -0600, Jay Leafey wrote:
> You might want to change the number of LUNs visible to the SCSI adapter. 
>   We've got a Fibre-based SAN and we use a lot more than two LUNs.  You 
> can alter the number of LUNs the SCSI subsystem sees by adding the 
> following to your /etc/modprobe.conf file:
> 
> options scsi_mod max_luns=255
> 
> This will tell the SCSI subsystem to allow for up to 255 LUNs for each 
> SCSI device.
> 
> Since we boot off of local storage I'm not sure how you would handle 
> this on boot, but I imagine it can be passed as a boot-time kernel 
> option for the initial install.

Indeed it can.

> 
> Just a thought!
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-          Consciousness: that annoying time between naps.           -
----------------------------------------------------------------------


From isplist at logicore.net  Thu Feb 15 17:39:35 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Thu, 15 Feb 2007 11:39:35 -0600
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <45D499AD.4030902@utmem.edu>
Message-ID: <2007215113935.660504@leena>

Hi Jay,

> You might want to change the number of LUNs visible to the SCSI adapter.
> We've got a Fibre-based SAN and we use a lot more than two LUNs.  You
> can alter the number of LUNs the SCSI subsystem sees by adding the
> following to your /etc/modprobe.conf file:

Problem is, there is no OS to modify anything on yet, I'm at the install 
stage. Perhaps it is an HBA issue where I need to tell the HBA about the 
additional LUN's rather than the usual 0/1?

Mike


From orkcu at yahoo.com  Thu Feb 15 17:51:37 2007
From: orkcu at yahoo.com (=?iso-8859-1?Q?Roger_Pe=F1a?=)
Date: Thu, 15 Feb 2007 09:51:37 -0800 (PST)
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <2007215113935.660504@leena>
Message-ID: <382977.27325.qm@web50605.mail.yahoo.com>


--- "isplist at logicore.net" <isplist at logicore.net>
wrote:

> Hi Jay,
> 
> > You might want to change the number of LUNs
> visible to the SCSI adapter.
> > We've got a Fibre-based SAN and we use a lot more
> than two LUNs.  You
> > can alter the number of LUNs the SCSI subsystem
> sees by adding the
> > following to your /etc/modprobe.conf file:
> 
> Problem is, there is no OS to modify anything on
> yet, I'm at the install 
> stage. Perhaps it is an HBA issue where I need to
> tell the HBA about the 
> additional LUN's rather than the usual 0/1?
> 
Then I sujects you to use a LiveCD distro to test if
the system can see the luns somehow
or maybe the rescue CD can allow you see if the HBA
can see all the LUNs (in the debug console alt+F3 or
alt+F4 maybe)

because maybe there is something outside the linux
kernel (hardware or SAN configuration) which disable
you to see all the LUNs

cu
roger

__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com


From rhurst at bidmc.harvard.edu  Thu Feb 15 18:02:18 2007
From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu)
Date: Thu, 15 Feb 2007 13:02:18 -0500
Subject: [Linux-cluster] ccs_tool update
Message-ID: <1171562538.7516.23.camel@WSBID06223>

From time-to-time, I have needed to update the cluster.conf by hand
using vi, and not use the system-config-cluster utility.  I make certain
I bump up the config_version.  I invoke `ccs_tool
update /etc/cluster/cluster.conf` afterwards, and all listening nodes
appear to update their local cluster.conf file to the new one.

However, I seem to have problems when rebooting a node, in that it gets
rejected because the config version of the "remote" is still regarded as
the earlier version than the "local" ... so I suspect that while files
were replicated, the quorum did not apply the "update" to their active
configuration.  So, I invoke system-config-cluster and click Send to
Cluster, and all works fine.

I must be missing a step, but essentially, I'd like to be able to do
this update from a shell/script.  How might this be done please?


Robert Hurst, Sr. Cach? Administrator
Beth Israel Deaconess Medical Center
1135 Tremont Street, REN-7
Boston, Massachusetts   02120-2140
617-754-8754 ? Fax: 617-754-8730 ? Cell: 401-787-3154
Any technology distinguishable from magic is insufficiently advanced.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/e9d231bc/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2178 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/e9d231bc/attachment.p7s>

From rpeterso at redhat.com  Thu Feb 15 18:24:59 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Thu, 15 Feb 2007 12:24:59 -0600
Subject: [Linux-cluster] ccs_tool update
In-Reply-To: <1171562538.7516.23.camel@WSBID06223>
References: <1171562538.7516.23.camel@WSBID06223>
Message-ID: <45D4A57B.6030801@redhat.com>

rhurst at bidmc.harvard.edu wrote:
> From time-to-time, I have needed to update the cluster.conf by hand
> using vi, and not use the system-config-cluster utility.  I make certain
> I bump up the config_version.  I invoke `ccs_tool
> update /etc/cluster/cluster.conf` afterwards, and all listening nodes
> appear to update their local cluster.conf file to the new one.
>
> However, I seem to have problems when rebooting a node, in that it gets
> rejected because the config version of the "remote" is still regarded as
> the earlier version than the "local" ... so I suspect that while files
> were replicated, the quorum did not apply the "update" to their active
> configuration.  So, I invoke system-config-cluster and click Send to
> Cluster, and all works fine.
>
> I must be missing a step, but essentially, I'd like to be able to do
> this update from a shell/script.  How might this be done please?
>
>
> Robert Hurst, Sr. Cach? Administrator
> Beth Israel Deaconess Medical Center
> 1135 Tremont Street, REN-7
> Boston, Massachusetts   02120-2140
> 617-754-8754 ? Fax: 617-754-8730 ? Cell: 401-787-3154
> Any technology distinguishable from magic is insufficiently advanced.
>   
Hi Robert,

This might answer your question.  If it doesn't, let me know and I'll 
fix it:
http://sources.redhat.com/cluster/faq.html#clusterconfpropagate

Regards,

Bob Peterson
Red Hat Cluster Suite


From aruvic at bits.ba  Thu Feb 15 19:15:42 2007
From: aruvic at bits.ba (aruvic at bits.ba)
Date: Thu, 15 Feb 2007 20:15:42 +0100 (CET)
Subject: [Linux-cluster] http://people.redhat.com/teigland/dlm/patches/
In-Reply-To: <45D4A57B.6030801@redhat.com>
References: <1171562538.7516.23.camel@WSBID06223> <45D4A57B.6030801@redhat.com>
Message-ID: <3785.85.158.33.78.1171566942.squirrel@www.bits.ba>

Hi

on this page:
http://sources.redhat.com/cluster/dlm/

there is a not working link:

"Source code

The kernel patches"

which points to: http://people.redhat.com/teigland/dlm/patches/

is there an other url for this source code?

And where can we find some documentation about how DLM is working?

Thanks

Alen Ruvic


From teigland at redhat.com  Thu Feb 15 19:25:30 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 15 Feb 2007 13:25:30 -0600
Subject: [Linux-cluster] http://people.redhat.com/teigland/dlm/patches/
In-Reply-To: <3785.85.158.33.78.1171566942.squirrel@www.bits.ba>
References: <1171562538.7516.23.camel@WSBID06223> <45D4A57B.6030801@redhat.com>
	<3785.85.158.33.78.1171566942.squirrel@www.bits.ba>
Message-ID: <20070215192530.GD18284@redhat.com>

On Thu, Feb 15, 2007 at 08:15:42PM +0100, aruvic at bits.ba wrote:
> Hi
> 
> on this page:
> http://sources.redhat.com/cluster/dlm/

The subdirs under cluster/ are from the old web page, not relevant any
more, just use the page at http://sources.redhat.com/cluster/

> there is a not working link:
> 
> "Source code
> 
> The kernel patches"
> 
> which points to: http://people.redhat.com/teigland/dlm/patches/
> 
> is there an other url for this source code?

http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.20.tar.bz2
cd linux-2.6.20/fs/dlm/

> And where can we find some documentation about how DLM is working?

Funny you ask, I answered the same question just this morning:

https://www.redhat.com/archives/linux-cluster/2007-February/msg00127.html

Dave


From rhurst at bidmc.harvard.edu  Thu Feb 15 20:43:55 2007
From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu)
Date: Thu, 15 Feb 2007 15:43:55 -0500
Subject: [Linux-cluster] ccs_tool update
In-Reply-To: <45D4A57B.6030801@redhat.com>
References: <1171562538.7516.23.camel@WSBID06223> <45D4A57B.6030801@redhat.com>
Message-ID: <1171572235.13117.2.camel@WSBID06223>

Worked great, thank you!!  The FAQ is missing the dash for option 'r',
that is, it should be:

cman_tool version -r 38


On Thu, 2007-02-15 at 12:24 -0600, Robert Peterson wrote:

> Hi Robert,
> 
> This might answer your question.  If it doesn't, let me know and I'll 
> fix it:
> http://sources.redhat.com/cluster/faq.html#clusterconfpropagate
> 
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

Robert Hurst, Sr. Cach? Administrator
Beth Israel Deaconess Medical Center
1135 Tremont Street, REN-7
Boston, Massachusetts   02120-2140
617-754-8754 ? Fax: 617-754-8730 ? Cell: 401-787-3154
Any technology distinguishable from magic is insufficiently advanced.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/95d5fc43/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2178 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070215/95d5fc43/attachment.p7s>

From hlawatschek at atix.de  Thu Feb 15 20:59:49 2007
From: hlawatschek at atix.de (Mark Hlawatschek)
Date: Thu, 15 Feb 2007 21:59:49 +0100
Subject: [Linux-cluster] gfs deadlock situation
In-Reply-To: <45D4737D.5070404@redhat.com>
References: <200702131337.18680.hlawatschek@atix.de>
	<200702151113.10402.hlawatschek@atix.de>
	<45D4737D.5070404@redhat.com>
Message-ID: <200702152159.49650.hlawatschek@atix.de>

Hi Wendy,

I created a bugzilla for this:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228916

Thanks,

Mark

> > Is there a way to resolve the deadlock without rebooting a server ?
> >
> I don't think there is a way to work around this hang other than reboot.
> Will look into this with Dave (who owns DLM code in the group).
>
> > On Wednesday 14 February 2007 20:41, Wendy Cheng wrote:
> >> Mark Hlawatschek wrote:
> >>> On Tuesday 13 February 2007 20:56, Wendy Cheng wrote:
> >>>> Wendy Cheng wrote:
> >>>>> So it is removing a file. It has obtained the directory lock and is
> >>>>> waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback
> >>>>> never occurs. Do you have abnormal messages in your /var/log/messages
> >>>>> file ? Dave, how to dump the locks from DLM side to see how DLM is
> >>>>> thinking ?
> >>>>
> >>>> shell> cman_tool services    /* find your lock space name */
> >>>> shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks
> >>>> shell> cat /proc/cluster/dlm_locks
> >>>>
> >>>> Then try to find the lock (2, hex of (1114343)) - cut and paste the
> >>>> contents of that file here.
> >>>
> >>> syslog seems to be ok.
> >>> note, that the process 5856 is running on node1
> >>
> >> What I was looking for is a lock (type=2 and lock number=0x98d3
> >> (=1114343)) - that's the lock hangs process id=5856. Since pid=5856 also
> >> holds another directory exclusive lock, so no body can access to that
> >> directory.
> >>
> >> Apparently from GFS end, node 2 thinks 0x98d3 is "unlocked" and node 1
> >> is waiting for it. The only thing that can get node 1 out of this wait
> >> state is DLM's callback. If DLM doesn't have any record of this lock,
> >> pid=5856 will wait forever. Are you sure this is the whole file of DLM
> >> output ? This lock somehow disappears from DLM and I have no idea why we
> >> get into this state. If the files are too large, could you tar the files
> >> and email over ? I would like to see (both) complete glock and dlm lock
> >> dumps on both nodes (4 files here). If possible, add the following two
> >> outputs (so 6 files total):
> >>
> >> shell> cd /tmp  /* on both nodes */
> >> shell> script /* this should generate a file called typescript in /tmp
> >> directory */
> >> shell> crash
> >> crash> foreach bt /* keep hitting space bar until this command run thru
> >> */ crash> quit
> >> shell> <cntl><d> /* this should close out typescript file */
> >> shell> mv typescript nodex_crash /* x=1, 2 based on node1 or node2 */
> >>
> >> Tar these 6 files (glock_dump_1, glock_dump_2, dlm_dump_1, dlm_dump_2,
> >> node1_crash, node2_crash) and email them to wcheng at redhat.com
> >>
> >> Thank you for the helps if you can.
> >>
> >> -- Wendy
> >>
> >>> Here's the dlm output:
> >>>
> >>> node1:
> >>> Resource 0000010001218088 (parent 0000000000000000). Name (len=24) "
> >>>  2 1100e7"
> >>> Local Copy, Master is node 2
> >>> Granted Queue
> >>> Conversion Queue
> >>> Waiting Queue
> >>> 5eb00178 PR (EX) Master:     3eeb0117  LQ: 0,0x5
> >>> [...]
> >>> Resource 00000100f56f0618 (parent 0000000000000000). Name (len=24) "
> >>>  5 1100e7"
> >>> Local Copy, Master is node 2
> >>> Granted Queue
> >>> 5bc20257 PR Master:     3d9703e0
> >>> Conversion Queue
> >>> Waiting Queue
> >>>
> >>> node2:
> >>> Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) "
> >>>  2 1100e7"
> >>> Master Copy
> >>> Granted Queue
> >>> 3eeb0117 PR Remote:   1 5eb00178
> >>> Conversion Queue
> >>> Waiting Queue
> >>> [...]
> >>> Resource 000001079f7e81d8 (parent 0000000000000000). Name (len=24) "
> >>>  5 1100e7"
> >>> Master Copy
> >>> Granted Queue
> >>> 3d9703e0 PR Remote:   1 5bc20257
> >>> 3e500091 PR
> >>> Conversion Queue
> >>> Waiting Queue
> >>>
> >>> Thanks for your help,
> >>>
> >>> Mark
> >>>
> >>>>>>>> we have the following deadlock situation:
> >>>>>>>>
> >>>>>>>> 2 node cluster consisting of node1 and node2.
> >>>>>>>> /usr/local is placed on a GFS filesystem mounted on both nodes.
> >>>>>>>> Lockmanager is dlm.
> >>>>>>>> We are using RHEL4u4
> >>>>>>>>
> >>>>>>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
> >>>>>>>> lstat("/usr/local/swadmin/mnx/xml",
> >>>>>>>>
> >>>>>>>> This happens on both cluster nodes.
> >>>>>>>>
> >>>>>>>> All processes trying to access the directory
> >>>>>>>> /usr/local/swadmin/mnx/xml
> >>>>>>>> are in "Waiting for IO (D)" state. I.e. system load is at about
> >>>>>>>> 400 ;-)
> >>>>>>>>
> >>>>>>>> Any ideas ?
> >>>>>>>
> >>>>>>> Quickly browsing this, look to me that process with pid=5856 got
> >>>>>>> stuck. That process had the file or directory (ino number 627732 -
> >>>>>>> probably /usr/local/swadmin/mnx/xml) exclusive lock so everyone was
> >>>>>>> waiting for it. The faulty process was apparently in the middle of
> >>>>>>> obtaining another
> >>>>>>> exclusive lock (and almost got it). We need to know where pid=5856
> >>>>>>> was stuck at that time. If this occurs again, could you use "crash"
> >>>>>>> to back trace that process and show us the output ?
> >>>>>>
> >>>>>> Here's the crash output:
> >>>>>>
> >>>>>> crash> bt 5856
> >>>>>> PID: 5856   TASK: 10bd26427f0       CPU: 0   COMMAND: "java"
> >>>>>>  #0 [10bd20cfbc8] schedule at ffffffff8030a1d1
> >>>>>>  #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
> >>>>>>  #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
> >>>>>>  #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
> >>>>>>  #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
> >>>>>>  #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
> >>>>>>  #6 [10bd20cfed0] sys_unlink at ffffffff80188b19
> >>>>>>  #7 [10bd20cff30] filp_close at ffffffff80178e48
> >>>>>>  #8 [10bd20cff50] error_exit at ffffffff80110d91
> >>>>>>     RIP: 0000002a9593f649  RSP: 0000007fbfffbca0  RFLAGS: 00010206
> >>>>>>     RAX: 0000000000000057  RBX: ffffffff8011026a  RCX:
> >>>>>> 0000002a9cc9c870 RDX: 0000002ae5989000  RSI: 0000002a962fa3a8  RDI:
> >>>>>> 0000002ae5989000 RBP: 0000000000000000   R8: 0000002a9630abb0   R9:
> >>>>>> 0000000000000ffc R10: 0000002a9630abc0  R11: 0000000000000206  R12:
> >>>>>> 0000000040115700 R13: 0000002ae23294b0  R14: 0000007fbfffc300  R15:
> >>>>>> 0000002ae5989000 ORIG_RAX: 0000000000000057  CS: 0033  SS: 002b
> >>>>>>
> >>>>>>>> a lockdump analysis with the decipher_lockstate_dump and
> >>>>>>>> parse_lockdump
> >>>>>>>> shows the following output (The whole file is too large for the
> >>>>>>>> mailing-list):
> >>>>>>>>
> >>>>>>>> Entries:  101939
> >>>>>>>> Glocks:  60112
> >>>>>>>> PIDs:  751
> >>>>>>>>
> >>>>>>>> 4 chain:
> >>>>>>>> lockdump.node1.dec Glock (inode[2], 1114343)
> >>>>>>>>   gl_flags = lock[1]
> >>>>>>>>   gl_count = 5
> >>>>>>>>   gl_state = shared[3]
> >>>>>>>>   req_gh = yes
> >>>>>>>>   req_bh = yes
> >>>>>>>>   lvb_count = 0
> >>>>>>>>   object = yes
> >>>>>>>>   new_le = no
> >>>>>>>>   incore_le = no
> >>>>>>>>   reclaim = no
> >>>>>>>>   aspace = 1
> >>>>>>>>   ail_bufs = no
> >>>>>>>>   Request
> >>>>>>>>     owner = 5856
> >>>>>>>>     gh_state = exclusive[1]
> >>>>>>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   Waiter3
> >>>>>>>>     owner = 5856
> >>>>>>>>     gh_state = exclusive[1]
> >>>>>>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   Inode: busy
> >>>>>>>> lockdump.node2.dec Glock (inode[2], 1114343)
> >>>>>>>>   gl_flags =
> >>>>>>>>   gl_count = 2
> >>>>>>>>   gl_state = unlocked[0]
> >>>>>>>>   req_gh = no
> >>>>>>>>   req_bh = no
> >>>>>>>>   lvb_count = 0
> >>>>>>>>   object = yes
> >>>>>>>>   new_le = no
> >>>>>>>>   incore_le = no
> >>>>>>>>   reclaim = no
> >>>>>>>>   aspace = 0
> >>>>>>>>   ail_bufs = no
> >>>>>>>>   Inode:
> >>>>>>>>     num = 1114343/1114343
> >>>>>>>>     type = regular[1]
> >>>>>>>>     i_count = 1
> >>>>>>>>     i_flags =
> >>>>>>>>     vnode = yes
> >>>>>>>> lockdump.node1.dec Glock (inode[2], 627732)
> >>>>>>>>   gl_flags = dirty[5]
> >>>>>>>>   gl_count = 379
> >>>>>>>>   gl_state = exclusive[1]
> >>>>>>>>   req_gh = no
> >>>>>>>>   req_bh = no
> >>>>>>>>   lvb_count = 0
> >>>>>>>>   object = yes
> >>>>>>>>   new_le = no
> >>>>>>>>   incore_le = no
> >>>>>>>>   reclaim = no
> >>>>>>>>   aspace = 58
> >>>>>>>>   ail_bufs = no
> >>>>>>>>   Holder
> >>>>>>>>     owner = 5856
> >>>>>>>>     gh_state = exclusive[1]
> >>>>>>>>     gh_flags = try[0] local_excl[5] async[6]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1] holder[6] first[7]
> >>>>>>>>   Waiter2
> >>>>>>>>     owner = none[-1]
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = try[0]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = demote[2] alloced[4] dealloc[5]
> >>>>>>>>   Waiter3
> >>>>>>>>     owner = 32753
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = any[3]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   [...loads of Waiter3 entries...]
> >>>>>>>>   Waiter3
> >>>>>>>>     owner = 4566
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = any[3]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   Inode: busy
> >>>>>>>> lockdump.node2.dec Glock (inode[2], 627732)
> >>>>>>>>   gl_flags = lock[1]
> >>>>>>>>   gl_count = 375
> >>>>>>>>   gl_state = unlocked[0]
> >>>>>>>>   req_gh = yes
> >>>>>>>>   req_bh = yes
> >>>>>>>>   lvb_count = 0
> >>>>>>>>   object = yes
> >>>>>>>>   new_le = no
> >>>>>>>>   incore_le = no
> >>>>>>>>   reclaim = no
> >>>>>>>>   aspace = 0
> >>>>>>>>   ail_bufs = no
> >>>>>>>>   Request
> >>>>>>>>     owner = 20187
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = any[3]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   Waiter3
> >>>>>>>>     owner = 20187
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = any[3]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   [...loads of Waiter3 entries...]
> >>>>>>>>   Waiter3
> >>>>>>>>     owner = 10460
> >>>>>>>>     gh_state = shared[3]
> >>>>>>>>     gh_flags = any[3]
> >>>>>>>>     error = 0
> >>>>>>>>     gh_iflags = promote[1]
> >>>>>>>>   Inode: busy
> >>>>>>>> 2 requests
> >>>>>>>
> >>>>>>> --
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Gruss / Regards,

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

Dipl.-Ing. Mark Hlawatschek
http://www.atix.de/
http://www.open-sharedroot.org/

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany


From rpeterso at redhat.com  Thu Feb 15 21:11:40 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Thu, 15 Feb 2007 15:11:40 -0600
Subject: [Linux-cluster] ccs_tool update
In-Reply-To: <1171572235.13117.2.camel@WSBID06223>
References: <1171562538.7516.23.camel@WSBID06223> <45D4A57B.6030801@redhat.com>
	<1171572235.13117.2.camel@WSBID06223>
Message-ID: <45D4CC8C.8000202@redhat.com>

rhurst at bidmc.harvard.edu wrote:
> Worked great, thank you!!  The FAQ is missing the dash for option 'r',
> that is, it should be:
>
> cman_tool version -r 38
>   
Hi Robert.

Fixed.  Thanks for pointing it out.  Actually, in the source, it looked 
just like a -r.
But somehow that dash was interpreted wrong in that font.  I deleted it 
and retyped
a new "-" and now it appears.  Very strange indeed.  I checked the rest 
of the
faq for other bad dashes, but found none.

Regards,

Bob Peterson
Red Hat Cluster Suite


From jprats at cesca.es  Fri Feb 16 08:56:27 2007
From: jprats at cesca.es (Jordi Prats)
Date: Fri, 16 Feb 2007 09:56:27 +0100
Subject: [Linux-cluster] clustat segmentation fault
Message-ID: <45D571BB.6060203@cesca.es>

Hi,
After rebooting and adding a new disc (from a EVA device) clustat is 
giving me a segmentation fault:

# clustat
Segmentation fault

This is the current version:
# clustat -v
clustat version 1.9.53
Connected via: CMAN/SM Plugin v1.1.7.1

On the messages log do no appears nothing, but DLM seems that is not 
working because there's no proces called dlm_*

Any ideas where to start?

Thank you very much!

-- 

......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputaci? de Catalunya

  Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
  T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
...................................................................... 


From grimme at atix.de  Fri Feb 16 15:59:20 2007
From: grimme at atix.de (Marc Grimme)
Date: Fri, 16 Feb 2007 16:59:20 +0100
Subject: [Linux-cluster] conga ricci ssl problems
Message-ID: <200702161659.21154.grimme@atix.de>

Hi,
I'm trying to get conga ricci/luci up. I've compiled everything successfully 
from cvs, started luci and ricci on two different hosts. When I try to add 
the system I get the error:
--------------------------------------------------------------------
[root at gfs-node1 conga]# ricci -f -d
client added
client added
exception: SSL_read() error: SSL_ERROR_SYSCALL
request completed in 3 milliseconds
exception: SSL_read() error: SSL_ERROR_SYSCALL
request completed in 2 milliseconds
client removed
client removed
--------------------------------------------------------------
Looks like something is gone wrong with the ssl libraries or with the guy 
sitting behind keyboard and chair. Any ideas?

Thanks Marc.

-- 
Gruss / Regards,

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany

Registergericht: Amtsgericht M?nchen
Registernummer: HRB 131682
USt.-Id.: DE209485962

Gesch?ftsf?hrung: Marc Grimme, Mark Hlawatschek, Thomas Merz


From kpodesta at redbrick.dcu.ie  Fri Feb 16 16:38:38 2007
From: kpodesta at redbrick.dcu.ie (Karl Podesta)
Date: Fri, 16 Feb 2007 16:38:38 +0000
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
Message-ID: <20070216163838.GB24404@murphy.redbrick.dcu.ie>

Hi folks, 

Just a short query - are Dell DRAC cards certified/supported as power switches
for use in a 2-node Redhat 3 cluster? (2 x Dell Poweredge 2850s) 

Alternatively are there other recommendations on what to use? 

Many thanks!

Karl

-- 
Karl Podesta
Systems Engineer, Securelinx Ltd. (Ireland) 
http://www.securelinx.com/


From lhh at redhat.com  Fri Feb 16 17:04:22 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 16 Feb 2007 12:04:22 -0500
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <45D571BB.6060203@cesca.es>
References: <45D571BB.6060203@cesca.es>
Message-ID: <1171645462.3058.53.camel@localhost.localdomain>

On Fri, 2007-02-16 at 09:56 +0100, Jordi Prats wrote:
> Hi,
> After rebooting and adding a new disc (from a EVA device) clustat is 
> giving me a segmentation fault:
> 
> # clustat
> Segmentation fault
> 
> This is the current version:
> # clustat -v
> clustat version 1.9.53
> Connected via: CMAN/SM Plugin v1.1.7.1
> 
> On the messages log do no appears nothing, but DLM seems that is not 
> working because there's no proces called dlm_*
> 
> Any ideas where to start?

Cman isn't running.  This should be fixed in 4.5.

-- Lon


From lhh at redhat.com  Fri Feb 16 17:05:00 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 16 Feb 2007 12:05:00 -0500
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
In-Reply-To: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
References: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
Message-ID: <1171645500.3058.55.camel@localhost.localdomain>

On Fri, 2007-02-16 at 16:38 +0000, Karl Podesta wrote:
> Hi folks, 
> 
> Just a short query - are Dell DRAC cards certified/supported as power switches
> for use in a 2-node Redhat 3 cluster? (2 x Dell Poweredge 2850s) 

Red Hat Cluster Suite for RHEL3 or Red Hat GFS for RHEL3 ?

-- Lon


From lhh at redhat.com  Fri Feb 16 17:05:44 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 16 Feb 2007 12:05:44 -0500
Subject: [Linux-cluster] conga ricci ssl problems
In-Reply-To: <200702161659.21154.grimme@atix.de>
References: <200702161659.21154.grimme@atix.de>
Message-ID: <1171645544.3058.57.camel@localhost.localdomain>

On Fri, 2007-02-16 at 16:59 +0100, Marc Grimme wrote:
> Hi,
> I'm trying to get conga ricci/luci up. I've compiled everything successfully 
> from cvs, started luci and ricci on two different hosts. When I try to add 
> the system I get the error:
> --------------------------------------------------------------------
> [root at gfs-node1 conga]# ricci -f -d
> client added
> client added
> exception: SSL_read() error: SSL_ERROR_SYSCALL
> request completed in 3 milliseconds
> exception: SSL_read() error: SSL_ERROR_SYSCALL
> request completed in 2 milliseconds
> client removed
> client removed
> --------------------------------------------------------------
> Looks like something is gone wrong with the ssl libraries or with the guy 
> sitting behind keyboard and chair. Any ideas?

Hmm, are there any SELinux AVC messages?

-- Lon


From lhh at redhat.com  Fri Feb 16 17:06:09 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 16 Feb 2007 12:06:09 -0500
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <1171645462.3058.53.camel@localhost.localdomain>
References: <45D571BB.6060203@cesca.es>
	<1171645462.3058.53.camel@localhost.localdomain>
Message-ID: <1171645569.3058.59.camel@localhost.localdomain>

On Fri, 2007-02-16 at 12:04 -0500, Lon Hohberger wrote:
> On Fri, 2007-02-16 at 09:56 +0100, Jordi Prats wrote:
> > Hi,
> > After rebooting and adding a new disc (from a EVA device) clustat is 
> > giving me a segmentation fault:
> > 
> > # clustat
> > Segmentation fault
> > 
> > This is the current version:
> > # clustat -v
> > clustat version 1.9.53
> > Connected via: CMAN/SM Plugin v1.1.7.1
> > 
> > On the messages log do no appears nothing, but DLM seems that is not 
> > working because there's no proces called dlm_*
> > 
> > Any ideas where to start?
> 
> Cman isn't running.  This should be fixed in 4.5.

(or it might be running but inquorate)

-- Lon


From isplist at logicore.net  Fri Feb 16 16:04:20 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 16 Feb 2007 10:04:20 -0600
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <45D49E53.2080903@utmem.edu>
Message-ID: <200721610420.989274@leena>

Sorry if I sent this twice, I just found it in my drafts which usually means 
it didn't make it out.

-------------

> 1) We use Qlogic FC HBAs and I can view the visible LUNs from the BIOS
> setup during the boot process.  That would be the first step, seeing if
> the HBA sees the LUNs.

These are the HBA's I'm using also.
 
> 2) We use HP FC switches (relabeled Brocade, I think) and use the
> "zoning" feature to limit the LUNs seen by a particular FC HBA.  I can
> imagine that you would only want a specific host to see its own boot
> LUN, in addition to any "shared" LUNs, using this sort of functionality.

Had not thought of that, good idea. I'll first check the HBA, then set up some 
zoning on the switch.
 
> if you can see the LUNs using the method in 1), you should be able to
> pass the max_luns parameter to Anaconda during the install process.
> When the "boot>" prompt shows up, you should be able to enter something
> like:
> 
> linux scsi_mod.max_scsi_luns=255
> 
> This should pass the correct parameter to the kernel when it loads,
> which should make the additional LUNs visible.
> 
> Since this is so speculative I did not post this to the list, but if it
> works out you might want to send a summary back to the list.  Otherwise,
> the advice is worth just what you paid for it. (grin!)
> 
> Good luck!


From kpodesta at redbrick.dcu.ie  Fri Feb 16 17:13:36 2007
From: kpodesta at redbrick.dcu.ie (Karl Podesta)
Date: Fri, 16 Feb 2007 17:13:36 +0000
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
In-Reply-To: <1171645500.3058.55.camel@localhost.localdomain>
References: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
	<1171645500.3058.55.camel@localhost.localdomain>
Message-ID: <20070216171336.GC24404@murphy.redbrick.dcu.ie>

On Fri, Feb 16, 2007 at 12:05:00PM -0500, Lon Hohberger wrote:
> On Fri, 2007-02-16 at 16:38 +0000, Karl Podesta wrote:
> > Hi folks, 
> > 
> > Just a short query - are Dell DRAC cards certified/supported as power switches
> > for use in a 2-node Redhat 3 cluster? (2 x Dell Poweredge 2850s) 
> 
> Red Hat Cluster Suite for RHEL3 or Red Hat GFS for RHEL3 ?
> 
> -- Lon

Sorry, Red Hat Cluster Suite for RHEL3... 

Thanks, 
Karl

-- 
Karl Podesta
Systems Engineer, Securelinx Ltd. (Ireland)
http://www.securelinx.com/


From lhh at redhat.com  Fri Feb 16 17:54:04 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 16 Feb 2007 12:54:04 -0500
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
In-Reply-To: <20070216171336.GC24404@murphy.redbrick.dcu.ie>
References: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
	<1171645500.3058.55.camel@localhost.localdomain>
	<20070216171336.GC24404@murphy.redbrick.dcu.ie>
Message-ID: <1171648445.3058.71.camel@localhost.localdomain>

On Fri, 2007-02-16 at 17:13 +0000, Karl Podesta wrote:
> On Fri, Feb 16, 2007 at 12:05:00PM -0500, Lon Hohberger wrote:
> > On Fri, 2007-02-16 at 16:38 +0000, Karl Podesta wrote:
> > > Hi folks, 
> > > 
> > > Just a short query - are Dell DRAC cards certified/supported as power switches
> > > for use in a 2-node Redhat 3 cluster? (2 x Dell Poweredge 2850s) 
> > 
> > Red Hat Cluster Suite for RHEL3 or Red Hat GFS for RHEL3 ?
> > 
> > -- Lon
> 
> Sorry, Red Hat Cluster Suite for RHEL3... 

No agent for DRAC on RHCS3.

-- Lon


From lshen at cisco.com  Fri Feb 16 18:55:35 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Fri, 16 Feb 2007 10:55:35 -0800
Subject: [Linux-cluster] Running GFS on top of AoE or HyperSCSI
Message-ID: <08A9A3213527A6428774900A80DBD8D8037755D4@xmb-sjc-222.amer.cisco.com>

Has anyone tried or looked into running GFS on top of AoE or HyperSCSI
in replacement of GNBD? Our current thinking is that this will yield
better performance. 

Does this make sense? And will it require any work to be done in GFS and
even the cluster suite?

Lin    


From teigland at redhat.com  Fri Feb 16 19:20:47 2007
From: teigland at redhat.com (David Teigland)
Date: Fri, 16 Feb 2007 13:20:47 -0600
Subject: [Linux-cluster] Running GFS on top of AoE or HyperSCSI
In-Reply-To: <08A9A3213527A6428774900A80DBD8D8037755D4@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D8037755D4@xmb-sjc-222.amer.cisco.com>
Message-ID: <20070216192047.GA18201@redhat.com>

On Fri, Feb 16, 2007 at 10:55:35AM -0800, Lin Shen (lshen) wrote:
> Has anyone tried or looked into running GFS on top of AoE or HyperSCSI
> in replacement of GNBD? Our current thinking is that this will yield
> better performance. 
> 
> Does this make sense? And will it require any work to be done in GFS and
> even the cluster suite?

Yes it makes sense, no gfs/cluster work should be required.

Dave


From jprats at cesca.es  Fri Feb 16 19:28:32 2007
From: jprats at cesca.es (Jordi Prats)
Date: Fri, 16 Feb 2007 20:28:32 +0100
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <1171645569.3058.59.camel@localhost.localdomain>
References: <45D571BB.6060203@cesca.es>	<1171645462.3058.53.camel@localhost.localdomain>
	<1171645569.3058.59.camel@localhost.localdomain>
Message-ID: <45D605E0.3010008@cesca.es>

So I should update my system? There's something else that may be causing
this. I've found that all vgscan is giving me an error. This is the output:

# vgscan -v
   Wiping cache of LVM-capable devices
   Wiping internal VG cache
 Reading all physical volumes.  This may take a while...
   Finding all volume groups
 /dev/dm-4: read failed after 0 of 4096 at 12989693952: Input/output error
 /dev/dm-4: read failed after 0 of 4096 at 0: Input/output error
   Finding volume group "padicat"
 Found volume group "padicat" using metadata type lvm2
   Finding volume group "vg_bbdd"
 Found volume group "vg_bbdd" using metadata type lvm2
   Finding volume group "vg_ordal"
 Found volume group "vg_ordal" using metadata type lvm2

Any idea what this "Input/output error" on /dev/dm-4 does it means?

Jordi

Lon Hohberger wrote:
> On Fri, 2007-02-16 at 12:04 -0500, Lon Hohberger wrote:
>> On Fri, 2007-02-16 at 09:56 +0100, Jordi Prats wrote:
>>> Hi,
>>> After rebooting and adding a new disc (from a EVA device) clustat is 
>>> giving me a segmentation fault:
>>>
>>> # clustat
>>> Segmentation fault
>>>
>>> This is the current version:
>>> # clustat -v
>>> clustat version 1.9.53
>>> Connected via: CMAN/SM Plugin v1.1.7.1
>>>
>>> On the messages log do no appears nothing, but DLM seems that is not 
>>> working because there's no proces called dlm_*
>>>
>>> Any ideas where to start?
>> Cman isn't running.  This should be fixed in 4.5.
> 
> (or it might be running but inquorate)
> 
> -- Lon
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

-- 
......................................................................
        __
       / /          Jordi Prats Catal?
 C E / S / C A      Departament de Sistemes
     /_/            Centre de Supercomputaci? de Catalunya

 Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
 T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
......................................................................
pgp:0x5D0D1321
......................................................................


From lshen at cisco.com  Fri Feb 16 19:25:35 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Fri, 16 Feb 2007 11:25:35 -0800
Subject: [Linux-cluster] Re: RedHat SSI cluster
In-Reply-To: <45992F65.206@gmail.com>
Message-ID: <08A9A3213527A6428774900A80DBD8D8037E8356@xmb-sjc-222.amer.cisco.com>

Hi Aneesh,

We're planning to make GFS/GNBD to work on top of TIPC in hope that will
give better performance over TCP.  Since TIPC provides socket-like APIs,
our initial thinking was to just convert socket APIs in GFS/GNBD code to
TIPC socket-like APIs. Based on what you described, making GFS/GNBD to
work on top of ICS may be a better alternative. 

Could you give me some pointers on how to make GFS/GNBD to work on ICS? 

Lin   

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Aneesh 
> Kumar K.V
> Sent: Monday, January 01, 2007 7:57 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Re: RedHat SSI cluster
> 
> Bob Marcan wrote:
> > Hi.
> > Are there any plans to enhance RHCS to become full SSI 
> (Single System 
> > Image) cluster?
> > Will http://www.open-sharedroot.org/ become officialy included and 
> > supported?
> > Isn't time to unite force with the http://www.openssi.org ?
> > 
> 
> 
> If you look at openssi.org code you can consider it contain 
> multiple components
> 
> a) ICS
> b) VPROC
> c) CFS
> d) Clusterwide SYSVIPC
> e) Clusterwide PID
> f) Clusterwide remote file operations
> 
> 
> I am right now done with getting ICS cleaned up for 
> 2.6.20-rc1 kernel. It provides a transport independent 
> cluster framework for writing kernel cluster services. 
> You can find the code at 
> http://git.openssi.org/~kvaneesh/gitweb.cgi?p=ci-to-linus.git;
> a=summary
> 
> 
> So what could be done which will help GFS and OCFS2 is to 
> make sure they can work on top of ICS. That also bring in an 
> advantage that GFS and OCFS2 can work using 
> TCP/Infiniband/SCTP/TIPC what ever the transport layer 
> protocol is. Once that is done next step would be to get 
> Clusterwide SYSVIPC from OpenSSI and merge it with latest 
> kernel. ClusterWide PID and clusterwide remote file operation 
> is easy to get working. What is most difficult is VPROC which 
> bring in the clusterwide proc model. Bruce Walker have a 
> paper written on a generic framework at 
> http://www.openssi.org/cgi-bin/view?page=proc-hooks.html
> 
> 
> -aneesh
> 
> 
> -aneesh 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From sdake at redhat.com  Sat Feb 17 00:33:25 2007
From: sdake at redhat.com (Steven Dake)
Date: Fri, 16 Feb 2007 17:33:25 -0700
Subject: [Linux-cluster] Re: RedHat SSI cluster
In-Reply-To: <08A9A3213527A6428774900A80DBD8D8037E8356@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D8037E8356@xmb-sjc-222.amer.cisco.com>
Message-ID: <1171672405.21782.7.camel@shih.broked.org>

Aneesh,

The latest GFS infrastructure requires total ordering of messages to
work properly and is highly integrated into openais at the moment.

It is possible to modify the totem protocol in openais to use TIPC or
some other transport besides UDP but multicast (or broadcast) is a
requirement of the underlying protocol.

I do not know of other protocols that are available with a suitable
license that provide total ordering of messages (often called agreed
ordering).

An example of what is required from the transport is as follows:
Agreed ordering always requires the following:
ex: 3 nodes transmitting messages a b c
N1: C A B delivered
N2: C A B delivered
N3: C A B delivered

whereas without total order something like this could happen:
n1: C A B delivered
N2: A B C delivered
N3: C B A delivered

This second scenario is disallowed by agreed ordering and won't work
with the GFS infrastructure.  The protocol in openais (Totem Single Ring
Protocol) provides agreed and virtual synchrony ordering.

Regards
-steve


On Fri, 2007-02-16 at 11:25 -0800, Lin Shen (lshen) wrote:
> Hi Aneesh,
> 
> We're planning to make GFS/GNBD to work on top of TIPC in hope that will
> give better performance over TCP.  Since TIPC provides socket-like APIs,
> our initial thinking was to just convert socket APIs in GFS/GNBD code to
> TIPC socket-like APIs. Based on what you described, making GFS/GNBD to
> work on top of ICS may be a better alternative. 
> 
> Could you give me some pointers on how to make GFS/GNBD to work on ICS? 
> 
> Lin   
> 
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com 
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Aneesh 
> > Kumar K.V
> > Sent: Monday, January 01, 2007 7:57 AM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] Re: RedHat SSI cluster
> > 
> > Bob Marcan wrote:
> > > Hi.
> > > Are there any plans to enhance RHCS to become full SSI 
> > (Single System 
> > > Image) cluster?
> > > Will http://www.open-sharedroot.org/ become officialy included and 
> > > supported?
> > > Isn't time to unite force with the http://www.openssi.org ?
> > > 
> > 
> > 
> > If you look at openssi.org code you can consider it contain 
> > multiple components
> > 
> > a) ICS
> > b) VPROC
> > c) CFS
> > d) Clusterwide SYSVIPC
> > e) Clusterwide PID
> > f) Clusterwide remote file operations
> > 
> > 
> > I am right now done with getting ICS cleaned up for 
> > 2.6.20-rc1 kernel. It provides a transport independent 
> > cluster framework for writing kernel cluster services. 
> > You can find the code at 
> > http://git.openssi.org/~kvaneesh/gitweb.cgi?p=ci-to-linus.git;
> > a=summary
> > 
> > 
> > So what could be done which will help GFS and OCFS2 is to 
> > make sure they can work on top of ICS. That also bring in an 
> > advantage that GFS and OCFS2 can work using 
> > TCP/Infiniband/SCTP/TIPC what ever the transport layer 
> > protocol is. Once that is done next step would be to get 
> > Clusterwide SYSVIPC from OpenSSI and merge it with latest 
> > kernel. ClusterWide PID and clusterwide remote file operation 
> > is easy to get working. What is most difficult is VPROC which 
> > bring in the clusterwide proc model. Bruce Walker have a 
> > paper written on a generic framework at 
> > http://www.openssi.org/cgi-bin/view?page=proc-hooks.html
> > 
> > 
> > -aneesh
> > 
> > 
> > -aneesh 
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From rawipfel at novell.com  Sat Feb 17 00:49:53 2007
From: rawipfel at novell.com (Robert Wipfel)
Date: Fri, 16 Feb 2007 17:49:53 -0700
Subject: [Linux-cluster] Re: RedHat SSI cluster
In-Reply-To: <1171672405.21782.7.camel@shih.broked.org>
References: <08A9A3213527A6428774900A80DBD8D8037E8356@xmb-sjc-222.amer.cisco.com>
	<1171672405.21782.7.camel@shih.broked.org>
Message-ID: <45D5EEC1.C5C7.00CF.0@novell.com>

>>> On Fri, Feb 16, 2007 at  5:33 PM, in message
<1171672405.21782.7.camel at shih.broked.org>, Steven Dake <sdake at redhat.com>
wrote: 

> I do not know of other protocols that are available with a suitable
> license that provide total ordering of messages (often called agreed
> ordering).

Spread has a BSD style license - www.spread.org

Robert


From ace at sannes.org  Sat Feb 17 11:31:48 2007
From: ace at sannes.org (=?ISO-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Sat, 17 Feb 2007 12:31:48 +0100
Subject: [Linux-cluster] gfs1 and 2.6.20
Message-ID: <45D6E7A4.9020106@sannes.org>


I have been trying to use the STABLE branch of the cluster suite with
vanilla 2.6.20 kernel, and everything seemed at first to work, my
problem can be reproduced by this:

mount a gfs filesystem anywhere..
do a sync, this sync will now just hang there ..

If I unmount the filesystem in another terminal, the sync command will
end..

.. dumping the kernel stack of sync shows that it is in __sync_inodes on
__down_read, looking in the code it seems that is waiting for the
s_umount semaphore (in the superblock)..

Just tell me if you need any more information or if this is not the
correct place for this..

Greetings,
Asbj?rn Sannes


From ace at sannes.org  Sat Feb 17 11:46:09 2007
From: ace at sannes.org (=?ISO-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Sat, 17 Feb 2007 12:46:09 +0100
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45D6E7A4.9020106@sannes.org>
References: <45D6E7A4.9020106@sannes.org>
Message-ID: <45D6EB01.7080906@sannes.org>

Asbj?rn Sannes wrote:
> I have been trying to use the STABLE branch of the cluster suite with
> vanilla 2.6.20 kernel, and everything seemed at first to work, my
> problem can be reproduced by this:
>
> mount a gfs filesystem anywhere..
> do a sync, this sync will now just hang there ..
>
> If I unmount the filesystem in another terminal, the sync command will
> end..
>
> .. dumping the kernel stack of sync shows that it is in __sync_inodes on
> __down_read, looking in the code it seems that is waiting for the
> s_umount semaphore (in the superblock)..
>
> Just tell me if you need any more information or if this is not the
> correct place for this..
>   
Here is the trace for sync (while hanging) ..

sync          D ffffffff8062eb80     0 17843  15013                    
(NOTLB)
ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210
0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0
ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001
Call Trace:
[<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140
[<ffffffff8046046c>] __down_read+0x90/0xaa
[<ffffffff802407d6>] down_read+0x16/0x1a
[<ffffffff8028df35>] __sync_inodes+0x5f/0xbb
[<ffffffff8028dfa7>] sync_inodes+0x16/0x2f
[<ffffffff80290293>] do_sync+0x17/0x60
[<ffffffff802902ea>] sys_sync+0xe/0x12
[<ffffffff802098be>] system_call+0x7e/0x83

Greetings,
Asbj?rn Sannes


From jprats at cesca.es  Mon Feb 19 07:40:14 2007
From: jprats at cesca.es (Jordi Prats)
Date: Mon, 19 Feb 2007 08:40:14 +0100
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <1171645569.3058.59.camel@localhost.localdomain>
References: <45D571BB.6060203@cesca.es>	<1171645462.3058.53.camel@localhost.localdomain>
	<1171645569.3058.59.camel@localhost.localdomain>
Message-ID: <45D9545E.5090704@cesca.es>

Hi,
It seems that it's running:

# /etc/init.d/cman status
Protocol version: 5.0.1
Config version: 85
Cluster name: dades
Cluster ID: 3093
Cluster Member: No
Membership state: Joining

Jordi

Lon Hohberger wrote:
> On Fri, 2007-02-16 at 12:04 -0500, Lon Hohberger wrote:
>   
>> On Fri, 2007-02-16 at 09:56 +0100, Jordi Prats wrote:
>>     
>>> Hi,
>>> After rebooting and adding a new disc (from a EVA device) clustat is 
>>> giving me a segmentation fault:
>>>
>>> # clustat
>>> Segmentation fault
>>>
>>> This is the current version:
>>> # clustat -v
>>> clustat version 1.9.53
>>> Connected via: CMAN/SM Plugin v1.1.7.1
>>>
>>> On the messages log do no appears nothing, but DLM seems that is not 
>>> working because there's no proces called dlm_*
>>>
>>> Any ideas where to start?
>>>       
>> Cman isn't running.  This should be fixed in 4.5.
>>     
>
> (or it might be running but inquorate)
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>   


-- 
......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputaci? de Catalunya

  Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
  T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
...................................................................... 


From pcaulfie at redhat.com  Mon Feb 19 08:55:00 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 19 Feb 2007 08:55:00 +0000
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <45D9545E.5090704@cesca.es>
References: <45D571BB.6060203@cesca.es>	<1171645462.3058.53.camel@localhost.localdomain>	<1171645569.3058.59.camel@localhost.localdomain>
	<45D9545E.5090704@cesca.es>
Message-ID: <45D965E4.6010701@redhat.com>

Jordi Prats wrote:
> Hi,
> It seems that it's running:
> 
> # /etc/init.d/cman status
> Protocol version: 5.0.1
> Config version: 85
> Cluster name: dades
> Cluster ID: 3093
> Cluster Member: No
> Membership state: Joining
> 


That's not "running" - that's "Joining" :-)

If it sticks like that for any length of time say over a minute) then check your
networking (and the archives of this mailing list).

-- 

patrick


From jprats at cesca.es  Mon Feb 19 10:56:50 2007
From: jprats at cesca.es (Jordi Prats)
Date: Mon, 19 Feb 2007 11:56:50 +0100
Subject: [Linux-cluster] clustat segmentation fault
In-Reply-To: <45D965E4.6010701@redhat.com>
References: <45D571BB.6060203@cesca.es>	<1171645462.3058.53.camel@localhost.localdomain>	<1171645569.3058.59.camel@localhost.localdomain>	<45D9545E.5090704@cesca.es>
	<45D965E4.6010701@redhat.com>
Message-ID: <45D98272.4040905@cesca.es>

I've already checked my networking, it have no problem...

Thank you anyway :)
Jordi

Patrick Caulfield wrote:
> Jordi Prats wrote:
>   
>> Hi,
>> It seems that it's running:
>>
>> # /etc/init.d/cman status
>> Protocol version: 5.0.1
>> Config version: 85
>> Cluster name: dades
>> Cluster ID: 3093
>> Cluster Member: No
>> Membership state: Joining
>>
>>     
>
>
> That's not "running" - that's "Joining" :-)
>
> If it sticks like that for any length of time say over a minute) then check your
> networking (and the archives of this mailing list).
>
>   


-- 
......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputaci? de Catalunya

  Gran Capit?, 2-4 (Edifici Nexus) ? 08034 Barcelona
  T. 93 205 6464 ? F.  93 205 6979 ? jprats at cesca.es
...................................................................... 


From Alain.Moulle at bull.net  Mon Feb 19 11:44:56 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Mon, 19 Feb 2007 12:44:56 +0100
Subject: [Linux-cluster] CS4 U4  / problem to relocate with clusvcadm
Message-ID: <45D98DB8.5080103@bull.net>

Hi

Configuration with only two nodes in cluster :
it seems there is sometimes a problem with clusvcadm which
returns error :
"Member Node1 not in membership list"
whereas the Node1 is in fact really in membership, which
is verified by the sequence  :
1/magma_tool members | grep Member | grep UP | wc -l
 (which returns 2)
2/clusvcadm -r appli -m Node1
which returns "Member Node1 not in membership list
3/magma_tool members | grep Member | grep UP | wc -l
 (which returns 2)

Is there an already known issue about this erroneous
error message from clusvcadm ?

Thanks
Alain Moull?


From simanhew at yahoo.com  Mon Feb 19 14:49:05 2007
From: simanhew at yahoo.com (Siman Hew)
Date: Mon, 19 Feb 2007 06:49:05 -0800 (PST)
Subject: [Linux-cluster] field displayed in "cman_tool services" command
Message-ID: <349312.66000.qm@web50107.mail.yahoo.com>

Hello,

When I try to check the services in a cluster, I am
using "cman_tool services", what I get is something
like this:
hostname:/sbin#cman_tool services
Service          Name                              GID
LID State     Code
Fence Domain:    "default"                           0
  2 join      S-1,1,3
[]

DLM Lock Space:  "Magma"                             3
  4 run       -
[1 2]

User:            "usrm::manager"                     2
  3 run       S-10,200,0
[1 2]

Some fields are quite obvious, like Service, Name, but
some are not, like GID, LID, Code and square bracket
under service(I guess it is node list).
Is there anywhere explain what these fields mean?

Thank you very much,

Siman


____________________________________________________________________________________
No need to miss a message. Get email on-the-go 
with Yahoo! Mail for Mobile. Get started.
http://mobile.yahoo.com/mail 


From rpeterso at redhat.com  Mon Feb 19 14:59:33 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Mon, 19 Feb 2007 08:59:33 -0600
Subject: [Linux-cluster] field displayed in "cman_tool services" command
In-Reply-To: <349312.66000.qm@web50107.mail.yahoo.com>
References: <349312.66000.qm@web50107.mail.yahoo.com>
Message-ID: <45D9BB55.4070108@redhat.com>

Siman Hew wrote:
> Hello,
>
> When I try to check the services in a cluster, I am
> using "cman_tool services", what I get is something
> like this:
> hostname:/sbin#cman_tool services
> Service          Name                              GID
> LID State     Code
> Fence Domain:    "default"                           0
>   2 join      S-1,1,3
> []
>
> DLM Lock Space:  "Magma"                             3
>   4 run       -
> [1 2]
>
> User:            "usrm::manager"                     2
>   3 run       S-10,200,0
> [1 2]
>
> Some fields are quite obvious, like Service, Name, but
> some are not, like GID, LID, Code and square bracket
> under service(I guess it is node list).
> Is there anywhere explain what these fields mean?
>
> Thank you very much,
>
> Siman
>   
Hi Siman,

I added a (somewhat long and over-imaginative) section to the FAQ on this
a while back:

http://sources.redhat.com/cluster/faq.html#cman_tool_services

It's not meant to be complete; it's just an explanation of the ideas.
Incidentally, GID and LID stand for "Global ID" and "Local ID" and
they are numbers used to identify cman messages, and that's not really
described in the faq.

Regards,

Bob Peterson
Red Hat Cluster Suite


From kpodesta at redbrick.dcu.ie  Mon Feb 19 15:11:42 2007
From: kpodesta at redbrick.dcu.ie (Karl Podesta)
Date: Mon, 19 Feb 2007 15:11:42 +0000
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
In-Reply-To: <1171648445.3058.71.camel@localhost.localdomain>
References: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
	<1171645500.3058.55.camel@localhost.localdomain>
	<20070216171336.GC24404@murphy.redbrick.dcu.ie>
	<1171648445.3058.71.camel@localhost.localdomain>
Message-ID: <20070219151142.GD24496@murphy.redbrick.dcu.ie>

On Fri, Feb 16, 2007 at 12:54:04PM -0500, Lon Hohberger wrote:
> > Sorry, Red Hat Cluster Suite for RHEL3... 
> 
> No agent for DRAC on RHCS3.
> 
> -- Lon

Thanks a lot!

Am I right in saying that hardware/software watchdog timers are no longer
supported (RH CS 3), and that customers must use a power switch to provide
a supported configuration? 

(it says so on the Open Cluster FAQ, but I couldn't find official Red Hat
word on it in docs/websites). 

Many thanks, 
Karl

-- 
Karl Podesta
Systems Engineer, Securelinx Ltd. (Ireland)
http://www.securelinx.com/


From kpodesta at redbrick.dcu.ie  Mon Feb 19 15:17:52 2007
From: kpodesta at redbrick.dcu.ie (Karl Podesta)
Date: Mon, 19 Feb 2007 15:17:52 +0000
Subject: [Linux-cluster] DRAC support in RH3 Cluster Suite?
In-Reply-To: <20070219151142.GD24496@murphy.redbrick.dcu.ie>
References: <20070216163838.GB24404@murphy.redbrick.dcu.ie>
	<1171645500.3058.55.camel@localhost.localdomain>
	<20070216171336.GC24404@murphy.redbrick.dcu.ie>
	<1171648445.3058.71.camel@localhost.localdomain>
	<20070219151142.GD24496@murphy.redbrick.dcu.ie>
Message-ID: <20070219151752.GE24496@murphy.redbrick.dcu.ie>

On Mon, Feb 19, 2007 at 03:11:42PM +0000, Karl Podesta wrote:
> Am I right in saying that hardware/software watchdog timers are no longer
> supported (RH CS 3), and that customers must use a power switch to provide
> a supported configuration? 
> 
> (it says so on the Open Cluster FAQ, but I couldn't find official Red Hat
> word on it in docs/websites). 

... and no sooner do I say that do I find this: 

http://www.redhat.com/docs/manuals/csgfs/browse/rh-cs-en-3/ch-hardware.html#S2-HARDWARE-PWRCTRL

"Important - Use of a power controller is strongly  recommended as part of a
production cluster environment. Configuration of a cluster without a power
controller is not supported."

Sorry! Any recommendations on power controllers to use specifically with 
a 2-node 2850 Dell setup would be great, otherwise no worries. Thanks for
the help, 

Karl

-- 
Karl Podesta
Systems Engineer, Securelinx Ltd. (Ireland)
http://www.securelinx.com/


From teigland at redhat.com  Mon Feb 19 15:24:56 2007
From: teigland at redhat.com (David Teigland)
Date: Mon, 19 Feb 2007 09:24:56 -0600
Subject: [Linux-cluster] field displayed in "cman_tool services" command
In-Reply-To: <349312.66000.qm@web50107.mail.yahoo.com>
References: <349312.66000.qm@web50107.mail.yahoo.com>
Message-ID: <20070219152456.GB4178@redhat.com>

On Mon, Feb 19, 2007 at 06:49:05AM -0800, Siman Hew wrote:
> Hello,
> 
> When I try to check the services in a cluster, I am
> using "cman_tool services", what I get is something
> like this:
> hostname:/sbin#cman_tool services

> Service          Name                         GID  LID State  Code
> Fence Domain:    "default"                      0    2 join   S-1,1,3
> []
> 
> DLM Lock Space:  "Magma"                        3    4 run
> [1 2]
> 
> User:            "usrm::manager"                2    3 run    S-10,200,0
> [1 2]
>
> Some fields are quite obvious, like Service, Name, but
> some are not, like GID, LID, Code and square bracket
> under service(I guess it is node list).
> Is there anywhere explain what these fields mean?

I thought I'd explained these in an email before, but I can't find it.
The non-obvious information is not explained because it doesn't have much
meaning if you're not looking at the code, and you don't need to use it
unless you're debugging the code.  But, in case someone does want to start
digging through the code:

- The numbers in [] are the nodeids of the nodes in that group
- GID is the global id of the group, sg->global_id
- LID is the local id of the group, sg->local_id
- State is the SGST_ value in sg->state
  none=SGST_NONE
  join=SGST_JOIN
  run=SGST_RUN
  recover=SGST_RECOVER, the number after "recover" is sg->recover_state
  update=SGST_UEVENT
- Code is a combination of information
  First letter S=SGFL_SEVENT, U=SGFL_UEVENT, N=SGFL_NEED_RECOVERY
  First number sg->sevent->se_state or sg->uevent.ue_state
  Second number sg->sevent->se_flags or sg->uevent.ue_flags
  Third number sg->sevent->se_reply_count or sg->uevent.ue_nodeid


Now, on to your specific problem.  By the looks of it I'd say that your
machine is trying to fence someone.  /var/log/messages will usually have
some clear information about what's wrong.  Source code debugging using
the info above is probably the wrong place to start.

Dave


From isplist at logicore.net  Mon Feb 19 17:03:53 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Mon, 19 Feb 2007 11:03:53 -0600
Subject: [Linux-cluster] LUN Masking is what I need...
Message-ID: <200721911353.210641@leena>

Anyone have experience setting up the qla2200's for LUN masking using central 
storage? I'm looking for pre-installation setup since I want to install the 
OS's on the storage itself. Then each will also have access to their GFS 
mounts as well.

I posted a msg about having a RAID array which has a number of logical volumes 
on it. I need to allow individual servers to see their own volume on the 
storage device. 

Thanks in advance.

Mike


From isplist at logicore.net  Mon Feb 19 19:19:33 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Mon, 19 Feb 2007 13:19:33 -0600
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <200721911353.210641@leena>
Message-ID: <2007219131933.239563@leena>

Anyone know of a way of specifying the LUN when installing RHEL? For example, 
at the install prompt;

	linux ide=nodma lun=30 (if only such an option existed)

Mike


On Mon, 19 Feb 2007 11:03:53 -0600, isplist at logicore.net wrote:
> Anyone have experience setting up the qla2200's for LUN masking using
> central
> 
> storage? I'm looking for pre-installation setup since I want to install the
> OS's on the storage itself. Then each will also have access to their GFS
> mounts as well.
> 
> I posted a msg about having a RAID array which has a number of logical
> volumes
> on it. I need to allow individual servers to see their own volume on the
> storage device.
> 
> Thanks in advance.
> 
> Mike
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From simanhew at yahoo.com  Mon Feb 19 19:26:30 2007
From: simanhew at yahoo.com (Siman Hew)
Date: Mon, 19 Feb 2007 11:26:30 -0800 (PST)
Subject: [Linux-cluster] field displayed in "cman_tool services" command
In-Reply-To: <20070219152456.GB4178@redhat.com>
Message-ID: <603442.81598.qm@web50113.mail.yahoo.com>

Thanks David & Robert.

Actually the real problem I have is that I can not
stop "active subsystems", so I use this command to
check what services are running.
For example, first of all, I run cman_tool status:
node04:/etc/init.d# cman_tool status
Protocol version: 5.0.1
Config version: 8
Cluster name: cluster01
Cluster ID: 730
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 3
Expected_votes: 4
Total_votes: 3
Quorum: 3
Active subsystems: 2
Node name: node04
Node ID: 1
Node addresses: 10.10.10.4

In which it tells us 2 active subsystems:
hostname:/etc/init.d# cman_tool services
Service          Name                              GID
LID State     Code
Fence Domain:    "default"                           5
  2 join      S-6,20,1
[1]

User:            "usrm::manager"                     3
  3 join      S-8,40,2
[2 1]

Using clustat -x to confirm:
node04:/etc/init.d# clustat -x
Timed out waiting for a response from Resource Group
Manager
<?xml version="1.0"?>
<clustat version="4.1.1">
  <quorum quorate="1" groupmember="1"/>
  <nodes>
    <node name="node03" state="1" local="0"
estranged="0" rgmanager="1"
nodeid="0x0000000000000002"/>
    <node name="node01" state="1" local="0"
estranged="0" rgmanager="0"
nodeid="0x0000000000000003"/>
    <node name="node02" state="0" local="0"
estranged="0" rgmanager="0"
nodeid="0x0000000000000000"/>
    <node name="node04" state="1" local="1"
estranged="0" rgmanager="1"
nodeid="0x0000000000000001"/>
  </nodes>
</clustat>

Check rgmanager status:
node04:/etc/init.d# rgmanager status
clurgmgrd (pid 4755 4754) is running...

Then try to stop it:
node04:/etc/init.d# rgmanager stop
Shutting down cluster Servie Manager...
Waiting for service to stop:

but looks it is waiting for the service to stop
forever.

I tried to stop fenced.
node04:/etc/init.d#./fenced status
fenced (pid 3913) is running...

node04:/etc/init.d#./fenced stop
Stopping fence domain:     [ OK ]

node04:/etc/init.d#./fenced status
fenced (pid 3913) is running...

Seems fenced did not stop fenced properly, although it
told us [ OK ].

Any sequence I did is wrong? Just got confused ?

Any hint is very appreciated.

Siman
--- David Teigland <teigland at redhat.com> wrote:

> On Mon, Feb 19, 2007 at 06:49:05AM -0800, Siman Hew
> wrote:
> > Hello,
> > 
> > When I try to check the services in a cluster, I
> am
> > using "cman_tool services", what I get is
> something
> > like this:
> ......
> Now, on to your specific problem.  By the looks of
> it I'd say that your
> machine is trying to fence someone. 
> /var/log/messages will usually have
> some clear information about what's wrong.  Source
> code debugging using
> the info above is probably the wrong place to start.
> 
> Dave
> 
> 


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com


From fajar at telkom.co.id  Tue Feb 20 01:56:14 2007
From: fajar at telkom.co.id (Fajar A. Nugraha)
Date: Tue, 20 Feb 2007 08:56:14 +0700
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <2007219131933.239563@leena>
References: <2007219131933.239563@leena>
Message-ID: <45DA553E.8040908@telkom.co.id>

isplist at logicore.net wrote:
> Anyone know of a way of specifying the LUN when installing RHEL? For example, 
> at the install prompt;
>
> 	linux ide=nodma lun=30 (if only such an option existed)
>
>   
I'm using IBM HS20 with qlogic 2312. The HBA card has a BIOS
configuration which allows to select which SAN to boot from, and which
LUN on that SUN to boot. Then, RHEL (or Windows) will install on that
location.

-- 
Fajar


From isplist at logicore.net  Tue Feb 20 02:30:09 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Mon, 19 Feb 2007 20:30:09 -0600
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <45DA553E.8040908@telkom.co.id>
Message-ID: <200721920309.223662@leena>

Hi and thanks for the reply.

> I'm using IBM HS20 with qlogic 2312. The HBA card has a BIOS
> configuration which allows to select which SAN to boot from, and which
> LUN on that SUN to boot. Then, RHEL (or Windows) will install on that
> location.

I'm using the 2200's in these blades but the thing is, they do see the 
storage, it's Linux that doesn't seem to see anything but the two volumes on 
the controllers, LUNS 0 and 1 only. 

I've told the card to use the specific logical volume that I can see when 
using the FASTutil but Linux never see's that.

Mike


From fajar at telkom.co.id  Tue Feb 20 02:46:41 2007
From: fajar at telkom.co.id (Fajar A. Nugraha)
Date: Tue, 20 Feb 2007 09:46:41 +0700
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <200721920309.223662@leena>
References: <200721920309.223662@leena>
Message-ID: <45DA6111.5070903@telkom.co.id>

isplist at logicore.net wrote:
> I'm using the 2200's in these blades but the thing is, they do see the 
> storage, it's Linux that doesn't seem to see anything but the two volumes on 
> the controllers, LUNS 0 and 1 only. 
>
> I've told the card to use the specific logical volume that I can see when 
> using the FASTutil but Linux never see's that.
>
>   
Perhaps max_luns=256 will do the trick.
See
http://publib.boulder.ibm.com/infocenter/dsichelp/ds8000ic/index.jsp?topic=/com.ibm.storage.ssic.help.doc/f2c_linuxlunconfig_2hsaga.html

-- 
Fajar


From isplist at logicore.net  Tue Feb 20 02:51:10 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Mon, 19 Feb 2007 20:51:10 -0600
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <45DA6111.5070903@telkom.co.id>
Message-ID: <2007219205110.927442@leena>

> Perhaps max_luns=256 will do the trick.

Good thinking but I tried this also. What I need is to find some method by 
which to tell Linux, on bootup install, what LUN to use. For example, if I 
could tell it to use LUN 0,30 for example, that would work.

I've not found any information on how to pass this information to the 
installer.

Mike


From matthew at arts.usyd.edu.au  Tue Feb 20 03:07:55 2007
From: matthew at arts.usyd.edu.au (Matthew Geier)
Date: Tue, 20 Feb 2007 14:07:55 +1100
Subject: [Linux-cluster] rhn kernal update
Message-ID: <45DA660B.2000807@arts.usyd.edu.au>


  I notice there is an 'important' kernel update for my RHEL 4 systems. 
However there isn't corresponding cluster updates.

  After a bad experience with the last kernel update that left the 
cluster broken until the kernel update was backed out, due to the kernel 
modules for cluster suite not having corresponding updates i'm wary of 
installing this update in case i'm left with a non-operational cluster 
again.

  Whats the deal ?

  I have to wait for cluster updates before applying the kernel update ?


From ridabarbiee at yahoo.com  Tue Feb 20 06:44:53 2007
From: ridabarbiee at yahoo.com (hiba salma)
Date: Mon, 19 Feb 2007 22:44:53 -0800 (PST)
Subject: [Linux-cluster] Booting from Local Hard drive
Message-ID: <178682.33427.qm@web39806.mail.mud.yahoo.com>

Hello...

i have a Linux Cluster of 64-nodes with

OSCAR version: 4.1

Linux Distro: Red hat 9

There is a problem which i'm trying to solve for
almost a week but still no results.

When i boot a client from the network, after finishing
successful installation from the network,

it reboots. After the reboot it tries to boot from the
Local Hard drive

cuz i changed the  /etc/systemimager/systemimager.conf
and set
 NET_BOOT_DEFAULT = LOCAL. 

The netbootmond also restart after the item changed.

But it hangs there! n do nothing after that i.e after
the message Booting from Local Hard drive

I dun understand why this happens.
i followed some instructions also from this site
www.wiki.sisuite.org/networkboot

But all invain.

Can any one suggest me some solutions to this problem?

PS:I posted this problem on SIsuite users mailing
lists but i thought to post it here too,if anyone from
here know the solution.


-Hiba


____________________________________________________________________________________
Need a quick answer? Get one in minutes from people who know.
Ask your question on www.Answers.yahoo.com


From ace at sannes.org  Tue Feb 20 11:07:34 2007
From: ace at sannes.org (=?iso-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Tue, 20 Feb 2007 12:07:34 +0100 (CET)
Subject: [Linux-cluster] gfs2 mount compile issue
Message-ID: <3431.193.157.189.207.1171969654.squirrel@kunder.interhost.no>

Trying to compile the cvs cluster suite (from CVS) I get problems:

make[2]: Leaving directory `/root/cluster/cluster/gfs2/mkfs'
make -C mount all
make[2]: Entering directory `/root/cluster/cluster/gfs2/mount'
gcc -Wall -I../include -I../config -I//usr/include/cluster -I//usr/include
-I../../gfs-kernel/src/gfs -I//usr/include -O2 -DHELPER_PROGRAM
-D_FILE_OFFSET_BITS=64 -DGFS2_RELEASE_NAME=\"DEVEL.1171969058\"
-D_GNU_SOURCE -I../include -I../config -I//usr/include/cluster
-I//usr/include -I../../gfs-kernel/src/gfs -I//usr/include -c -o
mount.gfs2.o mount.gfs2.c
gcc -Wall -I../include -I../config -I//usr/include/cluster -I//usr/include
-I../../gfs-kernel/src/gfs -I//usr/include -O2 -DHELPER_PROGRAM
-D_FILE_OFFSET_BITS=64 -DGFS2_RELEASE_NAME=\"DEVEL.1171969058\"
-D_GNU_SOURCE -I../include -I../config -I//usr/include/cluster
-I//usr/include -I../../gfs-kernel/src/gfs -I//usr/include -c -o ondisk1.o
ondisk1.c
gcc -Wall -I../include -I../config -I//usr/include/cluster -I//usr/include
-I../../gfs-kernel/src/gfs -I//usr/include -O2 -DHELPER_PROGRAM
-D_FILE_OFFSET_BITS=64 -DGFS2_RELEASE_NAME=\"DEVEL.1171969058\"
-D_GNU_SOURCE -I../include -I../config -I//usr/include/cluster
-I//usr/include -I../../gfs-kernel/src/gfs -I//usr/include -c -o ondisk2.o
ondisk2.c
In file included from /usr/include/asm/types.h:5,
                 from /usr/include/linux/types.h:7,
                 from /usr/include/linux/gfs2_ondisk.h:13,
                 from ondisk2.c:33:
/usr/include/asm-x86_64/types.h:23: error: conflicting types for 'uint64_t'
/usr/include/gentoo-multilib/amd64/stdint.h:56: error: previous
declaration of 'uint64_t' was here
make[2]: *** [ondisk2.o] Error 1
make[2]: Leaving directory `/root/cluster/cluster/gfs2/mount'
make[1]: *** [tag_mount] Error 2
make[1]: Leaving directory `/root/cluster/cluster/gfs2'
make: *** [all] Error 2

Adding #define _LINUX_TYPES_H to gfs2/mount/ondisk2.c makes it compile,
but I don't know if that is correct.


Greetings,
Asbj?rn Sannes


From markryde at gmail.com  Tue Feb 20 15:53:23 2007
From: markryde at gmail.com (Mark Ryden)
Date: Tue, 20 Feb 2007 17:53:23 +0200
Subject: [Linux-cluster] gfs_mkfs on diskOnKey - what is the minimum
	required size ?
Message-ID: <dac45060702200753r32cbe0eei84fcd48863ffedc@mail.gmail.com>

Hello,

I want to install GFS1 on diskOnKey; What is the minimum size required
 for the gfs_mkfs to succeed ?
When I try
gfs_mkfs -p lock_nolock -j 1 /dev/sda1
I get :
gfs_mkfs: Partition too small for number/size of journals

And "man gfs_mkfs"  or google could not help much.

Regards,
Mark


From natecars at natecarlson.com  Tue Feb 20 16:14:01 2007
From: natecars at natecarlson.com (Nate Carlson)
Date: Tue, 20 Feb 2007 10:14:01 -0600 (CST)
Subject: [Linux-cluster] LUN Masking is what I need...
In-Reply-To: <200721911353.210641@leena>
References: <200721911353.210641@leena>
Message-ID: <Pine.LNX.4.63.0702201013350.26088@tungsten.msp.technicality.org>

On Mon, 19 Feb 2007, isplist at logicore.net wrote:
> Anyone have experience setting up the qla2200's for LUN masking using 
> central storage? I'm looking for pre-installation setup since I want to 
> install the OS's on the storage itself. Then each will also have access 
> to their GFS mounts as well.
>
> I posted a msg about having a RAID array which has a number of logical 
> volumes on it. I need to allow individual servers to see their own 
> volume on the storage device.

Masking is generally done on the storage device itself or on the FC 
switches..

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From srramasw at cisco.com  Tue Feb 20 16:31:28 2007
From: srramasw at cisco.com (Sridhar Ramaswamy (srramasw))
Date: Tue, 20 Feb 2007 08:31:28 -0800
Subject: [Linux-cluster] gfs_mkfs on diskOnKey - what is the
	minimumrequired size ?
In-Reply-To: <dac45060702200753r32cbe0eei84fcd48863ffedc@mail.gmail.com>
Message-ID: <B14199FA0DBAAF4AA89E83EB41D35435031714CA@xmb-sjc-22c.amer.cisco.com>

I was in similar situation recently. How big is your diskOnkey?

The default journal size is 128MB. This is the amount of diskspace kept
aside for journaling per journal count/node. Try bringing it down to the
minimum 32MB.

gfs_mkfs -p lock_nolock -j 1 -J 32 /dev/sda1


thanks,
Sridhar
Cisco System, Inc
 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Mark Ryden
> Sent: Tuesday, February 20, 2007 7:53 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] gfs_mkfs on diskOnKey - what is the 
> minimumrequired size ?
> 
> Hello,
> 
> I want to install GFS1 on diskOnKey; What is the minimum size required
>  for the gfs_mkfs to succeed ?
> When I try
> gfs_mkfs -p lock_nolock -j 1 /dev/sda1
> I get :
> gfs_mkfs: Partition too small for number/size of journals
> 
> And "man gfs_mkfs"  or google could not help much.
> 
> Regards,
> Mark
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From lhh at redhat.com  Tue Feb 20 18:07:10 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 20 Feb 2007 13:07:10 -0500
Subject: [Linux-cluster] Service stuck in 'stopping' state
Message-ID: <1171994830.5216.17.camel@asuka.boston.devel.redhat.com>

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228823

If anyone has an easy way to make this occur, please comment on the
above bugzilla.  I have a fix for the symptom, but I would like to
understand when/why it happens so I can fix the *cause*.

-- Lon


From frederik.ferner at diamond.ac.uk  Wed Feb 21 11:10:51 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Wed, 21 Feb 2007 11:10:51 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
	<45D31766.3080908@redhat.com>
	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
	<45D339CF.7070408@redhat.com>
	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>
	<45D422B7.30506@redhat.com>
	<1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>
Message-ID: <1172056251.18210.135.camel@pc029.sc.diamond.ac.uk>

Hi Patrick, All,

let me give you an update on that problem.

On Thu, 2007-02-15 at 11:36 +0000, Frederik Ferner wrote:
> On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
[node not joining cluster] 
> > It would be interesting to know - though you may not want to do it - if the
> > problem persists when the still-running node is rebooted.
> 
> Obviously not at the moment, but I have a maintenance window upcoming
> soon where I might be able to do that. I'll keep you informed about the
> result.

Today I had the possibility to reboot the node that was still quorate
(i04-storage1) while the other node (i04-storage2) was still trying to
join. 
When i04-storage1 came to the stage where the cluster services are
started, both nodes joined the cluster at the same time.

With this running cluster, I tried to reproduce the problem by fencing
one node but after rebooting this immediately joined the cluster.

Regards,
Frederik

-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468


From pcaulfie at redhat.com  Wed Feb 21 11:26:22 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 21 Feb 2007 11:26:22 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1172056251.18210.135.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>	<45D31766.3080908@redhat.com>	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>	<45D339CF.7070408@redhat.com>	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>	<45D422B7.30506@redhat.com>	<1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>
	<1172056251.18210.135.camel@pc029.sc.diamond.ac.uk>
Message-ID: <45DC2C5E.1040808@redhat.com>

Frederik Ferner wrote:
> Hi Patrick, All,
> 
> let me give you an update on that problem.
> 
> On Thu, 2007-02-15 at 11:36 +0000, Frederik Ferner wrote:
>> On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
> [node not joining cluster] 
>>> It would be interesting to know - though you may not want to do it - if the
>>> problem persists when the still-running node is rebooted.
>> Obviously not at the moment, but I have a maintenance window upcoming
>> soon where I might be able to do that. I'll keep you informed about the
>> result.
> 
> Today I had the possibility to reboot the node that was still quorate
> (i04-storage1) while the other node (i04-storage2) was still trying to
> join. 
> When i04-storage1 came to the stage where the cluster services are
> started, both nodes joined the cluster at the same time.
> 
> With this running cluster, I tried to reproduce the problem by fencing
> one node but after rebooting this immediately joined the cluster.

Interesting. it sounds similar to a cman bug that was introduced in U3, but it
was fixed in U4 - which you said you were running.

-- 

patrick


From frederik.ferner at diamond.ac.uk  Wed Feb 21 12:07:33 2007
From: frederik.ferner at diamond.ac.uk (Frederik Ferner)
Date: Wed, 21 Feb 2007 12:07:33 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <45DC2C5E.1040808@redhat.com>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>
	<45D31766.3080908@redhat.com>
	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>
	<45D339CF.7070408@redhat.com>
	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>
	<45D422B7.30506@redhat.com>
	<1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>
	<1172056251.18210.135.camel@pc029.sc.diamond.ac.uk>
	<45DC2C5E.1040808@redhat.com>
Message-ID: <1172059653.18210.166.camel@pc029.sc.diamond.ac.uk>

On Wed, 2007-02-21 at 11:26 +0000, Patrick Caulfield wrote:
> Frederik Ferner wrote:
> > Hi Patrick, All,
> > 
> > let me give you an update on that problem.
> > 
> > On Thu, 2007-02-15 at 11:36 +0000, Frederik Ferner wrote:
> >> On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
> > [node not joining cluster] 
> >>> It would be interesting to know - though you may not want to do it - if the
> >>> problem persists when the still-running node is rebooted.
> >> Obviously not at the moment, but I have a maintenance window upcoming
> >> soon where I might be able to do that. I'll keep you informed about the
> >> result.
> > 
> > Today I had the possibility to reboot the node that was still quorate
> > (i04-storage1) while the other node (i04-storage2) was still trying to
> > join. 
> > When i04-storage1 came to the stage where the cluster services are
> > started, both nodes joined the cluster at the same time.
> > 
> > With this running cluster, I tried to reproduce the problem by fencing
> > one node but after rebooting this immediately joined the cluster.
> 
> Interesting. it sounds similar to a cman bug that was introduced in U3, but it
> was fixed in U4 - which you said you were running.

Let's verify that then. I have the following RHCS related packages
installed:
ccs-1.0.7-0
rgmanager-1.9.54-1
cman-1.0.11-0
fence-1.32.25-1
cman-kernel-smp-2.6.9-45.8
dlm-kernel-smp-2.6.9-44.3
dlm-1.0.1-1

/etc/redhat-release contains:
Red Hat Enterprise Linux AS release 4 (Nahant Update 4)

Thanks,
Frederik

-- 
Frederik Ferner 
Systems Administrator                  Phone: +44 (0)1235-778624
Diamond Light Source                   Fax:   +44 (0)1235-778468


From Alain.Moulle at bull.net  Wed Feb 21 13:21:39 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Wed, 21 Feb 2007 14:21:39 +0100
Subject: [Linux-cluster] CS4 Update 4 / Oops in dlm module
Message-ID: <45DC4763.7020109@bull.net>

Hi

CS4 Update 4 :
I got a Oops in dlm module following an acces to an invalid address in function
send_cluster_request. Is there an already know bug about this ?
and eventual fix ?

Thanks
Alain Moull?


From pcaulfie at redhat.com  Wed Feb 21 13:29:24 2007
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 21 Feb 2007 13:29:24 +0000
Subject: [Linux-cluster] node fails to join cluster after it was fenced
In-Reply-To: <1172059653.18210.166.camel@pc029.sc.diamond.ac.uk>
References: <1171458304.24507.91.camel@pc029.sc.diamond.ac.uk>	<45D31766.3080908@redhat.com>	<1171469028.24507.109.camel@pc029.sc.diamond.ac.uk>	<45D339CF.7070408@redhat.com>	<1171474578.24507.148.camel@pc029.sc.diamond.ac.uk>	<45D422B7.30506@redhat.com>	<1171539363.24507.210.camel@pc029.sc.diamond.ac.uk>	<1172056251.18210.135.camel@pc029.sc.diamond.ac.uk>	<45DC2C5E.1040808@redhat.com>
	<1172059653.18210.166.camel@pc029.sc.diamond.ac.uk>
Message-ID: <45DC4934.4040504@redhat.com>

Frederik Ferner wrote:
> On Wed, 2007-02-21 at 11:26 +0000, Patrick Caulfield wrote:
>> Frederik Ferner wrote:
>>> Hi Patrick, All,
>>>
>>> let me give you an update on that problem.
>>>
>>> On Thu, 2007-02-15 at 11:36 +0000, Frederik Ferner wrote:
>>>> On Thu, 2007-02-15 at 09:07 +0000, Patrick Caulfield wrote:
>>> [node not joining cluster] 
>>>>> It would be interesting to know - though you may not want to do it - if the
>>>>> problem persists when the still-running node is rebooted.
>>>> Obviously not at the moment, but I have a maintenance window upcoming
>>>> soon where I might be able to do that. I'll keep you informed about the
>>>> result.
>>> Today I had the possibility to reboot the node that was still quorate
>>> (i04-storage1) while the other node (i04-storage2) was still trying to
>>> join. 
>>> When i04-storage1 came to the stage where the cluster services are
>>> started, both nodes joined the cluster at the same time.
>>>
>>> With this running cluster, I tried to reproduce the problem by fencing
>>> one node but after rebooting this immediately joined the cluster.
>> Interesting. it sounds similar to a cman bug that was introduced in U3, but it
>> was fixed in U4 - which you said you were running.
> 
> Let's verify that then. I have the following RHCS related packages
> installed:
> ccs-1.0.7-0
> rgmanager-1.9.54-1
> cman-1.0.11-0
> fence-1.32.25-1
> cman-kernel-smp-2.6.9-45.8
> dlm-kernel-smp-2.6.9-44.3
> dlm-1.0.1-1

Yes, those look fine.

-- 

patrick


From teigland at redhat.com  Wed Feb 21 14:54:10 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 21 Feb 2007 08:54:10 -0600
Subject: [Linux-cluster] CS4 Update 4 / Oops in dlm module
In-Reply-To: <45DC4763.7020109@bull.net>
References: <45DC4763.7020109@bull.net>
Message-ID: <20070221145410.GA17281@redhat.com>

On Wed, Feb 21, 2007 at 02:21:39PM +0100, Alain Moulle wrote:
> Hi
> 
> CS4 Update 4 :
> I got a Oops in dlm module following an acces to an invalid address in
> function send_cluster_request. Is there an already know bug about this ?
> and eventual fix ?

That doesn't sound familiar, send the oops and any errors from
/var/log/messages if you can.

Dave


From lhh at redhat.com  Wed Feb 21 19:06:07 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 21 Feb 2007 14:06:07 -0500
Subject: [Linux-cluster] CS4 Update 4 / Oops in dlm module
In-Reply-To: <45DC4763.7020109@bull.net>
References: <45DC4763.7020109@bull.net>
Message-ID: <1172084767.5216.78.camel@asuka.boston.devel.redhat.com>

On Wed, 2007-02-21 at 14:21 +0100, Alain Moulle wrote:
> Hi
> 
> CS4 Update 4 :
> I got a Oops in dlm module following an acces to an invalid address in function
> send_cluster_request. Is there an already know bug about this ?
> and eventual fix ?

When does this happen occur - i.e. every time, or after a long run-time,
etc.?

-- Lon


From rpeterso at redhat.com  Wed Feb 21 21:55:16 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 21 Feb 2007 15:55:16 -0600
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45D6EB01.7080906@sannes.org>
References: <45D6E7A4.9020106@sannes.org> <45D6EB01.7080906@sannes.org>
Message-ID: <45DCBFC4.2050606@redhat.com>

Asbj?rn Sannes wrote:
> Asbj?rn Sannes wrote:
>   
>> I have been trying to use the STABLE branch of the cluster suite with
>> vanilla 2.6.20 kernel, and everything seemed at first to work, my
>> problem can be reproduced by this:
>>
>> mount a gfs filesystem anywhere..
>> do a sync, this sync will now just hang there ..
>>
>> If I unmount the filesystem in another terminal, the sync command will
>> end..
>>
>> .. dumping the kernel stack of sync shows that it is in __sync_inodes on
>> __down_read, looking in the code it seems that is waiting for the
>> s_umount semaphore (in the superblock)..
>>
>> Just tell me if you need any more information or if this is not the
>> correct place for this..
>>   
>>     
> Here is the trace for sync (while hanging) ..
>
> sync          D ffffffff8062eb80     0 17843  15013                    
> (NOTLB)
> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210
> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0
> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001
> Call Trace:
> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140
> [<ffffffff8046046c>] __down_read+0x90/0xaa
> [<ffffffff802407d6>] down_read+0x16/0x1a
> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb
> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f
> [<ffffffff80290293>] do_sync+0x17/0x60
> [<ffffffff802902ea>] sys_sync+0xe/0x12
> [<ffffffff802098be>] system_call+0x7e/0x83
>
> Greetings,
> Asbj?rn Sannes
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>   
Hi Asbj?rn,

I'll look into this as soon as I can find the time...

Regards,

Bob Peterson
Red Hat Cluster Suite


From rpeterso at redhat.com  Wed Feb 21 22:17:46 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 21 Feb 2007 16:17:46 -0600
Subject: [Linux-cluster] Question about memory usage
In-Reply-To: <45CB484A.449E.00C8.0@health-first.org>
References: <45CB484A.449E.00C8.0@health-first.org>
Message-ID: <45DCC50A.30700@redhat.com>

Danny Wall wrote:
> Are there any resources that can help determine how much memory I need
> to run a SAMBA cluster with several terabytes GFS storage? 
>
> I have three RHCS clusters on Red Hat 4 U4, with two nodes in each
> cluster. Both servers in a cluster have the same SAN storage mounted,
> but only one node accesses the storage at a time (mostly). The storage
> is shared out over several SAMBA shares, with several users accessing
> the data at a time, via an automated proxy user, so technically, only a
> few user connections are made directly to the clusters, but a few
> hundred GB of data is written and accessed daily. Almost all data is
> write once, read many, and the files range from a few KB to several
> hundred MB.
>
> I recently noticed that when running 'free -m', the servers all run
> very low on memory. If I remove one node from the cluster by stopping
> rgmanager, gfs, clvmd, fenced, cman, and ccsd, the memory get released
> until I join it to the cluster again. I could stop them one at a time to
> make sure it is GFS, but I assume much of the RAM is getting used for
> caching and for GFS needs. I do not mind upgrading the RAM, but I would
> like to know if there is a good way to size the servers properly for
> this type of usage.
>
> The servers are Dual Proc 3.6Ghz, with 2GB RAM each. They have U320 15K
> SCSI drives, and Emulex Fibre Channel to the SAN. Everything else
> appears to run fine, but one server ran out of memory, and I see others
> that range between 16MB and 250MB free RAM. 
>
> Thanks,
> Danny
>   
Hi Danny,

I'm not sure what you mean by a Samba Cluster, but I can tell you this:
There's a group working on clustered samba, but it's not quite ready yet.

There's no hard fast rule regarding memory usage.  It seems to me that
the 2G of RAM on your nodes should be enough to run a respectable cluster.
If you're having a memory problem with GFS, you may have hit this 
bugzilla bug:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=214239

The fix won't be generally available until RHEL4U5.  I think it's in the CVS
repository now if you built your environment from source.  It's been tested
at several sites now, and seems to help a great deal.

Regards,

Bob Peterson
Red Hat Cluster Suite


From isplist at logicore.net  Thu Feb 22 05:23:03 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Wed, 21 Feb 2007 23:23:03 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
Message-ID: <200722123233.408210@leena>

I need to install onto a volume on a RAID device. The catch is, I need to tell 
the installer about the volume so that I can get at it.

I know this can be done once the OS is installed but since I need to install 
on the volume, well, you see the problem :).

Anyone know if there is some way of passing these options (or the equivalent) 
to the Linux installer. I can use either RHEL4 or CentOS 4.4.

scsi_mod max_luns=256 dev_flags="INLINE:TF200 5_23078:0x200"


From beres.laszlo at sys-admin.hu  Thu Feb 22 13:54:05 2007
From: beres.laszlo at sys-admin.hu (BERES Laszlo)
Date: Thu, 22 Feb 2007 14:54:05 +0100
Subject: [Linux-cluster] lm_dlm_cancel
Message-ID: <45DDA07D.2020208@sys-admin.hu>

Hi all,

we had an interesting situation last night. One of our clients called us
that their whole GFS (with 5 nodes) is stopped somehow. More exactly the
GFS-based services couldn't write onto the partitions, but the root
could touch files, make dirs there. Four of the five nodes were
rebooted, but nothing happened until the last one was shooted down, at
this time everything worked normally again.

That's all what we found on the fifth node :

Feb 21 14:00:11 logserver kernel: lock_dlm: lm_dlm_cancel 2,18 flags 80
Feb 21 14:00:11 logserver kernel: lock_dlm: lm_dlm_cancel rv 0 2,18
flags 40080
Feb 21 14:00:11 logserver kernel: dlm: gfs1: cancel reply ret -22

Feb 21 14:00:11 logserver kernel: lock_dlm: ast sb_status -22 2,18 flags
40000

If you have any idea about this, please give us a hint.

-- 
B?RES L?szl?	 RHCE, RHCX
senior IT engineer, trainer


From ace at sannes.org  Thu Feb 22 14:33:22 2007
From: ace at sannes.org (=?ISO-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Thu, 22 Feb 2007 15:33:22 +0100
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45DCBFC4.2050606@redhat.com>
References: <45D6E7A4.9020106@sannes.org> <45D6EB01.7080906@sannes.org>
	<45DCBFC4.2050606@redhat.com>
Message-ID: <45DDA9B2.8070808@sannes.org>

Robert Peterson wrote:
> Asbj?rn Sannes wrote:
>> Asbj?rn Sannes wrote:
>>  
>>> I have been trying to use the STABLE branch of the cluster suite with
>>> vanilla 2.6.20 kernel, and everything seemed at first to work, my
>>> problem can be reproduced by this:
>>>
>>> mount a gfs filesystem anywhere..
>>> do a sync, this sync will now just hang there ..
>>>
>>> If I unmount the filesystem in another terminal, the sync command will
>>> end..
>>>
>>> .. dumping the kernel stack of sync shows that it is in
>>> __sync_inodes on
>>> __down_read, looking in the code it seems that is waiting for the
>>> s_umount semaphore (in the superblock)..
>>>
>>> Just tell me if you need any more information or if this is not the
>>> correct place for this..
>>>       
>> Here is the trace for sync (while hanging) ..
>>
>> sync          D ffffffff8062eb80     0 17843 
>> 15013                    (NOTLB)
>> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210
>> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0
>> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001
>> Call Trace:
>> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140
>> [<ffffffff8046046c>] __down_read+0x90/0xaa
>> [<ffffffff802407d6>] down_read+0x16/0x1a
>> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb
>> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f
>> [<ffffffff80290293>] do_sync+0x17/0x60
>> [<ffffffff802902ea>] sys_sync+0xe/0x12
>> [<ffffffff802098be>] system_call+0x7e/0x83
>>
>> Greetings,
>> Asbj?rn Sannes
>>
> Hi Asbj?rn,
>
> I'll look into this as soon as I can find the time...
>
Great! I tried to figure out why the s_umount semaphore was not upped by
comparing to other filesystems, but the functions seems almost identical
.. so I cheated and looked what had changed lately (from your patch):

diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c
--- gfs-kernel/src/gfs/diaper.c	26 Jun 2006 21:53:51 -0000	1.1.2.1.4.1.2.1
+++ gfs-kernel/src/gfs/diaper.c	2 Feb 2007 22:28:41 -0000
@@ -50,7 +50,7 @@ static int diaper_major = 0;
 static LIST_HEAD(diaper_list);
 static spinlock_t diaper_lock;
 static DEFINE_IDR(diaper_idr);
-kmem_cache_t *diaper_slab;
+struct kmem_cache *diaper_slab;
 
 /**
  * diaper_open -
@@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh)
 	struct inode *inode;
 	int error;
 
-	mutex_lock(&real->bd_mount_mutex);
+	down(&real->bd_mount_sem);
 	sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real);
-	mutex_unlock(&real->bd_mount_mutex);
+	up(&real->bd_mount_sem);
 	if (IS_ERR(sb))
 		return PTR_ERR(sb);
 
@@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh)
 	sb->s_op = &gfs_dummy_sops;
 	sb->s_fs_info = dh;
 
-	up_write(&sb->s_umount);
 	module_put(gfs_fs_type.owner);
 
 	dh->dh_dummy_sb = sb;
@@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh)
 	iput(inode);
 
  fail:
-	up_write(&sb->s_umount);
 	deactivate_super(sb);
 	return error;
 }


And undid those up_write ones (added them back in), which helped, I
don't know if it safe though, and maybe you could shed some lights on
why they were removed? (I didn't find any changes that would do up_write
on s_umount later..
> Regards,
>
> Bob Peterson
> Red Hat Cluster Suite
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
Mvh,
Asbj?rn Sannes


From teigland at redhat.com  Thu Feb 22 14:55:45 2007
From: teigland at redhat.com (David Teigland)
Date: Thu, 22 Feb 2007 08:55:45 -0600
Subject: [Linux-cluster] lm_dlm_cancel
In-Reply-To: <45DDA07D.2020208@sys-admin.hu>
References: <45DDA07D.2020208@sys-admin.hu>
Message-ID: <20070222145545.GA19768@redhat.com>

On Thu, Feb 22, 2007 at 02:54:05PM +0100, BERES Laszlo wrote:
> Hi all,
> 
> we had an interesting situation last night. One of our clients called us
> that their whole GFS (with 5 nodes) is stopped somehow. More exactly the
> GFS-based services couldn't write onto the partitions, but the root
> could touch files, make dirs there. Four of the five nodes were
> rebooted, but nothing happened until the last one was shooted down, at
> this time everything worked normally again.
> 
> That's all what we found on the fifth node :
> 
> Feb 21 14:00:11 logserver kernel: lock_dlm: lm_dlm_cancel 2,18 flags 80
> Feb 21 14:00:11 logserver kernel: lock_dlm: lm_dlm_cancel rv 0 2,18
> flags 40080
> Feb 21 14:00:11 logserver kernel: dlm: gfs1: cancel reply ret -22
> 
> Feb 21 14:00:11 logserver kernel: lock_dlm: ast sb_status -22 2,18 flags
> 40000
> 
> If you have any idea about this, please give us a hint.

That says something has gone wrong, but it's not enough info to say what.
Dave


From lhh at redhat.com  Thu Feb 22 15:18:54 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 22 Feb 2007 10:18:54 -0500
Subject: [Linux-cluster] Question about memory usage
In-Reply-To: <45DCC50A.30700@redhat.com>
References: <45CB484A.449E.00C8.0@health-first.org> <45DCC50A.30700@redhat.com>
Message-ID: <1172157534.30230.14.camel@asuka.boston.devel.redhat.com>

There's also a lock leak that rgmanager causes; fixes are here:

http://people.redhat.com/lhh/rgmanager-1.9.54-3.228823test.src.rpm
http://people.redhat.com/lhh/rgmanager-1.9.54-3.228823test.i386.rpm
http://people.redhat.com/lhh/rgmanager-1.9.54-3.228823test.x86_64.rpm

The above package fixes:
    #228823 - Add ability to disable services which get stuck
              in the 'stopping' state.
    #213312 - Failed assertion in rg_thread.c causing crash.
    #212634/218112 - DLM memory leak in some conditions
                     caused by rgmanager

The above RPMs are just ones I built.  They're not to be considered
official (or supported) at this point, but feedback is certainly
appreciated.

If you look at /proc/slabinfo and see an ever-growing count for dlm_lkb
and are using rgmanager, chances are you've hit #212634/218112.

All of the fixes (and a few more) will be in the next linux-cluster
release / Red Hat update (and should trickle down to other
distributions).

-- Lon


From ace at sannes.org  Thu Feb 22 15:54:14 2007
From: ace at sannes.org (=?ISO-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Thu, 22 Feb 2007 16:54:14 +0100
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45DDA9B2.8070808@sannes.org>
References: <45D6E7A4.9020106@sannes.org>
	<45D6EB01.7080906@sannes.org>	<45DCBFC4.2050606@redhat.com>
	<45DDA9B2.8070808@sannes.org>
Message-ID: <45DDBCA6.9000608@sannes.org>

Asbj?rn Sannes wrote:
> Robert Peterson wrote:
>   
>> Asbj?rn Sannes wrote:
>>     
>>> Asbj?rn Sannes wrote:
>>>  
>>>       
>>>> I have been trying to use the STABLE branch of the cluster suite with
>>>> vanilla 2.6.20 kernel, and everything seemed at first to work, my
>>>> problem can be reproduced by this:
>>>>
>>>> mount a gfs filesystem anywhere..
>>>> do a sync, this sync will now just hang there ..
>>>>
>>>> If I unmount the filesystem in another terminal, the sync command will
>>>> end..
>>>>
>>>> .. dumping the kernel stack of sync shows that it is in
>>>> __sync_inodes on
>>>> __down_read, looking in the code it seems that is waiting for the
>>>> s_umount semaphore (in the superblock)..
>>>>
>>>> Just tell me if you need any more information or if this is not the
>>>> correct place for this..
>>>>       
>>>>         
>>> Here is the trace for sync (while hanging) ..
>>>
>>> sync          D ffffffff8062eb80     0 17843 
>>> 15013                    (NOTLB)
>>> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210
>>> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0
>>> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001
>>> Call Trace:
>>> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140
>>> [<ffffffff8046046c>] __down_read+0x90/0xaa
>>> [<ffffffff802407d6>] down_read+0x16/0x1a
>>> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb
>>> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f
>>> [<ffffffff80290293>] do_sync+0x17/0x60
>>> [<ffffffff802902ea>] sys_sync+0xe/0x12
>>> [<ffffffff802098be>] system_call+0x7e/0x83
>>>
>>> Greetings,
>>> Asbj?rn Sannes
>>>
>>>       
>> Hi Asbj?rn,
>>
>> I'll look into this as soon as I can find the time...
>>
>>     
> Great! I tried to figure out why the s_umount semaphore was not upped by
> comparing to other filesystems, but the functions seems almost identical
> .. so I cheated and looked what had changed lately (from your patch):
>
> diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c
> --- gfs-kernel/src/gfs/diaper.c	26 Jun 2006 21:53:51 -0000	1.1.2.1.4.1.2.1
> +++ gfs-kernel/src/gfs/diaper.c	2 Feb 2007 22:28:41 -0000
> @@ -50,7 +50,7 @@ static int diaper_major = 0;
>  static LIST_HEAD(diaper_list);
>  static spinlock_t diaper_lock;
>  static DEFINE_IDR(diaper_idr);
> -kmem_cache_t *diaper_slab;
> +struct kmem_cache *diaper_slab;
>  
>  /**
>   * diaper_open -
> @@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh)
>  	struct inode *inode;
>  	int error;
>  
> -	mutex_lock(&real->bd_mount_mutex);
> +	down(&real->bd_mount_sem);
>  	sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real);
> -	mutex_unlock(&real->bd_mount_mutex);
> +	up(&real->bd_mount_sem);
>  	if (IS_ERR(sb))
>  		return PTR_ERR(sb);
>  
> @@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh)
>  	sb->s_op = &gfs_dummy_sops;
>  	sb->s_fs_info = dh;
>  
> -	up_write(&sb->s_umount);
>  	module_put(gfs_fs_type.owner);
>  
>  	dh->dh_dummy_sb = sb;
> @@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh)
>  	iput(inode);
>  
>   fail:
> -	up_write(&sb->s_umount);
>  	deactivate_super(sb);
>  	return error;
>  }
>
>
>
> And undid those up_write ones (added them back in), which helped, I
> don't know if it safe though, and maybe you could shed some lights on
> why they were removed? (I didn't find any changes that would do up_write
> on s_umount later..
>   
Actually, it didn't enjoy unmount as much ..

Mvh,
Asbj?rn Sannes


From rpeterso at redhat.com  Thu Feb 22 16:06:25 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Thu, 22 Feb 2007 10:06:25 -0600
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45DDBCA6.9000608@sannes.org>
References: <45D6E7A4.9020106@sannes.org>	<45D6EB01.7080906@sannes.org>	<45DCBFC4.2050606@redhat.com>	<45DDA9B2.8070808@sannes.org>
	<45DDBCA6.9000608@sannes.org>
Message-ID: <45DDBF81.7040300@redhat.com>

Asbj?rn Sannes wrote:
>> Great! I tried to figure out why the s_umount semaphore was not upped by
>> comparing to other filesystems, but the functions seems almost identical
>> .. so I cheated and looked what had changed lately (from your patch):
>>
>> diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c
>> --- gfs-kernel/src/gfs/diaper.c	26 Jun 2006 21:53:51 -0000	1.1.2.1.4.1.2.1
>> +++ gfs-kernel/src/gfs/diaper.c	2 Feb 2007 22:28:41 -0000
>> @@ -50,7 +50,7 @@ static int diaper_major = 0;
>>  static LIST_HEAD(diaper_list);
>>  static spinlock_t diaper_lock;
>>  static DEFINE_IDR(diaper_idr);
>> -kmem_cache_t *diaper_slab;
>> +struct kmem_cache *diaper_slab;
>>  
>>  /**
>>   * diaper_open -
>> @@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh)
>>  	struct inode *inode;
>>  	int error;
>>  
>> -	mutex_lock(&real->bd_mount_mutex);
>> +	down(&real->bd_mount_sem);
>>  	sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real);
>> -	mutex_unlock(&real->bd_mount_mutex);
>> +	up(&real->bd_mount_sem);
>>  	if (IS_ERR(sb))
>>  		return PTR_ERR(sb);
>>  
>> @@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh)
>>  	sb->s_op = &gfs_dummy_sops;
>>  	sb->s_fs_info = dh;
>>  
>> -	up_write(&sb->s_umount);
>>  	module_put(gfs_fs_type.owner);
>>  
>>  	dh->dh_dummy_sb = sb;
>> @@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh)
>>  	iput(inode);
>>  
>>   fail:
>> -	up_write(&sb->s_umount);
>>  	deactivate_super(sb);
>>  	return error;
>>  }
>>
>>
>>
>> And undid those up_write ones (added them back in), which helped, I
>> don't know if it safe though, and maybe you could shed some lights on
>> why they were removed? (I didn't find any changes that would do up_write
>> on s_umount later..
>>   
>>     
> Actually, it didn't enjoy unmount as much ..
>
> Mvh,
> Asbj?rn Sannes
>   
Hi Asbj?rn,

I took them out because I noticed the problem with umount and I knew that
the HEAD version didn't do it.  Of course, that's because it doesn't have a
diaper device whereas STABLE still does.  I've just got to spend a 
little time
with it, that's all.  It's probably something simple.

Regards,

Bob Peterson
Red Hat Cluster Suite


From lshen at cisco.com  Thu Feb 22 18:33:00 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Thu, 22 Feb 2007 10:33:00 -0800
Subject: [Linux-cluster] Running GFS as local file system
Message-ID: <08A9A3213527A6428774900A80DBD8D80385B45B@xmb-sjc-222.amer.cisco.com>

Is the cluster suite still required if I only use GFS as a local file
system? Based on my experiment, it seems not. I just did a gfs_mkfs on a
partition and that's it. 

How easy is it to migrate files created under other file systems (such
as ext3, reiser etc) into GFS?

Lin


From eftychios.eftychiou at gmail.com  Thu Feb 22 21:09:54 2007
From: eftychios.eftychiou at gmail.com (Eftychios Eftychiou)
Date: Thu, 22 Feb 2007 23:09:54 +0200
Subject: [Linux-cluster] Service stuck in 'stopping' state
In-Reply-To: <1171994830.5216.17.camel@asuka.boston.devel.redhat.com>
References: <1171994830.5216.17.camel@asuka.boston.devel.redhat.com>
Message-ID: <b4683dc90702221309x6539586ck2642c0d0154eaf5@mail.gmail.com>

I experienced similar or same problem while testing a 2 node cluster.
No GFS or anything fancy.
Manual Fencing and a single failover domain .
Did not notice any particular pattern on when and why this occurred.
Nodes were installed in Virtualized (VMWARE Workstation 5)


On 2/20/07, Lon Hohberger <lhh at redhat.com> wrote:
>
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228823
>
> If anyone has an easy way to make this occur, please comment on the
> above bugzilla.  I have a fix for the symptom, but I would like to
> understand when/why it happens so I can fix the *cause*.
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070222/88737c35/attachment.htm>

From anderson.stephen at gmail.com  Fri Feb 23 00:10:17 2007
From: anderson.stephen at gmail.com (Stephen Anderson)
Date: Thu, 22 Feb 2007 19:10:17 -0500
Subject: [Linux-cluster] cluster with multiple ax100 - suggestions?
Message-ID: <3fef98d60702221610v116cc4eej7217c03f603bddcd@mail.gmail.com>

Currently I am setting up a cluster from equipment on hand. The
cluster will have a v40z master, and eight v20z nodes running RHEL.
For storage, we have eight ax100. Currently the eight v20zs have HBA
cards for direct connection of the ax100s, but the cluster requires a
large common shared storage. The systems will also be connected via
Infiniband for operations and Ethernet for management.

My first idea is to connect the eight ax100s and the v40z to a fc
switch and use gfs to provide a shared file system for the cluster. So
I presume that I could use Powerpath on the v40z to setup each ax100,
and create a large logical volume through the OS from the eight
ax100s.

Does this setup scenario sound feasible?

Another idea was to leave the ax100s connected directly to the nodes
and use Lustre to create a shared file system, but that approach
seemed rather complex and may consume too much CPU power.

The cluster will perform heavy reads and writes.

Suggestions?

TIA,
Steve


From isplist at logicore.net  Fri Feb 23 00:26:42 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Thu, 22 Feb 2007 18:26:42 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
Message-ID: <2007222182642.484315@leena>

Is there some way of contacting redhat directly? There must be some way of 
passing this information to the installer??? 

If not, I'd love to stop looking and move on to other things.

scsi_mod max_luns=256 dev_flags="INLINE:TF200 5_23078:0x200"

Mike


From Keith.Lewis at its.monash.edu.au  Fri Feb 23 01:52:51 2007
From: Keith.Lewis at its.monash.edu.au (Keith Lewis)
Date: Fri, 23 Feb 2007 12:52:51 +1100
Subject: [Linux-cluster] data and machine dependent NFS GFS file xfer problem
Message-ID: <200702230152.l1N1qpMu010604@mukluk.its.monash.edu.au>


Thursday 22 Feb 2007

	An attempt was made to make sure all the computers in a certain group
had a common set of rpms installed.

	To make this easier, a non-RedHat rpm was copied to a disk that was
mounted on most of the machines, and installed from there.  This broke the RPM
database on those machines.  After the install, most rpm commands got this:

error: rpmdbNextIterator: skipping h# 2325 Header V3 DSA signature: BAD, key
ID 99b62126

	This was fixed by rpm -e <the above rpm> ; rpm --rebuilddb

	It was noticed that all the machines which had installed the shared
rpm had failed in this way, but none of the machines that had installed from a
copy on local disk.

	Using `sum' it was noticed that all the machines except one saw the
file as corrupt.  The one machine, (called `T' from here on), was the one
which had done the original copy.  It still saw the file as pristine - i.e.
not corrupt.

	The shared filesystem is based on GFS, but due to a history of network
and SAN problems causing fence events which seriously degrade our
servicelevel, GFS is restricted to as few machines as possible.  Currently
only three machines, (called C, S and W from here on), mount the GFS disk
directly.  Machines C and W export it to the rest of the group via NFS.

	`T' mounted the GFS disk via NFS through W.  `T' was the only machine
to see the GFS copy as pristine.  All other machines, including C, S and W,
irrespective of whether they mounted the disk by GFS directly or by NFS saw
the file as corrupt.

	`T' then dismounted the disk via W and remounted it via C.  It then saw
the file as corrupt, but it then made another copy of the file from its local
disk to the GFS disk, and this copy too was seen as corrupt by all other
machines, while `T' itself saw it as pristine.

	Other machines had no problems copying the same file from their local
disk to the GFS disk.

	An attempt was made to mount the GFS disk directly on T:

/etc/init.d/pool start
/etc/init.d/ccsd start
/etc/init.d/lock_gulmd start
/etc/init.d/gfs start
mount /dev/pool/pool_gfs01 -t gfs /mnt

	(I've never mounted a GFS disk in this way before, so this may be a
problem - usually its in fstab and `/etc/init.d/gfs start' mounts it)

	The mount never completed.  The log on the master lockserver showed
lock_gulm starting on `T' (New Client: idx 10 fd 15 from ...) and about a
minute later T missed a heartbeat...  seven heartbeats later `T' was fenced,
and most embarrassingly, rebooted.

	After the reboot `T' saw all the GFS copies (except those made by other
machines) as corrupt, but a further copy of the file by `T' to the GFS disk
showed as corrupt by all nodes except `T' which continued to see it as
pristine...  i.e. the reboot had not cured the problem...

	Summary - I have one file, R-2.3.1-1.rh3AS.i386.rpm, which one node,
`T', cannot successfully copy to the GFS disk, although it thinks it can, and
can even copy it back, producing a duplicate of the original...

# uname -r
2.4.21-47.0.1.ELsmp

# rpm -qa | grep -i gfs
GFS-devel-6.0.2.36-1
GFS-6.0.2.36-1
GFS-modules-smp-6.0.2.36-1

# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 3 (Taroon Update 8)

sum pristine file:
01904 22905

sum corrupt file:
57604 22905

	The above account is an accurate description of the events, only the
confusion, disbelief and utter panic has been omitted.

	Looking for suggestions, like what to do next, which list to take
it to and so on...

	Thanks

Keith


From Alain.Moulle at bull.net  Fri Feb 23 08:01:47 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Fri, 23 Feb 2007 09:01:47 +0100
Subject: [Linux-cluster] Re: CS4 Update 4 / Oops in dlm module 
Message-ID: <45DE9F6B.3000006@bull.net>

Hi

Nodes were quite very loaded by appli tests, and it happens after
a run-time of 3 days, but there was no failover, no manipulation
around CS4 at all.

Thanks
Alain

On Wed, 2007-02-21 at 14:21 +0100, Alain Moulle wrote:
>> Hi
>>
>> CS4 Update 4 :
>> I got a Oops in dlm module following an acces to an invalid address in function
>> send_cluster_request. Is there an already know bug about this ?
>> and eventual fix ?


>When does this happen occur - i.e. every time, or after a long run-time,
>etc.?
>-- Lon
-- 


From orkcu at yahoo.com  Fri Feb 23 14:27:45 2007
From: orkcu at yahoo.com (=?iso-8859-1?Q?Roger_Pe=F1a?=)
Date: Fri, 23 Feb 2007 06:27:45 -0800 (PST)
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <2007222182642.484315@leena>
Message-ID: <870267.90203.qm@web50608.mail.yahoo.com>


--- "isplist at logicore.net" <isplist at logicore.net>
wrote:

> Is there some way of contacting redhat directly?
> There must be some way of 
> passing this information to the installer??? 
> 
> If not, I'd love to stop looking and move on to
> other things.
> 
> scsi_mod max_luns=256 dev_flags="INLINE:TF200
> 5_23078:0x200"
> 
why not just add it when the installer ask what to do:
"text instalation", ""upgradeany, "just gui install",
etc

for example:
type: text scsi_mod max_luns=256
dev_flags="INLINE:TF200 5_23078:0x200"

if you whatn a text installation with that options

cu
roger
> Mike
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> 


__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
8:00? 8:25? 8:40? Find a flick in no time 
with the Yahoo! Search movie showtime shortcut.
http://tools.search.yahoo.com/shortcuts/#news


From isplist at logicore.net  Fri Feb 23 15:06:13 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 23 Feb 2007 09:06:13 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <870267.90203.qm@web50608.mail.yahoo.com>
Message-ID: <20072239613.013900@leena>

>> There must be some way of
>> passing this information to the installer???

>> scsi_mod max_luns=256 dev_flags="INLINE:TF200
>> 5_23078:0x200"
>> 
> why not just add it when the installer ask what to do:
> for example:
> type: text scsi_mod max_luns=256
> dev_flags="INLINE:TF200 5_23078:0x200"
> if you whatn a text installation with that options

I'm not sure how doing a text install vs a GUI one makes any difference? Maybe 
I'm missing something but I gave it a try just for the hell of it and still, 
only LUNS 0/1 show up :).

Mike


From orkcu at yahoo.com  Fri Feb 23 15:16:42 2007
From: orkcu at yahoo.com (=?iso-8859-1?Q?Roger_Pe=F1a?=)
Date: Fri, 23 Feb 2007 07:16:42 -0800 (PST)
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <20072239613.013900@leena>
Message-ID: <223473.12647.qm@web50609.mail.yahoo.com>


--- "isplist at logicore.net" <isplist at logicore.net>
wrote:

> >> There must be some way of
> >> passing this information to the installer???
> 
> >> scsi_mod max_luns=256 dev_flags="INLINE:TF200
> >> 5_23078:0x200"
> >> 
> > why not just add it when the installer ask what to
> do:
> > for example:
> > type: text scsi_mod max_luns=256
> > dev_flags="INLINE:TF200 5_23078:0x200"
> > if you whatn a text installation with that options
> 
> I'm not sure how doing a text install vs a GUI one
> makes any difference? Maybe 
> I'm missing something but I gave it a try just for
> the hell of it and still, 
> only LUNS 0/1 show up :).

it doesn't matter if it is a text or gui installation,
I just tried to point the place where you should add
the line ;-)

so, it doesn't work ? ;-(
how do you know that that options works at all?

you ask where to put the options to be recognized by
the kernel, I just point you to the place :-) you are
the one to chose the right options ;-)


cu
roger

__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
http://answers.yahoo.com/dir/?link=list&sid=396546091


From isplist at logicore.net  Fri Feb 23 15:25:08 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 23 Feb 2007 09:25:08 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <223473.12647.qm@web50609.mail.yahoo.com>
Message-ID: <20072239258.586862@leena>

> it doesn't matter if it is a text or gui installation,
> I just tried to point the place where you should add
> the line ;-)

Ok, that's what I guessed :). I didn't mean that I didn't know where to put it 
but that I was not able to find a method by which to tell anaconda about the 
options. 
 
> so, it doesn't work ? ;-(
> how do you know that that options works at all?

Because adding that exact line in /etc/modprobe.conf does the trick, all 
volumes show up. Problem is, I need to install on those volumes so need to see 
them before the OS is actually installed.
 
Maybe there is some way of creating my own install which already has that 
information included so that anaconda automatically sees it?

Mike


From orkcu at yahoo.com  Fri Feb 23 15:58:50 2007
From: orkcu at yahoo.com (=?iso-8859-1?Q?Roger_Pe=F1a?=)
Date: Fri, 23 Feb 2007 07:58:50 -0800 (PST)
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <20072239258.586862@leena>
Message-ID: <996015.57436.qm@web50614.mail.yahoo.com>


--- "isplist at logicore.net" <isplist at logicore.net>
wrote:

> 
> Ok, that's what I guessed :). I didn't mean that I
> didn't know where to put it 
> but that I was not able to find a method by which to
> tell anaconda about the 
> options. 
that is the place to give option to the kernel at boot
time, just as it is with lilo or grub
I do not know why it is not working if it works with
modules.conf.

>  
> > so, it doesn't work ? ;-(
> > how do you know that that options works at all?
> 
> Because adding that exact line in /etc/modprobe.conf
> does the trick, all 
> volumes show up. Problem is, I need to install on
> those volumes so need to see 
> them before the OS is actually installed.

did you try to unload an then load the module with
that options at instalation time? in the console F2 I
mean

>  
> Maybe there is some way of creating my own install
> which already has that 
> information included so that anaconda automatically
> sees it?
you can do that of course, but because it is just an
option and not a module, you only need to modify the
options in the "lilo.conf" of the instaler.

cu
roger

> 
> Mike
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> 


__________________________________________
RedHat Certified Engineer ( RHCE )
Cisco Certified Network Associate ( CCNA )


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com


From lhh at redhat.com  Fri Feb 23 16:10:30 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 23 Feb 2007 11:10:30 -0500
Subject: [Linux-cluster] WTI fencing devices guide / howto
Message-ID: <1172247030.30230.60.camel@asuka.boston.devel.redhat.com>

It covers WTI fencing device configuration and configuration examples
for RHCS3, GFS 6.0, GFS 6.1, RHCS4, and RHCS5:

  http://people.redhat.com/lhh/wti_devices.html

I hope to add a section about how to use a WTI IPS800 + WTI RSM-8 as a
way to provide redundant access to a fencing device in the near future.

-- Lon


From lhh at redhat.com  Fri Feb 23 16:15:47 2007
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 23 Feb 2007 11:15:47 -0500
Subject: [Linux-cluster] Re: CS4 Update 4 / Oops in dlm module
In-Reply-To: <45DE9F6B.3000006@bull.net>
References: <45DE9F6B.3000006@bull.net>
Message-ID: <1172247347.30230.67.camel@asuka.boston.devel.redhat.com>

On Fri, 2007-02-23 at 09:01 +0100, Alain Moulle wrote:
> Hi
> 
> Nodes were quite very loaded by appli tests, and it happens after
> a run-time of 3 days, but there was no failover, no manipulation
> around CS4 at all.

Could you install the current rgmanager test RPM:

http://people.redhat.com/lhh/rgmanager-1.9.54-3.228823test.i386.rpm

...and see if it goes away?  The above RPM is the same as 1.9.54, but
includes fix for an assertion failure, a way to fix services stuck in
the stopping state, and (the important one for you) a fix for an
intermittent DLM lock leak.

ia64/x86_64/srpms here: http://people.redhat.com/lhh/packages.html

-- Lon


From rpeterso at redhat.com  Fri Feb 23 16:15:28 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Fri, 23 Feb 2007 10:15:28 -0600
Subject: [Linux-cluster] data and machine dependent NFS GFS file xfer
	problem
In-Reply-To: <200702230152.l1N1qpMu010604@mukluk.its.monash.edu.au>
References: <200702230152.l1N1qpMu010604@mukluk.its.monash.edu.au>
Message-ID: <45DF1320.3040909@redhat.com>

Keith Lewis wrote:
> Thursday 22 Feb 2007
>
> 	An attempt was made to make sure all the computers in a certain group
> had a common set of rpms installed.
>
> 	To make this easier, a non-RedHat rpm was copied to a disk that was
> mounted on most of the machines, and installed from there.  This broke the RPM
> database on those machines.  After the install, most rpm commands got this:
>
> error: rpmdbNextIterator: skipping h# 2325 Header V3 DSA signature: BAD, key
> ID 99b62126
>
> 	This was fixed by rpm -e <the above rpm> ; rpm --rebuilddb
>
> 	It was noticed that all the machines which had installed the shared
> rpm had failed in this way, but none of the machines that had installed from a
> copy on local disk.
>
> 	Using `sum' it was noticed that all the machines except one saw the
> file as corrupt.  The one machine, (called `T' from here on), was the one
> which had done the original copy.  It still saw the file as pristine - i.e.
> not corrupt.
>
> 	The shared filesystem is based on GFS, but due to a history of network
> and SAN problems causing fence events which seriously degrade our
> servicelevel, GFS is restricted to as few machines as possible.  Currently
> only three machines, (called C, S and W from here on), mount the GFS disk
> directly.  Machines C and W export it to the rest of the group via NFS.
>
> 	`T' mounted the GFS disk via NFS through W.  `T' was the only machine
> to see the GFS copy as pristine.  All other machines, including C, S and W,
> irrespective of whether they mounted the disk by GFS directly or by NFS saw
> the file as corrupt.
>
> 	`T' then dismounted the disk via W and remounted it via C.  It then saw
> the file as corrupt, but it then made another copy of the file from its local
> disk to the GFS disk, and this copy too was seen as corrupt by all other
> machines, while `T' itself saw it as pristine.
>
> 	Other machines had no problems copying the same file from their local
> disk to the GFS disk.
>
> 	An attempt was made to mount the GFS disk directly on T:
>
> /etc/init.d/pool start
> /etc/init.d/ccsd start
> /etc/init.d/lock_gulmd start
> /etc/init.d/gfs start
> mount /dev/pool/pool_gfs01 -t gfs /mnt
>
> 	(I've never mounted a GFS disk in this way before, so this may be a
> problem - usually its in fstab and `/etc/init.d/gfs start' mounts it)
>
> 	The mount never completed.  The log on the master lockserver showed
> lock_gulm starting on `T' (New Client: idx 10 fd 15 from ...) and about a
> minute later T missed a heartbeat...  seven heartbeats later `T' was fenced,
> and most embarrassingly, rebooted.
>
> 	After the reboot `T' saw all the GFS copies (except those made by other
> machines) as corrupt, but a further copy of the file by `T' to the GFS disk
> showed as corrupt by all nodes except `T' which continued to see it as
> pristine...  i.e. the reboot had not cured the problem...
>
> 	Summary - I have one file, R-2.3.1-1.rh3AS.i386.rpm, which one node,
> `T', cannot successfully copy to the GFS disk, although it thinks it can, and
> can even copy it back, producing a duplicate of the original...
>
> # uname -r
> 2.4.21-47.0.1.ELsmp
>
> # rpm -qa | grep -i gfs
> GFS-devel-6.0.2.36-1
> GFS-6.0.2.36-1
> GFS-modules-smp-6.0.2.36-1
>
> # cat /etc/redhat-release
> Red Hat Enterprise Linux AS release 3 (Taroon Update 8)
>
> sum pristine file:
> 01904 22905
>
> sum corrupt file:
> 57604 22905
>
> 	The above account is an accurate description of the events, only the
> confusion, disbelief and utter panic has been omitted.
>
> 	Looking for suggestions, like what to do next, which list to take
> it to and so on...
>
> 	Thanks
>
> Keith
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>   
Hi Keith,

Good question.  In fact, I've answered similar ones before on this list.
I thought I had added it to the cluster faq, but apparently I was 
remiss; sorry.
I just added it now:

http://sources.redhat.com/cluster/faq.html#gfs_corruption

The examples I gave assume that you're using lvm2, which you're not
because you're RHEL3, but it should still give you the gist.

Please let me know if the new faq entry needs some work.

BTW, it was noticed that almost all of your sentences were written
in the passive voice.  The question why presents itself.  ;)

Regards,

Bob Peterson
Red Hat Cluster Suite


From isplist at logicore.net  Fri Feb 23 16:30:57 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 23 Feb 2007 10:30:57 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <996015.57436.qm@web50614.mail.yahoo.com>
Message-ID: <2007223103057.459544@leena>

> you can do that of course, but because it is just an
> option and not a module, you only need to modify the
> options in the "lilo.conf" of the instaler.

Ah, now we're on to something :). 

Since I already have the CD and have never modified my own... I'm guessing I 
copy the files to a drive, make the mods, then write it all back to CD?


From isplist at logicore.net  Fri Feb 23 16:39:40 2007
From: isplist at logicore.net (isplist at logicore.net)
Date: Fri, 23 Feb 2007 10:39:40 -0600
Subject: [Linux-cluster] Passing SCSI options at installer
In-Reply-To: <996015.57436.qm@web50614.mail.yahoo.com>
Message-ID: <2007223103940.148356@leena>

> you can do that of course, but because it is just an
> option and not a module, you only need to modify the
> options in the "lilo.conf" of the instaler.

Wait now... I use the grub loader :). I'll have to do some research.


From srramasw at cisco.com  Fri Feb 23 18:47:23 2007
From: srramasw at cisco.com (Sridhar Ramaswamy (srramasw))
Date: Fri, 23 Feb 2007 10:47:23 -0800
Subject: [Linux-cluster] GFS2
Message-ID: <B14199FA0DBAAF4AA89E83EB41D35435031D16D9@xmb-sjc-22c.amer.cisco.com>

I looked up the FAQ describing GFS2's improvements over GFS1. Quite an
interesting list! Some of the ones listed there really make sense for
us. Particularly the ones related to journals like smaller journal size
and adding journals without extending filesystem.
 
It seems GFS2 is available now as an beta/development release. Do you
know when GFS2 is planned for a production release?
 
thanks,
- Sridhar
Cisco Systems, Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070223/a89ee7d9/attachment.htm>

From rpeterso at redhat.com  Fri Feb 23 22:15:41 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Fri, 23 Feb 2007 16:15:41 -0600
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45DDBCA6.9000608@sannes.org>
References: <45D6E7A4.9020106@sannes.org>	<45D6EB01.7080906@sannes.org>	<45DCBFC4.2050606@redhat.com>	<45DDA9B2.8070808@sannes.org>
	<45DDBCA6.9000608@sannes.org>
Message-ID: <45DF678D.2080503@redhat.com>

Asbj?rn Sannes wrote:
> Actually, it didn't enjoy unmount as much ..
>
> Mvh,
> Asbj?rn Sannes
>   
Hi Asbj?rn,

FYI--I just updated the STABLE branch to fix the sync problem you were 
having.
Please give it a try and let me know if there are still problems.

Regards,

Bob Peterson
Red Hat Cluster Suite


From rpeterso at redhat.com  Fri Feb 23 22:58:50 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Fri, 23 Feb 2007 16:58:50 -0600
Subject: [Linux-cluster] STABLE instructions
Message-ID: <45DF71AA.5000604@redhat.com>

Hi All,

FYI--I created a revised set of instructions for getting the CVS STABLE
branch of the cluster code running on an upstream kernel.  This is a little
more detailed than the cluster/doc/usage.txt.  It's available (I hope) 
on my 108 page:

https://rpeterso.108.redhat.com/files/documents/98/247/STABLE.txt

Regards,

Bob Peterson
Red Hat Cluster Suite


From ace at sannes.org  Fri Feb 23 23:17:18 2007
From: ace at sannes.org (=?iso-8859-1?Q?Asbj=F8rn_Sannes?=)
Date: Sat, 24 Feb 2007 00:17:18 +0100 (CET)
Subject: [Linux-cluster] gfs1 and 2.6.20
In-Reply-To: <45DF678D.2080503@redhat.com>
References: <45D6E7A4.9020106@sannes.org> <45D6EB01.7080906@sannes.org>
	<45DCBFC4.2050606@redhat.com> <45DDA9B2.8070808@sannes.org>
	<45DDBCA6.9000608@sannes.org> <45DF678D.2080503@redhat.com>
Message-ID: <51962.193.71.127.102.1172272638.squirrel@kunder.interhost.no>


> Asbj?rn Sannes wrote:
>> Actually, it didn't enjoy unmount as much ..
>>
>> Mvh,
>> Asbj?rn Sannes
>>
> Hi Asbj?rn,
>
> FYI--I just updated the STABLE branch to fix the sync problem you were
> having.
> Please give it a try and let me know if there are still problems.

Thanks a bunch! :) Works for me (mounting, syncing, reading/writing
from/to files and unmount).

Mvh.
Asbj?rn Sannes


From lshen at cisco.com  Fri Feb 23 23:34:47 2007
From: lshen at cisco.com (Lin Shen (lshen))
Date: Fri, 23 Feb 2007 15:34:47 -0800
Subject: [Linux-cluster] Problem running GFS on top of AoE
Message-ID: <08A9A3213527A6428774900A80DBD8D80385BA93@xmb-sjc-222.amer.cisco.com>

I'm trying to run GFS on top of AoE on my two-node cluster (node A and
B). I was using GNBD previously. 

I first exported a partition from A to B using vblade via eth1 (the
cluster is using eth0 I believe). That seems to work as expected. Then,
I did a gfs_mkfs on the etherd partition on B, and mounted the file
system on both A and B. All went as expected. But when I write stuff
onto the file system on either node, it takes a long time (a few
minutes) or never for the contents to be propagated to the other node.

I saw some abnormal messages in the log, but not quite sure what it
means. This is from node A, and similar messages are also in log from
node B. Any ideas? How should I trouble shoot this?

Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: Joined cluster. Now
mounting FS...
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Trying to
acquire journal lock...
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Trying to
acquire journal lock...
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Looking at
journal...
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Looking at
journal...
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Done
Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Done
Feb 23 15:02:51 cfs2 kernel: device eth0 entered promiscuous mode
Feb 23 15:02:51 cfs2 kernel: device eth0 entered promiscuous mode
Feb 23 15:02:51 cfs2 kernel: device eth1 entered promiscuous mode
Feb 23 15:02:51 cfs2 kernel: device eth1 entered promiscuous mode
Feb 23 15:02:51 cfs2 kernel: device lo entered promiscuous mode
Feb 23 15:02:51 cfs2 kernel: device lo entered promiscuous mode

   
From krikler_samuel at diligent.com  Sat Feb 24 08:23:05 2007
From: krikler_samuel at diligent.com (Krikler, Samuel)
Date: Sat, 24 Feb 2007 10:23:05 +0200
Subject: [Linux-cluster] Power Switch recommendation
Message-ID: <453D02254A9EBC45866DBF28FECEA46FBADBED@ILEX01.corp.diligent.com>

Hi,

 
I want to purchase a power switch to be used as fence device.

Since I don't any experience with this kind of devices:

Can someone recommend me a specific power switch model of those
supported by GFS2?

 
Thanks,

 
Samuel.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070224/d72b49bd/attachment.htm>

From filipe.miranda at gmail.com  Sat Feb 24 18:46:42 2007
From: filipe.miranda at gmail.com (Filipe Miranda)
Date: Sat, 24 Feb 2007 16:46:42 -0200
Subject: [Linux-cluster] Power Switch recommendation
In-Reply-To: <453D02254A9EBC45866DBF28FECEA46FBADBED@ILEX01.corp.diligent.com>
References: <453D02254A9EBC45866DBF28FECEA46FBADBED@ILEX01.corp.diligent.com>
Message-ID: <a6d13c780702241046w2cbc8c49u59e6a1691cbc87fd@mail.gmail.com>

Hey Samuel,

First of all check if already have the hardware to provide the fence
capability in your machines (iLO or any IPMI  2.0 compliant).

Regards,

On 2/24/07, Krikler, Samuel <krikler_samuel at diligent.com> wrote:
>
>  Hi,
>
>
>
> I want to purchase a power switch to be used as fence device.
>
> Since I don't any experience with this kind of devices:
>
> Can someone recommend me a specific power switch model of those supported
> by GFS2?
>
>
>
>
>
> Thanks,
>
>
>
> Samuel.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
---
Filipe T Miranda
Red Hat Certified Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070224/62ede24f/attachment.htm>

From tgl at redhat.com  Sun Feb 25 03:04:31 2007
From: tgl at redhat.com (Tom Lane)
Date: Sat, 24 Feb 2007 22:04:31 -0500
Subject: [Linux-cluster] FWD: Question on RH Cluster from a MySQL Customer
Message-ID: <1638.1172372671@sss.pgh.pa.us>

Can someone help out this questioner?  I know zip about Cluster.
I looked at the FAQ for a bit and thought that what he wants is
probably doable, but I couldn't tell if it would be easy or
painful to do load-balancing in this particular way.  (And I'm not
qualified to say if what he wants is a sensible approach, either.)

			regards, tom lane

------- Forwarded Message

Date:    Sat, 24 Feb 2007 15:37:17 +0000
From:    Ivan Zoratti <ivan at mysql.com>
To:      tgl at redhat.com
Subject: Question on RH Cluster from a MySQL Customer

Dear Tom,

first of all, let me introduce myself. I am the Sales Engineering  
Manager for EMEA at MySQL. Kath O'Neil, our Director of Strategic  
Alliances, kindly gave me your name for a technical question related  
to the use of Red Hat and MySQL - hopefully leading to the adoption  
of RH Cluster.

Our customer is looking for a solution that could provide high  
availability and scalability in a cluster environment based on linux  
servers that are connected to a large SAN. Their favourite choice  
would be to go with Red Hat.
Each server connected to the SAN would provide resources to host,  
let's say, 5 different instances of MySQL (mysqld). Each mysqld will  
have its own configuration, datadir, connection port and IP address.
The clustering software should be able to load-balance new mysqld  
instances on the available servers. For example, considering servers  
with same specs and workload, when the first mysqld starts, it will  
be placed on Server A, the second one will go on Server B and so on  
for C,D and E. The sixth mysqld will then go on A again, then B and  
so forth. If one of the server fails, the mysqld(s) is (or are)  
"moved" on the other servers, still in a way to guarantee a load- 
balance of the whole system.
After my long (and hopefully clear enough) explanation, my quick  
question is: does RH Cluster provide this kind of features? I am  
mostly interested in the way we can instatiate mysqld and re-launch  
them on any other server in the cluster in case of fault.

I would be very grateful if you could help me or address me to  
somebody or something for an answer.

Thank you in advance for your help.

Kind Regards,

Ivan


--
   Ivan Zoratti - Sales Engineering Manager EMEA

   MySQL AB - Windsor - UK
   Mobile: +44 7866 363 180

   ivan at mysql.com
   http://www.mysql.com
--


------- End of Forwarded Message


From sara_sodagar at yahoo.com  Sun Feb 25 13:10:02 2007
From: sara_sodagar at yahoo.com (sara sodagar)
Date: Sun, 25 Feb 2007 05:10:02 -0800 (PST)
Subject: [Linux-cluster] Question about Cluster Service
Message-ID: <767095.27017.qm@web31809.mail.mud.yahoo.com>

Hi
I would be grateful if anyone could tell me if this
solution works or not?
I am planning to use Web server cluster.I have 2
active
servers and 1 passive server .I suppose I should
 create 2 cluster service as in each cluster service
there 
should be 1 active server.
As I have only 1 passive server , I should create 2
fail over domain .

Node A ,C    (cluster service 1)
Node B , C   (cluster service 2)
Node c :   (Failover domain 1 : service 1, failover
domain2: service 2)
Each Cluster service comprises : ip address resource ,
web serviver init script,file 
system resource (gfs)
 Also I would like to know what are the advantages of
using gfs in this solution over
other types of files systems (like ext3) , as there
are no 2 active servers writing on the same area at
the
same time.


--Regards
Sara


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com


From filipe.miranda at gmail.com  Sun Feb 25 23:19:52 2007
From: filipe.miranda at gmail.com (Filipe Miranda)
Date: Sun, 25 Feb 2007 20:19:52 -0300
Subject: [Linux-cluster] Question about Cluster Service
In-Reply-To: <767095.27017.qm@web31809.mail.mud.yahoo.com>
References: <767095.27017.qm@web31809.mail.mud.yahoo.com>
Message-ID: <a6d13c780702251519g3255e3b3le719a490ee6a1814@mail.gmail.com>

Hi there Sodagar,

Will the Web servers present the same data? will they serve the same
content?
Why not use all three servers active (without the fail-over mode) and just
add a layer of load balancing to the top of this solution (two machines with
IPVS?

About the GFS, it is a file system that handles multiple servers accessing
the same data on the same partition.
If the Web servers are presenting the same data to users, GFS will be very
helpful to avoid data redundancy on the storage.

If you do not have a dedicated storage GNBD and ISCSI are good choices.

Regards,


On 2/25/07, sara sodagar <sara_sodagar at yahoo.com> wrote:
>
> Hi
> I would be grateful if anyone could tell me if this
> solution works or not?
> I am planning to use Web server cluster.I have 2
> active
> servers and 1 passive server .I suppose I should
> create 2 cluster service as in each cluster service
> there
> should be 1 active server.
> As I have only 1 passive server , I should create 2
> fail over domain .
>
> Node A ,C    (cluster service 1)
> Node B , C   (cluster service 2)
> Node c :   (Failover domain 1 : service 1, failover
> domain2: service 2)
> Each Cluster service comprises : ip address resource ,
> web serviver init script,file
> system resource (gfs)
> Also I would like to know what are the advantages of
> using gfs in this solution over
> other types of files systems (like ext3) , as there
> are no 2 active servers writing on the same area at
> the
> same time.
>
>
> --Regards
> Sara
>
>
>
>
> ____________________________________________________________________________________
> Do you Yahoo!?
> Everyone is raving about the all-new Yahoo! Mail beta.
> http://new.mail.yahoo.com
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
---
Filipe T Miranda
Red Hat Certified Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070225/87af3727/attachment.htm>

From shailesh at verismonetworks.com  Mon Feb 26 05:48:34 2007
From: shailesh at verismonetworks.com (Shailesh)
Date: Mon, 26 Feb 2007 11:18:34 +0530
Subject: [Linux-cluster] Problem running GFS on top of AoE
In-Reply-To: <08A9A3213527A6428774900A80DBD8D80385BA93@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D80385BA93@xmb-sjc-222.amer.cisco.com>
Message-ID: <1172468914.6551.135.camel@shailesh>

> Looks like the eth devices have entered into promiscous mode, If 
  you would have a lot of traffic on the ethernet, this could 
  be a cause of the delay. Try cutting down non-AoE traffic.

  
Rgds
Shailesh

On Fri, 2007-02-23 at 15:34 -0800, Lin Shen (lshen) wrote:
> I'm trying to run GFS on top of AoE on my two-node cluster (node A and
> B). I was using GNBD previously. 
> 
> I first exported a partition from A to B using vblade via eth1 (the
> cluster is using eth0 I believe). That seems to work as expected. Then,
> I did a gfs_mkfs on the etherd partition on B, and mounted the file
> system on both A and B. All went as expected. But when I write stuff
> onto the file system on either node, it takes a long time (a few
> minutes) or never for the contents to be propagated to the other node.
> 
> I saw some abnormal messages in the log, but not quite sure what it
> means. This is from node A, and similar messages are also in log from
> node B. Any ideas? How should I trouble shoot this?
> 
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: Joined cluster. Now
> mounting FS...
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Trying to
> acquire journal lock...
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Trying to
> acquire journal lock...
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Looking at
> journal...
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Looking at
> journal...
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Done
> Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1: jid=1: Done
> Feb 23 15:02:51 cfs2 kernel: device eth0 entered promiscuous mode
> Feb 23 15:02:51 cfs2 kernel: device eth0 entered promiscuous mode
> Feb 23 15:02:51 cfs2 kernel: device eth1 entered promiscuous mode
> Feb 23 15:02:51 cfs2 kernel: device eth1 entered promiscuous mode
> Feb 23 15:02:51 cfs2 kernel: device lo entered promiscuous mode
> Feb 23 15:02:51 cfs2 kernel: device lo entered promiscuous mode
> 
>    
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 


From Keith.Lewis at its.monash.edu.au  Mon Feb 26 09:19:00 2007
From: Keith.Lewis at its.monash.edu.au (Keith Lewis)
Date: Mon, 26 Feb 2007 20:19:00 +1100
Subject: [Linux-cluster] data and machine dependent NFS GFS file xfer
	problem
Message-ID: <200702260919.l1Q9J03O024173@mukluk.its.monash.edu.au>

I Wrote

> Thursday 22 Feb 2007
> 	An attempt was made to make sure all the computers in a certain group
> had a common set of rpms installed.
> ...
> 	Summary - I have one file, R-2.3.1-1.rh3AS.i386.rpm, which one node,
> `T', cannot successfully copy to the GFS disk, although it thinks it can, and
> can even copy it back, producing a duplicate of the original...
> ...
> 	Looking for suggestions, like what to do next, which list to take
> it to and so on...

	Thanks to all who replied.

	Yes it was hardware.

	No it had nothing to do with GFS.

	Yes I'm an idiot.

	No I don't always use passive voice...

	We have since discovered that the 4 machines which were in a group
which we call subnet 12, could not communicate properly with another group in
what we call subnet 13, both subnets are behind a CSM.  In particular if a UDP
fragment happened to consist solely of ones in the data area, the 4th (16 bit)
word would mysteriously get re-written to zero's.  With `tcpdump' we could see
good data flowing out of machine `T' and bad data entering machines `W' and `C'.

	(We are guessing that this started happening a few weeks ago when
various routers and the CSM were upgraded for security reasons)...

	(It had to be UDP.  TCP and ICMP packets did not trigger the problem).

	(btw this also explains the hang/fence/reboot that I mentioned in the 
original mail - the one to zero corruption caused the sender to retry
continuously, making the machine too busy to do heartbeats).

	Thanks again.

Keith


From rpeterso at redhat.com  Mon Feb 26 14:29:45 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Mon, 26 Feb 2007 08:29:45 -0600
Subject: [Linux-cluster] data and machine dependent NFS GFS file xfer
	problem
In-Reply-To: <200702260919.l1Q9J03O024173@mukluk.its.monash.edu.au>
References: <200702260919.l1Q9J03O024173@mukluk.its.monash.edu.au>
Message-ID: <45E2EED9.5080606@redhat.com>

Keith Lewis wrote:
> 	Thanks to all who replied.
>
> 	Yes it was hardware.
>
> 	No it had nothing to do with GFS.
>
> 	Yes I'm an idiot.
>
> 	No I don't always use passive voice...
>   
Hi Keith,

Thanks for letting us know how this turned out.  No, you're not
an idiot; your questions were perfectly valid.  Every time someone
talks about GFS corruption, I start worrying that there might be some
nasty bug hiding in the shadows; I'm glad to hear that's not the case.
If there are problems, we definitely want to hear about them:
I'd rather hear customer complaints than have frustrated customers
working around problems we never hear about.

As for passive voice: I guess I'm just a little bit anal when it comes
to English, so I was just poking fun.  :)  Ask our documentation guys
who write the manuals (like Paul Kennedy) about me; I'm sure they'll
tell you I'm a real pain in the you-know-what.  :)

Regards,

Bob Peterson
Red Hat Cluster Suite


From jbrassow at redhat.com  Mon Feb 26 16:15:24 2007
From: jbrassow at redhat.com (Jonathan E Brassow)
Date: Mon, 26 Feb 2007 10:15:24 -0600
Subject: [Linux-cluster] Running GFS as local file system
In-Reply-To: <08A9A3213527A6428774900A80DBD8D80385B45B@xmb-sjc-222.amer.cisco.com>
References: <08A9A3213527A6428774900A80DBD8D80385B45B@xmb-sjc-222.amer.cisco.com>
Message-ID: <5c887d6e57303afa8bad78461432992f@redhat.com>


On Feb 22, 2007, at 12:33 PM, Lin Shen (lshen) wrote:

> Is the cluster suite still required if I only use GFS as a local file
> system?

no.

>  Based on my experiment, it seems not. I just did a gfs_mkfs on a
> partition and that's it.

If using as a local file system, use the option '-p lock_nolock'.  In 
the future, if you want to use that FS in a cluster, you must change 
the locking protocol with gfs_tool.

> How easy is it to migrate files created under other file systems (such
> as ext3, reiser etc) into GFS?
>

cp/rsync?

  brassow


From james.lapthorn at lapthornconsulting.com  Tue Feb 27 10:42:56 2007
From: james.lapthorn at lapthornconsulting.com (James Lapthorn)
Date: Tue, 27 Feb 2007 10:42:56 -0000 (UTC)
Subject: [Linux-cluster] clurgmgrd[6147]: <warning> Starving for lock
 usrm::rg="SDA database"
Message-ID: <49900.193.133.138.40.1172572976.squirrel@lapthorn.biz>

Hi Guys,

I have a 4 node cluster running RH Cluster Suite 4.  I have just added a
DB2 service to one of the nodes and have starting gettingerrors relating
to locks ion the system log.  I plan to restart this node at luch time
today to see if this fixes the problem.

Is there anyone who can explain what these errors relate to so that I can
understand the problem better.   I have checked RHN, Cluster Project
website and Google and I cant find anything?

Its worth mentioning that the service is running fine.

Feb 27 10:18:40 leoukldb2 clurgmgrd[6147]: <warning> Starving for lock
usrm::rg="SDA database"
Feb 27 10:19:40 leoukldb2 last message repeated 2 times
Feb 27 10:21:10 leoukldb2 last message repeated 3 times
Feb 27 10:22:55 leoukldb2 last message repeated 2 times
Feb 27 10:22:55 leoukldb2 clurgmgrd[6147]: <err> #48: Unable to obtain
cluster lock: Connection timed out
Feb 27 10:23:40 leoukldb2 clurgmgrd[6147]: <err> #50: Unable to obtain
cluster lock: Connection timed out
Feb 27 10:24:25 leoukldb2 clurgmgrd[6147]: <err> #48: Unable to obtain
cluster lock: Connection timed out
Feb 27 10:24:55 leoukldb2 clurgmgrd[6147]: <warning> Node
ID:211c013b0000000b stuck with lock usrm::rg="SDA database"

_________________________________
This email has been ClamScanned !
          www.clamav.net


From sara_sodagar at yahoo.com  Tue Feb 27 12:14:41 2007
From: sara_sodagar at yahoo.com (sara sodagar)
Date: Tue, 27 Feb 2007 04:14:41 -0800 (PST)
Subject: [Linux-cluster] Re: Question about Cluster Service
In-Reply-To: <20070226170008.C70A5732A9@hormel.redhat.com>
Message-ID: <483373.69806.qm@web31810.mail.mud.yahoo.com>

Thanks a lot for replying to my question.
Actually I have to distribute data on my web servers
as
we are using a provisioning software and it just uses
this architecture.
My plan is to setup 2 active servers that do not share
any data , and one passive server which is for both of
them.
Node A ,C    (cluster service 1)
Node B , C   (cluster service 2)
Node c :   (Failover domain 1 : service 1,failover
domain2: service 2)
I want to setup GFS between A,C  and another pair B,C
My main question is whether I should use GFS or not ?
I am confused about whether I should use GFS when I am
using High availability service management from
Cluster suite.

--Regards.
Sara

--- linux-cluster-request at redhat.com wrote:

> Send Linux-cluster mailing list submissions to
> 	linux-cluster at redhat.com
> 
> To subscribe or unsubscribe via the World Wide Web,
> visit
> 
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> or, via email, send a message with subject or body
> 'help' to
> 	linux-cluster-request at redhat.com
> 
> You can reach the person managing the list at
> 	linux-cluster-owner at redhat.com
> 
> When replying, please edit your Subject line so it
> is more specific
> than "Re: Contents of Linux-cluster digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Question about Cluster Service (Filipe
> Miranda)
>    2. Re: Problem running GFS on top of AoE
> (Shailesh)
>    3. Re: data and machine dependent NFS GFS file
> xfer	problem
>       (Keith Lewis)
>    4. Re: data and machine dependent NFS GFS file
> xfer	problem
>       (Robert Peterson)
>    5. Re: Running GFS as local file system (Jonathan
> E Brassow)
> 
> 
>
----------------------------------------------------------------------
> 
> Message: 1
> Date: Sun, 25 Feb 2007 20:19:52 -0300
> From: "Filipe Miranda" <filipe.miranda at gmail.com>
> Subject: Re: [Linux-cluster] Question about Cluster
> Service
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
> 
>
<a6d13c780702251519g3255e3b3le719a490ee6a1814 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Hi there Sodagar,
> 
> Will the Web servers present the same data? will
> they serve the same
> content?
> Why not use all three servers active (without the
> fail-over mode) and just
> add a layer of load balancing to the top of this
> solution (two machines with
> IPVS?
> 
> About the GFS, it is a file system that handles
> multiple servers accessing
> the same data on the same partition.
> If the Web servers are presenting the same data to
> users, GFS will be very
> helpful to avoid data redundancy on the storage.
> 
> If you do not have a dedicated storage GNBD and
> ISCSI are good choices.
> 
> Regards,
> 
> 
> 
> 
> On 2/25/07, sara sodagar <sara_sodagar at yahoo.com>
> wrote:
> >
> > Hi
> > I would be grateful if anyone could tell me if
> this
> > solution works or not?
> > I am planning to use Web server cluster.I have 2
> > active
> > servers and 1 passive server .I suppose I should
> > create 2 cluster service as in each cluster
> service
> > there
> > should be 1 active server.
> > As I have only 1 passive server , I should create
> 2
> > fail over domain .
> >
> > ster service comprises : ip address
> resource ,
> > web serviver initNode A ,C    (cluster service 1)
> > Node B , C   (cluster service 2)
> > Node c :   (Failover domain 1 : service 1,
> failover
> > domain2: service 2)
> > Each Clu script,file
> > system resource (gfs)
> > Also I would like to know what are the advantages
> of
> > using gfs in this solution over
> > other types of files systems (like ext3) , as
> there
> > are no 2 active servers writing on the same area
> at
> > the
> > same time.
> >
> >
> > --Regards
> > Sara
> >
> >
> >
> >
> >
>
____________________________________________________________________________________
> > Do you Yahoo!?
> > Everyone is raving about the all-new Yahoo! Mail
> beta.
> > http://new.mail.yahoo.com
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> >
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> 
> 
> -- 
> ---
> Filipe T Miranda
> Red Hat Certified Engineer
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
https://www.redhat.com/archives/linux-cluster/attachments/20070225/87af3727/attachment.html
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 26 Feb 2007 11:18:34 +0530
> From: Shailesh <shailesh at verismonetworks.com>
> Subject: Re: [Linux-cluster] Problem running GFS on
> top of AoE
> To: linux clustering <linux-cluster at redhat.com>
> Message-ID: <1172468914.6551.135.camel at shailesh>
> Content-Type: text/plain
> 
> > Looks like the eth devices have entered into
> promiscous mode, If 
>   you would have a lot of traffic on the ethernet,
> this could 
>   be a cause of the delay. Try cutting down non-AoE
> traffic.
> 
>   
> Rgds
> Shailesh
> 
> On Fri, 2007-02-23 at 15:34 -0800, Lin Shen (lshen)
> wrote:
> > I'm trying to run GFS on top of AoE on my two-node
> cluster (node A and
> > B). I was using GNBD previously. 
> > 
> > I first exported a partition from A to B using
> vblade via eth1 (the
> > cluster is using eth0 I believe). That seems to
> work as expected. Then,
> > I did a gfs_mkfs on the etherd partition on B, and
> mounted the file
> > system on both A and B. All went as expected. But
> when I write stuff
> > onto the file system on either node, it takes a
> long time (a few
> > minutes) or never for the contents to be
> propagated to the other node.
> > 
> > I saw some abnormal messages in the log, but not
> quite sure what it
> > means. This is from node A, and similar messages
> are also in log from
> > node B. Any ideas? How should I trouble shoot
> this?
> > 
> > Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1:
> Joined cluster. Now
> > mounting FS...
> > Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1:
> jid=1: Trying to
> > acquire journal lock...
> > Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1:
> jid=1: Trying to
> > acquire journal lock...
> > Feb 23 11:46:52 cfs2 kernel: GFS: fsid=gfs:aoe.1:
> jid=1: 
=== message truncated ===


____________________________________________________________________________________
Looking for earth-friendly autos? 
Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.
http://autos.yahoo.com/green_center/


From haller at atix.de  Tue Feb 27 15:16:21 2007
From: haller at atix.de (Dirk Haller)
Date: Tue, 27 Feb 2007 16:16:21 +0100
Subject: [Linux-cluster] GFS 6.1 crashed (glock.c)
Message-ID: <200702271616.21576.haller@atix.de>

Hello list,

we have a running two node GFS 6.1 Cluster and today GFS crashed on one node 
suddenly.

Please have a look at the following log messages:

----
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: fatal: 
assertion "FALSE" failed
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   function = 
xmote_bh
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   file 
= /builddir/build/BUILD/gfs-kernel-2.6.9-60/smp/src/gfs/glock.c, line = 1093
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   time = 
1172575419
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: about to 
withdraw from the cluster
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: waiting for 
outstanding I/O
Feb 27 12:23:39 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: telling LM to 
withdraw
----

We are not able to reproduce the problem, because we are not sure what is 
responsible for this problem.
I found an older post in this list, where the same problem exists, but there 
is no real solution or a reason why this is happening.

The cluster's operating system is RHEL4 U4 (x86_64). Kernel version is 
2.6.9-42.0.3.ELsmp and the following GFS rpms are installed and in use.
GFS-6.1.6-1
GFS-kernel-2.6.9-60.3
GFS-kernel-smp-2.6.9-60.3
GFS-kernheaders-2.6.9-60.3

Any hints and tips to look deeper into this problem or even a solution would 
be great.
For more details, please have a look at the attached crash log.

Thanks in advance! 

-- 
Gruss / Regards Dirk Haller
-------------- next part --------------
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: fatal: assertion "FALSE" failed
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   function = xmote_bh
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   file = /builddir/build/BUILD/gfs-kernel-2.6.9-60/smp/src/gfs/glock.c, line = 1093
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   time = 1172575419
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: about to withdraw from the cluster
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: waiting for outstanding I/O
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: telling LM to withdraw
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: fatal: assertion "FALSE" failed
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   function = xmote_bh
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   file = /builddir/build/BUILD/gfs-kernel-2.6.9-60/smp/src/gfs/glock.c, line = 1093
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   time = 1172575419
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: about to withdraw from the cluster
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: waiting for outstanding I/O
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: telling LM to withdraw
Feb 27 12:23:43 node2 clurgmgrd: [14717]: <debug> Checking 172.23.50.51, Level 0
Feb 27 12:23:43 node2 clurgmgrd: [14717]: <debug> 172.23.50.51 present on bond1
Feb 27 12:23:43 node2 clurgmgrd: [14717]: <debug> Link for bond1: Detected
Feb 27 12:23:43 node2 clurgmgrd: [14717]: <debug> Link detected on bond1
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Trying to acquire journal lock...
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Looking at journal...
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Acquiring the transaction lock...
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Replaying journal...
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Replayed 0 of 0 blocks
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: replays = 0, skips = 0, sames = 0
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Journal replayed in 1s
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node2  lock_dlm: withdraw abandoned memory
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  GFS: fsid=ozeane:lt_atlantik.1: withdrawn
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  GFS: fsid=ozeane:lt_atlantik.1: ret = 0x00000003
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node1  GFS: fsid=ozeane:lt_atlantik.0: jid=1: Done
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173243
Feb 27 12:23:46 node2  general protection fault: 0000 [1] SMP
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  CPU 0
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  Modules linked in: nfsd exportfs lockd nfs_acl sg cpqci(U) mptctl mptbase netconsole netdump i2c_dev i2c_core sunrpc ext3 jbd button battery ac ohci_hcd hw_random shpchp floppy md5 ipv6 lock_dlm(U) dlm(U) gfs(U) lock_harness(U) cman(U) bonding(U) dm_round_robin dm_multipath qla2300 qla2xxx scsi_transport_fc cciss sd_mod scsi_mod dm_snapshot dm_mirror dm_mod tg3 e1000
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  Pid: 17539, comm: lock_dlm1 Tainted: P      2.6.9-42.0.3.ELsmp
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  RIP: 0010:[<ffffffffa013debc>] <ffffffffa013debc>{:gfs:run_queue+477}
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  RSP: 0018:00000100e5891db8  EFLAGS: 00010202
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  RAX: 000000000006000f RBX: 000001006f426920 RCX: 0000000000000001
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  RDX: ffffffffa017e9c0 RSI: 0000000000000001 RDI: 000001006f4268c8
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  RBP: 000001006d604420 R08: ffffffff803e1fe8 R09: 0000000000000001
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  R10: 0000000100000000 R11: ffffffff8011e884 R12: 0000000000000001
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  R13: 560a11000001000a R14: ffffff0000481000 R15: 000001006f4268c8
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  FS:  0000002a96a970e0(0000) GS:ffffffff804e5180(0000) knlGS:00000000f61d1bb0
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  CR2: 0000002a96a86880 CR3: 0000000000101000 CR4: 00000000000006e0
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  Process lock_dlm1 (pid: 17539, threadinfo 00000100e5890000, task 00000100e8ed17f0)
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  Stack: 0000000000000000 000001006f4268f4 000001006d604420 000001006f4268f4
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2         000001006f4268c8 ffffff0000481000 0000000000000003 ffffffffa013facf
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2         0000000000000001 0000000000000001
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2  Call Trace:<ffffffffa013facf>{:gfs:xmote_bh+953} <ffffffffa0141426>{:gfs:gfs_glock_cb+194}
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2         <ffffffffa01a8a75>{:lock_dlm:dlm_async+1989} <ffffffff80133dfe>{__wake_up_common+67}
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2         <ffffffff80133dad>{default_wake_function+0} <ffffffff8014b4f4>{keventd_create_kthread+0}
Feb 27 12:23:46 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:46 node2         <ffffffffa01a82b0>{:lock_dlm:dlm_async+0}
??Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: fatal: assertion "FALSE" failed
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   function = xmote_bh
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   file = /builddir/build/BUILD/gfs-kernel-2.6.9-60/smp/src/gfs/glock.c, line = 1093
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1:   time = 1172575419
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: about to withdraw from the cluster
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: waiting for outstanding I/O
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2  GFS: fsid=ozeane:lt_atlantik.1: telling LM to withdraw
Feb 27 12:23:43 syslog-server netdump[5743]: Got strange package from ip 0xac173242
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1: fatal: assertion "FALSE" failed
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   function = xmote_bh
Feb 27 12:23:43 node2 kernel: GFS: fsid=ozeane:lt_atlantik.1:   file = /builddir/build/BUILD/gfs-kernel-2.6.9-60/smp/src/gfs/glock.c, line = 1093
Feb 27 12:23:47 node1 clurgmgrd: [15224]: <debug> Link detected on bond1
Feb 27 12:23:48 node1 clurgmgrd: [15224]: <notice> Using atlantik as NetBIOS name (service atlantik)
Feb 27 12:23:48 node1 clurgmgrd: [15224]: <debug> Checking Samba instance "atlantik"
Feb 27 12:23:48 node1 clurgmgrd: [15224]: <debug> Checking 172.23.50.52, Level 0
Feb 27 12:23:48 node1 smbd[10164]: [2007/02/27 12:23:42, 0] printing/print_cups.c:cups_cache_reload(85)
Feb 27 12:23:48 node1 smbd[31559]: [2007/02/27 12:23:42, 0] printing/print_cups.c:cups_cache_reload(85)
Feb 27 12:23:48 node1 smbd[10164]:   Unable to connect to CUPS server localhost - Connection refused
Feb 27 12:23:48 node1 smbd[31559]:   Unable to connect to CUPS server localhost - Connection refused
Feb 27 12:23:48 node1 smbd[10164]: [2007/02/27 12:23:42, 0] printing/print_cups.c:cups_cache_reload(85)
Feb 27 12:23:48 node1 smbd[31559]: [2007/02/27 12:23:42, 0] printing/print_cups.c:cups_cache_reload(85)
Feb 27 12:23:48 node1 smbd[10164]:   Unable to connect to CUPS server localhost - Connection refused
Feb 27 12:23:48 node1 smbd[31559]:   Unable to connect to CUPS server localhost - Connection refused
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Trying to acquire journal lock...
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Looking at journal...
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Acquiring the transaction lock...
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Replaying journal...
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Replayed 0 of 0 blocks
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: replays = 0, skips = 0, sames = 0
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Journal replayed in 1s
Feb 27 12:23:48 node1 kernel: GFS: fsid=ozeane:lt_atlantik.0: jid=1: Done

From rpeterso at redhat.com  Tue Feb 27 15:14:25 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Tue, 27 Feb 2007 09:14:25 -0600
Subject: [Linux-cluster] Re: Question about Cluster Service
In-Reply-To: <483373.69806.qm@web31810.mail.mud.yahoo.com>
References: <483373.69806.qm@web31810.mail.mud.yahoo.com>
Message-ID: <45E44AD1.7040306@redhat.com>

sara sodagar wrote:
> Thanks a lot for replying to my question.
> Actually I have to distribute data on my web servers
> as
> we are using a provisioning software and it just uses
> this architecture.
> My plan is to setup 2 active servers that do not share
> any data , and one passive server which is for both of
> them.
> Node A ,C    (cluster service 1)
> Node B , C   (cluster service 2)
> Node c :   (Failover domain 1 : service 1,failover
> domain2: service 2)
> I want to setup GFS between A,C  and another pair B,C
> My main question is whether I should use GFS or not ?
> I am confused about whether I should use GFS when I am
> using High availability service management from
> Cluster suite.
>
> --Regards.
> Sara
>   
Hi Sara,

I'm not sure I understood that completely.  However, the question
comes down to whether the systems (for example "A" and "C")
have physical access to the same shared storage, like a SAN.

If they're trying to coexist using shared storage, then you want to
use GFS.   If "C" is only seeing a copy of the data on "A" then
you don't need GFS.  The same applies to "B" and "C" and their
storage.  You don't need GFS in order to do High Availability
failover services.  You do need GFS if your storage is shared.

Regards,

Bob Peterson
Red Hat Cluster Suite


From Alain.Moulle at bull.net  Tue Feb 27 15:50:07 2007
From: Alain.Moulle at bull.net (Alain Moulle)
Date: Tue, 27 Feb 2007 16:50:07 +0100
Subject: [Linux-cluster] Re: CS4 Update 4 / Oops in dlm module (Lon
	Hohberger)
Message-ID: <45E4532F.1080206@bull.net>

Hi

We test it :
1/ it seems that the services stuck in stopping state is fixed
2/ about DLM Oops, we have not reproduced it but it happens
only once with former rpm version, so ... wait & see ...
3/ we have a problem just after the boot in clurgmgrd, don't
know if it is due to this new rpm or not, but we never had
this problem with former rpm version; syslog gives :
clurgmgrd[7069]: <info> Services Initialized
clurgmgrd[7069]: <crit> #10: Couldn't set up listen socket
and CS4 is stalled on the machine.

Any idea ?

Thanks a lot
Alain
> Could you install the current rgmanager test RPM:
>
> http://people.redhat.com/lhh/rgmanager-1.9.54-3.228823test.i386.rpm
>
> ...and see if it goes away?  The above RPM is the same as 1.9.54, but
> includes fix for an assertion failure, a way to fix services stuck in
> the stopping state, and (the important one for you) a fix for an
> intermittent DLM lock leak.
>
> ia64/x86_64/srpms here: http://people.redhat.com/lhh/packages.html


From dave at eons.com  Tue Feb 27 15:58:08 2007
From: dave at eons.com (Dave Berry)
Date: Tue, 27 Feb 2007 10:58:08 -0500
Subject: [Linux-cluster] failover domain issue
Message-ID: <45E45510.2030102@eons.com>

I currently have a failover domain configured in an ordered, restricted 
setup(RHEL4U4) and the failover is not working. 
Server fs101 is disconnected from the network and fenced, fs102 becomes 
the primary.  When fs101 returns, I get an error on fs102( <err> #52: 
Failed changing RG status ) and it does not fail back to fs101.  Any 
thoughts?

<failoverdomain name="nfs_domain1" ordered="1" restricted="1">
                                <failoverdomainnode name="fs101" 
priority="1"/>
                                <failoverdomainnode name="fs102" 
priority="2"/>
                                <failoverdomainnode name="fs103" 
priority="3"/>

-dave


From teigland at redhat.com  Tue Feb 27 21:31:15 2007
From: teigland at redhat.com (David Teigland)
Date: Tue, 27 Feb 2007 15:31:15 -0600
Subject: [Linux-cluster] cluster-1.04.00
Message-ID: <20070227213114.GB21810@redhat.com>

A new source tarball from the STABLE branch has been released; it builds
and runs on 2.6.20:

  ftp://sources.redhat.com/pub/cluster/releases/cluster-1.04.00.tar.gz

Version 1.04.00 - 27 February 2007
==================================
  cman
   * Add cman/cluster_id field to CCS to allow users to override the
     cluster ID generated from the name. bz#219588
  cman-kernel
   * Fix global id creation in SM to avoid duplicates. bz#206193 bz#217626
   * Check for NULL in process_startdone_barrier_new(). bz#206212
   * Fix proc reads. bz#213723
   * Fix race that could panic if cman_kill_node() is called when we are
     shutting down. bz#223098
   * Tell SM when the quorum device comes or goes.
   * Always queue kill messages that need ACKs, so we don't block apps
     like qdiskd. bz#223462
   * Merge qdisk fixes from RHEL4 branch.
  dlm-kernel
   * Don't create lkids of 0. bz#199673
   * Add a spinlock around the ls_nodes_gone list. bz#206463
  gfs-kernel
   * Was passing fl_pid instead of fl_owner causing F_GETLK problems.
   * bz#206339
   * F_UNLCK was returning -ENOENT when it didn't find plocks. bz#206590
   * Include sd_freeze_count in counters output.
   * Add SELinux xattr support.
   * Change the default drop_count from 50,000 to 200,000. bz#218795
   * Update the drop_count for mounted fs's. bz#218795
   * Fix gfs knows of directories which it chooses not to display.
   * bz#190756
   * Don't panic if we try to unlock a plock that's already being
   * unlocked.
     bz#220219
  fence
   * fence_tool - add timeout option for leave.
   * fenced - add manual override. bz#223060


From teigland at redhat.com  Tue Feb 27 21:44:42 2007
From: teigland at redhat.com (David Teigland)
Date: Tue, 27 Feb 2007 15:44:42 -0600
Subject: [Linux-cluster] cluster-2.00.00
Message-ID: <20070227214442.GC21810@redhat.com>

The first source tarball of the new cluster code from cvs HEAD has been
released; it builds and runs on 2.6.20:

  ftp://sources.redhat.com/pub/cluster/releases/cluster-2.00.00.tar.gz

Most notably, this new cluster architecture is based on openais
(http://developer.osdl.org/dev/openais/) and gfs2 and dlm kernel
components that are now present in the upstream kernel.

The tarball still includes the gfs1 kernel code under cluster/gfs-kernel
and user code under cluster/gfs.  This version of gfs1 uses the lock_dlm
and lock_nolock modules from the upstream kernel under fs/gfs2/locking.


From sara_sodagar at yahoo.com  Wed Feb 28 08:07:59 2007
From: sara_sodagar at yahoo.com (sara sodagar)
Date: Wed, 28 Feb 2007 00:07:59 -0800 (PST)
Subject: [Linux-cluster] Re: Question about Cluster Service
In-Reply-To: <20070227170007.1E75F73568@hormel.redhat.com>
Message-ID: <419766.4860.qm@web31811.mail.mud.yahoo.com>

Hi
Thank you very much for replying to my question.
My first problem is whether I can use one node (Node
C)
as a passive for two separate cluster
service(service1,service2).
We are using SAN in our solution and I think there is
no need for Node C to have a copy of the data on A ,
because
it's a waste of storage on SAN.It should have an
access
on data on A in case of failure.(I don't know if this
is necessary to have GFS between A,C or not?)

--Regards.
Sara

--- linux-cluster-request at redhat.com wrote:

> Send Linux-cluster mailing list submissions to
> 	linux-cluster at redhat.com
> 
> To subscribe or unsubscribe via the World Wide Web,
> visit
> 
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> or, via email, send a message with subject or body
> 'help' to
> 	linux-cluster-request at redhat.com
> 
> You can reach the person managing the list at
> 	linux-cluster-owner at redhat.com
> 
> When replying, please edit your Subject line so it
> is more specific
> than "Re: Contents of Linux-cluster digest..."
> 
> 
> Today's Topics:
> 
>    1. clurgmgrd[6147]: <warning> Starving for lock
> usrm::rg="SDA
>       database" (James Lapthorn)
>    2. Re: Question about Cluster Service (sara
> sodagar)
>    3. GFS 6.1 crashed (glock.c) (Dirk Haller)
>    4. Re: Re: Question about Cluster Service (Robert
> Peterson)
>    5. Re: CS4 Update 4 / Oops in dlm module (Lon
> Hohberger)
>       (Alain Moulle)
>    6. failover domain issue (Dave Berry)
> 
> 
>
----------------------------------------------------------------------
> 
> Message: 1
> Date: Tue, 27 Feb 2007 10:42:56 -0000 (UTC)
> From: "James Lapthorn"
> <james.lapthorn at lapthornconsulting.com>
> Subject: [Linux-cluster] clurgmgrd[6147]: <warning>
> Starving for lock
> 	usrm::rg="SDA database"
> To: linux-cluster at redhat.com
> Message-ID:
>
<49900.193.133.138.40.1172572976.squirrel at lapthorn.biz>
> Content-Type: text/plain;charset=iso-8859-1
> 
> Hi Guys,
> 
> I have a 4 node cluster running RH Cluster Suite 4. 
> I have just added a
> DB2 service to one of the nodes and have starting
> gettingerrors relating
> to locks ion the system log.  I plan to restart this
> node at luch time
> today to see if this fixes the problem.
> 
> Is there anyone who can explain what these errors
> relate to so that I can
> understand the problem better.   I have checked RHN,
> Cluster Project
> website and Google and I cant find anything?
> 
> Its worth mentioning that the service is running
> fine.
> 
> Feb 27 10:18:40 leoukldb2 clurgmgrd[6147]: <warning>
> Starving for lock
> usrm::rg="SDA database"
> Feb 27 10:19:40 leoukldb2 last message repeated 2
> times
> Feb 27 10:21:10 leoukldb2 last message repeated 3
> times
> Feb 27 10:22:55 leoukldb2 last message repeated 2
> times
> Feb 27 10:22:55 leoukldb2 clurgmgrd[6147]: <err>
> #48: Unable to obtain
> cluster lock: Connection timed out
> Feb 27 10:23:40 leoukldb2 clurgmgrd[6147]: <err>
> #50: Unable to obtain
> cluster lock: Connection timed out
> Feb 27 10:24:25 leoukldb2 clurgmgrd[6147]: <err>
> #48: Unable to obtain
> cluster lock: Connection timed out
> Feb 27 10:24:55 leoukldb2 clurgmgrd[6147]: <warning>
> Node
> ID:211c013b0000000b stuck with lock usrm::rg="SDA
> database"
> 
> _________________________________
> This email has been ClamScanned !
>           www.clamav.net
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Tue, 27 Feb 2007 04:14:41 -0800 (PST)
> From: sara sodagar <sara_sodagar at yahoo.com>
> Subject: [Linux-cluster] Re: Question about Cluster
> Service
> To: linux-cluster at redhat.com
> Message-ID:
> <483373.69806.qm at web31810.mail.mud.yahoo.com>
> Content-Type: text/plain; charset=iso-8859-1
> 
> Thanks a lot for replying to my question.
> Actually I have to distribute data on my web servers
> as
> we are using a provisioning software and it just
> uses
> this architecture.
> My plan is to setup 2 active servers that do not
> share
> any data , and one passive server which is for both
> of
> them.
> Node A ,C    (cluster service 1)
> Node B , C   (cluster service 2)
> Node c :   (Failover domain 1 : service 1,failover
> domain2: service 2)
> I want to setup GFS between A,C  and another pair
> B,C
> My main question is whether I should use GFS or not
> ?
> I am confused about whether I should use GFS when I
> am
> using High availability service management from
> Cluster suite.
> 
> --Regards.
> Sara
> 
> --- linux-cluster-request at redhat.com wrote:
> 
> > Send Linux-cluster mailing list submissions to
> > 	linux-cluster at redhat.com
> > 
> > To subscribe or unsubscribe via the World Wide
> Web,
> > visit
> > 
> >
>
https://www.redhat.com/mailman/listinfo/linux-cluster
> > or, via email, send a message with subject or body
> > 'help' to
> > 	linux-cluster-request at redhat.com
> > 
> > You can reach the person managing the list at
> > 	linux-cluster-owner at redhat.com
> > 
> > When replying, please edit your Subject line so it
> > is more specific
> > than "Re: Contents of Linux-cluster digest..."
> > 
> > 
> > Today's Topics:
> > 
> >    1. Re: Question about Cluster Service (Filipe
> > Miranda)
> >    2. Re: Problem running GFS on top of AoE
> > (Shailesh)
> >    3. Re: data and machine dependent NFS GFS file
> > xfer	problem
> >       (Keith Lewis)
> >    4. Re: data and machine dependent NFS GFS file
> > xfer	problem
> >       (Robert Peterson)
> >    5. Re: Running GFS as local file system
> (Jonathan
> > E Brassow)
> > 
> > 
> >
>
----------------------------------------------------------------------
> > 
> > Message: 1
> > Date: Sun, 25 Feb 2007 20:19:52 -0300
> > From: "Filipe Miranda" <filipe.miranda at gmail.com>
> > Subject: Re: [Linux-cluster] Question about
> Cluster
> > Service
> > To: "linux clustering" <linux-cluster at redhat.com>
> > Message-ID:
> > 
> >
>
<a6d13c780702251519g3255e3b3le719a490ee6a1814 at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> > 
> > Hi there Sodagar,
> > 
> > Will the Web servers present the same data? will
> > they serve the same
> > content?
> > Why not use all three servers active (without the
> > fail-over mode) and just
> > add a layer of load balancing to the top of this
> > solution (two machines with
> 
=== message truncated ===


____________________________________________________________________________________
Finding fabulous fares is fun.  
Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains.
http://farechase.yahoo.com/promo-generic-14795097


From erwan at seanodes.com  Wed Feb 28 13:45:36 2007
From: erwan at seanodes.com (Erwan Velu)
Date: Wed, 28 Feb 2007 14:45:36 +0100
Subject: [Linux-cluster] Typo in Makefile
Message-ID: <45E58780.8060901@seanodes.com>

I found many line like this one in the Makefile or rgmanager.

rgmanager/src/utils/Makefile: $(CC) -o $@ $^ $(INLUDE) $(CFLAGS) $(LDFLAGS)

Looks like INLUDE is a typo ;)


From rpeterso at redhat.com  Wed Feb 28 15:11:39 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 28 Feb 2007 09:11:39 -0600
Subject: [Linux-cluster] Re: Question about Cluster Service
In-Reply-To: <419766.4860.qm@web31811.mail.mud.yahoo.com>
References: <419766.4860.qm@web31811.mail.mud.yahoo.com>
Message-ID: <45E59BAB.6000103@redhat.com>

sara sodagar wrote:
> Hi
> Thank you very much for replying to my question.
> My first problem is whether I can use one node (Node
> C)
> as a passive for two separate cluster
> service(service1,service2).
> We are using SAN in our solution and I think there is
> no need for Node C to have a copy of the data on A ,
> because
> it's a waste of storage on SAN.It should have an
> access
> on data on A in case of failure.(I don't know if this
> is necessary to have GFS between A,C or not?)
>
> --Regards.
> Sara
>   
Hi Sara,

I am still confused by your explanation, but I'll try to answer your
question anyway.

Yes, it's possible for C to be a passive (failover) server for node
both A and B.  I'm not an rgmanager expert, but I think you can do
this by configuring two failover domains in your /etc/cluster/cluster.conf.
I don't know what service you need to fail over to "C" but here's an
example that uses a virtual IP address service:

               <failoverdomains>
                       <failoverdomain name="igridnodes1" ordered="1" 
restricted="1">
                               <failoverdomainnode name="A" priority="1"/>
                               <failoverdomainnode name="C" priority="2"/>
                       </failoverdomain>
                       <failoverdomain name="igridnodes2" ordered="1" 
restricted="1">
                               <failoverdomainnode name="B" priority="1"/>
                               <failoverdomainnode name="C" priority="2"/>
                       </failoverdomain>
                </failoverdomains>
                <resources>
                        <ip address="10.0.0.1" monitor_link="1"/>
                        <ip address="10.0.0.2" monitor_link="1"/>
                </resources>
                <service autostart="1" domain="igridnodes1" name="10.0.0.1">
                        <ip ref="10.0.0.1"/>
                </service>
                <service autostart="1" domain="igridnodes2" name="10.0.0.2">
                        <ip ref="10.0.0.2"/>
                </service>

This basically defines two virtual IP addresses, one for "A" but "C" can 
take over
if A fails, and a second one for "B" but "C" can take over if B fails.
They are ordered, which means prioritized, and "A" is given highest priority
for one, and "B" for the other.  That means A and B should be running the
service until they fail.  If they fail, C will be used to host the service.
In this example, the services are virtual IP addresses, but you can use this
same basic idea to define any service; even your own custom service if 
you want.

Unless I misunderstood your description, it sounds like your data is kept
on a SAN, and you want "C" to be able to serve the same data on that SAN
(not a copy) for both "A" and "B".  If that's the case, then the storage 
is shared
between A and C, and yes, you want to use GFS to manage the data.
I hope this answers your question.

Regards,

Bob Peterson
Red Hat Cluster Suite


From teigland at redhat.com  Wed Feb 28 15:45:36 2007
From: teigland at redhat.com (David Teigland)
Date: Wed, 28 Feb 2007 09:45:36 -0600
Subject: [Linux-cluster] cluster-1.04.00
In-Reply-To: <20070227213114.GB21810@redhat.com>
References: <20070227213114.GB21810@redhat.com>
Message-ID: <20070228154536.GA14545@redhat.com>

On Tue, Feb 27, 2007 at 03:31:15PM -0600, David Teigland wrote:
> A new source tarball from the STABLE branch has been released; it builds
> and runs on 2.6.20:
> 
>   ftp://sources.redhat.com/pub/cluster/releases/cluster-1.04.00.tar.gz

The release.mk.input files had a trailing space after the 1.04.00 version
number which broke compiles.  I've fixed that and replaced the tarball.
There appear to be some other build issues, so I expect we may put out a
1.04.01 soon.

Dave


From erwan at seanodes.com  Wed Feb 28 16:10:45 2007
From: erwan at seanodes.com (Erwan Velu)
Date: Wed, 28 Feb 2007 17:10:45 +0100
Subject: [Linux-cluster] Typo in Makefile
In-Reply-To: <45E58780.8060901@seanodes.com>
References: <45E58780.8060901@seanodes.com>
Message-ID: <45E5A985.8030302@seanodes.com>

Erwan Velu wrote:
> I found many line like this one in the Makefile or rgmanager.
>
> rgmanager/src/utils/Makefile: $(CC) -o $@ $^ $(INLUDE) $(CFLAGS) 
> $(LDFLAGS)
>
> Looks like INLUDE is a typo ;)
>
I forgot to say this apply to cluster-1.04 and previous.


From aruvic at bits.ba  Wed Feb 28 21:29:44 2007
From: aruvic at bits.ba (aruvic at bits.ba)
Date: Wed, 28 Feb 2007 22:29:44 +0100 (CET)
Subject: [Linux-cluster] Typo in Makefile
In-Reply-To: <45E5A985.8030302@seanodes.com>
References: <45E58780.8060901@seanodes.com> <45E5A985.8030302@seanodes.com>
Message-ID: <1243.192.168.50.4.1172698184.squirrel@www.bits.ba>

hi

when i try to start cman with this command "service cman start"

i get this:

"Starting cman: [FAILED]" and in /var/log/messages

i have this line

Feb 28 22:24:30 clus1 cman: FATAL: Module cman not found. failed

do someone have an idea why the cman Module can not be found?

thanks
Alen Ruvic


From rpeterso at redhat.com  Wed Feb 28 22:04:05 2007
From: rpeterso at redhat.com (Robert Peterson)
Date: Wed, 28 Feb 2007 16:04:05 -0600
Subject: [Linux-cluster] Typo in Makefile
In-Reply-To: <1243.192.168.50.4.1172698184.squirrel@www.bits.ba>
References: <45E58780.8060901@seanodes.com> <45E5A985.8030302@seanodes.com>
	<1243.192.168.50.4.1172698184.squirrel@www.bits.ba>
Message-ID: <45E5FC55.9070302@redhat.com>

aruvic at bits.ba wrote:
> hi
>
> when i try to start cman with this command "service cman start"
>
> i get this:
>
> "Starting cman: [FAILED]" and in /var/log/messages
>
> i have this line
>
> Feb 28 22:24:30 clus1 cman: FATAL: Module cman not found. failed
>
> do someone have an idea why the cman Module can not be found?
>
> thanks
> Alen Ruvic
>   
Hi Alan,

Did you build the software by hand?  You may need to do depmod -a
to add cman.ko to your kernel module dependencies.  If that's not the
problem, then try doing "modprobe cman" and see if it complains.
Also, let us know what software you're trying to do this with.
There isn't a cman.ko module on the new (HEAD / RHEL5) software,
but there is on the current (STABLE / RHEL4) software.

Regards,

Bob Peterson
Red Hat Cluster Suite


From berrange at redhat.com  Mon Feb 12 17:53:59 2007
From: berrange at redhat.com (Daniel P. Berrange)
Date: Mon, 12 Feb 2007 17:53:59 -0000
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
Message-ID: <20070212175357.GF21671@redhat.com>

On Fri, Feb 09, 2007 at 04:39:03PM -0500, Lon Hohberger wrote:
> On Fri, 2007-02-09 at 16:18 -0500, Hugh Brock wrote:
> 
> > Todos:
> >    Investigate gparted, one of the partition management tools we already 
> > have (apis? remote accessibility?) (I believe Jim Meyering volunteered 
> 
> * Investigate Conga's cluster and non-cluster remotely-accessible LVM
> management, which sounds like it would fit the bill?
> 
> APIs are all XMLRPC, IIRC, so they're extensible and flexible.

There unfortunately is a bit of an impedance mis-match between libvirt and
Conga. libvirt is a low level library written with the goal that if you
have a host running Xen / QEMU / KVM, you can just drop in the libvirt
library and get a set of APIs for managing the system. Experiance with
developing virt-inst, virt-manager & cobbler/koan has shown that we need
a simple API for enumerating available storage volumes, and allocating
new volumes. 

In providing such an API though, we don't want to have to mandate that 
everyone using libvirt also install Conga. While Conga is indeed a very
capable tool, requiring install / setup of another web service is going
to put up a singificant barrier to entry for people wanting to use libvirt/
Particularly for developers who are just experimenting with virtualization
on a laptop / desktop / couple of machines.

Hence our initial goal is to find a suitable C library we can call into
to perform our simple set of storage management tasks. Now in keeping 
with the libvirt model of pluggable hypervisor drivers, I'd  expect the 
underlying libvirt impl of any storage APIs to also be pluggable. So while
the initial impl might be based on GParteD, we would have the option of
also providing a Conga based backend at a later date.

Regards,
Dan
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 


From dlutter at redhat.com  Mon Feb 12 19:23:35 2007
From: dlutter at redhat.com (David Lutterkort)
Date: Mon, 12 Feb 2007 19:23:35 -0000
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <45D0A3FE.3000500@redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
	<45D0A3FE.3000500@redhat.com>
Message-ID: <1171308215.12535.23.camel@localhost.localdomain>

On Mon, 2007-02-12 at 12:29 -0500, James Parsons wrote:
> Lon Hohberger wrote:
> >http://sourceware.org/cluster/conga/

It seems that that should be http://www.sourceware.org/cluster/conga/ -
I get a 503 without the www. BTW, where is the conga CVS ? It doesn't
seem to be linked from that page.

David


From dlutter at redhat.com  Mon Feb 12 19:28:41 2007
From: dlutter at redhat.com (David Lutterkort)
Date: Mon, 12 Feb 2007 19:28:41 -0000
Subject: [Linux-cluster] Re: [Libvir] Storage manager initial requirements
	and thoughts
In-Reply-To: <1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
References: <45CCE528.8070904@redhat.com>
	<1171057143.4798.43.camel@asuka.boston.devel.redhat.com>
Message-ID: <1171308521.12535.29.camel@localhost.localdomain>

On Fri, 2007-02-09 at 16:39 -0500, Lon Hohberger wrote:
> * Investigate Conga's cluster and non-cluster remotely-accessible LVM
> management, which sounds like it would fit the bill?

Scott did a lot of work to add storage mgmt capabilities to puppet based
on conga. Those features haven't been embraced by the puppet community. 

Feedback I have seen has been (a) conga is not in Fedora, let alone any
other Linux distros (e.g., debian) (b) conga is perceived as too RH
specific.

David


From riaan at linuxwarehouse.co.za  Wed Feb 14 15:06:03 2007
From: riaan at linuxwarehouse.co.za (Riaan van Niekerk)
Date: Wed, 14 Feb 2007 15:06:03 -0000
Subject: [Linux-cluster] Can't see all volumes
In-Reply-To: <24FC2E99D9E64349B866EAC79D68A9A0@ZSTAR>
References: <20072148216.915974@leena> <24FC2E99D9E64349B866EAC79D68A9A0@ZSTAR>
Message-ID: <45D3252A.40109@linuxwarehouse.co.za>


shirai at SystemCreateInc wrote:
> Hi Mike
> 
> Could you show the output of cat /proc/scsi/scsi?
> I think that it can recognize only the drive of LUN number 0.
> 
> 1. Add the following line to/etc/modprobe.conf.
>     options scsi_mod.o max_scsi_luns=255

slight correction:
options scsi_mod max_luns=255

> 
> 2. Next, this setting must be read when you boot OS.
>     #mkinitrd /boot/< your initrd image > `uname -r`
> 
> Regard
> 
> ------------------------------------------------------
> Shirai Noriyuki
> Chief Engineer Technical Div. System Create Inc
> Kanda Toyo Bldg, 3-4-2 Kandakajicho
> Chiyodaku Tokyo 101-0045 Japan
> Tel81-3-5296-3775 Fax81-3-5296-3777
> e-mail:shirai at sc-i.co.jp web:http://www.sc-i.co.jp
> ------------------------------------------------------
> 
> 
>> With all of the tech's on this list... no one has seen this problem?
>> Thought I'd ask again since I'm still stumped.
>>
>> By the way, anyone know of any user groups, forums for the Xyratex/MTI
> style
>> storage chassis?
>>
>> Mike
>>
>>> This might or might not be the right list and if it is not, does anyone
> know
>>> one that would cover things like fibre channel RAID devices?
>>> I have an 800GB RAID drive which I have split into 32 smaller volumes.
> This
>>> is a fibre channel system, Xyratex. The problem is that I cannot see any
> more
>>> than 2 volumes per controller when I check from any of the nodes.
>>> Basically, I would like to set up a small volume for each node along with
>>> their various GFS volumes.
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>>
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: riaan.vcf
Type: text/x-vcard
Size: 452 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070214/2a865a9c/attachment.vcf>