From pasik at iki.fi  Wed Oct  1 13:26:41 2008
From: pasik at iki.fi (Pasi =?iso-8859-1?Q?K=E4rkk=E4inen?=)
Date: Wed, 1 Oct 2008 16:26:41 +0300
Subject: [Linux-cluster] Online resizing CLVM PVs (with RHEL 5.3)
Message-ID: <20081001132641.GO9714@edu.joroinen.fi>

Hello list!

I've been trying out RHEL 5.3 beta/test packages for kernel and
device-mapper-multipath that allows online resizing of SCSI and dm-mpath
devices.

I've been able to successfully online resize:

- iSCSI LUNs
- dm-mpath device that is on top of those iSCSI LUNs
- LVM PV on top of that dm-mpath device
- LVM volume from that PV/VG
- And ext3 filesystem on top of that LVM volume

Now I'm wondering about online resizing shared/clustered CLVM PV.

Should it work just like that..? Make sure all servers in the cluster see 
SCSI and dm-mpath devices resized, and after that just run pvresize? 

Thanks!

-- Pasi


From edoardo.causarano at laitspa.it  Wed Oct  1 13:49:05 2008
From: edoardo.causarano at laitspa.it (Edoardo Causarano)
Date: Wed, 1 Oct 2008 15:49:05 +0200
Subject: [Linux-cluster] fence_scsi & shutdown race
Message-ID: <1222868945.6506.11.camel@ecausarano-laptop>

Hi

I config'd a a 2node cluster with fence_scsi on the gfs device. Works
great but as soon as I (cleanly) reboot a node (say node01) the other
(node02) will fence it and the shutdown init scripts will hang when GFS
on the reebooting node (node01) tries to withdraw from the cluster. I
have to (manually) reset the machine (node01) and it will happily
rejoin. What can I do to avoid this situationm, any tunables?

ciao,
e


From s.wendy.cheng at gmail.com  Wed Oct  1 15:20:49 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 01 Oct 2008 11:20:49 -0400
Subject: [Linux-cluster] Distributed Replicated GFS shared storage
In-Reply-To: <48E29731.4010604@gmail.com>
References: <C5CC877B-7125-4F1F-A362-AD4877EDA3D4@mac.com>
	<48E29731.4010604@gmail.com>
Message-ID: <48E39551.2000201@gmail.com>

Jos? Miguel Parrella Romero wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Juliano Rodrigues escribi?, en fecha 30/09/08 10:58:
>   
>> Hello,
>>
>> In order to design an HA project I need a solution to replicate one GFS
>> shared storage to another "hot" (standby) GFS "mirror", in case of my
>> primary shared storage permanently fail.
>>
>> Inside RHEL Advanced Platform there is any supported way to accomplish
>> that?
>>     
>
> I believe the whole point of GFS is avoiding you to spend twice your
> storage capacity just for the sake of storage distribution. It already
> enables you to have a standby server which can go live through a
> resource manager whenever you need it.
>   

Look like the original subject (requirement) is to have redundant (HA) 
storage devices. GFS alone can't accomplish this since it only deals 
with server nodes - as soon as the shared storage unit is gone, the 
filesystem will be completely unusable.

Depending on the hardware, redundant storages do not necessarily consume 
a great deal of storage capacity. Though GFS itself does spread the 
blocks allocation across the whole partition (to avoid write contention 
between multiple clustered nodes), the underneath hardware may do things 
differently. That is, GFS block numbers (and its block layout) do not 
necessarily resemble the real disk block numbers (and the physical 
layout).  So, say if you have a 1TB GFS partition configured but it only 
gets half full, you may have extra 500GB space to spare if your SAN 
product allows this type of over-commit. If your storage vendor supports 
data de-duplication,  the storage consumption can go down even further. 

-- Wendy

> However, if you need to have two separate storage facilities which sync
> in one way, DRBD is probably the easiest way to do so. Heartbeat can
> manage DRBD resources at block- and filesystem-level easily, and other
> resource managers can probably do so (though I haven't used them)
>
> HTH,
> Jose
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkjilzEACgkQUWAsjQBcO4KhwQCeM0lxhXfCwxiAigfi+39pHGog
> alwAn3UilZcaPU009vaoxVhXFV6J5KqY
> =IVLO
> -----END PGP SIGNATURE-----
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   


From angelo.compagnucci at gmail.com  Wed Oct  1 15:39:34 2008
From: angelo.compagnucci at gmail.com (Angelo Compagnucci)
Date: Wed, 1 Oct 2008 17:39:34 +0200
Subject: [Linux-cluster] CLVM clarification
Message-ID: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>

Hi to all,This is my first post on this list. Thanks in advance for every
answer.

I've already read every guide in this matter, this is the list:

Cluster_Administration.pdf
Cluster_Logical_Volume_Manager.pdf
Global_Network_Block_Device.pdf
Cluster_Suite_Overview.pdf
Global_File_System.pdf
CLVM.pdf
RedHatClusterAdminOverview.pdf

The truth is that I've not clear a point about CLVM.

Let's me make an example:

In this example CLVM and the Cluster suite are fully running without
problems. Let's pose the same configuration of cluster.conf and lvm.conf and
the nodes of the cluster are joined and operatives.

NODE1:

pvcreate /dev/hda3

NODE2:

pvcreate /dev/hda2

Let's pose that CLVM spans LVM metadata across the cluster, if I stroke the
command:

pvscan

I should see /dev/sda2 and /dev/sda3

and then I can create a vg with

vgcreate /dev/sda2 /dev/sda3 ...

The question is: How LVM metadata sharing works? I have to use GNBD on the
row partion to share a device between nodes? I can create a GFS over a
spanned volume group? Are shareable only logical volumes?

Thanks for your answers!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/99b84a94/attachment.htm>

From sakect at gmail.com  Wed Oct  1 15:45:53 2008
From: sakect at gmail.com (POWERBALL ONLINE)
Date: Wed, 1 Oct 2008 22:45:53 +0700
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
Message-ID: <c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>

Please give me the error when you create the LVM in /var/log/messages.
Are you running clvmd service?

2008/9/30 Terry Davis <terrybdavis at gmail.com>

> lvm2-2.02.32-4.el5
> lvm2-cluster-2.02.32-4.el5
>
> I don't have any patches available for LVM.
>
> 2008/9/30 POWERBALL ONLINE <sakect at gmail.com>
>
>  Hello,
>>
>> Are you already update patch?
>> Because it is teh LVM bug you can find it in redhat bugzila.
>>
>> Best Regards,
>>
>> Somsak
>>
>> 2008/9/30 Terry Davis <terrybdavis at gmail.com>
>>
>>>   Hello,
>>>
>>> I am having a heck of a time getting a volume to show up in my cluster.
>>> I have a feeling I am doing something wrong but this isn't the first one
>>> I've added so I'm not sure where I got lucky before.  Here is what I've done
>>> thus far in my 2 node RHEL5 cluster:
>>>
>>> 1) Created my volume in my SAN and gave both nodes access to it
>>> 2) on node A: created 4TB partition with parted and made a gpt label
>>> 3) on node A: pvcreate /dev/sdc1
>>> 4) on node B: vgcreate vg_data01e /dev/sdc1
>>> 5) on both nodes: vgchange -a y
>>> 6) on node A: lvcreate -n lv_data01e vg_data01e
>>>
>>> I get the error:
>>>   Error locking on node omadvnfs01b: Volume group for uuid not found:
>>> p9SfIjriPtXY33G1Yi3YdojvAAAzmAuwlOLqhVzX8mqL6goiVmUAgQZLGcDnX324
>>>   Error locking on node omadvnfs01a: Volume group for uuid not found:
>>> p9SfIjriPtXY33G1Yi3YdojvAAAzmAuwlOLqhVzX8mqL6goiVmUAgQZLGcDnX324
>>>   Aborting. Failed to activate new LV to wipe the start of it.
>>>
>>> I tried restarting clvmd for good measure.  Still no luck.  What am I
>>> doing wrong?
>>>
>>>
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/3340d129/attachment.htm>

From terrybdavis at gmail.com  Wed Oct  1 16:34:20 2008
From: terrybdavis at gmail.com (Terry Davis)
Date: Wed, 1 Oct 2008 11:34:20 -0500
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
Message-ID: <14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>

I started over by deleting and recreating the volume in our SAN:

[root at omadvnfs01a ~]# pvcreate /dev/sdh1
  Physical volume "/dev/sdh1" successfully created

[root at omadvnfs01a ~]# vgcreate -c y vg_data01e /dev/sdh1
  Volume group "vg_data01e" successfully created

[root at omadvnfs01a ~]# lvcreate -n lv_data01e -l100%VG vg_data01e
  Error locking on node omadvnfs01a: Volume group for uuid not found:
FbdGWogYIYwX1IfeZcSLxoGNoGgfcZMOzeHZMf1beXTguz9JpiJifBi0dKzwG7pI
  Error locking on node omadvnfs01b: Volume group for uuid not found:
FbdGWogYIYwX1IfeZcSLxoGNoGgfcZMOzeHZMf1beXTguz9JpiJifBi0dKzwG7pI
  Aborting. Failed to activate new LV to wipe the start of it.

Yes, I am running the clvmd service.  Frustrating.


2008/10/1 POWERBALL ONLINE <sakect at gmail.com>

> Please give me the error when you create the LVM in /var/log/messages.
> Are you running clvmd service?
>
> 2008/9/30 Terry Davis <terrybdavis at gmail.com>
>
>> lvm2-2.02.32-4.el5
>> lvm2-cluster-2.02.32-4.el5
>>
>> I don't have any patches available for LVM.
>>
>> 2008/9/30 POWERBALL ONLINE <sakect at gmail.com>
>>
>>  Hello,
>>>
>>> Are you already update patch?
>>> Because it is teh LVM bug you can find it in redhat bugzila.
>>>
>>> Best Regards,
>>>
>>> Somsak
>>>
>>> 2008/9/30 Terry Davis <terrybdavis at gmail.com>
>>>
>>>>   Hello,
>>>>
>>>> I am having a heck of a time getting a volume to show up in my cluster.
>>>> I have a feeling I am doing something wrong but this isn't the first one
>>>> I've added so I'm not sure where I got lucky before.  Here is what I've done
>>>> thus far in my 2 node RHEL5 cluster:
>>>>
>>>> 1) Created my volume in my SAN and gave both nodes access to it
>>>> 2) on node A: created 4TB partition with parted and made a gpt label
>>>> 3) on node A: pvcreate /dev/sdc1
>>>> 4) on node B: vgcreate vg_data01e /dev/sdc1
>>>> 5) on both nodes: vgchange -a y
>>>> 6) on node A: lvcreate -n lv_data01e vg_data01e
>>>>
>>>> I get the error:
>>>>   Error locking on node omadvnfs01b: Volume group for uuid not found:
>>>> p9SfIjriPtXY33G1Yi3YdojvAAAzmAuwlOLqhVzX8mqL6goiVmUAgQZLGcDnX324
>>>>   Error locking on node omadvnfs01a: Volume group for uuid not found:
>>>> p9SfIjriPtXY33G1Yi3YdojvAAAzmAuwlOLqhVzX8mqL6goiVmUAgQZLGcDnX324
>>>>   Aborting. Failed to activate new LV to wipe the start of it.
>>>>
>>>> I tried restarting clvmd for good measure.  Still no luck.  What am I
>>>> doing wrong?
>>>>
>>>>
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/af374fdf/attachment.htm>

From agk at redhat.com  Wed Oct  1 16:42:47 2008
From: agk at redhat.com (Alasdair G Kergon)
Date: Wed, 1 Oct 2008 17:42:47 +0100
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
	<14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>
Message-ID: <20081001164247.GB6173@agk.fab.redhat.com>

I hope that problem was fixed in newer packages.

Meanwhile try running 'clvmd -R' between some of the commands.

If all else fails, you may have to kill the clvmd daemons in the cluster
and restart them, or even add a 'vgscan' on each node before the restart.

Alasdair
-- 
agk at redhat.com


From terrybdavis at gmail.com  Wed Oct  1 17:06:15 2008
From: terrybdavis at gmail.com (Terry Davis)
Date: Wed, 1 Oct 2008 12:06:15 -0500
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <20081001164247.GB6173@agk.fab.redhat.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
	<14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>
	<20081001164247.GB6173@agk.fab.redhat.com>
Message-ID: <14139e3a0810011006o6084a3ddqd7786d150b5b0658@mail.gmail.com>

On Wed, Oct 1, 2008 at 11:42 AM, Alasdair G Kergon <agk at redhat.com> wrote:

> I hope that problem was fixed in newer packages.
>
> Meanwhile try running 'clvmd -R' between some of the commands.
>
> If all else fails, you may have to kill the clvmd daemons in the cluster
> and restart them, or even add a 'vgscan' on each node before the restart.
>
> Alasdair
> --
> agk at redhat.com


Just a sanity check.  I killed all the clvmd daemons and started clvmd back
up.  I created the PV on node A:
[root at omadvnfs01a ~]# pvcreate /dev/sdh1
  Physical volume "/dev/sdh1" successfully created

Node B knows nothing of /dev/sdh1 but it does exist:
[root at omadvnfs01b ~]# ls /dev/sdh*
/dev/sdh
[root at omadvnfs01b ~]# parted /dev/sdh
GNU Parted 1.8.1
Using /dev/sdh
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p

Model: EQLOGIC 100E-00 (scsi)
Disk /dev/sdh: 4398GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      17.4kB  4398GB  4398GB               primary


Maybe this is why the pvcreate and vgcreate aren't tracking with Node B.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/32dd10e7/attachment.htm>

From Edoardo.Causarano at laitspa.it  Wed Oct  1 17:18:18 2008
From: Edoardo.Causarano at laitspa.it (Edoardo Causarano)
Date: Wed, 1 Oct 2008 19:18:18 +0200
Subject: [Linux-cluster] "gfs" init script configuration
Message-ID: <B9F5EE1EE80E12479854A81337FB102C030A2149@RLBEMAIL02.interno.regione.lazio.it>

Hi all,

 
further investigation shows that my gfs stalling on reboot is due to
incorrect specification of the filesystem in /ec/fstab. I mount is as
_netdev so /etc/init.d/gfs won't pick it up.

 
What is the correct syntax to make sure /etc/init.d/gfs will pick up the
fs at the right time during shutdown (before tearing down
scsi_reservation and clustering)?

 
IE... can I peek at your fstabs? ;)

 
E

(excuse me for the outlook mail)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/d66599ac/attachment.htm>

From jakub.suchy at enlogit.cz  Wed Oct  1 19:04:48 2008
From: jakub.suchy at enlogit.cz (Jakub Suchy)
Date: Wed, 1 Oct 2008 21:04:48 +0200
Subject: [Linux-cluster] Cisco working configuration
Message-ID: <20081001190448.GB10123@aaron>

Hello all,
because we are still struggling with RHEL5.2 cluster together with Cisco
network infrastructure, I'd like to ask if there is somebody which has
this configuration working:

RHEL5.2 cluster + Cisco 6500/4500/3500 infrastructure, heartbeat going
through these switches, nodes are not linked through crossover cable.

If so, can you please contact me for further details? I would very much
appreciate the help.

Thank you,
Jakub Suchy


From lpleiman at redhat.com  Wed Oct  1 19:22:01 2008
From: lpleiman at redhat.com (Leo Pleiman)
Date: Wed, 1 Oct 2008 15:22:01 -0400 (EDT)
Subject: [Linux-cluster] Cisco working configuration
In-Reply-To: <2000762384.122241222888753010.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>
Message-ID: <620016859.122671222888921460.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>


The kbase article can be found at http://kbase.redhat.com/faq/FAQ_51_11755.shtm
It has a link to Cisco's web site enumerating 5 possible solutions. http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a008059a9df.shtml

Leo J Pleiman 
Senior Consultant 
GPS, Red Hat Inc. 
410.688.3873 

----- Original Message -----
From: "Jakub Suchy" <jakub.suchy at enlogit.cz>
To: linux-cluster at redhat.com
Sent: Wednesday, October 1, 2008 3:04:48 PM GMT -05:00 US/Canada Eastern
Subject: [Linux-cluster] Cisco working configuration

Hello all,
because we are still struggling with RHEL5.2 cluster together with Cisco
network infrastructure, I'd like to ask if there is somebody which has
this configuration working:

RHEL5.2 cluster + Cisco 6500/4500/3500 infrastructure, heartbeat going
through these switches, nodes are not linked through crossover cable.

If so, can you please contact me for further details? I would very much
appreciate the help.

Thank you,
Jakub Suchy

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From jakub.suchy at enlogit.cz  Wed Oct  1 21:11:09 2008
From: jakub.suchy at enlogit.cz (Jakub Suchy)
Date: Wed, 1 Oct 2008 23:11:09 +0200
Subject: [Linux-cluster] Cisco working configuration
In-Reply-To: <620016859.122671222888921460.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>
References: <2000762384.122241222888753010.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>
	<620016859.122671222888921460.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>
Message-ID: <20081001211109.GA11341@aaron>

Leo Pleiman wrote:
> 
> The kbase article can be found at http://kbase.redhat.com/faq/FAQ_51_11755.shtm
> It has a link to Cisco's web site enumerating 5 possible solutions. http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a008059a9df.shtml

Hello,
I am aware of these documents and I have tried all these solutions.

Jakub


From accdias+cluster at gmail.com  Wed Oct  1 22:30:26 2008
From: accdias+cluster at gmail.com (Antonio Dias)
Date: Wed, 1 Oct 2008 19:30:26 -0300
Subject: [Linux-cluster] Cisco working configuration
In-Reply-To: <20081001190448.GB10123@aaron>
References: <20081001190448.GB10123@aaron>
Message-ID: <204313690810011530vc5548d9g39ec11661560366c@mail.gmail.com>

Hi, Jakub,

On Wed, Oct 1, 2008 at 16:04, Jakub Suchy <jakub.suchy at enlogit.cz> wrote:
> RHEL5.2 cluster + Cisco 6500/4500/3500 infrastructure, heartbeat going
> through these switches, nodes are not linked through crossover cable.

I believe you just need to force all nodes to "speak" IGMPv2. Do this:

for iface in /proc/sys/net/ipv4/conf/*; do
  echo '2' > $iface/force_igmp_version
done

On each node and probably it will work after. Cisco switches generally
speak IGMPv2 and Linux defaults to IGMPv3. RHCS depends on multicast
to work properly and multicast depends on IGMP.

You will need to set this in /etc/sysctl.conf to make it persistent
between reboots. Do this:

cat << EOF >> /etc/sysctl.conf
net.ipv4.conf.all.force_igmp_version = 2
net.ipv4.conf.default.force_igmp_version = 2
EOF

Let us know if this resolve your problem.

-- 
Antonio Dias


From terrybdavis at gmail.com  Wed Oct  1 23:05:24 2008
From: terrybdavis at gmail.com (Terry Davis)
Date: Wed, 1 Oct 2008 18:05:24 -0500
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <14139e3a0810011006o6084a3ddqd7786d150b5b0658@mail.gmail.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
	<14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>
	<20081001164247.GB6173@agk.fab.redhat.com>
	<14139e3a0810011006o6084a3ddqd7786d150b5b0658@mail.gmail.com>
Message-ID: <14139e3a0810011605o3d9ca45bq918ae7b94e6c1444@mail.gmail.com>

Awesome.  I rebooted and applied all available updates and now it works.
Only thing worth noting in the updates was a kernel update to
2.6.18-92.1.13.el5.  I think a reboot did it (for some reason).

On Wed, Oct 1, 2008 at 12:06 PM, Terry Davis <terrybdavis at gmail.com> wrote:

> On Wed, Oct 1, 2008 at 11:42 AM, Alasdair G Kergon <agk at redhat.com> wrote:
>
>> I hope that problem was fixed in newer packages.
>>
>> Meanwhile try running 'clvmd -R' between some of the commands.
>>
>> If all else fails, you may have to kill the clvmd daemons in the cluster
>> and restart them, or even add a 'vgscan' on each node before the restart.
>>
>> Alasdair
>> --
>> agk at redhat.com
>
>
>
> Just a sanity check.  I killed all the clvmd daemons and started clvmd back
> up.  I created the PV on node A:
> [root at omadvnfs01a ~]# pvcreate /dev/sdh1
>   Physical volume "/dev/sdh1" successfully created
>
> Node B knows nothing of /dev/sdh1 but it does exist:
> [root at omadvnfs01b ~]# ls /dev/sdh*
> /dev/sdh
> [root at omadvnfs01b ~]# parted /dev/sdh
> GNU Parted 1.8.1
> Using /dev/sdh
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p
>
> Model: EQLOGIC 100E-00 (scsi)
> Disk /dev/sdh: 4398GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
>
> Number  Start   End     Size    File system  Name     Flags
>  1      17.4kB  4398GB  4398GB               primary
>
>
> Maybe this is why the pvcreate and vgcreate aren't tracking with Node B.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081001/45de11f2/attachment.htm>

From denisb+gmane at gmail.com  Thu Oct  2 08:16:08 2008
From: denisb+gmane at gmail.com (denis)
Date: Thu, 02 Oct 2008 10:16:08 +0200
Subject: [Linux-cluster] qdisk questions
Message-ID: <gc2008$uk7$1@ger.gmane.org>

Hi,

I have recently had a couple of situations with my cluster where both
nodes were restarted simultaneously. The reasons for this are a bit
beyond me so I was wondering if anyone could clarify / point me to
relevant documentation.

Following excerpts from both nodes logs :

Oct  2 08:32:22 node1 qdiskd[3758]: <info> Heuristic: 'ping 10.X.X.X -c1
-t2' DOWN (3/3)
Oct  2 08:32:39 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct  2 08:32:55 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct  2 08:32:58 node1 qdiskd[3758]: <info> Heuristic: 'ping X.X.X.X -c1
-t1' DOWN (6/6)
Oct  2 08:33:01 node1 qdiskd[3758]: <notice> Score insufficient for
master operation (0/4; required=1); downgrading
Oct  2 08:33:01 node1 kernel: md: stopping all md devices.

Oct  2 08:32:23 node2 qdiskd[3599]: <info> Heuristic: 'ping 10.X.X.X -c1
-t2' DOWN (3/3)
Oct  2 08:32:49 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct  2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t1' DOWN (6/6)
Oct  2 08:32:56 node2 qdiskd[3599]: <info> Heuristic: 'ping X.X.X.X -c1
-t2' DOWN (6/6)
Oct  2 08:33:03 node2 qdiskd[3599]: <notice> Score insufficient for
master operation (0/4; required=1); downgrading
Oct  2 08:33:03 node2 kernel: md: stopping all md devices.


Does qdisk reboot the node due to these tests failing?

The upstream routers these nodes are connected to were unavailable for
at most 2 minutes, and all four pingtests require connectivity through
the router (probably need to change that!?).

What kind of tests can I use for qdiskd that will prevent router-outages
 from killing my cluster completely?


Regards
--
Denis


From xavier.montagutelli at unilim.fr  Thu Oct  2 09:43:58 2008
From: xavier.montagutelli at unilim.fr (Xavier Montagutelli)
Date: Thu, 2 Oct 2008 11:43:58 +0200
Subject: [Linux-cluster] CLVM clarification
In-Reply-To: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>
References: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>
Message-ID: <200810021143.58402.xavier.montagutelli@unilim.fr>

On Wednesday 01 October 2008 17:39, Angelo Compagnucci wrote:
> Hi to all,This is my first post on this list. Thanks in advance for every
> answer.
>
> I've already read every guide in this matter, this is the list:
>
> Cluster_Administration.pdf
> Cluster_Logical_Volume_Manager.pdf
> Global_Network_Block_Device.pdf
> Cluster_Suite_Overview.pdf
> Global_File_System.pdf
> CLVM.pdf
> RedHatClusterAdminOverview.pdf
>
> The truth is that I've not clear a point about CLVM.
>
> Let's me make an example:
>
> In this example CLVM and the Cluster suite are fully running without
> problems. Let's pose the same configuration of cluster.conf and lvm.conf
> and the nodes of the cluster are joined and operatives.

Does your example include a shared storage (GNBD, iSCSI, SAN, ...) ?

>
> NODE1:
>
> pvcreate /dev/hda3
>
> NODE2:
>
> pvcreate /dev/hda2
>
> Let's pose that CLVM spans LVM metadata across the cluster, if I stroke the
> command:
>
> pvscan
>
> I should see /dev/sda2 and /dev/sda3
>
> and then I can create a vg with
>
> vgcreate /dev/sda2 /dev/sda3 ...
>
> The question is: How LVM metadata sharing works? I have to use GNBD on the
> row partion to share a device between nodes? I can create a GFS over a
> spanned volume group? Are shareable only logical volumes?

I have the feeling that something is not clear here. I am not an expert, but :

GNBD is just a mean to export a block device on the IP network. A GNBD device 
is accessible to multiple nodes at the same time, and thus you can include 
that block device in a CLVM Volume Group. Instead of GNBD, you can also use 
any other shared storage (iSCSI, FC, ...). Be careful, from what I have 
understood, some SAN storage are not sharable between many hosts (NBD, AoE 
for example) !

After that, you have the choice : 

 - to make one LV with a shared filesystem (GFS). You can then mount the same 
filesystem on many nodes at the same time.

 - to make many LV with an ext3 / xfs / ... filesystem. But you then have to 
make sure that one LV is mounted on only one node at a given time.

But the type of filesystem is independant, this is a higher component.

In this picture, CLVM is only a low-level component, avoiding the concurrent 
access of many nodes on the LVM metadata written on the shared storage.

The data are not "spanned" across the local storage of many nodes (well, I 
suppose you *could* do that, but you would need other tools / layers ?)

Other point : if I remember correctly, the Red Hat doc says it's not 
recommended to use GFS on a node that exports a GNBD device. So if you use 
GNBD as a shared storage, I suppose it's better to specialize one or more 
nodes as GNBD "servers".


HTH

>
> Thanks for your answers!!

-- 
Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
Universite de Limoges
123, avenue Albert Thomas
87060 Limoges cedex


From angelo.compagnucci at gmail.com  Thu Oct  2 10:28:29 2008
From: angelo.compagnucci at gmail.com (Angelo Compagnucci)
Date: Thu, 2 Oct 2008 12:28:29 +0200
Subject: [Linux-cluster] CLVM clarification
In-Reply-To: <200810021143.58402.xavier.montagutelli@unilim.fr>
References: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>
	<200810021143.58402.xavier.montagutelli@unilim.fr>
Message-ID: <777f2ade0810020328h75710a54u36c42e4e48a01429@mail.gmail.com>

Ok, this could be clear, but in the Cluster_Logical_Volume_Manager.pdf I've
read (bottom of page 3):
"The clmvd daemon is the key clustering extension to LVM. The clvmd daemon
runs in each cluster computer and distributes LVM metadata updates in a
cluster, presenting each cluster computer with the same view of the logical
volumes"

This is a picture of wath I have in mind:

-----------------------------------------
|      GFS filesystem            |
-----------------------------------------
|            LV                         |
-----------------------------------------
|            VG                        |
-----------------------------------------
|  PV1     |  PV2     |   PV3   |
-----------------------------------------
| GNBD1 | GNBD2 | GNBD3 |
-----------------------------------------
| hda1     |  hda1    |   hda1   |
| Node1   | Node2  |   Node3 |
-----------------------------------------

In this case the clvm features are not useful because there is only one
machine (that could not be a node of a cluster) that have the lvm over GNBD
exported devices. So the nodes doesn't know nothing about the other nodes.

Let's pose this situation:

-----------------------------------------------
|            GFS                           |
-----------------------------------------------
|                LV                          |
-----------------------------------------------
|         VG1         |        VG2      |
-----------------------------------------------
|         PV1         |        PV2       |
|      Node1         |      Node2      |
-----------------------------------------------
|        CLVM coordinates           |
-----------------------------------------------

In this situatuation makes sense to have a clustered lvm because if I have
to make some maintenance over VGs, CLVM can lock and unlock the interested
device.

Is this the correct behaviour??

In the contrary, which is the CLVM role in a cluster?


2008/10/2 Xavier Montagutelli <xavier.montagutelli at unilim.fr>

> On Wednesday 01 October 2008 17:39, Angelo Compagnucci wrote:
> > Hi to all,This is my first post on this list. Thanks in advance for every
> > answer.
> >
> > I've already read every guide in this matter, this is the list:
> >
> > Cluster_Administration.pdf
> > Cluster_Logical_Volume_Manager.pdf
> > Global_Network_Block_Device.pdf
> > Cluster_Suite_Overview.pdf
> > Global_File_System.pdf
> > CLVM.pdf
> > RedHatClusterAdminOverview.pdf
> >
> > The truth is that I've not clear a point about CLVM.
> >
> > Let's me make an example:
> >
> > In this example CLVM and the Cluster suite are fully running without
> > problems. Let's pose the same configuration of cluster.conf and lvm.conf
> > and the nodes of the cluster are joined and operatives.
>
> Does your example include a shared storage (GNBD, iSCSI, SAN, ...) ?
>
> >
> > NODE1:
> >
> > pvcreate /dev/hda3
> >
> > NODE2:
> >
> > pvcreate /dev/hda2
> >
> > Let's pose that CLVM spans LVM metadata across the cluster, if I stroke
> the
> > command:
> >
> > pvscan
> >
> > I should see /dev/sda2 and /dev/sda3
> >
> > and then I can create a vg with
> >
> > vgcreate /dev/sda2 /dev/sda3 ...
> >
> > The question is: How LVM metadata sharing works? I have to use GNBD on
> the
> > row partion to share a device between nodes? I can create a GFS over a
> > spanned volume group? Are shareable only logical volumes?
>
> I have the feeling that something is not clear here. I am not an expert,
> but :
>
> GNBD is just a mean to export a block device on the IP network. A GNBD
> device
> is accessible to multiple nodes at the same time, and thus you can include
> that block device in a CLVM Volume Group. Instead of GNBD, you can also use
> any other shared storage (iSCSI, FC, ...). Be careful, from what I have
> understood, some SAN storage are not sharable between many hosts (NBD, AoE
> for example) !
>
> After that, you have the choice :
>
>  - to make one LV with a shared filesystem (GFS). You can then mount the
> same
> filesystem on many nodes at the same time.
>
>  - to make many LV with an ext3 / xfs / ... filesystem. But you then have
> to
> make sure that one LV is mounted on only one node at a given time.
>
> But the type of filesystem is independant, this is a higher component.
>
> In this picture, CLVM is only a low-level component, avoiding the
> concurrent
> access of many nodes on the LVM metadata written on the shared storage.
>
> The data are not "spanned" across the local storage of many nodes (well, I
> suppose you *could* do that, but you would need other tools / layers ?)
>
> Other point : if I remember correctly, the Red Hat doc says it's not
> recommended to use GFS on a node that exports a GNBD device. So if you use
> GNBD as a shared storage, I suppose it's better to specialize one or more
> nodes as GNBD "servers".
>
>
> HTH
>
> >
> > Thanks for your answers!!
>
> --
> Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
> Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
> Universite de Limoges
> 123, avenue Albert Thomas
> 87060 Limoges cedex
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081002/e7ffebf6/attachment.htm>

From xavier.montagutelli at unilim.fr  Thu Oct  2 12:56:29 2008
From: xavier.montagutelli at unilim.fr (Xavier Montagutelli)
Date: Thu, 2 Oct 2008 14:56:29 +0200
Subject: [Linux-cluster] CLVM clarification
In-Reply-To: <777f2ade0810020328h75710a54u36c42e4e48a01429@mail.gmail.com>
References: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>
	<200810021143.58402.xavier.montagutelli@unilim.fr>
	<777f2ade0810020328h75710a54u36c42e4e48a01429@mail.gmail.com>
Message-ID: <200810021456.29214.xavier.montagutelli@unilim.fr>

On Thursday 02 October 2008 12:28, Angelo Compagnucci wrote:
> Ok, this could be clear, but in the Cluster_Logical_Volume_Manager.pdf I've
> read (bottom of page 3):
> "The clmvd daemon is the key clustering extension to LVM. The clvmd daemon
> runs in each cluster computer and distributes LVM metadata updates in a
> cluster, presenting each cluster computer with the same view of the logical
> volumes"
>
> This is a picture of wath I have in mind:

This picture doesn't show the difference between a GNBD server (which doesn't 
know anything about the use of the exported block device : it doesn't know 
the VG for example) and the GNBD clients (which actually use the block device 
as PV). May I add some layers ? Not exactly what I have in mind but I am not 
a ascii art expert :

 ---------------------------
|      GFS filesystem       |
 ---------------------------
|            LV             |
 ---------------------------
|            VG             |
 ---------------------------
|  PV1    |  PV2   |   PV3  |
.---------.--------.--------.
|             CLVM          |
.---------.--------.--------.
|  cluster basis (dlm,...)  |
.---------.--------.--------.
| Node4   | Node5  | Node6  | 
.---------.--------.--------.
(Node4,5,6 have access to the three GNBD devices)
        \    | |    /
         \___|_|___/
         /   | |   \
        /    | |    \
       /     | |     \
.---------.--------.---------.
| GNBD1   | GNBD2  | GNBD3   |
.---------.--------.---------.
| hda1    |  hda1  |   hda1  |
| Node1   | Node2  |   Node3 |
.---------.--------.---------.

>
> In this case the clvm features are not useful because there is only one
> machine (that could not be a node of a cluster) that have the lvm over GNBD
> exported devices. So the nodes doesn't know nothing about the other nodes.

If your GNBD* devices are accessed by only one other node. But if the GNBD are 
served to multiple nodes (nodes4,5,6), then CLVM is useful.

>
> Let's pose this situation:
>
> -----------------------------------------------
> |            GFS                           |
> -----------------------------------------------
> |                LV                          |
> -----------------------------------------------
> |         VG1         |        VG2      |
> -----------------------------------------------
> |         PV1         |        PV2       |
> |      Node1         |      Node2      |
> -----------------------------------------------
> |        CLVM coordinates           |
> -----------------------------------------------
>
> In this situatuation makes sense to have a clustered lvm because if I have
> to make some maintenance over VGs, CLVM can lock and unlock the interested
> device.
>
> Is this the correct behaviour??

Perhaps I miss your point, but it doesn't make sense if the block devices are 
local to each node. How could Node2 have access to the block device on Node1 
(showed as PV1) ? 

CLVM is useful only when you have a shared storage. 

> In the contrary, which is the CLVM role in a cluster?

>From what I know, CLVM protects the metadata parts of LVM on the shared 
storage. And when you make one operation the shared storage on one node (for 
example, create a new LV), all the nodes are aware of the change.


>
>
> 2008/10/2 Xavier Montagutelli <xavier.montagutelli at unilim.fr>
>
> > On Wednesday 01 October 2008 17:39, Angelo Compagnucci wrote:
> > > Hi to all,This is my first post on this list. Thanks in advance for
> > > every answer.
> > >
> > > I've already read every guide in this matter, this is the list:
> > >
> > > Cluster_Administration.pdf
> > > Cluster_Logical_Volume_Manager.pdf
> > > Global_Network_Block_Device.pdf
> > > Cluster_Suite_Overview.pdf
> > > Global_File_System.pdf
> > > CLVM.pdf
> > > RedHatClusterAdminOverview.pdf
> > >
> > > The truth is that I've not clear a point about CLVM.
> > >
> > > Let's me make an example:
> > >
> > > In this example CLVM and the Cluster suite are fully running without
> > > problems. Let's pose the same configuration of cluster.conf and
> > > lvm.conf and the nodes of the cluster are joined and operatives.
> >
> > Does your example include a shared storage (GNBD, iSCSI, SAN, ...) ?
> >
> > > NODE1:
> > >
> > > pvcreate /dev/hda3
> > >
> > > NODE2:
> > >
> > > pvcreate /dev/hda2
> > >
> > > Let's pose that CLVM spans LVM metadata across the cluster, if I stroke
> >
> > the
> >
> > > command:
> > >
> > > pvscan
> > >
> > > I should see /dev/sda2 and /dev/sda3
> > >
> > > and then I can create a vg with
> > >
> > > vgcreate /dev/sda2 /dev/sda3 ...
> > >
> > > The question is: How LVM metadata sharing works? I have to use GNBD on
> >
> > the
> >
> > > row partion to share a device between nodes? I can create a GFS over a
> > > spanned volume group? Are shareable only logical volumes?
> >
> > I have the feeling that something is not clear here. I am not an expert,
> > but :
> >
> > GNBD is just a mean to export a block device on the IP network. A GNBD
> > device
> > is accessible to multiple nodes at the same time, and thus you can
> > include that block device in a CLVM Volume Group. Instead of GNBD, you
> > can also use any other shared storage (iSCSI, FC, ...). Be careful, from
> > what I have understood, some SAN storage are not sharable between many
> > hosts (NBD, AoE for example) !
> >
> > After that, you have the choice :
> >
> >  - to make one LV with a shared filesystem (GFS). You can then mount the
> > same
> > filesystem on many nodes at the same time.
> >
> >  - to make many LV with an ext3 / xfs / ... filesystem. But you then have
> > to
> > make sure that one LV is mounted on only one node at a given time.
> >
> > But the type of filesystem is independant, this is a higher component.
> >
> > In this picture, CLVM is only a low-level component, avoiding the
> > concurrent
> > access of many nodes on the LVM metadata written on the shared storage.
> >
> > The data are not "spanned" across the local storage of many nodes (well,
> > I suppose you *could* do that, but you would need other tools / layers ?)
> >
> > Other point : if I remember correctly, the Red Hat doc says it's not
> > recommended to use GFS on a node that exports a GNBD device. So if you
> > use GNBD as a shared storage, I suppose it's better to specialize one or
> > more nodes as GNBD "servers".
> >
> >
> > HTH
> >
> > > Thanks for your answers!!
> >
> > --
> > Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
> > Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
> > Universite de Limoges
> > 123, avenue Albert Thomas
> > 87060 Limoges cedex
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
Universite de Limoges
123, avenue Albert Thomas
87060 Limoges cedex


From angelo.compagnucci at gmail.com  Thu Oct  2 15:15:41 2008
From: angelo.compagnucci at gmail.com (Angelo Compagnucci)
Date: Thu, 2 Oct 2008 17:15:41 +0200
Subject: [Linux-cluster] CLVM clarification
In-Reply-To: <200810021456.29214.xavier.montagutelli@unilim.fr>
References: <777f2ade0810010839y77afc32bn5038bcfb4e3c1a6d@mail.gmail.com>
	<200810021143.58402.xavier.montagutelli@unilim.fr>
	<777f2ade0810020328h75710a54u36c42e4e48a01429@mail.gmail.com>
	<200810021456.29214.xavier.montagutelli@unilim.fr>
Message-ID: <777f2ade0810020815p26fb6a73u6dcb80a7e7ab81b5@mail.gmail.com>

Sorry, but I have not clear the clvm role.
CLVM shares VG metadata over a cluster and makes possible a wide cluster
administration (RedHat documentation says).

In this way a CLVM Cluster must have a CMAN Cluster up and running.

So, if I have already a shared storage, the only thing I can do is to make a
GFS filesystem and export this one to clients machine. In this way the
shared storage could be accessed by multiples machines.

In this scenario clvm is not useful because the shared lock on filesystem is
guaranted by GFS.

Let's pose I have different machines that I want to join in a cluster. Each
machine has a storage that I want share with the other machines to create a
large storage.

With CLVM, stands to RedHat guide, I can create a cluster that "presenting
each cluster computer with the same view of the logical volumes".[1]

So I have:

node 1:
  VG1  (local)
  VG2  (node2)
  VG3  (node3)

node 2:
  VG1  (node1)
  VG2  (local)
  VG3  (node3)

node 3:
  VG1  (node1)
  VG2  (node2)
  VG3  (local)

This should be what the RedHat CLVM guide stands for "the same view of the
logical volumes"

>From this point, Node1 is the shared storage. In this example it is visible
from alle the cluster's nodes.

So if I stroke an "lvcreate", I have to see the newly create LV on the other
nodes of the cluster.
It is true?

If this is true, gndb is not necessary and the layout becomes really simple.

Thanks for your time!

[1]
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/html/Cluster_Logical_Volume_Manager/LVM_Cluster_Overview.html

2008/10/2 Xavier Montagutelli <xavier.montagutelli at unilim.fr>

> On Thursday 02 October 2008 12:28, Angelo Compagnucci wrote:
> > Ok, this could be clear, but in the Cluster_Logical_Volume_Manager.pdf
> I've
> > read (bottom of page 3):
> > "The clmvd daemon is the key clustering extension to LVM. The clvmd
> daemon
> > runs in each cluster computer and distributes LVM metadata updates in a
> > cluster, presenting each cluster computer with the same view of the
> logical
> > volumes"
> >
> > This is a picture of wath I have in mind:
>
> This picture doesn't show the difference between a GNBD server (which
> doesn't
> know anything about the use of the exported block device : it doesn't know
> the VG for example) and the GNBD clients (which actually use the block
> device
> as PV). May I add some layers ? Not exactly what I have in mind but I am
> not
> a ascii art expert :
>
>  ---------------------------
> |      GFS filesystem       |
>  ---------------------------
> |            LV             |
>  ---------------------------
> |            VG             |
>  ---------------------------
> |  PV1    |  PV2   |   PV3  |
> .---------.--------.--------.
> |             CLVM          |
> .---------.--------.--------.
> |  cluster basis (dlm,...)  |
> .---------.--------.--------.
> | Node4   | Node5  | Node6  |
> .---------.--------.--------.
> (Node4,5,6 have access to the three GNBD devices)
>        \    | |    /
>         \___|_|___/
>         /   | |   \
>        /    | |    \
>       /     | |     \
> .---------.--------.---------.
> | GNBD1   | GNBD2  | GNBD3   |
> .---------.--------.---------.
> | hda1    |  hda1  |   hda1  |
> | Node1   | Node2  |   Node3 |
> .---------.--------.---------.
>
> >
> > In this case the clvm features are not useful because there is only one
> > machine (that could not be a node of a cluster) that have the lvm over
> GNBD
> > exported devices. So the nodes doesn't know nothing about the other
> nodes.
>
> If your GNBD* devices are accessed by only one other node. But if the GNBD
> are
> served to multiple nodes (nodes4,5,6), then CLVM is useful.
>
> >
> > Let's pose this situation:
> >
> > -----------------------------------------------
> > |            GFS                           |
> > -----------------------------------------------
> > |                LV                          |
> > -----------------------------------------------
> > |         VG1         |        VG2      |
> > -----------------------------------------------
> > |         PV1         |        PV2       |
> > |      Node1         |      Node2      |
> > -----------------------------------------------
> > |        CLVM coordinates           |
> > -----------------------------------------------
> >
> > In this situatuation makes sense to have a clustered lvm because if I
> have
> > to make some maintenance over VGs, CLVM can lock and unlock the
> interested
> > device.
> >
> > Is this the correct behaviour??
>
> Perhaps I miss your point, but it doesn't make sense if the block devices
> are
> local to each node. How could Node2 have access to the block device on
> Node1
> (showed as PV1) ?
>
> CLVM is useful only when you have a shared storage.
>
> > In the contrary, which is the CLVM role in a cluster?
>
> >From what I know, CLVM protects the metadata parts of LVM on the shared
> storage. And when you make one operation the shared storage on one node
> (for
> example, create a new LV), all the nodes are aware of the change.
>
>
> >
> >
> > 2008/10/2 Xavier Montagutelli <xavier.montagutelli at unilim.fr>
> >
> > > On Wednesday 01 October 2008 17:39, Angelo Compagnucci wrote:
> > > > Hi to all,This is my first post on this list. Thanks in advance for
> > > > every answer.
> > > >
> > > > I've already read every guide in this matter, this is the list:
> > > >
> > > > Cluster_Administration.pdf
> > > > Cluster_Logical_Volume_Manager.pdf
> > > > Global_Network_Block_Device.pdf
> > > > Cluster_Suite_Overview.pdf
> > > > Global_File_System.pdf
> > > > CLVM.pdf
> > > > RedHatClusterAdminOverview.pdf
> > > >
> > > > The truth is that I've not clear a point about CLVM.
> > > >
> > > > Let's me make an example:
> > > >
> > > > In this example CLVM and the Cluster suite are fully running without
> > > > problems. Let's pose the same configuration of cluster.conf and
> > > > lvm.conf and the nodes of the cluster are joined and operatives.
> > >
> > > Does your example include a shared storage (GNBD, iSCSI, SAN, ...) ?
> > >
> > > > NODE1:
> > > >
> > > > pvcreate /dev/hda3
> > > >
> > > > NODE2:
> > > >
> > > > pvcreate /dev/hda2
> > > >
> > > > Let's pose that CLVM spans LVM metadata across the cluster, if I
> stroke
> > >
> > > the
> > >
> > > > command:
> > > >
> > > > pvscan
> > > >
> > > > I should see /dev/sda2 and /dev/sda3
> > > >
> > > > and then I can create a vg with
> > > >
> > > > vgcreate /dev/sda2 /dev/sda3 ...
> > > >
> > > > The question is: How LVM metadata sharing works? I have to use GNBD
> on
> > >
> > > the
> > >
> > > > row partion to share a device between nodes? I can create a GFS over
> a
> > > > spanned volume group? Are shareable only logical volumes?
> > >
> > > I have the feeling that something is not clear here. I am not an
> expert,
> > > but :
> > >
> > > GNBD is just a mean to export a block device on the IP network. A GNBD
> > > device
> > > is accessible to multiple nodes at the same time, and thus you can
> > > include that block device in a CLVM Volume Group. Instead of GNBD, you
> > > can also use any other shared storage (iSCSI, FC, ...). Be careful,
> from
> > > what I have understood, some SAN storage are not sharable between many
> > > hosts (NBD, AoE for example) !
> > >
> > > After that, you have the choice :
> > >
> > >  - to make one LV with a shared filesystem (GFS). You can then mount
> the
> > > same
> > > filesystem on many nodes at the same time.
> > >
> > >  - to make many LV with an ext3 / xfs / ... filesystem. But you then
> have
> > > to
> > > make sure that one LV is mounted on only one node at a given time.
> > >
> > > But the type of filesystem is independant, this is a higher component.
> > >
> > > In this picture, CLVM is only a low-level component, avoiding the
> > > concurrent
> > > access of many nodes on the LVM metadata written on the shared storage.
> > >
> > > The data are not "spanned" across the local storage of many nodes
> (well,
> > > I suppose you *could* do that, but you would need other tools / layers
> ?)
> > >
> > > Other point : if I remember correctly, the Red Hat doc says it's not
> > > recommended to use GFS on a node that exports a GNBD device. So if you
> > > use GNBD as a shared storage, I suppose it's better to specialize one
> or
> > > more nodes as GNBD "servers".
> > >
> > >
> > > HTH
> > >
> > > > Thanks for your answers!!
> > >
> > > --
> > > Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
> > > Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
> > > Universite de Limoges
> > > 123, avenue Albert Thomas
> > > 87060 Limoges cedex
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
> Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
> Universite de Limoges
> 123, avenue Albert Thomas
> 87060 Limoges cedex
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081002/4d54631b/attachment.htm>

From jruemker at redhat.com  Thu Oct  2 17:24:25 2008
From: jruemker at redhat.com (John Ruemker)
Date: Thu, 02 Oct 2008 13:24:25 -0400
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <14139e3a0810011605o3d9ca45bq918ae7b94e6c1444@mail.gmail.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>	<14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>	<20081001164247.GB6173@agk.fab.redhat.com>	<14139e3a0810011006o6084a3ddqd7786d150b5b0658@mail.gmail.com>
	<14139e3a0810011605o3d9ca45bq918ae7b94e6c1444@mail.gmail.com>
Message-ID: <48E503C9.1030003@redhat.com>

Terry Davis wrote:
> Awesome.  I rebooted and applied all available updates and now it 
> works.  Only thing worth noting in the updates was a kernel update to 
> 2.6.18-92.1.13.el5.  I think a reboot did it (for some reason).
>
> On Wed, Oct 1, 2008 at 12:06 PM, Terry Davis <terrybdavis at gmail.com 
> <mailto:terrybdavis at gmail.com>> wrote:
>
>     On Wed, Oct 1, 2008 at 11:42 AM, Alasdair G Kergon <agk at redhat.com
>     <mailto:agk at redhat.com>> wrote:
>
>         I hope that problem was fixed in newer packages.
>
>         Meanwhile try running 'clvmd -R' between some of the commands.
>
>         If all else fails, you may have to kill the clvmd daemons in
>         the cluster
>         and restart them, or even add a 'vgscan' on each node before
>         the restart.
>
>         Alasdair
>         --
>         agk at redhat.com <mailto:agk at redhat.com>
>
>
>
>     Just a sanity check.  I killed all the clvmd daemons and started
>     clvmd back up.  I created the PV on node A:
>
>     [root at omadvnfs01a ~]# pvcreate /dev/sdh1
>       Physical volume "/dev/sdh1" successfully created
>
>     Node B knows nothing of /dev/sdh1 but it does exist:
>     [root at omadvnfs01b ~]# ls /dev/sdh*
>     /dev/sdh
>

This is the problem.  If you partition the device on one node, you must 
do a 'partprobe' on all nodes so that they update their partition 
tables.  Without doing this LVM has no idea what /dev/sdh1 is and 
therefore cannot lock on it.  After running partprobe do 'clvmd -R' so 
that clvmd reloads its device cache and knows which devices are 
available.  After that you can proceed with pvcreate, vgcreate, 
lvcreate, etc. 

John


From terrybdavis at gmail.com  Thu Oct  2 18:16:26 2008
From: terrybdavis at gmail.com (Terry Davis)
Date: Thu, 2 Oct 2008 13:16:26 -0500
Subject: [Linux-cluster] adding volume to cluster
In-Reply-To: <48E503C9.1030003@redhat.com>
References: <14139e3a0809291459y6d4beb36sf69feb14c18f4b72@mail.gmail.com>
	<c99654a0809300846j5e803b9clccd562f3ffd12284@mail.gmail.com>
	<14139e3a0809300933s5014e4fp2465554c0bd1255a@mail.gmail.com>
	<c99654a0810010845s19589b5ctc98c793b8529fc1c@mail.gmail.com>
	<14139e3a0810010934j65e31733kff5e45772f13a95b@mail.gmail.com>
	<20081001164247.GB6173@agk.fab.redhat.com>
	<14139e3a0810011006o6084a3ddqd7786d150b5b0658@mail.gmail.com>
	<14139e3a0810011605o3d9ca45bq918ae7b94e6c1444@mail.gmail.com>
	<48E503C9.1030003@redhat.com>
Message-ID: <14139e3a0810021116n3f63c7f1v3b45702c608484b4@mail.gmail.com>

On Thu, Oct 2, 2008 at 12:24 PM, John Ruemker <jruemker at redhat.com> wrote:

> Terry Davis wrote:
>
>> Awesome.  I rebooted and applied all available updates and now it works.
>>  Only thing worth noting in the updates was a kernel update to
>> 2.6.18-92.1.13.el5.  I think a reboot did it (for some reason).
>>
>> On Wed, Oct 1, 2008 at 12:06 PM, Terry Davis <terrybdavis at gmail.com<mailto:
>> terrybdavis at gmail.com>> wrote:
>>
>>    On Wed, Oct 1, 2008 at 11:42 AM, Alasdair G Kergon <agk at redhat.com
>>    <mailto:agk at redhat.com>> wrote:
>>
>>        I hope that problem was fixed in newer packages.
>>
>>        Meanwhile try running 'clvmd -R' between some of the commands.
>>
>>        If all else fails, you may have to kill the clvmd daemons in
>>        the cluster
>>        and restart them, or even add a 'vgscan' on each node before
>>        the restart.
>>
>>        Alasdair
>>        --
>>        agk at redhat.com <mailto:agk at redhat.com>
>>
>>
>>
>>    Just a sanity check.  I killed all the clvmd daemons and started
>>    clvmd back up.  I created the PV on node A:
>>
>>    [root at omadvnfs01a ~]# pvcreate /dev/sdh1
>>      Physical volume "/dev/sdh1" successfully created
>>
>>    Node B knows nothing of /dev/sdh1 but it does exist:
>>    [root at omadvnfs01b ~]# ls /dev/sdh*
>>    /dev/sdh
>>
>>
> This is the problem.  If you partition the device on one node, you must do
> a 'partprobe' on all nodes so that they update their partition tables.
>  Without doing this LVM has no idea what /dev/sdh1 is and therefore cannot
> lock on it.  After running partprobe do 'clvmd -R' so that clvmd reloads its
> device cache and knows which devices are available.  After that you can
> proceed with pvcreate, vgcreate, lvcreate, etc.
> John


Ahhhh, the step that I was missing all along.  I have gone ahead and carved
that into the back of my hand with a dull pencil so I don't forget next
time.

Thanks for the help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081002/0b32be37/attachment.htm>

From macscr at macscr.com  Fri Oct  3 01:39:59 2008
From: macscr at macscr.com (Mark Chaney)
Date: Thu, 2 Oct 2008 20:39:59 -0500
Subject: [Linux-cluster] error messages explained
Message-ID: <02d501c924f8$f3eb4020$dbc1c060$@com>

Cam someone explain to me these errors and tell me how I should attempt to
resolve them? They both aren't happening at the same time exactly, its just
to errors that I don't truly understand.

####################

ccsd[3192]: Attempt to close an unopened CCS descriptor (13590).
ccsd[3192]: Error while processing disconnect:
Invalid request descriptor

##################

openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined
the cluster with existing state

##################

Thanks,
Mark


From macscr at macscr.com  Fri Oct  3 01:56:33 2008
From: macscr at macscr.com (Mark Chaney)
Date: Thu, 2 Oct 2008 20:56:33 -0500
Subject: [Linux-cluster] fence daemon settings
Message-ID: <02dd01c924fb$44a2a1f0$cde7e5d0$@com>

I have a 3 node cluster running gfs. They have a private gigabit network
that's used for cluster communication and backups. Can you recommend what
type of settings I should have here? My nodes don't seem to rejoin the
cluster to well after being fenced.

<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="60"/>

Thanks,
Mark


From ccaulfie at redhat.com  Fri Oct  3 07:38:32 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Fri, 03 Oct 2008 08:38:32 +0100
Subject: [Linux-cluster] error messages explained
In-Reply-To: <02d501c924f8$f3eb4020$dbc1c060$@com>
References: <02d501c924f8$f3eb4020$dbc1c060$@com>
Message-ID: <48E5CBF8.4000301@redhat.com>

Mark Chaney wrote:
> Cam someone explain to me these errors and tell me how I should attempt to
> resolve them? They both aren't happening at the same time exactly, its just
> to errors that I don't truly understand.
> 
> ####################
> 
> ccsd[3192]: Attempt to close an unopened CCS descriptor (13590).
> ccsd[3192]: Error while processing disconnect:
> Invalid request descriptor
> 
> ##################
> 
> openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined
> the cluster with existing state
> 

I need to add this to the FAQ!

What this message means is that a node was a valid member of the cluster
once; it then left the cluster (without being fenced) and rejoined
automatically. This can sometimes happen if the ethernet is disconnected
for a time, usually a few seconds.

If a node leave the cluster, it MUST rejoin using the cman_tool join
command with no services running. The usual way to make this happen is
to reboot the node, and if fencing is configured correctly that is what
normally happens. It could be that fencing is too slow to manage this or
that the cluster is made up of two nodes without a quorum disk so that
the 'other' node doesn't have quorum and cannot initiate fencing.

Another (more common) cause of this, is slow responding of some Cisco
switches as documented here:

http://www.openais.org/doku.php?id=faq:cisco_switches


-- 

Chrissie


From d.vasilets at peterhost.ru  Fri Oct  3 13:30:38 2008
From: d.vasilets at peterhost.ru (=?koi8-r?Q?=F7=C1=D3=C9=CC=C5=C3_?= =?koi8-r?Q?=E4=CD=C9=D4=D2=C9=CA?=)
Date: Fri, 03 Oct 2008 17:30:38 +0400
Subject: [Linux-cluster] can't compile cluster-2.03.07
Message-ID: <1223040639.10584.1.camel@dima-desktop>

I try compile cluster-2.03.07 with kernel 2.6.26-5
how i can fix that ?
every time report 
make[2]: Entering directory `/usr/src/kernels/linux-2.6.26.5'

  WARNING: Symbol version
dump /usr/src/kernels/linux-2.6.26.5/Module.symvers
           is missing; modules will have no dependencies and
modversions.

  Building modules, stage 2.
  MODPOST 1 modules
/bin/sh: scripts/mod/modpost: No such file or directory
make[3]: *** [__modpost] Error 127
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/usr/src/kernels/linux-2.6.26.5'
make[1]: *** [gnbd.ko] Error 2
make[1]: Leaving directory `/root/gfs/cluster/gnbd-kernel/src'
make: *** [gnbd-kernel/src] Error 2


From tuckerd at engr.smu.edu  Fri Oct  3 15:54:39 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Fri, 03 Oct 2008 10:54:39 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
Message-ID: <1223049279.22918.30.camel@thor.seas.smu.edu>

We recently migrated from a 7 year old file server running on a single
proc dec alpha running Tru64 and utilizing Truclustering for HA, to a
Redhat cluster suite and gfs for HA on a dual duo core dell 2950 with
32gb ram, and have been having major performance issues.  Both have
fiber attached storage.  The old file server grossly outperforms the new
one!  The way we are utilizing it is for nfs file serving only to
multiple clients.  It doesn't take many users doing much on the clients,
to easily drive the load on the boxes into the 10+ range, where on the
old file server it never got above 2 or 3 to perform the same tasks.
The load and performance was much worse, but improved to where we are
now after setting all of the volumes to statfs_fast 1.  I also set nfs
threads to 256, which helped some, but I don't know what more to do, and
we are at the point of abandoning this platform if we cannot get it to
perform reasonably.  Please help!

Sincerely,

Doug Tucker
Network and Systems
Southern Methodist University


From gordan at bobich.net  Fri Oct  3 16:25:30 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Fri, 3 Oct 2008 17:25:30 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
Message-ID: <48BF35C50600251B@> (added by postmaster@mail.o2.co.uk)

It sounds like you have a SAN (fibre attached storage) that you are trying to turn into a NAS. That's justifiable if you have multiple mirrored SANs, but makes a mockery of HA if you only have one storage device since it leaves you with a single point of failure regardless of the number of front end nodes.

Do you have a separate gigabit interface/vlan just for cluster communication? RHCS doesn't use a lot of sustained bandwidth but performance is sensitive to latencies for DLM comms. If you only have 2 nodes, a direct crossover connection would be ideal.

How big is your data store? Are files large or small? Are they in few directories with lots of files (e.g. Maildir)?

Load averages will go up - that's normal, since there is added latency (round trip time) from locking across nodes. Unless your CPUs is 0% idle, the servers aren't running out of steam. So don't worry about it.

Also note that a clustered FS will _ALWAYS_ be slower than a non-clustered one, all things being equal. No exceptions. Also, if you are load sharing across the nodes, and you have Maildir-like file structures, it'll go slower than a purely fail-over setup, even on a clustered FS (like GFS), since there is no lock bouncing between head nodes. For extra performance, you can use a non-clustered FS as a failover resource, but be very careful with that since dual mounting a non-clustered FS will destroy the volume firtually instantly.

Provided that your data isn't fundamentally unsuitable for being handled by a clustered load sharing setup, you could try increasing lock trimming and increasing the number of resource groups. Search through the archives for details on that.

More suggestions when you provide more details on what your data is like.

Gordan

-----Original Message-----
From: "Doug Tucker" <tuckerd at engr.smu.edu>
To: linux-cluster at redhat.com
Sent: 03/10/08 16:54
Subject: [Linux-cluster] rhcs + gfs performance issues

We recently migrated from a 7 year old file server running on a single
proc dec alpha running Tru64 and utilizing Truclustering for HA, to a
Redhat cluster suite and gfs for HA on a dual duo core dell 2950 with
32gb ram, and have been having major performance issues.  Both have
fiber attached storage.  The old file server grossly outperforms the new
one!  The way we are utilizing it is for nfs file serving only to
multiple clients.  It doesn't take many users doing much on the clients,
to easily drive the load on the boxes into the 10+ range, where on the
old file server it never got above 2 or 3 to perform the same tasks.
The load and performance was much worse, but improved to where we are
now after setting all of the volumes to statfs_fast 1.  I also set nfs
threads to 256, which helped some, but I don't know what more to do, and
we are at the point of abandoning this platform if we cannot get it to
perform reasonably.  Please help!

Sincerely,

Doug Tucker
Network and Systems
Southern Methodist University

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From tuckerd at engr.smu.edu  Fri Oct  3 16:56:51 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Fri, 03 Oct 2008 11:56:51 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48BF35C50600251B@> (added by postmaster@mail.o2.co.uk)
References: <48BF35C50600251B@> (added by postmaster@mail.o2.co.uk)
Message-ID: <1223053011.22918.62.camel@thor.seas.smu.edu>

Thanks so much for the reply, hopefully this will lead to something.

On Fri, 2008-10-03 at 17:25 +0100, Gordan Bobic wrote:
> It sounds like you have a SAN (fibre attached storage) that you are trying to turn into a NAS. That's justifiable if you have multiple mirrored SANs, but makes a mockery of HA if you only have one storage device since it leaves you with a single point of failure regardless of the number of front end nodes.

Understood on the san single point of failure.  We're addressing HA on
the front end, don't have the money to address the backend yet.  Storage
is something you set up once and don't have to mess with again, and
doesn't do things like have application issues, etc, it's just storage.
So barring a hardware issue not covered by redundant power supplies,
spare disks, etc, it doesn't have issues.  Having a cluster on the front
end allows for failure of software on one, being able to reboot one, and
provide zero downtime to the clients.

> Do you have a separate gigabit interface/vlan just for cluster communication? RHCS doesn't use a lot of sustained bandwidth but performance is sensitive to latencies for DLM comms. If you only have 2 nodes, a direct crossover connection would be ideal.

Not sure how to accomplish that.  How do you get certain services of the
cluster environment to talk over 1 interface, and other services (such
as the shares) over another?  The only other interface I have configured
is for the fence device (dell drac cards).

> How big is your data store? Are files large or small? Are they in few directories with lots of files (e.g. Maildir)?

Very much mixed.  We have SAS and SATA  in the same SAN device, and
carved out based on application performance need.  Some large volumes
(7TB), some small (2GB).  Some large files (video) down to the mix of
millions of 1k user files.

> Load averages will go up - that's normal, since there is added latency (round trip time) from locking across nodes. Unless your CPUs is 0% idle, the servers aren't running out of steam. So don't worry about it.

Understood.  That was just the measure I used as comparison.  There is
definite performance lag during these higher load averages.  What I was
trying (and doing poorly) to communicate was that all we are doing here
is serving files over nfs..we're not running apps on the cluster
itself...difficult for me to understand why file serving would be so
slow or ever drive load up on a box that high.  And, the old file
server, did not have these performance issues doing the same tasks with
less hardware, bandwith, etc.

> Also note that a clustered FS will _ALWAYS_ be slower than a non-clustered one, all things being equal. No exceptions. Also, if you are load sharing across the nodes, and you have Maildir-like file structures, it'll go slower than a purely fail-over setup, even on a clustered FS (like GFS), since there is no lock bouncing between head nodes. For extra performance, you can use a non-clustered FS as a failover resource, but be very careful with that since dual mounting a non-clustered FS will destroy the volume firtually instantly.

Agreed.  That's not the comaprison though.  Our old file server was
running a clustered file system from Tru64 (AdvFS).  Our expectation was
that a newer technology, plus a major upgrade in hardware, would result
in better performance at least than what we had, it has not, it is far
worse.

> Provided that your data isn't fundamentally unsuitable for being handled by a clustered load sharing setup, you could try increasing lock trimming and increasing the number of resource groups. Search through the archives for details on that.

Can you point me in the direction of the archives?  I can't seem to find
them?

> More suggestions when you provide more details on what your data is like.

My apologies for the lack of detail, I'm a bit lost as to what to
provide.  It's basic files, large and small.  User volumes, webserver
volumes, postfix mail volumes, etc.  Thanks so much!

> Gordan
> 


From gordan at bobich.net  Fri Oct  3 17:29:33 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Fri, 03 Oct 2008 18:29:33 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223053011.22918.62.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@> (added by postmaster@mail.o2.co.uk)
	<1223053011.22918.62.camel@thor.seas.smu.edu>
Message-ID: <48E6567D.5020508@bobich.net>

Doug Tucker wrote:

>> Do you have a separate gigabit interface/vlan just for cluster
 >> communication? RHCS doesn't use a lot of sustained bandwidth but
 >> performance is sensitive to latencies for DLM comms. If you only have
 >> 2 nodes, a direct crossover connection would be ideal.
> 
> Not sure how to accomplish that.  How do you get certain services of the
> cluster environment to talk over 1 interface, and other services (such
> as the shares) over another?  The only other interface I have configured
> is for the fence device (dell drac cards).

In your cluster.conf, make sure in the

<cluternode name="node1c"....

section is pointing at a private crossover IP of the node. Say you have 
2nd dedicated Gb interface for the clustering, assign it address, say 
10.0.0.1, and in the hosts file, have something like

10.0.0.1 node1c
10.0.0.2 node2c

That way each node in the cluster is referred to by it's cluster 
interface name, and thus the cluster communication will go over that 
dedicated interface.

The fail-over resources (typically client-side IPs) remain as they are 
on the client-side subnet.

>> How big is your data store? Are files large or small? Are they in
>> few directories with lots of files (e.g. Maildir)?
> 
> Very much mixed.  We have SAS and SATA  in the same SAN device, and
> carved out based on application performance need.  Some large volumes
> (7TB), some small (2GB).  Some large files (video) down to the mix of
> millions of 1k user files.

GFS copes OK with large files split across many separate directories. 
But if you are expecting to get fast random writes on files in the same 
directory, prepare to be disappointed. A write to a directory requires a 
directory lock, so concurrent writes to the same directory are going to 
have major performance issues. There isn't really any way to work around 
that, on any clustered FS.

As long as there is no directory write contention, it should be OK, though.

>> Load averages will go up - that's normal, since there is added latency
>> (round trip time) from locking across nodes. Unless your CPUs is 0% idle,
>> the servers aren't running out of steam. So don't worry about it.
> 
> Understood.  That was just the measure I used as comparison.  There is
> definite performance lag during these higher load averages.  What I was
> trying (and doing poorly) to communicate was that all we are doing here
> is serving files over nfs..we're not running apps on the cluster
> itself...difficult for me to understand why file serving would be so
> slow or ever drive load up on a box that high.

It sounds like you are seeing write contention. Make sure you mount 
everything with noatime,nodiratime,noquota, both from the GFS and from 
the NFS clients' side. Otherwise ever read will also require a write, 
and that'll kill any hope of getting decent performance out of the system.

> And, the old file
> server, did not have these performance issues doing the same tasks with
> less hardware, bandwith, etc.

I'm guessing the old server was standalone, rather than clustered?

>> Also note that a clustered FS will _ALWAYS_ be slower than a non-clustered
 >> one, all things being equal. No exceptions. Also, if you are load
 >> sharingacross the nodes, and you have Maildir-like file structures,
>> it'll go slower than a purely fail-over setup, even on a clustered
>> FS (like GFS), since there is no lock bouncing between head nodes.
>> For extra performance, you can use a non-clustered FS as a failover
 >> resource, but be very careful with that since dual mounting a
>> non-clustered FS will destroy the volume firtually instantly.
> 
> Agreed.  That's not the comaprison though.  Our old file server was
> running a clustered file system from Tru64 (AdvFS).  Our expectation was
> that a newer technology, plus a major upgrade in hardware, would result
> in better performance at least than what we had, it has not, it is far
> worse.

I see, so you had two servers in a load-sharing write-write 
configuration before, too?

>> Provided that your data isn't fundamentally unsuitable for being
>> handled by a clustered load sharing setup, you could try
>> increasing lock trimming and increasing the number of resource
>> groups. Search through the archives for details on that.
> 
> Can you point me in the direction of the archives?  I can't seem to find
> them?

Try here:
http://www.mail-archive.com/search?l=linux-cluster%40redhat.com

Look for gfs lock trimming and resource group related tuning.

>> More suggestions when you provide more details on what your data is like.
> 
> My apologies for the lack of detail, I'm a bit lost as to what to
> provide.  It's basic files, large and small.  User volumes, webserver
> volumes, postfix mail volumes, etc.

The important thing is to:
1) reduce the number of concurrent writes to the same directory to the 
maximum extent possible.
2) reduce the number of unnecessary writes (noatime,nodiratime)

All writes require locks to be bounced between the nodes, and this can 
add a significant overhead.

If you set the nodes up in a fail-over configuration, and server all the 
traffic from the primary node, you may see the performance improve due 
to locks not being bounced around all the time, they'll get set on the 
master node and stay there until the master node fails and it's floating 
IP gets migrated to the other node.

Gordan


From macscr at macscr.com  Fri Oct  3 17:48:39 2008
From: macscr at macscr.com (Mark Chaney)
Date: Fri, 3 Oct 2008 12:48:39 -0500
Subject: [Linux-cluster] error messages explained
In-Reply-To: <48E5CBF8.4000301@redhat.com>
References: <02d501c924f8$f3eb4020$dbc1c060$@com> <48E5CBF8.4000301@redhat.com>
Message-ID: <033b01c92580$464023e0$d2c06ba0$@com>

When you say, need to join with the services running. What services do I
need to start in order to do this manual join? Just cman? If a node crashes
and cant rejoin. I have to hurry up (before its fenced again) and disable
the auto start (chkconfig) of the following services: rgmanager, gfs, clvmd,
and cman. Then reboot that node again? Then start cman and try to rejoin
with just the cman_tool?

The question is, if a server isn't part of a cluster anymore (aka, it was
rebooted), the cluster obviously recognizes that disconnect and since the
node was rebooted, it shouldn't even think its part of a cluster. So why in
the world does anything think it is?

All these manually changes after a simple node reboot or fencing just
doesn't seem like a good design plan. I don't consider myself even
moderately knowledgeable in this arena, I am just looking at this from a
design perspective.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Christine Caulfield
Sent: Friday, October 03, 2008 2:39 AM
To: linux clustering
Subject: Re: [Linux-cluster] error messages explained

Mark Chaney wrote:
> Cam someone explain to me these errors and tell me how I should attempt to
> resolve them? They both aren't happening at the same time exactly, its
just
> to errors that I don't truly understand.
> 
> ####################
> 
> ccsd[3192]: Attempt to close an unopened CCS descriptor (13590).
> ccsd[3192]: Error while processing disconnect:
> Invalid request descriptor
> 
> ##################
> 
> openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined
> the cluster with existing state
> 

I need to add this to the FAQ!

What this message means is that a node was a valid member of the cluster
once; it then left the cluster (without being fenced) and rejoined
automatically. This can sometimes happen if the ethernet is disconnected
for a time, usually a few seconds.

If a node leave the cluster, it MUST rejoin using the cman_tool join
command with no services running. The usual way to make this happen is
to reboot the node, and if fencing is configured correctly that is what
normally happens. It could be that fencing is too slow to manage this or
that the cluster is made up of two nodes without a quorum disk so that
the 'other' node doesn't have quorum and cannot initiate fencing.

Another (more common) cause of this, is slow responding of some Cisco
switches as documented here:

http://www.openais.org/doku.php?id=faq:cisco_switches


-- 

Chrissie

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From tuckerd at engr.smu.edu  Fri Oct  3 19:47:04 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Fri, 03 Oct 2008 14:47:04 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48E6567D.5020508@bobich.net>
References: <48BF35C50600251B@> (added by postmaster@mail.o2.co.uk)
	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
Message-ID: <1223063224.22918.102.camel@thor.seas.smu.edu>

Let me say first, I appreciate your help tremendously.  Let me answer
some questions, and then I need to go do some homework you have
suggested.


> In your cluster.conf, make sure in the
> 
> <cluternode name="node1c"....
> 
> section is pointing at a private crossover IP of the node. Say you have 
> 2nd dedicated Gb interface for the clustering, assign it address, say 
> 10.0.0.1, and in the hosts file, have something like
> 
> 10.0.0.1 node1c
> 10.0.0.2 node2c
> 
> That way each node in the cluster is referred to by it's cluster 
> interface name, and thus the cluster communication will go over that 
> dedicated interface.
> 
I'm not sure I understand this correctly, please bear with me, are you
saying the communication runs over the fenced interface?  Or that the
node name should reference a seperate nic that is private, and the
exported virtual ip to the clients is done over the public interface?
I'm confused, I thought that definition had to be the same as the
hostname of the box?  Here is what is in my conf file for reference:

 <clusternode name="engrfs1.seas.smu.edu" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename=""
name="engrfs1drac"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="engrfs2.seas.smu.edu" nodeid="2"
votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename=""
name="engrfs2drac"/>
                                </method>
                        </fence>

Where as engrfs1 and 2 are the actual hostnames of the boxes.
 

> The fail-over resources (typically client-side IPs) remain as they are 
> on the client-side subnet.

> It sounds like you are seeing write contention. Make sure you mount 
> everything with noatime,nodiratime,noquota, both from the GFS and from 
> the NFS clients' side. Otherwise ever read will also require a write, 
> and that'll kill any hope of getting decent performance out of the system.
Already mounted noatime, will add nodiratime.  Can't do noquota, we
implement quotas for ever users here (5000 or so), and did so on the old
file server.
> 

> 
> I'm guessing the old server was standalone, rather than clustered?
No, clustered, as I assume you realized below, just making sure it's
clear.
> 

> 
> I see, so you had two servers in a load-sharing write-write 
> configuration before, too?
Certainly were capable of such.  However here, as we did there, we set
it up in more of a failover mode.  We export a virtual ip attached to
the nfs export, and all clients mount the vip, so whichever machine has
the vip at a given time is "master" and gets all the traffic.  The only
exception to this is the backups that run at night, we do on the
"secondary" machine directly, rather than using the vip.  And the
secondary is only there in the event of a failure to node1, when node1
comes back online, it is set up to fail back to node1.
> 


> If you set the nodes up in a fail-over configuration, and server all the 
> traffic from the primary node, you may see the performance improve due 
> to locks not being bounced around all the time, they'll get set on the 
> master node and stay there until the master node fails and it's floating 
> IP gets migrated to the other node.
As explained above, exactly how it is set up.  Old file server the same
way.  We're basically completely scratching our heads in disbelief here
to a large degree.  No if/ands/buts about it, hardware wise, we have
500% more box than we used to have.  Configuration architecture is
virtually identical.  Which leaves us with the software, which leaves us
with only 2 conclusions we can come up with:

1)  Tru64 and TruCluster with Advfs from 7 years ago is simply that much
more robust and mature than RHES4 and CS/GFS and therefore tremendously
outperforms it...or
2)  We have this badly configured.
> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From macscr at macscr.com  Fri Oct  3 22:32:00 2008
From: macscr at macscr.com (Mark Chaney)
Date: Fri, 3 Oct 2008 17:32:00 -0500
Subject: [Linux-cluster] fencing single server results in two servers fenced
Message-ID: <036401c925a7$dbc29b60$9347d220$@com>

I have a 3 node cluster. When I fence a single server (skydive.local) from
wheeljack.local, I get the following results (2 servers clustered). I can
duplicate this with any of my 3 servers. Why is the second server being
fenced?

##############################################

[root at wheeljack ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M   9580   2008-10-03 16:35:10  ratchet.local
   2   M   9580   2008-10-03 16:35:10  skydive.local
   3   M   9568   2008-10-03 16:35:10  wheeljack.local
[root at wheeljack ~]# tail -f /var/log/messages
Oct  3 16:35:15 wheeljack kernel: dlm: Using TCP for communications
Oct  3 16:35:15 wheeljack kernel: dlm: got connection from 1
Oct  3 16:35:15 wheeljack kernel: dlm: got connection from 2
Oct  3 16:35:16 wheeljack clvmd: Cluster LVM daemon started - connected to
CMAN
Oct  3 16:35:16 wheeljack multipathd: dm-3: add map (uevent)
Oct  3 16:35:16 wheeljack multipathd: dm-4: add map (uevent)
Oct  3 16:35:16 wheeljack multipathd: dm-5: add map (uevent)
Oct  3 16:35:16 wheeljack multipathd: dm-6: add map (uevent)
Oct  3 16:35:16 wheeljack multipathd: dm-7: add map (uevent)
Oct  3 16:35:17 wheeljack clurgmgrd[5407]: <notice> Resource Group Manager
Starting

Oct  3 16:37:15 wheeljack ntpd[3756]: synchronized to LOCAL(0), stratum 10
Oct  3 16:37:15 wheeljack ntpd[3756]: kernel time sync enabled 0001
Oct  3 16:38:19 wheeljack ntpd[3756]: synchronized to 66.79.149.35, stratum
2
Oct  3 16:42:42 wheeljack ntpd[3756]: synchronized to 64.202.112.65, stratum
2

Oct  3 16:44:05 wheeljack openais[5217]: [TOTEM] entering GATHER state from
12.
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] entering GATHER state from
11.
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] Saving state aru 52 high
seq received 52
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] Storing new sequence id for
ring 2570
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] entering COMMIT state.
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] entering RECOVERY state.
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] position [0] member
192.168.1.10:
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] previous ring seq 9580 rep
192.168.1.10
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] aru 52 high delivered 52
received flag 1
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] position [1] member
192.168.1.11:
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] previous ring seq 9580 rep
192.168.1.10
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] aru 52 high delivered 52
received flag 1
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] Did not need to originate
any messages in recovery.
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] CLM CONFIGURATION CHANGE
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] New Configuration:
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ]        r(0)
ip(192.168.1.10)
Oct  3 16:44:10 wheeljack kernel: dlm: closing connection to node 2
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ]        r(0)
ip(192.168.1.11)
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] Members Left:
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ]        r(0)
ip(192.168.1.12)
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] Members Joined:
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] CLM CONFIGURATION CHANGE
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] New Configuration:
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ]        r(0)
ip(192.168.1.10)
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ]        r(0)
ip(192.168.1.11)
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] Members Left:
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] Members Joined:
Oct  3 16:44:10 wheeljack openais[5217]: [SYNC ] This node is within the
primary component and will provide service.
Oct  3 16:44:10 wheeljack openais[5217]: [TOTEM] entering OPERATIONAL state.
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] got nodejoin message
192.168.1.10
Oct  3 16:44:10 wheeljack openais[5217]: [CLM  ] got nodejoin message
192.168.1.11
Oct  3 16:44:10 wheeljack openais[5217]: [CPG  ] got joinlist message from
node 3
Oct  3 16:44:10 wheeljack openais[5217]: [CPG  ] got joinlist message from
node 1
Oct  3 16:44:10 wheeljack fenced[5236]: fencing deferred to ratchet.local

[root at wheeljack ~]#
Message from syslogd@ at Fri Oct  3 16:48:12 2008 ...
wheeljack clurgmgrd[5407]: <emerg> #1: Quorum Dissolved
Message from syslogd@ at Fri Oct  3 16:52:55 2008 ...
wheeljack clurgmgrd[5407]: <emerg> #1: Quorum Dissolved


From orkcu at yahoo.com  Sat Oct  4 00:14:06 2008
From: orkcu at yahoo.com (Roger Pena Escobio)
Date: Fri, 3 Oct 2008 17:14:06 -0700 (PDT)
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223063224.22918.102.camel@thor.seas.smu.edu>
Message-ID: <24956.47991.qm@web88307.mail.re4.yahoo.com>


--- On Fri, 10/3/08, Doug Tucker <tuckerd at engr.smu.edu> wrote:

> From: Doug Tucker <tuckerd at engr.smu.edu>
> Subject: Re: [Linux-cluster] rhcs + gfs performance issues
> To: "linux clustering" <linux-cluster at redhat.com>
> Received: Friday, October 3, 2008, 3:47 PM
> Let me say first, I appreciate your help tremendously.  Let
> me answer
> some questions, and then I need to go do some homework you
> have
> suggested.
> 
> 
> > In your cluster.conf, make sure in the
> > 
> > <cluternode name="node1c"....
> > 
> > section is pointing at a private crossover IP of the
> node. Say you have 
> > 2nd dedicated Gb interface for the clustering, assign
> it address, say 
> > 10.0.0.1, and in the hosts file, have something like
> > 
> > 10.0.0.1 node1c
> > 10.0.0.2 node2c
> > 
> > That way each node in the cluster is referred to by
> it's cluster 
> > interface name, and thus the cluster communication
> will go over that 
> > dedicated interface.
> > 
> I'm not sure I understand this correctly, please bear
> with me, are you
> saying the communication runs over the fenced interface? 
> Or that the
> node name should reference a seperate nic that is private,
> and the
> exported virtual ip to the clients is done over the public
> interface?
> I'm confused, I thought that definition had to be the
> same as the
> hostname of the box?  Here is what is in my conf file for
> reference:
> 
>  <clusternode name="engrfs1.seas.smu.edu"
> nodeid="1" votes="1">
>                         <fence>
>                                 <method
> name="1">
>                                         <device
> modulename=""
> name="engrfs1drac"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode
> name="engrfs2.seas.smu.edu" nodeid="2"
> votes="1">
>                         <fence>
>                                 <method
> name="1">
>                                         <device
> modulename=""
> name="engrfs2drac"/>
>                                 </method>
>                         </fence>
> 
> Where as engrfs1 and 2 are the actual hostnames of the
> boxes.

the cluster will do the hearbeat and internal communication through the interface that the "nodes" (declared in cluster.conf) are reachable
that mean that you can use "internal" names (name declared in /etc/hosts in every node) just for the cluster communication

>  
> 


> > I see, so you had two servers in a load-sharing
> write-write 
> > configuration before, too?
> Certainly were capable of such.  However here, as we did
> there, we set
> it up in more of a failover mode.  We export a virtual ip
> attached to
> the nfs export, and all clients mount the vip, so whichever
> machine has
> the vip at a given time is "master" and gets all
> the traffic.  The only
> exception to this is the backups that run at night, we do
> on the
> "secondary" machine directly, rather than using
> the vip.  And the
> secondary is only there in the event of a failure to node1,
> when node1
> comes back online, it is set up to fail back to node1.
> > 
> 
> 
> > If you set the nodes up in a fail-over configuration,
> and server all the 
> > traffic from the primary node, you may see the
> performance improve due 
> > to locks not being bounced around all the time,
> they'll get set on the 
> > master node and stay there until the master node fails
> and it's floating 
> > IP gets migrated to the other node.
> As explained above, exactly how it is set up.  Old file
> server the same

then I suggest you to define the service as failover (active-pasive) but the fs as GFS so you can mount it on-demand when you need to made the backup then umount it
that way during normal workload the cluster nodes will not need to agree when reading/writing to the FS

and, also you should measure the performance of the NFS standalone linux server compared to the tru64 nfs server, maybe the big performance degradation you are noticing is not the cluster layer

thanks
roger


From ccaulfie at redhat.com  Sat Oct  4 14:28:59 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Sat, 04 Oct 2008 15:28:59 +0100
Subject: [Linux-cluster] error messages explained
In-Reply-To: <033b01c92580$464023e0$d2c06ba0$@com>
References: <02d501c924f8$f3eb4020$dbc1c060$@com> <48E5CBF8.4000301@redhat.com>
	<033b01c92580$464023e0$d2c06ba0$@com>
Message-ID: <48E77DAB.5000000@redhat.com>

Mark Chaney wrote:
> When you say, need to join with the services running. What services do I
> need to start in order to do this manual join? Just cman? If a node crashes
> and cant rejoin. I have to hurry up (before its fenced again) and disable
> the auto start (chkconfig) of the following services: rgmanager, gfs, clvmd,
> and cman. Then reboot that node again? Then start cman and try to rejoin
> with just the cman_tool?
> 
> The question is, if a server isn't part of a cluster anymore (aka, it was
> rebooted), the cluster obviously recognizes that disconnect and since the
> node was rebooted, it shouldn't even think its part of a cluster. So why in
> the world does anything think it is?
> 
> All these manually changes after a simple node reboot or fencing just
> doesn't seem like a good design plan. I don't consider myself even
> moderately knowledgeable in this arena, I am just looking at this from a
> design perspective.
> 

I think you have misunderstood my. The point is that if a node leaves
the cluster it really should be rebooted and join the cluster cleanly
that way. There is no manual involvement at all. That's what the init
scripts are for and why they are run at startup.

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Christine Caulfield
> Sent: Friday, October 03, 2008 2:39 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] error messages explained
> 
> Mark Chaney wrote:
>> Cam someone explain to me these errors and tell me how I should attempt to
>> resolve them? They both aren't happening at the same time exactly, its
> just
>> to errors that I don't truly understand.
>>
>> ####################
>>
>> ccsd[3192]: Attempt to close an unopened CCS descriptor (13590).
>> ccsd[3192]: Error while processing disconnect:
>> Invalid request descriptor
>>
>> ##################
>>
>> openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined
>> the cluster with existing state
>>
> 
> I need to add this to the FAQ!
> 
> What this message means is that a node was a valid member of the cluster
> once; it then left the cluster (without being fenced) and rejoined
> automatically. This can sometimes happen if the ethernet is disconnected
> for a time, usually a few seconds.
> 
> If a node leave the cluster, it MUST rejoin using the cman_tool join
> command with no services running. The usual way to make this happen is
> to reboot the node, and if fencing is configured correctly that is what
> normally happens. It could be that fencing is too slow to manage this or
> that the cluster is made up of two nodes without a quorum disk so that
> the 'other' node doesn't have quorum and cannot initiate fencing.
> 
> Another (more common) cause of this, is slow responding of some Cisco
> switches as documented here:
> 
> http://www.openais.org/doku.php?id=faq:cisco_switches
> 
> 


-- 

Chrissie


From gordan at bobich.net  Sat Oct  4 17:32:58 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Sat, 04 Oct 2008 18:32:58 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223063224.22918.102.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@> (added by
	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
Message-ID: <48E7A8CA.1000502@bobich.net>

Doug Tucker wrote:
>> In your cluster.conf, make sure in the
>>
>> <cluternode name="node1c"....
>>
>> section is pointing at a private crossover IP of the node. Say you have 
>> 2nd dedicated Gb interface for the clustering, assign it address, say 
>> 10.0.0.1, and in the hosts file, have something like
>>
>> 10.0.0.1 node1c
>> 10.0.0.2 node2c
>>
>> That way each node in the cluster is referred to by it's cluster 
>> interface name, and thus the cluster communication will go over that 
>> dedicated interface.
>>
> I'm not sure I understand this correctly, please bear with me, are you
> saying the communication runs over the fenced interface?

No, over a dedicated, separate interface.

> Or that the
> node name should reference a seperate nic that is private, and the
> exported virtual ip to the clients is done over the public interface?

That's the one.

> I'm confused, I thought that definition had to be the same as the
> hostname of the box?

No. The floating IPs will get assigned to whatever interface has the IP 
on that subnet. The cluster/DLM comms interface is inferred by the node 
name.

> Here is what is in my conf file for reference:
> 
>  <clusternode name="engrfs1.seas.smu.edu" nodeid="1" votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device modulename=""
> name="engrfs1drac"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="engrfs2.seas.smu.edu" nodeid="2"
> votes="1">
>                         <fence>
>                                 <method name="1">
>                                         <device modulename=""
> name="engrfs2drac"/>
>                                 </method>
>                         </fence>
> 
> Where as engrfs1 and 2 are the actual hostnames of the boxes.

Add another NIC in, give it a private IP/subnet, and put it in the hosts 
file on both nodes as something like engrfs1-cluster.seas.smu.edu, and 
put that in the clusternode name entry.

>> The fail-over resources (typically client-side IPs) remain as they are 
>> on the client-side subnet.
> 
>> It sounds like you are seeing write contention. Make sure you mount 
>> everything with noatime,nodiratime,noquota, both from the GFS and from 
>> the NFS clients' side. Otherwise ever read will also require a write, 
>> and that'll kill any hope of getting decent performance out of the system.
 >
> Already mounted noatime, will add nodiratime.  Can't do noquota, we
> implement quotas for ever users here (5000 or so), and did so on the old
> file server.
> 
>> I'm guessing the old server was standalone, rather than clustered?
 >
> No, clustered, as I assume you realized below, just making sure it's
> clear.

OK, noted.

>> I see, so you had two servers in a load-sharing write-write 
>> configuration before, too?
 >
> Certainly were capable of such.  However here, as we did there, we set
> it up in more of a failover mode.  We export a virtual ip attached to
> the nfs export, and all clients mount the vip, so whichever machine has
> the vip at a given time is "master" and gets all the traffic.  The only
> exception to this is the backups that run at night, we do on the
> "secondary" machine directly, rather than using the vip.  And the
> secondary is only there in the event of a failure to node1, when node1
> comes back online, it is set up to fail back to node1.

OK, that should be fine, although you may find there's less of a 
performance hit if you do the backup from the master node, too, as 
that'll already have the locks on all the files.

>> If you set the nodes up in a fail-over configuration, and server all the 
>> traffic from the primary node, you may see the performance improve due 
>> to locks not being bounced around all the time, they'll get set on the 
>> master node and stay there until the master node fails and it's floating 
>> IP gets migrated to the other node.
 >
> As explained above, exactly how it is set up.  Old file server the same
> way.  We're basically completely scratching our heads in disbelief here
> to a large degree.  No if/ands/buts about it, hardware wise, we have
> 500% more box than we used to have.  Configuration architecture is
> virtually identical.  Which leaves us with the software, which leaves us
> with only 2 conclusions we can come up with:
> 
> 1)  Tru64 and TruCluster with Advfs from 7 years ago is simply that much
> more robust and mature than RHES4 and CS/GFS and therefore tremendously
> outperforms it...or

RHEL4 is quite old. It's been a while since I used it for clustering. 
RHEL5 has yielded considerably better performance in my experience.

> 2)  We have this badly configured.

There isn't all that much to tune on RHEL4 cluster-wise, most of the 
tweakability has been added more recently than I've last used it. I'd 
say RHEL5 is certainly worth trying. The problem you are having may just 
go away.

Gordan


From macscr at macscr.com  Sat Oct  4 17:38:13 2008
From: macscr at macscr.com (Mark Chaney)
Date: Sat, 4 Oct 2008 12:38:13 -0500
Subject: [Linux-cluster] error messages explained
In-Reply-To: <48E77DAB.5000000@redhat.com>
References: <02d501c924f8$f3eb4020$dbc1c060$@com>
	<48E5CBF8.4000301@redhat.com>	<033b01c92580$464023e0$d2c06ba0$@com>
	<48E77DAB.5000000@redhat.com>
Message-ID: <03a901c92647$fb81efa0$f285cee0$@com>

Unfortunately simply rebooting has never resolved those errors. =/. I am
getting these errors after a server is fenced and is rebooted. Then its
fenced again, still same errors. I basically have to shutdown the entire
cluster manually, reboot with all init scripts off, then have manually start
all cluster services and add the services back to chkconfig. This is
basically the process I have to do 95% of the time when a single server is
fenced. =/

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Christine Caulfield
Sent: Saturday, October 04, 2008 9:29 AM
To: linux clustering
Subject: Re: [Linux-cluster] error messages explained

Mark Chaney wrote:
> When you say, need to join with the services running. What services do I
> need to start in order to do this manual join? Just cman? If a node
crashes
> and cant rejoin. I have to hurry up (before its fenced again) and disable
> the auto start (chkconfig) of the following services: rgmanager, gfs,
clvmd,
> and cman. Then reboot that node again? Then start cman and try to rejoin
> with just the cman_tool?
> 
> The question is, if a server isn't part of a cluster anymore (aka, it was
> rebooted), the cluster obviously recognizes that disconnect and since the
> node was rebooted, it shouldn't even think its part of a cluster. So why
in
> the world does anything think it is?
> 
> All these manually changes after a simple node reboot or fencing just
> doesn't seem like a good design plan. I don't consider myself even
> moderately knowledgeable in this arena, I am just looking at this from a
> design perspective.
> 

I think you have misunderstood my. The point is that if a node leaves
the cluster it really should be rebooted and join the cluster cleanly
that way. There is no manual involvement at all. That's what the init
scripts are for and why they are run at startup.

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Christine Caulfield
> Sent: Friday, October 03, 2008 2:39 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] error messages explained
> 
> Mark Chaney wrote:
>> Cam someone explain to me these errors and tell me how I should attempt
to
>> resolve them? They both aren't happening at the same time exactly, its
> just
>> to errors that I don't truly understand.
>>
>> ####################
>>
>> ccsd[3192]: Attempt to close an unopened CCS descriptor (13590).
>> ccsd[3192]: Error while processing disconnect:
>> Invalid request descriptor
>>
>> ##################
>>
>> openais[5453]: [MAIN ] Killing node ratchet.local because it has rejoined
>> the cluster with existing state
>>
> 
> I need to add this to the FAQ!
> 
> What this message means is that a node was a valid member of the cluster
> once; it then left the cluster (without being fenced) and rejoined
> automatically. This can sometimes happen if the ethernet is disconnected
> for a time, usually a few seconds.
> 
> If a node leave the cluster, it MUST rejoin using the cman_tool join
> command with no services running. The usual way to make this happen is
> to reboot the node, and if fencing is configured correctly that is what
> normally happens. It could be that fencing is too slow to manage this or
> that the cluster is made up of two nodes without a quorum disk so that
> the 'other' node doesn't have quorum and cannot initiate fencing.
> 
> Another (more common) cause of this, is slow responding of some Cisco
> switches as documented here:
> 
> http://www.openais.org/doku.php?id=faq:cisco_switches
> 
> 


-- 

Chrissie

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From d.vasilets at peterhost.ru  Mon Oct  6 07:22:09 2008
From: d.vasilets at peterhost.ru (=?koi8-r?Q?=F7=C1=D3=C9=CC=C5=C3_?= =?koi8-r?Q?=E4=CD=C9=D4=D2=C9=CA?=)
Date: Mon, 06 Oct 2008 11:22:09 +0400
Subject: [Linux-cluster] where i can find  define of  "volume_id_get_type"
Message-ID: <1223277729.10374.2.camel@dima-desktop>

i try compile last version of cluster package
have error " undefined reference to `volume_id_get_type'"
where i can find define of this function ?


From edoardo.causarano at laitspa.it  Mon Oct  6 09:00:27 2008
From: edoardo.causarano at laitspa.it (Edoardo Causarano)
Date: Mon, 6 Oct 2008 11:00:27 +0200
Subject: [Linux-cluster] "gfs" init script configuration
In-Reply-To: <B9F5EE1EE80E12479854A81337FB102C030A2149@RLBEMAIL02.interno.regione.lazio.it>
References: <B9F5EE1EE80E12479854A81337FB102C030A2149@RLBEMAIL02.interno.regione.lazio.it>
Message-ID: <1223283627.6492.1.camel@ecausarano-laptop>

Hi all,

can anyone help me on this issue I mentioned in my pevious email?

e


On mer, 2008-10-01 at 19:18 +0200, Edoardo Causarano wrote:
> Hi all,
> 
>  
> 
> further investigation shows that my gfs stalling on reboot is due to
> incorrect specification of the filesystem in /ec/fstab. I mount is as
> _netdev so /etc/init.d/gfs won?t pick it up.
> 
>  
> 
> What is the correct syntax to make sure /etc/init.d/gfs will pick up
> the fs at the right time during shutdown (before tearing down
> scsi_reservation and clustering)?
> 
>  
> 
> IE? can I peek at your fstabs? ;)
> 
>  
> 
> E
> 
> (excuse me for the outlook mail)
> 
> 
> Documento in testo semplice attachment (ATT265888.txt)
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From swhiteho at redhat.com  Mon Oct  6 09:20:45 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 06 Oct 2008 10:20:45 +0100
Subject: [Linux-cluster] "gfs" init script configuration
In-Reply-To: <1223283627.6492.1.camel@ecausarano-laptop>
References: <B9F5EE1EE80E12479854A81337FB102C030A2149@RLBEMAIL02.interno.regione.lazio.it>
	<1223283627.6492.1.camel@ecausarano-laptop>
Message-ID: <1223284845.3540.7.camel@localhost.localdomain>

Hi,

Please see bugs #207697, #246933, #435906 and #435945. Also #456476 is
the same bug but for FUSE. So you might need to write your own script to
solve the problem. You can kind of use _netdev, but it does have
limitations and there is currently no way to solve the shutdown problem
for filesystems which were not mounted by the initscripts,

Steve.

On Mon, 2008-10-06 at 11:00 +0200, Edoardo Causarano wrote:
> Hi all,
> 
> can anyone help me on this issue I mentioned in my pevious email?
> 
> e
> 
> 
> On mer, 2008-10-01 at 19:18 +0200, Edoardo Causarano wrote:
> > Hi all,
> > 
> >  
> > 
> > further investigation shows that my gfs stalling on reboot is due to
> > incorrect specification of the filesystem in /ec/fstab. I mount is as
> > _netdev so /etc/init.d/gfs won?t pick it up.
> > 
> >  
> > 
> > What is the correct syntax to make sure /etc/init.d/gfs will pick up
> > the fs at the right time during shutdown (before tearing down
> > scsi_reservation and clustering)?
> > 
> >  
> > 
> > IE? can I peek at your fstabs? ;)
> > 
> >  
> > 
> > E
> > 
> > (excuse me for the outlook mail)
> > 
> > 
> > Documento in testo semplice attachment (ATT265888.txt)
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From tuckerd at engr.smu.edu  Mon Oct  6 15:21:42 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Mon, 06 Oct 2008 10:21:42 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48E7A8CA.1000502@bobich.net>
References: <48BF35C50600251B@> (added by
	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>
Message-ID: <1223306502.29679.12.camel@thor.seas.smu.edu>


> > 1)  Tru64 and TruCluster with Advfs from 7 years ago is simply that much
> > more robust and mature than RHES4 and CS/GFS and therefore tremendously
> > outperforms it...or
> 
> RHEL4 is quite old. It's been a while since I used it for clustering. 
> RHEL5 has yielded considerably better performance in my experience.

Interesting.  The only way I can see upgrading is:

1) using the upgrade options on the production node (scary, and
downtime) or
2) do you know if RHEL5 will participate in a RHEL4 cluster?  if so, I
could add an RHEL5 node, make it master, and then have the ability to
upgrade the other 2 one at a time.

> 
> > 2)  We have this badly configured.
> 
> There isn't all that much to tune on RHEL4 cluster-wise, most of the 
> tweakability has been added more recently than I've last used it. I'd 
> say RHEL5 is certainly worth trying. The problem you are having may just 
> go away.

Bummer!

> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From michael.osullivan at auckland.ac.nz  Mon Oct  6 15:28:22 2008
From: michael.osullivan at auckland.ac.nz (michael.osullivan at auckland.ac.nz)
Date: Tue, 7 Oct 2008 04:28:22 +1300 (NZDT)
Subject: [Linux-cluster] Can't create LV in 2-node cluster
Message-ID: <1132.128.187.153.180.1223306902.squirrel@mail.esc.auckland.ac.nz>

Hi everyone,

I have created a 2-node cluster and (after a little difficulty) created a
clustered volume group visible on both nodes. However, I can't create a
logical volume on either node. I get the following error:

Error locking on node <other node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

I have locking_type = 3 and locking_library = liblvm2clusterlock.so. The
metadata areas and sequence nos are the same using vgdisplay on both
nodes.

Can anyone help me create LVs? I am happy to provide any extra info needed.

Thanks, Mike


From jeff.sturm at eprize.com  Mon Oct  6 16:01:32 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Mon, 6 Oct 2008 12:01:32 -0400
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223306502.29679.12.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@> (added
	bypostmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu><48E6567D.5020508@bobich.net><1223063224.22918.102.camel@thor.seas.smu.edu><48E7A8CA.1000502@bobich.net>
	<1223306502.29679.12.camel@thor.seas.smu.edu>
Message-ID: <64D0546C5EBBD147B75DE133D798665F01806522@hugo.eprize.local>

> 2) do you know if RHEL5 will participate in a RHEL4 cluster?  

Not with incompatible lock modules (DLM vs. GULM).  Sorry.  I don't
beleve there's a way to upgrade while the cluster is online.


From macscr at macscr.com  Mon Oct  6 16:03:57 2008
From: macscr at macscr.com (Mark Chaney)
Date: Mon, 6 Oct 2008 11:03:57 -0500
Subject: [Linux-cluster] Can't create LV in 2-node cluster
In-Reply-To: <1132.128.187.153.180.1223306902.squirrel@mail.esc.auckland.ac.nz>
References: <1132.128.187.153.180.1223306902.squirrel@mail.esc.auckland.ac.nz>
Message-ID: <04cb01c927cd$24b23ed0$6e16bc70$@com>

What are you using for shared storage? Also, 2 node clusters are highly
discouraged from my experience and recommendations of others.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
michael.osullivan at auckland.ac.nz
Sent: Monday, October 06, 2008 10:28 AM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Can't create LV in 2-node cluster

Hi everyone,

I have created a 2-node cluster and (after a little difficulty) created a
clustered volume group visible on both nodes. However, I can't create a
logical volume on either node. I get the following error:

Error locking on node <other node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

I have locking_type = 3 and locking_library = liblvm2clusterlock.so. The
metadata areas and sequence nos are the same using vgdisplay on both
nodes.

Can anyone help me create LVs? I am happy to provide any extra info needed.

Thanks, Mike


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From mulach at libero.it  Mon Oct  6 17:04:23 2008
From: mulach at libero.it (mulach)
Date: Mon, 6 Oct 2008 19:04:23 +0200
Subject: R: [Linux-cluster] Cluster Centos - Don't switch resource
In-Reply-To: <200809221730.45084.xavier.montagutelli@unilim.fr>
References: <K7L4MK$1975E3C7C3A907E3606A6C891AD3106D@libero.it>
	<200809221730.45084.xavier.montagutelli@unilim.fr>
Message-ID: <44C2C1BCF1DA4FBBAE543EFE9F4CB6AD@invitto>

It works fine, but i have a doubt: I must install heartbeat? If not in wich
case i do it?

Tnks

-----Messaggio originale-----
Da: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] Per conto di Xavier Montagutelli
Inviato: luned? 22 settembre 2008 17.31
A: linux clustering
Oggetto: Re: [Linux-cluster] Cluster Centos - Don't switch resource

On Monday 22 September 2008 08:56, mulach at libero.it wrote:
> Hi,
>
> in Centos 5.2 i had create a cluster with Conga(two node has been in
vmware
> server). The problem is that when a node fail, don't switch the service.
> Below the cluster.conf
>
> --------------------------------
> <?xml version="1.0"?>
> <cluster alias="SecondCluster" config_version="23" name="SecondCluster">
> 	<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
> 	<clusternodes>
> 		<clusternode name="clu1.localdomain" nodeid="1" votes="1">
> 			<fence>
> 				<method name="1"/>
> 			</fence>
> 		</clusternode>
> 		<clusternode name="clu2.localdomain" nodeid="2" votes="1">
> 			<fence>
> 				<method name="1"/>
> 			</fence>
> 		</clusternode>
> 	</clusternodes>

If I understand correctly your cluster.conf file, you don't have fencing 
devices for your nodes. In my tests, I **had** to define fencing methods for

the service to switch when one node fails. Otherwise, it doesn't work !

You can try configuring a "fence_manual" method (just for testing). After
clu1 
failure, clu2 should "fence" it. You then have to use 
the "fence_ack_manual -n clu1.localdomain" command to confirm manually that 
the fencing is done.

http://sources.redhat.com/cluster/wiki/FAQ/Fencing#fence_manual2


In my cluster.conf, it looks like :


 	<clusternode name="clu1.localdomain" nodeid="1" votes="1">
 		<fence>
 		  <method name="1"/>
                     <device name="manual_1" nodename="clu1.localdomain"/>
 		</fence>
	</clusternode>
	(same for clu2)

        <fencedevices>
                <fencedevice agent="fence_manual" name="manual_2"/>
                <fencedevice agent="fence_manual" name="manual_1"/>
        </fencedevices>

Does it solve your pb ?


> 	<cman expected_votes="1" two_node="1"/>
> 	<fencedevices/>
> 	<rm>
> 		<failoverdomains>
> 			<failoverdomain name="fail" ordered="1"
restricted="1">
> 				<failoverdomainnode name="clu1.localdomain"
priority="1"/>
> 				<failoverdomainnode name="clu2.localdomain"
priority="2"/>
> 			</failoverdomain>
> 		</failoverdomains>
> 		<resources>
> 			<ip address="192.168.80.201" monitor_link="1"/>
> 			<fs device="/dev/sdb1" force_fsck="0"
force_unmount="0" fsid="45662"
> fstype="ext3" mountpoint="/mnt/sdc" name="Share" options=""
> self_fence="0"/> </resources>
> 		<service autostart="1" domain="fail" exclusive="1"
name="IPS"
> recovery="restart"> <ip ref="192.168.80.201"/>
> 			<fs ref="Share"/>
> 		</service>
> 	</rm>
> 	<fence_xvmd/>
> </cluster>
>
> --------------------------------
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
Universite de Limoges
123, avenue Albert Thomas
87060 Limoges cedex

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
No virus found in this incoming message.
Checked by AVG - http://www.avg.com 
Version: 8.0.169 / Virus Database: 270.7.0/1683 - Release Date: 21/09/2008
10.10


From gordan at bobich.net  Mon Oct  6 17:45:00 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 06 Oct 2008 18:45:00 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223306502.29679.12.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@> (added
	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>
	<1223306502.29679.12.camel@thor.seas.smu.edu>
Message-ID: <48EA4E9C.7070901@bobich.net>

Doug Tucker wrote:
>>> 1)  Tru64 and TruCluster with Advfs from 7 years ago is simply that much
>>> more robust and mature than RHES4 and CS/GFS and therefore tremendously
>>> outperforms it...or
>> RHEL4 is quite old. It's been a while since I used it for clustering. 
>> RHEL5 has yielded considerably better performance in my experience.
> 
> Interesting.  The only way I can see upgrading is:
> 
> 1) using the upgrade options on the production node (scary, and
> downtime) or

"Upgrade" options never really worked on any OS. I wouldn't bank on it 
"just working", especially on something as complex as clustering.

> 2) do you know if RHEL5 will participate in a RHEL4 cluster?  if so, I
> could add an RHEL5 node, make it master, and then have the ability to
> upgrade the other 2 one at a time.

No, you can't mix versions. Even different package versions between the 
same distro version can cause problems.

>>> 2)  We have this badly configured.
>> There isn't all that much to tune on RHEL4 cluster-wise, most of the 
>> tweakability has been added more recently than I've last used it. I'd 
>> say RHEL5 is certainly worth trying. The problem you are having may just 
>> go away.
> 
> Bummer!

Worse, you may need to change the GFS file system options for the new 
version, so you may end up having to backup/restore the data. I don't 
think you can avoid cluster downtime for the upgrade.

Gordan


From tuckerd at engr.smu.edu  Mon Oct  6 18:34:23 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Mon, 06 Oct 2008 13:34:23 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48EA4E9C.7070901@bobich.net>
References: <48BF35C50600251B@> (added
	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>
	<1223306502.29679.12.camel@thor.seas.smu.edu>
	<48EA4E9C.7070901@bobich.net>
Message-ID: <1223318063.29679.20.camel@thor.seas.smu.edu>

O
> Worse, you may need to change the GFS file system options for the new 
> version, so you may end up having to backup/restore the data. I don't 
> think you can avoid cluster downtime for the upgrade.

Then upgrading is not an option.

> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From fdinitto at redhat.com  Mon Oct  6 18:54:57 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 6 Oct 2008 20:54:57 +0200 (CEST)
Subject: [Linux-cluster] can't compile cluster-2.03.07
In-Reply-To: <1223040639.10584.1.camel@dima-desktop>
References: <1223040639.10584.1.camel@dima-desktop>
Message-ID: <Pine.LNX.4.64.0810062053220.7453@trider-g7>

On Fri, 3 Oct 2008, ??????? ??????? wrote:

> I try compile cluster-2.03.07 with kernel 2.6.26-5
> how i can fix that ?
> every time report
> make[2]: Entering directory `/usr/src/kernels/linux-2.6.26.5'
>
>  WARNING: Symbol version
> dump /usr/src/kernels/linux-2.6.26.5/Module.symvers
>           is missing; modules will have no dependencies and
> modversions.
>
>  Building modules, stage 2.
>  MODPOST 1 modules
> /bin/sh: scripts/mod/modpost: No such file or directory
> make[3]: *** [__modpost] Error 127
> make[2]: *** [modules] Error 2
> make[2]: Leaving directory `/usr/src/kernels/linux-2.6.26.5'
> make[1]: *** [gnbd.ko] Error 2
> make[1]: Leaving directory `/root/gfs/cluster/gnbd-kernel/src'
> make: *** [gnbd-kernel/src] Error 2

These messages are spawn for different reasons. You either need to:

- install the kernel headers for the running kernel (depending on the 
distribution it might change)
- if you are using a custom kernel, you need to configure it and prepare 
the tree.
- configure --kernel_src and --kernel_build to point to the right 
locations.

Fabio

--
I'm going to make him an offer he can't refuse.

From fdinitto at redhat.com  Mon Oct  6 18:55:45 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 6 Oct 2008 20:55:45 +0200 (CEST)
Subject: [Linux-cluster] where i can find define of "volume_id_get_type"
In-Reply-To: <1223277729.10374.2.camel@dima-desktop>
References: <1223277729.10374.2.camel@dima-desktop>
Message-ID: <Pine.LNX.4.64.0810062055221.7453@trider-g7>

On Mon, 6 Oct 2008, ??????? ??????? wrote:

> i try compile last version of cluster package
> have error " undefined reference to `volume_id_get_type'"
> where i can find define of this function ?

libvolume_id

Fabio

--
I'm going to make him an offer he can't refuse.

From gordan at bobich.net  Mon Oct  6 19:10:58 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 06 Oct 2008 20:10:58 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223318063.29679.20.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@>
	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>	<48EA4E9C.7070901@bobich.net>
	<1223318063.29679.20.camel@thor.seas.smu.edu>
Message-ID: <48EA62C2.8070002@bobich.net>

Doug Tucker wrote:
> O
>> Worse, you may need to change the GFS file system options for the new 
>> version, so you may end up having to backup/restore the data. I don't 
>> think you can avoid cluster downtime for the upgrade.
> 
> Then upgrading is not an option.

And reverting back to the Tru64/Alpha system is?

Gordan


From tuckerd at engr.smu.edu  Mon Oct  6 20:05:13 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Mon, 06 Oct 2008 15:05:13 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48EA62C2.8070002@bobich.net>
References: <48BF35C50600251B@>
	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>
	<48EA4E9C.7070901@bobich.net>
	<1223318063.29679.20.camel@thor.seas.smu.edu>
	<48EA62C2.8070002@bobich.net>
Message-ID: <1223323513.29679.31.camel@thor.seas.smu.edu>


> 
> And reverting back to the Tru64/Alpha system is?

Nope, completely out of drive space on that one.  I'm basically stuck
where I'm at with it underperforming.

> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From gordan at bobich.net  Mon Oct  6 20:44:02 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 06 Oct 2008 21:44:02 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223323513.29679.31.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@>	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>	<48EA62C2.8070002@bobich.net>
	<1223323513.29679.31.camel@thor.seas.smu.edu>
Message-ID: <48EA7892.5030902@bobich.net>

Doug Tucker wrote:
>> And reverting back to the Tru64/Alpha system is?
> 
> Nope, completely out of drive space on that one.  I'm basically stuck
> where I'm at with it underperforming.

I don't mean to teach you to suck eggs, so please don't take this as 
patronizing, 'cause that's not my intention in any way shape or form, 
but since 1TB SATA disks go for around $160, could you not just plug a 
couple of those in as scratch space for the migration?

Gordan


From Dave.Jones at maritz.com  Mon Oct  6 21:10:21 2008
From: Dave.Jones at maritz.com (Jones, Dave)
Date: Mon, 6 Oct 2008 16:10:21 -0500
Subject: [Linux-cluster] Recommended 120v & 240v power switch
In-Reply-To: <48EA7892.5030902@bobich.net>
References: <48BF35C50600251B@>	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>	<48EA62C2.8070002@bobich.net><1223323513.29679.31.camel@thor.seas.smu.edu>
	<48EA7892.5030902@bobich.net>
Message-ID: <DDF0CD44B9EED841B556D44B7929813A05A87924@FENEXCH1602C.us.maritz.net>


Hello all.

Does anyone have a recommended power switch that works well for power
fencing and supports 120v and 240v?

Thanks,
Dave

Confidentiality Warning:  This e-mail contains information intended only for the use of the individual or entity named above.  If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering it to the intended recipient, any dissemination, publication or copying of this e-mail is strictly prohibited.  The sender does not accept any responsibility for any loss, disruption or damage to your data or computer system that may occur while using data contained in, or transmitted with, this e-mail.  
If you have received this e-mail in error, please immediately notify us by return e-mail.  Thank you.


From Dave.Jones at maritz.com  Mon Oct  6 21:25:35 2008
From: Dave.Jones at maritz.com (Jones, Dave)
Date: Mon, 6 Oct 2008 16:25:35 -0500
Subject: [Linux-cluster] Recommended 120v & 240v power switch
In-Reply-To: <DDF0CD44B9EED841B556D44B7929813A05A87924@FENEXCH1602C.us.maritz.net>
References: <48BF35C50600251B@>	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>	<48EA62C2.8070002@bobich.net><1223323513.29679.31.camel@thor.seas.smu.edu><48EA7892.5030902@bobich.net>
	<DDF0CD44B9EED841B556D44B7929813A05A87924@FENEXCH1602C.us.maritz.net>
Message-ID: <DDF0CD44B9EED841B556D44B7929813A05A87928@FENEXCH1602C.us.maritz.net>


One more thing -

Barring a dual-voltage device, which 208v device does everyone prefer?

Thanks,
D
 

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jones, Dave
Sent: Monday, October 06, 2008 4:10 PM
To: linux clustering
Subject: [Linux-cluster] Recommended 120v & 240v power switch


Hello all.

Does anyone have a recommended power switch that works well for power
fencing and supports 120v and 240v?

Thanks,
Dave

Confidentiality Warning:  This e-mail contains information intended only for the use of the individual or entity named above.  If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering it to the intended recipient, any dissemination, publication or copying of this e-mail is strictly prohibited.  The sender does not accept any responsibility for any loss, disruption or damage to your data or computer system that may occur while using data contained in, or transmitted with, this e-mail.  
If you have received this e-mail in error, please immediately notify us by return e-mail.  Thank you.


From jparsons at redhat.com  Mon Oct  6 21:37:50 2008
From: jparsons at redhat.com (jim parsons)
Date: Mon, 06 Oct 2008 17:37:50 -0400
Subject: [Linux-cluster] Recommended 120v & 240v power switch
In-Reply-To: <DDF0CD44B9EED841B556D44B7929813A05A87928@FENEXCH1602C.us.maritz.net>
References: <48BF35C50600251B@>
	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>
	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>
	<48EA62C2.8070002@bobich.net><1223323513.29679.31.camel@thor.seas.smu.edu>
	<48EA7892.5030902@bobich.net>
	<DDF0CD44B9EED841B556D44B7929813A05A87924@FENEXCH1602C.us.maritz.net>
	<DDF0CD44B9EED841B556D44B7929813A05A87928@FENEXCH1602C.us.maritz.net>
Message-ID: <1223329070.3266.10.camel@localhost.localdomain>

On Mon, 2008-10-06 at 16:25 -0500, Jones, Dave wrote:
> One more thing -
> 
> Barring a dual-voltage device, which 208v device does everyone prefer?
> 
> Thanks,
> D
>  
Take a look at apc AP7911, AP7921, or AP7940, depending on current needs
and number of outlets.

WTI makes nice switches, too. I do not know their product numbers off
hand though.

-j


From tuckerd at engr.smu.edu  Mon Oct  6 21:41:20 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Mon, 06 Oct 2008 16:41:20 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48EA7892.5030902@bobich.net>
References: <48BF35C50600251B@>
	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>
	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>
	<48EA62C2.8070002@bobich.net>
	<1223323513.29679.31.camel@thor.seas.smu.edu>
	<48EA7892.5030902@bobich.net>
Message-ID: <1223329280.29679.50.camel@thor.seas.smu.edu>


> 
> I don't mean to teach you to suck eggs, so please don't take this as 
> patronizing, 'cause that's not my intention in any way shape or form, 
> but since 1TB SATA disks go for around $160, could you not just plug a 
> couple of those in as scratch space for the migration?
I can't see a way around some significant downtime even with that, and
there is no way they will give me the option to be down from a planned
perspective.  

> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From orkcu at yahoo.com  Mon Oct  6 22:25:00 2008
From: orkcu at yahoo.com (Roger Pena Escobio)
Date: Mon, 6 Oct 2008 15:25:00 -0700 (PDT)
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223323513.29679.31.camel@thor.seas.smu.edu>
Message-ID: <690445.74676.qm@web88308.mail.re4.yahoo.com>


--- On Mon, 10/6/08, Doug Tucker <tuckerd at engr.smu.edu> wrote:

> From: Doug Tucker <tuckerd at engr.smu.edu>
> Subject: Re: [Linux-cluster] rhcs + gfs performance issues
> To: "linux clustering" <linux-cluster at redhat.com>
> Received: Monday, October 6, 2008, 4:05 PM
> > 
> > And reverting back to the Tru64/Alpha system is?
> 
> Nope, completely out of drive space on that one.  I'm
> basically stuck
> where I'm at with it underperforming.

did you check if the standalone NFS server also under perform?

just to rule out the cluster layer in the "under perform" equation

if it is not, then a pasive-active configuration for the cluster could give you what you want, but still using GFS filesystem so you can mount it simultaneous on the demand (for backup).

thanks
roger


From gordan at bobich.net  Mon Oct  6 22:33:53 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 06 Oct 2008 23:33:53 +0100
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <1223329280.29679.50.camel@thor.seas.smu.edu>
References: <48BF35C50600251B@>	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>	<48E6567D.5020508@bobich.net>	<1223063224.22918.102.camel@thor.seas.smu.edu>	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>	<48EA62C2.8070002@bobich.net>	<1223323513.29679.31.camel@thor.seas.smu.edu>	<48EA7892.5030902@bobich.net>
	<1223329280.29679.50.camel@thor.seas.smu.edu>
Message-ID: <48EA9251.40606@bobich.net>

Doug Tucker wrote:
>> I don't mean to teach you to suck eggs, so please don't take this as 
>> patronizing, 'cause that's not my intention in any way shape or form, 
>> but since 1TB SATA disks go for around $160, could you not just plug a 
>> couple of those in as scratch space for the migration?
 >
> I can't see a way around some significant downtime even with that, and
> there is no way they will give me the option to be down from a planned
> perspective.  

So, out of nowhere straight into production, without performance user 
acceptance testing period? And they won't allow any planned downtime? My 
mind boggles.

Good luck.

Gordan


From linux-cluster at merctech.com  Mon Oct  6 23:59:01 2008
From: linux-cluster at merctech.com (linux-cluster at merctech.com)
Date: Mon, 06 Oct 2008 19:59:01 -0400
Subject: [Linux-cluster] help wanted configuring services
Message-ID: <602.1223337541@mirchi>


Is it possible to set up a hierarchy of services, in the same way that a 
service is made up of individual resources?

I'm trying to set up a clustered web server using RHCS 5 on CentOS 5.2, with 
the latest versions from the "upstream provider's" production release.

The cluster provides several web applications, each of which has it's own
resources that Apache doesn't need to know about directly. Each application
relies on Apache as a presentation layer. I don't want to make the individual
applications' dependencies separate resources within a single "Apache" service,
because that makes management difficult.

For example, is it possible to create a structure like:

	Service:	Apache
		Private Resource:	IP address
		Private Resource:	GFS Vol1

		Shared Service:	Wiki
		Shared Service:	SVN
		Shared Service:	Calendar


	Service:	Wiki
		Private Resource:	GFS Vol2
		Private Resource:	MySQL "wiki" instance
	
	Service:	SVN
		Private Resource:	MySQL "svn" instance
		Private Resource:	GFS Vol3
	
	Service:	Calendar
			(no resources beyond apache)
		

Note that resources like "GFS Vol2" and "MySQL wiki instance" are assigned
to the "Wiki" service, not directly to the "Apache" service. The Apache
service sees "Wiki" as a resource.

With this kind of structure, I could administratively disable the "Wiki" service
without causing the other web applications to restart or relocate. However, if a
resource that the Wiki requires fails on the active web server (for example, GFS
Vol 2), the the standard failover policy would apply...the Wiki service would
restart and if that is unsuccessful, then the Apache service (and it's
dependencies--Wiki, SVN, Calendar, IP address, GFS Vol1) would relocate.

Is there any way to set up this kind of structure with RHCS?

Thanks,

Mark


-----
Mark Bergman    

http://wwwkeys.pgp.net:11371/pks/lookup?op=get&search=bergman%40merctech.com


From stuarta at squashedfrog.net  Tue Oct  7 08:46:26 2008
From: stuarta at squashedfrog.net (Stuart Auchterlonie)
Date: Tue, 07 Oct 2008 09:46:26 +0100
Subject: [Linux-cluster] Cisco working configuration
In-Reply-To: <20081001211109.GA11341@aaron>
References: <2000762384.122241222888753010.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>	<620016859.122671222888921460.JavaMail.root@zmail02.collab.prod.int.phx2.redhat.com>
	<20081001211109.GA11341@aaron>
Message-ID: <48EB21E2.70803@squashedfrog.net>

Jakub Suchy wrote:
> Leo Pleiman wrote:
>> The kbase article can be found at http://kbase.redhat.com/faq/FAQ_51_11755.shtm
>> It has a link to Cisco's web site enumerating 5 possible solutions. http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a008059a9df.shtml
> 
> Hello,
> I am aware of these documents and I have tried all these solutions.
> 


We had to turn on 'ip igmp snooping querier' as documented in the
link on the cisco website above, and it worked okay..


Stuart


From david at craigon.co.uk  Tue Oct  7 10:34:07 2008
From: david at craigon.co.uk (David)
Date: Tue, 07 Oct 2008 11:34:07 +0100
Subject: [Linux-cluster] My patch
Message-ID: <48EB3B1F.1010700@craigon.co.uk>

Did this patch ever get merged in?

https://www.redhat.com/archives/linux-cluster/2008-August/msg00026.html


From s.wendy.cheng at gmail.com  Tue Oct  7 13:57:20 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Tue, 07 Oct 2008 08:57:20 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <690445.74676.qm@web88308.mail.re4.yahoo.com>
References: <690445.74676.qm@web88308.mail.re4.yahoo.com>
Message-ID: <48EB6AC0.6070100@gmail.com>

Hopefully the following provide some relieves ...
 
1. Enable lock trimming tunable. It is particularly relevant if NFS-GFS 
is used by development type of workloads (editing, compiling, build, 
etc) and/or after filesystem backup. Unlike fast statfs, this tunable is 
per-node base (you don't need to have the same value on each of the 
nodes and a mix of on-off within the same cluster is ok). Make the 
trimming very aggressive on backup node (> 50% where you run backup) and 
moderate on your active node (< 50%). Try to experiment with different 
values to fit the workload. Googling "gfs lock trimming wcheng" to pick 
up the technical background if needed.

shell> gfs_tool settune <mount_point> glock_purge <percentage>
         (e.g. gfs_tool settune /mnt/gfs1 glock_purge 50)

2. Turn on readahead tunable. It is effective for large file (stream IO) 
read performance. As I recalled, there was a cluster (with IPTV 
application) used val=256 for 400-500M files. Another one with 2G file 
size used val=2048. Again, it is per-node base so different values are 
ok on different nodes.

shell> gfs_tool settune <mount> seq_readahead <val>
         (e.g. gfs_tool settune /mnt/gfs1 seq_readahead 2048)

3. Fast statfs tunable - you have this on already ? Make sure they need 
to be same across cluster nodes.

4. Understand the risks and performance implications of NFS server's 
"async" vs. "sync" options. Linux NFS server "sync" options are 
controlled by two different mechanisms - mount and export. By default, 
mount is "aysnc" and export is "sync". Even with specific "async" mount 
request, Linux server uses "sync" export as default that is particularly 
troublesome for gfs. I don't plan to give an example and/or suggest the 
exact export option here - hopefully this will force folks to do more 
researches to fully understand the ramifications between performance and 
data liability. Most of the proprietary NFS servers in the market today 
utilize hardware features to relieve this performance and data integrity 
conflicts. Mainline linux servers (and RHEL) are totally software-base 
so it generally has problem in this regard.

Gfs1 in general doesn't do well in "sync" performance (journal layer is 
too bulky). Gfs2 has potentials to do better (but I'm not sure).

There are also few other things that worth mentioned but my flight is 
call for boarding .. I'll stop here .......

-- Wendy


From gniagnia at gmail.com  Tue Oct  7 14:24:38 2008
From: gniagnia at gmail.com (gnia gnia)
Date: Tue, 7 Oct 2008 16:24:38 +0200
Subject: [Linux-cluster] how to configure qdisk in a two nodes cluster with
	mirrored LVM
Message-ID: <f8eac96f0810070724h1331a53ftefca5e4190c0c25@mail.gmail.com>

Hello all,

Situation:
We have a 2 nodes cluster (we don't use GFS). Only one node has an
active service.
The other node is only here in case the first node crashs (application
automatically restarts on the healthy node).

This service has a file system resource that is a mirrored LV across 2
storage bays (HSV210 - HP EVA8000 : let's call them SAN1 and SAN2).
We also have a quorum disk that is declared on SAN1.
Today, we tested the cluster behaviour in case of a SAN2 outage (We
did so by deactivating zoning between nodes and SAN2 controllers).
Immediatly, ths IO/s on the mirrored LV are stopped. After 2 or 3
minutes, the mirrored LV becomes linear and I/Os resume on the
available storage bay :

Sep 25 12:02:01 redhat lvm[15525]: Mirror device, 253:7, has failed.
Sep 25 12:02:01 redhat lvm[15525]: Device failure in vghpdriver-lvhpdriver
Sep 25 12:02:22 redhat lvm[15525]: WARNING: Bad device removed from
mirror volume, vghpdriver/lvhpdriver
Sep 25 12:02:22 redhat kernel: end_request: I/O error, dev sda, sector 5559920
Sep 25 12:02:22 redhat lvm[15525]: WARNING: Mirror volume,
vghpdriver/lvhpdriver converted to linear due to device failure.

This is pretty much what we hoped for.
But, when we do the same test on SAN1 (the one with qdisk), the
cluster instantly becomes inquorate and stops working.
Here is our qdisk configuration :

  <quorumd interval="3" tko="14" votes="2"
device="/dev/vgquorum/lvquorum" label="quorum_hpdr" log_level="7">
    <heuristic program="/etc/cluster/euristic.sh" score="1" interval="5"/>
  </quorumd>

Here is the ouput of 'cman_tool status'

# cman_tool status
Protocol version: 5.0.1
Config version: 2
Cluster name: clu_HPDRIVER
Cluster ID: 23324
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 2
Total_votes: 3
Quorum: 2
Active subsystems: 5
Node name: redhat.test.com
Node ID: 2
Node addresses: 192.168.1.6

I tried to replace votes="2" by votes="1" in
/etc/cluster/cluster.conf... which solved the problem.
But is it safe to do this?

Thanks


From tuckerd at engr.smu.edu  Tue Oct  7 15:00:17 2008
From: tuckerd at engr.smu.edu (Doug Tucker)
Date: Tue, 07 Oct 2008 10:00:17 -0500
Subject: [Linux-cluster] rhcs + gfs performance issues
In-Reply-To: <48EA9251.40606@bobich.net>
References: <48BF35C50600251B@>
	(added	by	postmaster@mail.o2.co.uk)	<1223053011.22918.62.camel@thor.seas.smu.edu>
	<48E6567D.5020508@bobich.net>
	<1223063224.22918.102.camel@thor.seas.smu.edu>
	<48E7A8CA.1000502@bobich.net>	<1223306502.29679.12.camel@thor.seas.smu.edu>
	<48EA4E9C.7070901@bobich.net>	<1223318063.29679.20.camel@thor.seas.smu.edu>
	<48EA62C2.8070002@bobich.net>	<1223323513.29679.31.camel@thor.seas.smu.edu>
	<48EA7892.5030902@bobich.net>
	<1223329280.29679.50.camel@thor.seas.smu.edu>
	<48EA9251.40606@bobich.net>
Message-ID: <1223391617.25152.40.camel@thor.seas.smu.edu>

>  >
> > I can't see a way around some significant downtime even with that, and
> > there is no way they will give me the option to be down from a planned
> > perspective.  
> 
> So, out of nowhere straight into production, without performance user 
> acceptance testing period? And they won't allow any planned downtime? My 
> mind boggles.

Yours too huh?  This is the strangest place I have ever worked quite
frankly.  I've never been anywhere I could not set aside a 2 hour window
at 3 am once a month for upgrades/maintenance.  They don't allow me that
here.  Migrating from old file server to new one, was done with zero
downtime and no interruption to the user community.  Due to $$$, very
little redundancy.  Sure, downtime does happen, but only when something
breaks.  I could go on, I think you get the picture and whining about it
doesn't help me here.

Straight into production...well, not exactly.  I set up a cluster, and
moved one application over and it ran for about 3 months before we began
the user moves.  Once the users and mail were moved, that's when the
load issues reared it's ugly head.  Like I said, was really bad at
first.  Had to bump the nfs processes to 256..that helped some..setting
the fs to fast = 1, had a much bigger impact.  The odd thing is, it
doesn't seem to take much to drive up the load.  Being an engineering
school, we have a lot of cadence users, and cadence writes 2-5k files on
a big job, and it doesn't take more than 2 or 3 users doing this, along
with the normal stuff always touching the fileserver (such as mail, web,
etc) to drive up load.  I can virus scan my mapped home directory and
watch load jump by 2 or 3.  Mounting my old home directory on the old
file server and doing the same thing, you wouldn't even know I was
touching files out there.  It's like directory/file access is just very
expensive for some reason and it goes against everything I know :P.

Let me run this by you.  I thought about another potential upgrade path.
What if I remove one node from the cluster and run on one node, take the
2nd down, install 5, get it prepped.  Is there anyway in the world to
somehow bring it up and have it mount the volumes AS master and take the
current primary down to then rebuild it?  I think my answer is no, but
thought it worth asking.  This inability to cross version participate
seems to really be my achilles heal here in getting it upgraded.

> 
> Good luck.
> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From shawnlhood at gmail.com  Tue Oct  7 17:33:45 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Tue, 7 Oct 2008 13:33:45 -0400
Subject: [Linux-cluster] GFS hanging on 3 node RHEL4 cluster
Message-ID: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>

Problem:
It seems that IO on one machine in the cluster (not always the same
machine) will hang and all processes accessing clustered LVs will
block.  Other machines will follow suit shortly thereafter until the
machine that first exhibited the problem is rebooted (via fence_drac
manually).  No messages in dmesg, syslog, etc.  Filesystems recently
fsckd.

Hardware:
Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
Running RHEL4 ES U7.  Four machines
Onboard gigabit NICs (Machines use little bandwidth, and all network
traffic including DLM share NICs)
QLogic 2462 PCI-Express dual channel FC HBAs
QLogic SANBox 5200 FC switch
Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
Cisco Catalyst switch

Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
x86_64 with the following packages:
ccs-1.0.12-1
cman-1.0.24-1
cman-kernel-smp-2.6.9-55.13.el4_7.1
cman-kernheaders-2.6.9-55.13.el4_7.1
dlm-kernel-smp-2.6.9-54.11.el4_7.1
dlm-kernheaders-2.6.9-54.11.el4_7.1
fence-1.32.63-1.el4_7.1
GFS-6.1.18-1
GFS-kernel-smp-2.6.9-80.9.el4_7.1

One clustered VG.  Striped across two physical volumes, which
correspond to each side of an Apple XRAID.
Clustered volume group info:
  --- Volume group ---
  VG Name               hq-san
  System ID
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  50
  VG Access             read/write
  VG Status             resizable
  Clustered             yes
  Shared                no
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               4.55 TB
  PE Size               4.00 MB
  Total PE              1192334
  Alloc PE / Size       905216 / 3.45 TB
  Free  PE / Size       287118 / 1.10 TB
  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv

Logical volumes contained with hq-san VG:
  cam_development   hq-san                          -wi-ao 500.00G
  qa            hq-san                          -wi-ao   1.07T
  svn_users         hq-san                          -wi-ao   1.89T

All four machines mount svn_users, two machines mount qa, and one
mounts cam_development.

/etc/cluster/cluster.conf:

<?xml version="1.0"?>
<cluster alias="tungsten" config_version="31" name="qualia">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="odin" votes="1">
                        <fence>
                                <method name="1">
                    <device modulename="" name="odin-drac"/>
                </method>
                        </fence>
                </clusternode>
                <clusternode name="hugin" votes="1">
                        <fence>
                                <method name="1">
                    <device modulename="" name="hugin-drac"/>
                </method>
                        </fence>
                </clusternode>
                <clusternode name="munin" votes="1">
                        <fence>
                                <method name="1">
                    <device modulename="" name="munin-drac"/>
                </method>
                        </fence>
                </clusternode>
                <clusternode name="zeus" votes="1">
                        <fence>
                                <method name="1">
                    <device modulename="" name="zeus-drac"/>
                </method>
                        </fence>
                </clusternode>
    </clusternodes>
        <cman expected_votes="1" two_node="0"/>
        <fencedevices>
                <resources/>
                <fencedevice name="odin-drac" agent="fence_drac"
ipaddr="redacted" login="root" passwd="redacted"/>
                <fencedevice name="hugin-drac" agent="fence_drac"
ipaddr="redacted" login="root" passwd="redacted"/>
                <fencedevice name="munin-drac" agent="fence_drac"
ipaddr="redacted" login="root" passwd="redacted"/>
                <fencedevice name="zeus-drac" agent="fence_drac"
ipaddr="redacted" login="root" passwd="redacted"/>
        </fencedevices>
        <rm>
        <failoverdomains/>
        <resources/>
    </rm>
</cluster>


--
Shawn Hood
910.670.1819 m


From shawnlhood at gmail.com  Tue Oct  7 17:40:51 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Tue, 7 Oct 2008 13:40:51 -0400
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
Message-ID: <cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>

More info:

All filesystems mounted using noatime,nodiratime,noquota.

All filesystems report the same data from gfs_tool gettune:

limit1 = 100
ilimit1_tries = 3
ilimit1_min = 1
ilimit2 = 500
ilimit2_tries = 10
ilimit2_min = 3
demote_secs = 300
incore_log_blocks = 1024
jindex_refresh_secs = 60
depend_secs = 60
scand_secs = 5
recoverd_secs = 60
logd_secs = 1
quotad_secs = 5
inoded_secs = 15
glock_purge = 0
quota_simul_sync = 64
quota_warn_period = 10
atime_quantum = 3600
quota_quantum = 60
quota_scale = 1.0000   (1, 1)
quota_enforce = 0
quota_account = 0
new_files_jdata = 0
new_files_directio = 0
max_atomic_write = 4194304
max_readahead = 262144
lockdump_size = 131072
stall_secs = 600
complain_secs = 10
reclaim_limit = 5000
entries_per_readdir = 32
prefetch_secs = 10
statfs_slots = 64
max_mhc = 10000
greedy_default = 100
greedy_quantum = 25
greedy_max = 250
rgrp_try_threshold = 100
statfs_fast = 0
seq_readahead = 0


And data on the FS from gfs_tool counters:
                                  locks 2948
                             locks held 1352
                           freeze count 0
                          incore inodes 1347
                       metadata buffers 0
                        unlinked inodes 0
                              quota IDs 0
                     incore log buffers 0
                         log space used 0.05%
              meta header cache entries 0
                     glock dependencies 0
                 glocks on reclaim list 0
                              log wraps 2
                   outstanding LM calls 0
                  outstanding BIO calls 0
                       fh2dentry misses 0
                       glocks reclaimed 223287
                         glock nq calls 1812286
                         glock dq calls 1810926
                   glock prefetch calls 101158
                          lm_lock calls 198294
                        lm_unlock calls 142643
                           lm callbacks 341621
                     address operations 502691
                      dentry operations 395330
                      export operations 0
                        file operations 199243
                       inode operations 984276
                       super operations 1727082
                          vm operations 0
                        block I/O reads 520531
                       block I/O writes 130315

                                  locks 171423
                             locks held 85717
                           freeze count 0
                          incore inodes 85376
                       metadata buffers 1474
                        unlinked inodes 0
                              quota IDs 0
                     incore log buffers 24
                         log space used 0.83%
              meta header cache entries 6621
                     glock dependencies 2037
                 glocks on reclaim list 0
                              log wraps 428
                   outstanding LM calls 0
                  outstanding BIO calls 0
                       fh2dentry misses 0
                       glocks reclaimed 45784677
                         glock nq calls 962822941
                         glock dq calls 962595532
                   glock prefetch calls 20215922
                          lm_lock calls 40708633
                        lm_unlock calls 23410498
                           lm callbacks 64156052
                     address operations 705464659
                      dentry operations 19701522
                      export operations 0
                        file operations 364990733
                       inode operations 98910127
                       super operations 440061034
                          vm operations 7
                        block I/O reads 90394984
                       block I/O writes 131199864

                                  locks 2916542
                             locks held 1476005
                           freeze count 0
                          incore inodes 1454165
                       metadata buffers 12539
                        unlinked inodes 100
                              quota IDs 0
                     incore log buffers 11
                         log space used 13.33%
              meta header cache entries 9928
                     glock dependencies 110
                 glocks on reclaim list 0
                              log wraps 2393
                   outstanding LM calls 25
                  outstanding BIO calls 0
                       fh2dentry misses 55546
                       glocks reclaimed 127341056
                         glock nq calls 867427
                         glock dq calls 867430
                   glock prefetch calls 36679316
                          lm_lock calls 110179878
                        lm_unlock calls 84588424
                           lm callbacks 194863553
                     address operations 250891447
                      dentry operations 359537343
                      export operations 390941288
                        file operations 399156716
                       inode operations 537830
                       super operations 1093798409
                          vm operations 774785
                        block I/O reads 258044208
                       block I/O writes 101585172


On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
> Problem:
> It seems that IO on one machine in the cluster (not always the same
> machine) will hang and all processes accessing clustered LVs will
> block.  Other machines will follow suit shortly thereafter until the
> machine that first exhibited the problem is rebooted (via fence_drac
> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
> fsckd.
>
> Hardware:
> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
> Running RHEL4 ES U7.  Four machines
> Onboard gigabit NICs (Machines use little bandwidth, and all network
> traffic including DLM share NICs)
> QLogic 2462 PCI-Express dual channel FC HBAs
> QLogic SANBox 5200 FC switch
> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
> Cisco Catalyst switch
>
> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
> x86_64 with the following packages:
> ccs-1.0.12-1
> cman-1.0.24-1
> cman-kernel-smp-2.6.9-55.13.el4_7.1
> cman-kernheaders-2.6.9-55.13.el4_7.1
> dlm-kernel-smp-2.6.9-54.11.el4_7.1
> dlm-kernheaders-2.6.9-54.11.el4_7.1
> fence-1.32.63-1.el4_7.1
> GFS-6.1.18-1
> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>
> One clustered VG.  Striped across two physical volumes, which
> correspond to each side of an Apple XRAID.
> Clustered volume group info:
>  --- Volume group ---
>  VG Name               hq-san
>  System ID
>  Format                lvm2
>  Metadata Areas        2
>  Metadata Sequence No  50
>  VG Access             read/write
>  VG Status             resizable
>  Clustered             yes
>  Shared                no
>  MAX LV                0
>  Cur LV                3
>  Open LV               3
>  Max PV                0
>  Cur PV                2
>  Act PV                2
>  VG Size               4.55 TB
>  PE Size               4.00 MB
>  Total PE              1192334
>  Alloc PE / Size       905216 / 3.45 TB
>  Free  PE / Size       287118 / 1.10 TB
>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>
> Logical volumes contained with hq-san VG:
>  cam_development   hq-san                          -wi-ao 500.00G
>  qa            hq-san                          -wi-ao   1.07T
>  svn_users         hq-san                          -wi-ao   1.89T
>
> All four machines mount svn_users, two machines mount qa, and one
> mounts cam_development.
>
> /etc/cluster/cluster.conf:
>
> <?xml version="1.0"?>
> <cluster alias="tungsten" config_version="31" name="qualia">
>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>        <clusternodes>
>                <clusternode name="odin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="odin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="hugin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="hugin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="munin" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="munin-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="zeus" votes="1">
>                        <fence>
>                                <method name="1">
>                    <device modulename="" name="zeus-drac"/>
>                </method>
>                        </fence>
>                </clusternode>
>    </clusternodes>
>        <cman expected_votes="1" two_node="0"/>
>        <fencedevices>
>                <resources/>
>                <fencedevice name="odin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="hugin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="munin-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>                <fencedevice name="zeus-drac" agent="fence_drac"
> ipaddr="redacted" login="root" passwd="redacted"/>
>        </fencedevices>
>        <rm>
>        <failoverdomains/>
>        <resources/>
>    </rm>
> </cluster>
>
>
>
>
> --
> Shawn Hood
> 910.670.1819 m
>


-- 
Shawn Hood
910.670.1819 m


From shawnlhood at gmail.com  Tue Oct  7 17:43:07 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Tue, 7 Oct 2008 13:43:07 -0400
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
	<cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
Message-ID: <cfe2fc960810071043h1ef881feib198860b36d2766@mail.gmail.com>

And for another follow-up in the interest of full disclosure, I don't recall
the specifics, but it seems dlm_recvd was eating up all the CPU cycles on
one of the machines, and others seemed to follow suit shortly thereafter.
Sorry for the flood!

Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081007/d448c585/attachment.htm>

From michael.osullivan at auckland.ac.nz  Tue Oct  7 19:10:13 2008
From: michael.osullivan at auckland.ac.nz (michael.osullivan at auckland.ac.nz)
Date: Wed, 8 Oct 2008 08:10:13 +1300 (NZDT)
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster
Message-ID: <1393.128.187.171.228.1223406613.squirrel@mail.esc.auckland.ac.nz>

Hi Mark,

This is just an experimental cluster for now, not production, so 2-nodes
is sufficient (as long as it doesn't significantly alter the setup, which
I don;t think it does). I have two multi-pathed iSCSI targets for storage,
one each on two separate boxes. I have got this going previously on a
slightly different set-up elsewhere, but this is my next effort that
incorporates easy shutdown/startup of the storage and cluster. Except I
can't get the LV up and running...

Thanks, Mike

Date: Mon, 6 Oct 2008 11:03:57 -0500
From: "Mark Chaney" <macscr at macscr.com>
Subject: RE: [Linux-cluster] Can't create LV in 2-node cluster
To: "'linux clustering'" <linux-cluster at redhat.com>
Message-ID: <04cb01c927cd$24b23ed0$6e16bc70$@com>
Content-Type: text/plain;        charset="us-ascii"

What are you using for shared storage? Also, 2 node clusters are highly
discouraged from my experience and recommendations of others.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
michael.osullivan at auckland.ac.nz
Sent: Monday, October 06, 2008 10:28 AM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Can't create LV in 2-node cluster

Hi everyone,

I have created a 2-node cluster and (after a little difficulty) created a
clustered volume group visible on both nodes. However, I can't create a
logical volume on either node. I get the following error:

Error locking on node <other node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

I have locking_type = 3 and locking_library = liblvm2clusterlock.so. The
metadata areas and sequence nos are the same using vgdisplay on both
nodes.

Can anyone help me create LVs? I am happy to provide any extra info needed.

Thanks, Mike


From macscr at macscr.com  Tue Oct  7 19:24:43 2008
From: macscr at macscr.com (Mark Chaney)
Date: Tue, 7 Oct 2008 14:24:43 -0500
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster
In-Reply-To: <1393.128.187.171.228.1223406613.squirrel@mail.esc.auckland.ac.nz>
References: <1393.128.187.171.228.1223406613.squirrel@mail.esc.auckland.ac.nz>
Message-ID: <00b901c928b2$5b1b3810$1151a830$@com>

So is clvmd running fine on both nodes? If its not, your not going to be
able to do anything with the shared storage. After you have verified its
running, do a vgscan. If you get any errors, you have to fix those first
before you can move ahead to worrying about the lv issues.

I am far from a cluster expert. Im actually having my own issues right now,
but I am just passing on some info that I have learned along the way.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
michael.osullivan at auckland.ac.nz
Sent: Tuesday, October 07, 2008 2:10 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster

Hi Mark,

This is just an experimental cluster for now, not production, so 2-nodes
is sufficient (as long as it doesn't significantly alter the setup, which
I don;t think it does). I have two multi-pathed iSCSI targets for storage,
one each on two separate boxes. I have got this going previously on a
slightly different set-up elsewhere, but this is my next effort that
incorporates easy shutdown/startup of the storage and cluster. Except I
can't get the LV up and running...

Thanks, Mike

Date: Mon, 6 Oct 2008 11:03:57 -0500
From: "Mark Chaney" <macscr at macscr.com>
Subject: RE: [Linux-cluster] Can't create LV in 2-node cluster
To: "'linux clustering'" <linux-cluster at redhat.com>
Message-ID: <04cb01c927cd$24b23ed0$6e16bc70$@com>
Content-Type: text/plain;        charset="us-ascii"

What are you using for shared storage? Also, 2 node clusters are highly
discouraged from my experience and recommendations of others.

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
michael.osullivan at auckland.ac.nz
Sent: Monday, October 06, 2008 10:28 AM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Can't create LV in 2-node cluster

Hi everyone,

I have created a 2-node cluster and (after a little difficulty) created a
clustered volume group visible on both nodes. However, I can't create a
logical volume on either node. I get the following error:

Error locking on node <other node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

I have locking_type = 3 and locking_library = liblvm2clusterlock.so. The
metadata areas and sequence nos are the same using vgdisplay on both
nodes.

Can anyone help me create LVs? I am happy to provide any extra info needed.

Thanks, Mike


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From caronc at navcanada.ca  Tue Oct  7 20:57:24 2008
From: caronc at navcanada.ca (Caron, Chris)
Date: Tue, 7 Oct 2008 16:57:24 -0400
Subject: [Linux-cluster] GFS & journal resources ... common ratio?
Message-ID: <474534909BE4064E853161350C47578E0C15999E@ncrmail1.corp.navcan.ca>

I just have a quick question regarding the amount of storage occupied by
the journals. Is there a common ratio to determine how much space will
be occupied?

I'm doing very rough map and experimenting with different conditions,
but I can't seem to get a common mechanism for predicting the usable
storage limit given the number of locks.

 
Currently we have an 8 node cluster with 9 journals defined (+1 just in
case)...  We are awaiting the external storage hardware; but in the time
being I am using exporting iscsi drives from another machine to work
with.

 
With 9 locks and creating a 1.75GB partition, I get 655 MB usable from
it.... (roughly calculated ratio: 3.37)

With 9 locks and creating a 1.50GB partition, I get 393 MB usable from
it.... (roughly calculated ratio: 2.36)

With 9 locks and creating a 1.25GB partition, I get 131 MB usable from
it.... (roughly calculated ratio: 0.90)

 
I'm obviously missing a figure during my calculations because the ratios
vary each time... I'd have expected them to be a bit more constant...

At the end of the day I want to be able to know ahead of time how much
hard-disk space I need to allocate for 'X usable' based on the number of
locks I'm using.

 
The simple equation I used was:

X/9=USABLE/ALLOCATED

 
ie: (X/9=.655/1.75 = 3.369)

 
Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081007/a8eaf4ac/attachment.htm>

From kanderso at redhat.com  Tue Oct  7 21:11:37 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Tue, 07 Oct 2008 16:11:37 -0500
Subject: [Linux-cluster] GFS & journal resources ... common ratio?
In-Reply-To: <474534909BE4064E853161350C47578E0C15999E@ncrmail1.corp.navcan.ca>
References: <474534909BE4064E853161350C47578E0C15999E@ncrmail1.corp.navcan.ca>
Message-ID: <1223413897.4420.54.camel@dhcp80-204.msp.redhat.com>


On Tue, 2008-10-07 at 16:57 -0400, Caron, Chris wrote:
> I just have a quick question regarding the amount of storage occupied
> by the journals. Is there a common ratio to determine how much space
> will be occupied?
> 

Default Journal size is 128MB.

> I?m doing very rough map and experimenting with different conditions,
> but I can?t seem to get a common mechanism for predicting the usable
> storage limit given the number of locks.
> 
>  
> 
> Currently we have an 8 node cluster with 9 journals defined (+1 just
> in case)?  We are awaiting the external storage hardware; but in the
> time being I am using exporting iscsi drives from another machine to
> work with.

9 * 128MB = 1152MB - which is pretty consistent with your remaining
space below.

> 
>  
> 
> With 9 locks and creating a 1.75GB partition, I get 655 MB usable from
> it?. (roughly calculated ratio: 3.37)
> 
> With 9 locks and creating a 1.50GB partition, I get 393 MB usable from
> it?. (roughly calculated ratio: 2.36)
> 
> With 9 locks and creating a 1.25GB partition, I get 131 MB usable from
> it?. (roughly calculated ratio: 0.90)
> 
>  
> 
> I?m obviously missing a figure during my calculations because the
> ratios vary each time? I?d have expected them to be a bit more
> constant?
> 
> At the end of the day I want to be able to know ahead of time how much
> hard-disk space I need to allocate for ?X usable? based on the number
> of locks I?m using.
> 
>  
> 
> The simple equation I used was:
> 
> X/9=USABLE/ALLOCATED
> 
>  
> 
> ie: (X/9=.655/1.75 = 3.369)
> 
>  
> 
>  
> 
> Chris
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From stelgn at gmail.com  Wed Oct  8 09:15:27 2008
From: stelgn at gmail.com (Ernest Neo Wee Teck)
Date: Wed, 8 Oct 2008 17:15:27 +0800
Subject: [Linux-cluster] GNBD multi import?
Message-ID: <93af20e00810080215j6c8a625fwfef560f314468a52@mail.gmail.com>

I have 3 servers deployed with CLVM, GNBD, CMAN.

fencing using fence_gnbd

ServerA gnbd_export a logical block (LVM) named "r1"

Is it possible for ServerB and ServerC to import "r1" at the same time?

Can domU running on either ServerB and C use the imported r1
(/dev/gnbd/r1)? eg. phy:/dev/gnbd/r1

Is live migration feasible in this case if r1 could be imported on
ServerB and C?


Trying to test the concept of live migration with shared storage here

Cheers,

Ernest


From michael.osullivan at auckland.ac.nz  Wed Oct  8 16:00:16 2008
From: michael.osullivan at auckland.ac.nz (michael.osullivan at auckland.ac.nz)
Date: Thu, 9 Oct 2008 05:00:16 +1300 (NZDT)
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster
Message-ID: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>

Hi Mark,

clvmd is running fine on both nodes. The result of "service clvmd status" is

clvmd (pid xxxxx) is running...
active volumes: LogVol00 LogVol01

The result of vgscan is

Reading all physical volumes. This may take a while...
Found volume group "iscsi_raid_vg" using metadata type lvm2
Found volume group "VolGroup00" using metadata type lvm2

I just can't create a logical volume either from the command line or using
system-config-lvm...

Thanks, Mike

    * From: "Mark Chaney" <macscr macscr com>
    * To: "'linux clustering'" <linux-cluster redhat com>
    * Subject: RE: [Linux-cluster] RE: Can't create LV in 2-node cluster
    * Date: Tue, 7 Oct 2008 14:24:43 -0500

So is clvmd running fine on both nodes? If its not, your not going to be
able to do anything with the shared storage. After you have verified its
running, do a vgscan. If you get any errors, you have to fix those first
before you can move ahead to worrying about the lv issues.

I am far from a cluster expert. Im actually having my own issues right now,
but I am just passing on some info that I have learned along the way.

-----Original Message-----
From: linux-cluster-bounces redhat com
[mailto:linux-cluster-bounces redhat com] On Behalf Of
michael osullivan auckland ac nz
Sent: Tuesday, October 07, 2008 2:10 PM
To: linux-cluster redhat com
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster

Hi Mark,

This is just an experimental cluster for now, not production, so 2-nodes
is sufficient (as long as it doesn't significantly alter the setup, which
I don;t think it does). I have two multi-pathed iSCSI targets for storage,
one each on two separate boxes. I have got this going previously on a
slightly different set-up elsewhere, but this is my next effort that
incorporates easy shutdown/startup of the storage and cluster. Except I
can't get the LV up and running...

Thanks, Mike

Date: Mon, 6 Oct 2008 11:03:57 -0500
From: "Mark Chaney" <macscr macscr com>
Subject: RE: [Linux-cluster] Can't create LV in 2-node cluster
To: "'linux clustering'" <linux-cluster redhat com>
Message-ID: <04cb01c927cd$24b23ed0$6e16bc70$ com>
Content-Type: text/plain;        charset="us-ascii"

What are you using for shared storage? Also, 2 node clusters are highly
discouraged from my experience and recommendations of others.

-----Original Message-----
From: linux-cluster-bounces redhat com
[mailto:linux-cluster-bounces redhat com] On Behalf Of
michael osullivan auckland ac nz
Sent: Monday, October 06, 2008 10:28 AM
To: linux-cluster redhat com
Subject: [Linux-cluster] Can't create LV in 2-node cluster

Hi everyone,

I have created a 2-node cluster and (after a little difficulty) created a
clustered volume group visible on both nodes. However, I can't create a
logical volume on either node. I get the following error:

Error locking on node <other node>: Error backing up metadata, can't find
VG for group #global
Aborting. Failed to activate new LV to wipe the start of it.

I have locking_type = 3 and locking_library = liblvm2clusterlock.so. The
metadata areas and sequence nos are the same using vgdisplay on both
nodes.

Can anyone help me create LVs? I am happy to provide any extra info needed.

Thanks, Mike


--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster


From Dave.Jones at maritz.com  Wed Oct  8 16:04:04 2008
From: Dave.Jones at maritz.com (Jones, Dave)
Date: Wed, 8 Oct 2008 11:04:04 -0500
Subject: [Linux-cluster] 2nd ILO idea
In-Reply-To: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>
References: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>
Message-ID: <DDF0CD44B9EED841B556D44B7929813A05A87940@FENEXCH1602C.us.maritz.net>


Hello all.

Has anyone experimented with adding a second ILO card into HP servers,
reserving 1 for normal ILO access and the second for fencing?

Just curious.   I'm not even sure if they sell add-on ILO boards
anymore.  Or if 2 of them would work in the same server.

Thanks,
Dave

Confidentiality Warning:  This e-mail contains information intended only for the use of the individual or entity named above.  If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering it to the intended recipient, any dissemination, publication or copying of this e-mail is strictly prohibited.  The sender does not accept any responsibility for any loss, disruption or damage to your data or computer system that may occur while using data contained in, or transmitted with, this e-mail.  
If you have received this e-mail in error, please immediately notify us by return e-mail.  Thank you.


From jruemker at redhat.com  Wed Oct  8 18:33:45 2008
From: jruemker at redhat.com (John Ruemker)
Date: Wed, 08 Oct 2008 14:33:45 -0400
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster
In-Reply-To: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>
References: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>
Message-ID: <48ECFD09.6010801@redhat.com>

michael.osullivan at auckland.ac.nz wrote:
> Hi Mark,
>
> clvmd is running fine on both nodes. The result of "service clvmd status" is
>
> clvmd (pid xxxxx) is running...
> active volumes: LogVol00 LogVol01
>
> The result of vgscan is
>
> Reading all physical volumes. This may take a while...
> Found volume group "iscsi_raid_vg" using metadata type lvm2
> Found volume group "VolGroup00" using metadata type lvm2
>
> I just can't create a logical volume either from the command line or using
> system-config-lvm...

Did you partition the device before adding a physical volume to it?  If 
so, did you run partprobe on both nodes?  A common scenario is to 
partition the device from node 1 and create a physical volume on it.  
However the partition table is not automatically read on the second node 
so it has no idea there is a partition there.  When clvmd tells the 
second node to activate a vg or lv on this unknown device, that node 
responds that it can't lock on to the device since it has no idea what 
it is.  If you do end up in this situation then usually the solution is 
to do this on both nodes

   # rm /etc/lvm/cache/.cache
   # partprobe
   # clvmd -R

Then from one node:

   # pvscan
   # vgscan
   # lvchange -ay vg/lv

Try this and see if it helps. 

-John


From jmartin at learningobjects.com  Wed Oct  8 19:53:55 2008
From: jmartin at learningobjects.com (James Martin)
Date: Wed, 08 Oct 2008 15:53:55 -0400
Subject: [Linux-cluster] 2nd ILO idea
In-Reply-To: <DDF0CD44B9EED841B556D44B7929813A05A87940@FENEXCH1602C.us.maritz.net>
References: <1482.128.187.169.21.1223481616.squirrel@mail.esc.auckland.ac.nz>
	<DDF0CD44B9EED841B556D44B7929813A05A87940@FENEXCH1602C.us.maritz.net>
Message-ID: <48ED0FD3.2000705@learningobjects.com>

I don't believe the sell them as add-ons except for older servers that 
didn't come with them integrated.  Why not just by a APC PDU or 
something similar that lets you power on/off specific outlets?


James


Jones, Dave wrote:
> Hello all.
> 
> Has anyone experimented with adding a second ILO card into HP servers,
> reserving 1 for normal ILO access and the second for fencing?
> 
> Just curious.   I'm not even sure if they sell add-on ILO boards
> anymore.  Or if 2 of them would work in the same server.
> 
> Thanks,
> Dave
> 
> Confidentiality Warning:  This e-mail contains information intended only for the use of the individual or entity named above.  If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering it to the intended recipient, any dissemination, publication or copying of this e-mail is strictly prohibited.  The sender does not accept any responsibility for any loss, disruption or damage to your data or computer system that may occur while using data contained in, or transmitted with, this e-mail.  
> If you have received this e-mail in error, please immediately notify us by return e-mail.  Thank you.
> 
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From jamesc at exa.com  Wed Oct  8 20:56:35 2008
From: jamesc at exa.com (James Chamberlain)
Date: Wed, 8 Oct 2008 16:56:35 -0400
Subject: [Linux-cluster] gfs_grow
In-Reply-To: <81D8B57D-B9C8-4AA0-8BEC-F45212795FB6@exa.com>
References: <81D8B57D-B9C8-4AA0-8BEC-F45212795FB6@exa.com>
Message-ID: <DE800B5E-F5BB-48BE-A8C2-FF61592037EB@exa.com>

Hi all,

I'd like to thank Bob Peterson for helping me solve the last problem I  
was seeing with my storage cluster.  I've got a new one now.  A couple  
days ago, site ops plugged in a new storage shelf and this triggered  
some sort of error in the storage chassis.  I was able to sort that  
out with gfs_fsck, and have since gotten the new storage recognized by  
the cluster.  I'd like to make use of this new storage, and it's here  
that we run into trouble.

lvextend completed with no trouble, so I ran gfs_grow.  gfs_grow has  
been running for over an hour now and has not progressed past:

[root at s12n01 ~]# gfs_grow /dev/s12/scratch13
FS: Mount Point: /scratch13
FS: Device: /dev/s12/scratch13
FS: Options: rw,noatime,nodiratime
FS: Size: 4392290302
DEV: Size: 5466032128
Preparing to write new FS information...

The load average on this node has risen from its normal ~30-40 to 513  
(the number of nfsd threads, plus one), and the file system has become  
slow-to-inaccessible on client nodes.  I am seeing messages in my log  
files that indicate things like:

Oct  8 16:26:00 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when  
sending 140 bytes - shutting down socket
Oct  8 16:26:00 s12n01 last message repeated 4 times
Oct  8 16:26:00 s12n01 kernel: nfsd: peername failed (err 107)!
Oct  8 16:26:58 s12n01 kernel: nfsd: peername failed (err 107)!
Oct  8 16:27:56 s12n01 last message repeated 2 times
Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when  
sending 140 bytes - shutting down socket
Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when  
sending 140 bytes - shutting down socket
Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when  
sending 140 bytes - shutting down socket
Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when  
sending 140 bytes - shutting down socket
Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
Oct  8 16:28:34 s12n01 last message repeated 2 times
Oct  8 16:30:29 s12n01 last message repeated 2 times

I was seeing similar messages this morning, but those went away when I  
mounted this file system on another node in the cluster, turned on  
statfs_fast, and then moved the service to that node.  I'm not sure  
what to do about it given that gfs_grow is running.  Is this something  
anyone else has seen?  Does anyone know what to do about this?  Do I  
have any option other than to wait until gfs_grow is done?  Given my  
recent experiences (see "lm_dlm_cancel" in the list archives), I'm  
very hesitant to hit ^C on this gfs_grow.  I'm running CentOS 4 for  
x86-64, kernel 2.6.9-67.0.20.ELsmp.

Thanks,

James


From andrew at ntsg.umt.edu  Wed Oct  8 21:12:39 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Wed, 08 Oct 2008 15:12:39 -0600
Subject: [Linux-cluster] gfs_grow
In-Reply-To: <DE800B5E-F5BB-48BE-A8C2-FF61592037EB@exa.com>
References: <81D8B57D-B9C8-4AA0-8BEC-F45212795FB6@exa.com>
	<DE800B5E-F5BB-48BE-A8C2-FF61592037EB@exa.com>
Message-ID: <48ED2247.8050406@ntsg.umt.edu>

James,

I have a CentOS 5.2 cluster where I would see the same nfs errors under 
certain conditions. If I did anything that introduced latency to my gfs 
operations on the node that served nfs, the nfs threads couldn't service 
requests faster than they came in from clients. Eventually my nfs 
threads would all be busy and start dropping nfs requests. I kept an eye 
on my nfsd thread utilization (/proc/net/rpc/nfsd) and kept bumping up 
the number of threads until they could handle all the requests while the 
gfs had a higher latency.

In my case, I had EMC Networker streaming data from my gfs filesystems 
to a local scsi tape device on the same node that served nfs. I 
eventually separated them onto different nodes.

I'm sure gfs_grow would slow down your gfs enough that your nfs threads 
couldn't keep up. NFS on gfs seems to be very latency sensitive. I have 
a quick an dirty perl script to generate a historgram image from nfs 
thread stats if you are interested.

-Andrew
--
Andrew A. Neuschwander, RHCE
Linux Systems/Software Engineer
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310


James Chamberlain wrote:
> Hi all,
> 
> I'd like to thank Bob Peterson for helping me solve the last problem I 
> was seeing with my storage cluster.  I've got a new one now.  A couple 
> days ago, site ops plugged in a new storage shelf and this triggered 
> some sort of error in the storage chassis.  I was able to sort that out 
> with gfs_fsck, and have since gotten the new storage recognized by the 
> cluster.  I'd like to make use of this new storage, and it's here that 
> we run into trouble.
> 
> lvextend completed with no trouble, so I ran gfs_grow.  gfs_grow has 
> been running for over an hour now and has not progressed past:
> 
> [root at s12n01 ~]# gfs_grow /dev/s12/scratch13
> FS: Mount Point: /scratch13
> FS: Device: /dev/s12/scratch13
> FS: Options: rw,noatime,nodiratime
> FS: Size: 4392290302
> DEV: Size: 5466032128
> Preparing to write new FS information...
> 
> The load average on this node has risen from its normal ~30-40 to 513 
> (the number of nfsd threads, plus one), and the file system has become 
> slow-to-inaccessible on client nodes.  I am seeing messages in my log 
> files that indicate things like:
> 
> Oct  8 16:26:00 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when 
> sending 140 bytes - shutting down socket
> Oct  8 16:26:00 s12n01 last message repeated 4 times
> Oct  8 16:26:00 s12n01 kernel: nfsd: peername failed (err 107)!
> Oct  8 16:26:58 s12n01 kernel: nfsd: peername failed (err 107)!
> Oct  8 16:27:56 s12n01 last message repeated 2 times
> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when 
> sending 140 bytes - shutting down socket
> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when 
> sending 140 bytes - shutting down socket
> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when 
> sending 140 bytes - shutting down socket
> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104 when 
> sending 140 bytes - shutting down socket
> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
> Oct  8 16:28:34 s12n01 last message repeated 2 times
> Oct  8 16:30:29 s12n01 last message repeated 2 times
> 
> I was seeing similar messages this morning, but those went away when I 
> mounted this file system on another node in the cluster, turned on 
> statfs_fast, and then moved the service to that node.  I'm not sure what 
> to do about it given that gfs_grow is running.  Is this something anyone 
> else has seen?  Does anyone know what to do about this?  Do I have any 
> option other than to wait until gfs_grow is done?  Given my recent 
> experiences (see "lm_dlm_cancel" in the list archives), I'm very 
> hesitant to hit ^C on this gfs_grow.  I'm running CentOS 4 for x86-64, 
> kernel 2.6.9-67.0.20.ELsmp.
> 
> Thanks,
> 
> James
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From janar.kartau at gmail.com  Wed Oct  8 23:24:51 2008
From: janar.kartau at gmail.com (Janar Kartau)
Date: Thu, 09 Oct 2008 02:24:51 +0300
Subject: [Linux-cluster] GFS lockups ?
Message-ID: <48ED4143.4070107@gmail.com>

Hi,
Recently our three-node webserver cluster started randomly crashing. I
never had time to investigate what the problem was, cause i needed to
bring them back online again. But it seemed like alla Apache processes
just hang (couldn't even kill them).. waiting for something. The only
thing that helped, was a reboot for all or couple of the nodes. Anyway,
today i encountered this problem at night and i could look into it a
little more. I noticed that some of the GFS filesystems were
unaccessable (we have 5 of them, mounted on every nide) and of the nodes
was completely unaccessable. So i guessed that this half-dead node was
holding locks on the filesystems or sth. Did a hard reset on this dead
node and all stabilized.
Absolutely no cluster/GFS errors in the logs (besides the ones which
tell that the half-dead node was leaving the cluster when i reset it).
Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
(over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
for CMAN/DLM traffic.
Please give me ideas how to solve this or atleast some debugging tips as
it's happening twice a day now and seems i simply can't help it. :(

Janar Kartau


From grimme at atix.de  Thu Oct  9 06:40:58 2008
From: grimme at atix.de (Marc Grimme)
Date: Thu, 9 Oct 2008 08:40:58 +0200
Subject: [Linux-cluster] GFS lockups ?
In-Reply-To: <48ED4143.4070107@gmail.com>
References: <48ED4143.4070107@gmail.com>
Message-ID: <200810090840.58589.grimme@atix.de>

On Thursday 09 October 2008 01:24:51 Janar Kartau wrote:
> Hi,
> Recently our three-node webserver cluster started randomly crashing. I
> never had time to investigate what the problem was, cause i needed to
> bring them back online again. But it seemed like alla Apache processes
> just hang (couldn't even kill them).. waiting for something. The only
> thing that helped, was a reboot for all or couple of the nodes. Anyway,
> today i encountered this problem at night and i could look into it a
> little more. I noticed that some of the GFS filesystems were
> unaccessable (we have 5 of them, mounted on every nide) and of the nodes
> was completely unaccessable. So i guessed that this half-dead node was
> holding locks on the filesystems or sth. Did a hard reset on this dead
> node and all stabilized.
> Absolutely no cluster/GFS errors in the logs (besides the ones which
> tell that the half-dead node was leaving the cluster when i reset it).
> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
> for CMAN/DLM traffic.
> Please give me ideas how to solve this or atleast some debugging tips as
> it's happening twice a day now and seems i simply can't help it. :(

Could you provide more information like relevant syslogs and console messages?

Are you using php with sessions?

-- 
Gruss / Regards,

Marc Grimme
http://www.atix.de/               http://www.open-sharedroot.org/


From shawnlhood at gmail.com  Thu Oct  9 05:00:00 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Thu, 9 Oct 2008 01:00:00 -0400
Subject: [Linux-cluster] GFS lockups ?
In-Reply-To: <48ED4143.4070107@gmail.com>
References: <48ED4143.4070107@gmail.com>
Message-ID: <0C318874-8DF9-4E40-BEC9-3CD90F8C85EA@gmail.com>

See my thread from yesterday.  Same general thing, but the dlm kernel  
threads were eating cycles.

Sent from my iPhone

On Oct 8, 2008, at 7:24 PM, Janar Kartau <janar.kartau at gmail.com> wrote:

> Hi,
> Recently our three-node webserver cluster started randomly crashing. I
> never had time to investigate what the problem was, cause i needed to
> bring them back online again. But it seemed like alla Apache processes
> just hang (couldn't even kill them).. waiting for something. The only
> thing that helped, was a reboot for all or couple of the nodes.  
> Anyway,
> today i encountered this problem at night and i could look into it a
> little more. I noticed that some of the GFS filesystems were
> unaccessable (we have 5 of them, mounted on every nide) and of the  
> nodes
> was completely unaccessable. So i guessed that this half-dead node was
> holding locks on the filesystems or sth. Did a hard reset on this dead
> node and all stabilized.
> Absolutely no cluster/GFS errors in the logs (besides the ones which
> tell that the half-dead node was leaving the cluster when i reset it).
> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS  
> storage
> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
> for CMAN/DLM traffic.
> Please give me ideas how to solve this or atleast some debugging  
> tips as
> it's happening twice a day now and seems i simply can't help it. :(
>
> Janar Kartau
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From grimme at atix.de  Thu Oct  9 07:10:27 2008
From: grimme at atix.de (Marc Grimme)
Date: Thu, 9 Oct 2008 09:10:27 +0200
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810071043h1ef881feib198860b36d2766@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
	<cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
	<cfe2fc960810071043h1ef881feib198860b36d2766@mail.gmail.com>
Message-ID: <200810090910.27894.grimme@atix.de>

On Tuesday 07 October 2008 19:43:07 Shawn Hood wrote:
> And for another follow-up in the interest of full disclosure, I don't
> recall the specifics, but it seems dlm_recvd was eating up all the CPU
> cycles on one of the machines, and others seemed to follow suit shortly
> thereafter. Sorry for the flood!
>
> Shawn

You might want to enable glock_purging. This should reduce or even eliminate 
the problems (not sure yet what it is dependent on). But normally to enable 
glock_purging (in my experiance) reduces the likelyhood of gfs/dlm freezes.

You are sure you don't have any syslog or console message related to the 
cluster before?

-- 
Gruss / Regards,

Marc Grimme
http://www.atix.de/               http://www.open-sharedroot.org/


From federico.simoncelli at gmail.com  Thu Oct  9 11:20:45 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Thu, 9 Oct 2008 13:20:45 +0200
Subject: [Linux-cluster] Cluster monitoring
Message-ID: <a01fe36d0810090420q4aca69fapec895f0f8ee5ee90@mail.gmail.com>

Hi all,
  what is the best way to generate mail notification for cluster
events such as joins/leaves/fences?
I would rather not use an external monitor system like nagios and
ganglia but looks like those are the best practice for now.
Is there any other monitoring application/technique that I should consider?

Thanks in advance,
-- 
Federico.


From hrouamba at gmail.com  Thu Oct  9 11:52:45 2008
From: hrouamba at gmail.com (ROUAMBA Halidou)
Date: Thu, 9 Oct 2008 11:52:45 +0000
Subject: [Linux-cluster] RHEL AS 4.7 Cluster : unable to create HP ILO fence
	device
Message-ID: <df225e0c0810090452m358cdce0hd502f8cbd2dd8562@mail.gmail.com>

Hi , my pote

I just finish installing RHEL AS 4.7 cluster suite on HP DL580 G5 platform.
I'm create member node, cluster domain, but i can't create HP ILO fence
device
When i click on the OK button the bellow message is send to the
system-config-cluster line commande:
*[root at app-db2 ~]# system-config-cluster
Traceback (most recent call last):
  File "/usr/share/system-config-cluster/ConfigTabController.py", line 1232,
in on_fd_panel_ok
    return_list = self.fence_handler.validate_fencedevice(agent_type, None)
  File "/usr/share/system-config-cluster/FenceHandler.py", line 713, in
validate_fencedevice
    returnlist = apply(self.fd_validate[agent_type], args)
  File "/usr/share/system-config-cluster/FenceHandler.py", line 932, in
val_ilo_fd
    if self.ilo_ssh.get_active == True:
AttributeError: 'NoneType' object has no attribute 'get_active'
[root at app-db2 ~]#*

you can find attached the file contained the screen capture.

Thanks for all
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/f93b112e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hp_ilo_screen.png
Type: image/png
Size: 52967 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/f93b112e/attachment.png>

From mgrac at redhat.com  Thu Oct  9 11:56:15 2008
From: mgrac at redhat.com (Marek 'marx' Grac)
Date: Thu, 09 Oct 2008 13:56:15 +0200
Subject: [Linux-cluster] My patch
In-Reply-To: <48EB3B1F.1010700@craigon.co.uk>
References: <48EB3B1F.1010700@craigon.co.uk>
Message-ID: <48EDF15F.9030000@redhat.com>

David wrote:
> Did this patch ever get merged in?
>
> https://www.redhat.com/archives/linux-cluster/2008-August/msg00026.html
No, can you please create a bug in bugzilla? If it works without any 
problem (as we don't have such device), I can apply it.

m,

-- 
Marek Grac
Red Hat Czech s.r.o.


From jamesc at exa.com  Thu Oct  9 15:18:11 2008
From: jamesc at exa.com (James Chamberlain)
Date: Thu, 9 Oct 2008 11:18:11 -0400
Subject: [Linux-cluster] gfs_grow
In-Reply-To: <48ED2247.8050406@ntsg.umt.edu>
References: <81D8B57D-B9C8-4AA0-8BEC-F45212795FB6@exa.com>
	<DE800B5E-F5BB-48BE-A8C2-FF61592037EB@exa.com>
	<48ED2247.8050406@ntsg.umt.edu>
Message-ID: <5C2F3859-F436-42DB-8FB2-94ABAD52CD73@exa.com>

Thanks Andrew.

What I'm really hoping for is anything I can do to make this gfs_grow  
go faster.  It's been running for 19 hours now, I have no idea when  
it'll complete, and the file system I'm trying to grow has been all  
but unusable for the duration.  This is a very busy file system, and I  
know it's best to run gfs_grow on a quiet file system, but there isn't  
too much I can do about that.  Alternatively, if anyone knows of a  
signal I could send to gfs_grow that would cause it to give a status  
report or increase verbosity, that would be helpful, too.  I have  
tried both increasing and decreasing the number of NFS threads, but  
since I can't tell where I am in the process or how quickly it's  
going, I have no idea what effect this has on operations.

Thanks,

James

On Oct 8, 2008, at 5:12 PM, Andrew A. Neuschwander wrote:

> James,
>
> I have a CentOS 5.2 cluster where I would see the same nfs errors  
> under certain conditions. If I did anything that introduced latency  
> to my gfs operations on the node that served nfs, the nfs threads  
> couldn't service requests faster than they came in from clients.  
> Eventually my nfs threads would all be busy and start dropping nfs  
> requests. I kept an eye on my nfsd thread utilization (/proc/net/rpc/ 
> nfsd) and kept bumping up the number of threads until they could  
> handle all the requests while the gfs had a higher latency.
>
> In my case, I had EMC Networker streaming data from my gfs  
> filesystems to a local scsi tape device on the same node that served  
> nfs. I eventually separated them onto different nodes.
>
> I'm sure gfs_grow would slow down your gfs enough that your nfs  
> threads couldn't keep up. NFS on gfs seems to be very latency  
> sensitive. I have a quick an dirty perl script to generate a  
> historgram image from nfs thread stats if you are interested.
>
> -Andrew
> --
> Andrew A. Neuschwander, RHCE
> Linux Systems/Software Engineer
> College of Forestry and Conservation
> The University of Montana
> http://www.ntsg.umt.edu
> andrew at ntsg.umt.edu - 406.243.6310
>
>
> James Chamberlain wrote:
>> Hi all,
>> I'd like to thank Bob Peterson for helping me solve the last  
>> problem I was seeing with my storage cluster.  I've got a new one  
>> now.  A couple days ago, site ops plugged in a new storage shelf  
>> and this triggered some sort of error in the storage chassis.  I  
>> was able to sort that out with gfs_fsck, and have since gotten the  
>> new storage recognized by the cluster.  I'd like to make use of  
>> this new storage, and it's here that we run into trouble.
>> lvextend completed with no trouble, so I ran gfs_grow.  gfs_grow  
>> has been running for over an hour now and has not progressed past:
>> [root at s12n01 ~]# gfs_grow /dev/s12/scratch13
>> FS: Mount Point: /scratch13
>> FS: Device: /dev/s12/scratch13
>> FS: Options: rw,noatime,nodiratime
>> FS: Size: 4392290302
>> DEV: Size: 5466032128
>> Preparing to write new FS information...
>> The load average on this node has risen from its normal ~30-40 to  
>> 513 (the number of nfsd threads, plus one), and the file system has  
>> become slow-to-inaccessible on client nodes.  I am seeing messages  
>> in my log files that indicate things like:
>> Oct  8 16:26:00 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>> when sending 140 bytes - shutting down socket
>> Oct  8 16:26:00 s12n01 last message repeated 4 times
>> Oct  8 16:26:00 s12n01 kernel: nfsd: peername failed (err 107)!
>> Oct  8 16:26:58 s12n01 kernel: nfsd: peername failed (err 107)!
>> Oct  8 16:27:56 s12n01 last message repeated 2 times
>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>> when sending 140 bytes - shutting down socket
>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>> when sending 140 bytes - shutting down socket
>> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>> when sending 140 bytes - shutting down socket
>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>> when sending 140 bytes - shutting down socket
>> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
>> Oct  8 16:28:34 s12n01 last message repeated 2 times
>> Oct  8 16:30:29 s12n01 last message repeated 2 times
>> I was seeing similar messages this morning, but those went away  
>> when I mounted this file system on another node in the cluster,  
>> turned on statfs_fast, and then moved the service to that node.   
>> I'm not sure what to do about it given that gfs_grow is running.   
>> Is this something anyone else has seen?  Does anyone know what to  
>> do about this?  Do I have any option other than to wait until  
>> gfs_grow is done?  Given my recent experiences (see "lm_dlm_cancel"  
>> in the list archives), I'm very hesitant to hit ^C on this  
>> gfs_grow.  I'm running CentOS 4 for x86-64, kernel  
>> 2.6.9-67.0.20.ELsmp.
>> Thanks,
>> James
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/5b0e4836/attachment.htm>

From xavier.montagutelli at unilim.fr  Thu Oct  9 17:32:06 2008
From: xavier.montagutelli at unilim.fr (Xavier Montagutelli)
Date: Thu, 09 Oct 2008 19:32:06 +0200
Subject: [Linux-cluster] RHEL AS 4.7 Cluster : unable to create HP ILO
	fence	device
In-Reply-To: <df225e0c0810090452m358cdce0hd502f8cbd2dd8562@mail.gmail.com>
References: <df225e0c0810090452m358cdce0hd502f8cbd2dd8562@mail.gmail.com>
Message-ID: <48EE4016.402@unilim.fr>

ROUAMBA Halidou a ?crit :
>
> Hi , my pote
>
> I just finish installing RHEL AS 4.7 cluster suite on HP DL580 G5 
> platform.
> I'm create member node, cluster domain, but i can't create HP ILO 
> fence device
> When i click on the OK button the bellow message is send to the 
> system-config-cluster line commande:
> *[root at app-db2 ~]# system-config-cluster
> Traceback (most recent call last):
>   File "/usr/share/system-config-cluster/ConfigTabController.py", line 
> 1232, in on_fd_panel_ok
>
> *
If you have problem with the GUI, it's easy to modify the configuration 
file directly (/etc/cluster/cluster.conf).

1) increment the "config_version" attribute

2) add a fence device :
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="app-db1-ilo" 
login="Administrator" name="app-db1-ILO" passwd="orange......."/>
        </fencedevices>

3) modify your cluster node :
                <clusternode name="app-db1" ....>
                        <fence>
                                <method name="single">
                                        <device name="app-db1-ILO"/>
                                </method>
                        </fence>
                </clusternode>

4) inform ccs of the change :
   ccs_tool upgrade /etc/cluster/cluster.conf

After that, you can test your fence device with the command :
  fence_node app-db1

--
Xavier Montagutelli


From janar.kartau at gmail.com  Thu Oct  9 18:38:43 2008
From: janar.kartau at gmail.com (Janar Kartau)
Date: Thu, 09 Oct 2008 21:38:43 +0300
Subject: [Linux-cluster] GFS lockups ?
In-Reply-To: <200810090840.58589.grimme@atix.de>
References: <48ED4143.4070107@gmail.com> <200810090840.58589.grimme@atix.de>
Message-ID: <48EE4FB3.8040600@gmail.com>

Like i said, i couldn't find anything in the logs besides eviction
messages after i manually reset the server. Yes, we do use PHP and
sessions which use memcached as a backend.

Janar

Marc Grimme wrote:
> On Thursday 09 October 2008 01:24:51 Janar Kartau wrote:
>   
>> Hi,
>> Recently our three-node webserver cluster started randomly crashing. I
>> never had time to investigate what the problem was, cause i needed to
>> bring them back online again. But it seemed like alla Apache processes
>> just hang (couldn't even kill them).. waiting for something. The only
>> thing that helped, was a reboot for all or couple of the nodes. Anyway,
>> today i encountered this problem at night and i could look into it a
>> little more. I noticed that some of the GFS filesystems were
>> unaccessable (we have 5 of them, mounted on every nide) and of the nodes
>> was completely unaccessable. So i guessed that this half-dead node was
>> holding locks on the filesystems or sth. Did a hard reset on this dead
>> node and all stabilized.
>> Absolutely no cluster/GFS errors in the logs (besides the ones which
>> tell that the half-dead node was leaving the cluster when i reset it).
>> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
>> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
>> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
>> for CMAN/DLM traffic.
>> Please give me ideas how to solve this or atleast some debugging tips as
>> it's happening twice a day now and seems i simply can't help it. :(
>>     
>
> Could you provide more information like relevant syslogs and console messages?
>
> Are you using php with sessions?
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/dcdc347e/attachment.htm>

From janar.kartau at gmail.com  Thu Oct  9 18:40:42 2008
From: janar.kartau at gmail.com (Janar Kartau)
Date: Thu, 09 Oct 2008 21:40:42 +0300
Subject: [Linux-cluster] GFS lockups ?
In-Reply-To: <0C318874-8DF9-4E40-BEC9-3CD90F8C85EA@gmail.com>
References: <48ED4143.4070107@gmail.com>
	<0C318874-8DF9-4E40-BEC9-3CD90F8C85EA@gmail.com>
Message-ID: <48EE502A.2040309@gmail.com>

Hm.. didn't notice it before. Anyway, i didn't notice that dlm was doing
any more job than usually. The most CPU-consuming processes on the alive
nodes was "top" itself (although the load was around 600 because of the
hang Apache procs).

Janar

Shawn Hood wrote:
> See my thread from yesterday.  Same general thing, but the dlm kernel
> threads were eating cycles.
>
> Sent from my iPhone
>
> On Oct 8, 2008, at 7:24 PM, Janar Kartau <janar.kartau at gmail.com> wrote:
>
>> Hi,
>> Recently our three-node webserver cluster started randomly crashing. I
>> never had time to investigate what the problem was, cause i needed to
>> bring them back online again. But it seemed like alla Apache processes
>> just hang (couldn't even kill them).. waiting for something. The only
>> thing that helped, was a reboot for all or couple of the nodes. Anyway,
>> today i encountered this problem at night and i could look into it a
>> little more. I noticed that some of the GFS filesystems were
>> unaccessable (we have 5 of them, mounted on every nide) and of the nodes
>> was completely unaccessable. So i guessed that this half-dead node was
>> holding locks on the filesystems or sth. Did a hard reset on this dead
>> node and all stabilized.
>> Absolutely no cluster/GFS errors in the logs (besides the ones which
>> tell that the half-dead node was leaving the cluster when i reset it).
>> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
>> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
>> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
>> for CMAN/DLM traffic.
>> Please give me ideas how to solve this or atleast some debugging tips as
>> it's happening twice a day now and seems i simply can't help it. :(
>>
>> Janar Kartau
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From grimme at atix.de  Thu Oct  9 19:19:57 2008
From: grimme at atix.de (Marc Grimme)
Date: Thu, 9 Oct 2008 21:19:57 +0200
Subject: [Linux-cluster] GFS lockups ?
In-Reply-To: <48EE4FB3.8040600@gmail.com>
References: <48ED4143.4070107@gmail.com> <200810090840.58589.grimme@atix.de>
	<48EE4FB3.8040600@gmail.com>
Message-ID: <200810092119.57501.grimme@atix.de>

On Thursday 09 October 2008 20:38:43 Janar Kartau wrote:
> Like i said, i couldn't find anything in the logs besides eviction
> messages after i manually reset the server. Yes, we do use PHP and
> sessions which use memcached as a backend.
Don't know much about memcached as a backend but I recall we finally patched 
php so it uses flocks (as far as I remember or you can at least configure how 
you want to use session-filelocking) and after it apache is pretty stable. No 
*D*s any more because of this.

I don't know what the status is with the php patch but I think it's still 
somewhere. I need to check back on this.

-marc.
>
> Janar
>
> Marc Grimme wrote:
> > On Thursday 09 October 2008 01:24:51 Janar Kartau wrote:
> >> Hi,
> >> Recently our three-node webserver cluster started randomly crashing. I
> >> never had time to investigate what the problem was, cause i needed to
> >> bring them back online again. But it seemed like alla Apache processes
> >> just hang (couldn't even kill them).. waiting for something. The only
> >> thing that helped, was a reboot for all or couple of the nodes. Anyway,
> >> today i encountered this problem at night and i could look into it a
> >> little more. I noticed that some of the GFS filesystems were
> >> unaccessable (we have 5 of them, mounted on every nide) and of the nodes
> >> was completely unaccessable. So i guessed that this half-dead node was
> >> holding locks on the filesystems or sth. Did a hard reset on this dead
> >> node and all stabilized.
> >> Absolutely no cluster/GFS errors in the logs (besides the ones which
> >> tell that the half-dead node was leaving the cluster when i reset it).
> >> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1,
> >> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage
> >> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used
> >> for CMAN/DLM traffic.
> >> Please give me ideas how to solve this or atleast some debugging tips as
> >> it's happening twice a day now and seems i simply can't help it. :(
> >
> > Could you provide more information like relevant syslogs and console
> > messages?
> >
> > Are you using php with sessions?


-- 
Gruss / Regards,

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

**
ATIX Informationstechnologie und Consulting AG
Einsteinstr. 10 
85716 Unterschleissheim
Deutschland/Germany

Phone: +49-89 452 3538-0
Fax:   +49-89 990 1766-0

Registergericht: Amtsgericht Muenchen
Registernummer: HRB 168930
USt.-Id.: DE209485962

Vorstand: 
Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.)

Vorsitzender des Aufsichtsrats:
Dr. Martin Buss


From terrybdavis at gmail.com  Thu Oct  9 20:01:04 2008
From: terrybdavis at gmail.com (Terry Davis)
Date: Thu, 9 Oct 2008 15:01:04 -0500
Subject: [Linux-cluster] clustering inside of vmware -- fencing
Message-ID: <14139e3a0810091301q3b1d006al829dc52157bcdb22@mail.gmail.com>

Hello,
I am trying to set up a RHEL5 cluster inside of VMware (client request).  I
see there are hints of a fence_vmware script floating around.  I found one
on sources.redhat.com but this, coupled with the latest toolkit from vmware
yields a missing VMware::VmPerl.pm file.  This got me to step back and think
about this a bit further.

1) is this a supported configuration?
2) what am I missing with the fence_vmware script?
3) can anyone share any working configurations with this?

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/ea81d553/attachment.htm>

From michael.osullivan at auckland.ac.nz  Thu Oct  9 20:30:48 2008
From: michael.osullivan at auckland.ac.nz (michael.osullivan at auckland.ac.nz)
Date: Fri, 10 Oct 2008 09:30:48 +1300 (NZDT)
Subject: [Linux-cluster] RE: Can't create LV in 2-node cluster
Message-ID: <1385.128.187.136.18.1223584248.squirrel@mail.esc.auckland.ac.nz>

    * From: John Ruemker <jruemker redhat com>
    * To: linux clustering <linux-cluster redhat com>
    * Subject: Re: [Linux-cluster] RE: Can't create LV in 2-node cluster
    * Date: Wed, 08 Oct 2008 14:33:45 -0400

michael osullivan auckland ac nz wrote:

    Hi Mark,

    clvmd is running fine on both nodes. The result of "service clvmd
status" is

    clvmd (pid xxxxx) is running...
    active volumes: LogVol00 LogVol01

    The result of vgscan is

    Reading all physical volumes. This may take a while...
    Found volume group "iscsi_raid_vg" using metadata type lvm2
    Found volume group "VolGroup00" using metadata type lvm2

    I just can't create a logical volume either from the command line or
using
    system-config-lvm...

> Did you partition the device before adding a physical volume to it? If
so, did you run partprobe on both nodes? A common scenario is to
partition the device from node 1 and create a physical volume on it.
However the partition table is not automatically read on the second node
so it has no idea there is a partition there. When clvmd tells the
second node to activate a vg or lv on this unknown device, that node
responds that it can't lock on to the device since it has no idea what
it is. If you do end > up in this situation then usually the solution is
to do this on both nodes
>
>   # rm /etc/lvm/cache/.cache
>   # partprobe
>   # clvmd -R
>
> Then from one node:
>
>   # pvscan
>   # vgscan
>   # lvchange -ay vg/lv
>
>
> Try this and see if it helps.
>
> -John

Thanks for your help John,

I didn't partition the device before adding the physical volume, I just
used pvcreate. I tried you advice except for the last step as I have not
managed to create a logical volume to activate. The error changed a
little:

lvcreate -n iscsi_raid_lv -l 4882 iscsi_raid_vg gives

Error locking on node <other node>: Error backing up metadata, can't find
VG for group vg
Aborting. Failed to activate new LV to wipe the start of it.

Note that "group vg" used to be "group #global" before I tried your
solution. Any other ideas? This has got me really stumped. The commands I
used to create the physical volume and volume group were:

pvcreate /dev/iscsi_raid

vgcreate -cy iscsi_raid_vg /dev/iscsi_raid


But now lvcreate won't cooperate.

Thanks again for any help. Also thanks for your previous help too. Thanks,
Mike


From andrew at ntsg.umt.edu  Thu Oct  9 20:55:57 2008
From: andrew at ntsg.umt.edu (Andrew A. Neuschwander)
Date: Thu, 09 Oct 2008 14:55:57 -0600
Subject: [Linux-cluster] clustering inside of vmware -- fencing
In-Reply-To: <14139e3a0810091301q3b1d006al829dc52157bcdb22@mail.gmail.com>
References: <14139e3a0810091301q3b1d006al829dc52157bcdb22@mail.gmail.com>
Message-ID: <48EE6FDD.1050904@ntsg.umt.edu>

VMware::VmPerl.pm a deprecated api. The current vmware perl api is the 
VI Perl Toolkit (VMware::VIM2Runtime). I have a centos 5.2 gfs cluster 
running in a VMware ESX cluster (with virtualcenter). I have a mix of 
CentOS VMs and physical machines participating in the GFS cluster.

I modified the fence_vixel agent to use this new api, and called it 
fence_vi3. It's fairly basic and works, but could use some improvements. 
I've been using it for almost a year:

https://www.redhat.com/archives/cluster-devel/2007-November/msg00056.html

The old fence_vmware agent logs into a single ESX/GSX server using the 
old api and resets the targeted guest. fence_vi3 logs into the 
VirtualCenter and resets the requested VM. This is needed in a ESX 
cluster, since you don't know on which ESX machine your Centos/Rhel 
guest is running.

I don't know if this is a supported setup, but it works. I've done a lot 
of heavy testing and optimizing of gfs in this setup. My volume group 
which hold my gfs filesystems is 14TB and has been in production for a 
good 6 months. Obviously, YMMV.

-Andrew
--
Andrew A. Neuschwander, RHCE
Linux Systems/Software Engineer
College of Forestry and Conservation
The University of Montana
http://www.ntsg.umt.edu
andrew at ntsg.umt.edu - 406.243.6310


Terry Davis wrote:
> Hello,
> I am trying to set up a RHEL5 cluster inside of VMware (client request).  I
> see there are hints of a fence_vmware script floating around.  I found one
> on sources.redhat.com but this, coupled with the latest toolkit from vmware
> yields a missing VMware::VmPerl.pm file.  This got me to step back and think
> about this a bit further.
> 
> 1) is this a supported configuration?
> 2) what am I missing with the fence_vmware script?
> 3) can anyone share any working configurations with this?
> 
> Thanks!
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From kanderso at redhat.com  Thu Oct  9 21:18:42 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Thu, 09 Oct 2008 16:18:42 -0500
Subject: [Linux-cluster] clustering inside of vmware -- fencing
In-Reply-To: <48EE6FDD.1050904@ntsg.umt.edu>
References: <14139e3a0810091301q3b1d006al829dc52157bcdb22@mail.gmail.com>
	<48EE6FDD.1050904@ntsg.umt.edu>
Message-ID: <1223587122.3016.18.camel@dhcp80-204.msp.redhat.com>

There is a new implementation of the fence_vmware agent written in
python in the GIT tree:
http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=tree;f=fence/agents/vmware;h=d07bfd7a8d6f445f15f793626cf493a9edf11833;hb=refs/heads/master

This one uses the new libfence python infrastructure and hopefully does
what you need.  Check it out and let us know.

Kevin

On Thu, 2008-10-09 at 14:55 -0600, Andrew A. Neuschwander wrote:
> VMware::VmPerl.pm a deprecated api. The current vmware perl api is the 
> VI Perl Toolkit (VMware::VIM2Runtime). I have a centos 5.2 gfs cluster 
> running in a VMware ESX cluster (with virtualcenter). I have a mix of 
> CentOS VMs and physical machines participating in the GFS cluster.
> 
> I modified the fence_vixel agent to use this new api, and called it 
> fence_vi3. It's fairly basic and works, but could use some improvements. 
> I've been using it for almost a year:
> 
> https://www.redhat.com/archives/cluster-devel/2007-November/msg00056.html
> 
> The old fence_vmware agent logs into a single ESX/GSX server using the 
> old api and resets the targeted guest. fence_vi3 logs into the 
> VirtualCenter and resets the requested VM. This is needed in a ESX 
> cluster, since you don't know on which ESX machine your Centos/Rhel 
> guest is running.
> 
> I don't know if this is a supported setup, but it works. I've done a lot 
> of heavy testing and optimizing of gfs in this setup. My volume group 
> which hold my gfs filesystems is 14TB and has been in production for a 
> good 6 months. Obviously, YMMV.
> 
> -Andrew
> --
> Andrew A. Neuschwander, RHCE
> Linux Systems/Software Engineer
> College of Forestry and Conservation
> The University of Montana
> http://www.ntsg.umt.edu
> andrew at ntsg.umt.edu - 406.243.6310
> 
> 
> Terry Davis wrote:
> > Hello,
> > I am trying to set up a RHEL5 cluster inside of VMware (client request).  I
> > see there are hints of a fence_vmware script floating around.  I found one
> > on sources.redhat.com but this, coupled with the latest toolkit from vmware
> > yields a missing VMware::VmPerl.pm file.  This got me to step back and think
> > about this a bit further.
> > 
> > 1) is this a supported configuration?
> > 2) what am I missing with the fence_vmware script?
> > 3) can anyone share any working configurations with this?
> > 
> > Thanks!
> > 
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From jamesc at exa.com  Thu Oct  9 22:53:58 2008
From: jamesc at exa.com (James Chamberlain)
Date: Thu, 9 Oct 2008 18:53:58 -0400
Subject: [Linux-cluster] gfs_grow
In-Reply-To: <5C2F3859-F436-42DB-8FB2-94ABAD52CD73@exa.com>
References: <81D8B57D-B9C8-4AA0-8BEC-F45212795FB6@exa.com>
	<DE800B5E-F5BB-48BE-A8C2-FF61592037EB@exa.com>
	<48ED2247.8050406@ntsg.umt.edu>
	<5C2F3859-F436-42DB-8FB2-94ABAD52CD73@exa.com>
Message-ID: <0FA1858B-DA74-4D65-95CB-7EC21559FA6F@exa.com>

The gfs_grow did finally complete, but now I've got another problem:

Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1: fatal:  
invalid metadata block
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1:   bh =  
4314413922 (type: exp=5, found=4)
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1:   function =  
gfs_get_meta_buffer
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1:   file = / 
builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/dio.c, line = 1223
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1:   time =  
1223589349
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1: about to  
withdraw from the cluster
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1: waiting for  
outstanding I/O
Oct  9 17:55:49 s12n03 kernel: GFS: fsid=s12:scratch13.1: telling LM  
to withdraw
Oct  9 17:55:50 s12n01 kernel: GFS: fsid=s12:scratch13.2: jid=1:  
Trying to acquire journal lock...
Oct  9 17:55:50 s12n01 kernel: GFS: fsid=s12:scratch13.2: jid=1: Busy
Oct  9 17:55:50 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Trying to acquire journal lock...
Oct  9 17:55:50 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Looking at journal...
Oct  9 17:55:51 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Acquiring the transaction lock...
Oct  9 17:55:51 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Replaying journal...
Oct  9 17:55:52 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Replayed 1637 of 3945 blocks
Oct  9 17:55:52 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
replays = 1637, skips = 115, sames = 2193
Oct  9 17:55:52 s12n03 kernel: lock_dlm: withdraw abandoned memory
Oct  9 17:55:52 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1:  
Journal replayed in 2s
Oct  9 17:55:52 s12n03 kernel: GFS: fsid=s12:scratch13.1: withdrawn
Oct  9 17:55:52 s12n02 kernel: GFS: fsid=s12:scratch13.0: jid=1: Done
Oct  9 17:56:26 s12n03 clurgmgrd: [6611]: <err> clusterfs:gfs- 
scratch13: Mount point is not accessible!
Oct  9 17:56:26 s12n03 clurgmgrd[6611]: <notice> status on  
clusterfs:gfs-scratch13 returned 1 (generic error)
Oct  9 17:56:26 s12n03 clurgmgrd[6611]: <notice> Stopping service  
scratch13
Oct  9 17:56:26 s12n03 clurgmgrd: [6611]: <info> Removing IPv4 address  
10.14.12.5 from bond0
Oct  9 17:56:36 s12n03 clurgmgrd: [6611]: <err> /scratch13 is not a  
directory
Oct  9 17:56:36 s12n03 clurgmgrd[6611]: <notice> stop on nfsclient:nfs- 
scratch13 returned 2 (invalid argument(s))
Oct  9 17:56:36 s12n03 clurgmgrd[6611]: <crit> #12: RG scratch13  
failed to stop; intervention required
Oct  9 17:56:36 s12n03 clurgmgrd[6611]: <notice> Service scratch13 is  
failed

The history here is that a new storage shelf was added to the  
chassis.  This somehow triggered an error on the chassis - a timeout  
of some sort, as I understand it from Site Ops - which I presume  
triggered this problem on this file system, since the two events were  
coincident.  I have run gfs_fsck against this file system, but it  
didn't fix the problem - even when I used a newer version of gfs_fsck  
from RHEL 5 that had been back-ported to RHEL4.  I had done this a  
couple of times before running the gfs_grow, and had hoped that the  
problem had been taken care of.  Apparently not.  Does anyone have any  
thoughts here?  I can make the file system available again by killing  
off anything I suspect might be accessing that invalid metadata block,  
but that's not a good solution.

Thanks,

James

On Oct 9, 2008, at 11:18 AM, James Chamberlain wrote:

> Thanks Andrew.
>
> What I'm really hoping for is anything I can do to make this  
> gfs_grow go faster.  It's been running for 19 hours now, I have no  
> idea when it'll complete, and the file system I'm trying to grow has  
> been all but unusable for the duration.  This is a very busy file  
> system, and I know it's best to run gfs_grow on a quiet file system,  
> but there isn't too much I can do about that.  Alternatively, if  
> anyone knows of a signal I could send to gfs_grow that would cause  
> it to give a status report or increase verbosity, that would be  
> helpful, too.  I have tried both increasing and decreasing the  
> number of NFS threads, but since I can't tell where I am in the  
> process or how quickly it's going, I have no idea what effect this  
> has on operations.
>
> Thanks,
>
> James
>
> On Oct 8, 2008, at 5:12 PM, Andrew A. Neuschwander wrote:
>
>> James,
>>
>> I have a CentOS 5.2 cluster where I would see the same nfs errors  
>> under certain conditions. If I did anything that introduced latency  
>> to my gfs operations on the node that served nfs, the nfs threads  
>> couldn't service requests faster than they came in from clients.  
>> Eventually my nfs threads would all be busy and start dropping nfs  
>> requests. I kept an eye on my nfsd thread utilization (/proc/net/ 
>> rpc/nfsd) and kept bumping up the number of threads until they  
>> could handle all the requests while the gfs had a higher latency.
>>
>> In my case, I had EMC Networker streaming data from my gfs  
>> filesystems to a local scsi tape device on the same node that  
>> served nfs. I eventually separated them onto different nodes.
>>
>> I'm sure gfs_grow would slow down your gfs enough that your nfs  
>> threads couldn't keep up. NFS on gfs seems to be very latency  
>> sensitive. I have a quick an dirty perl script to generate a  
>> historgram image from nfs thread stats if you are interested.
>>
>> -Andrew
>> --
>> Andrew A. Neuschwander, RHCE
>> Linux Systems/Software Engineer
>> College of Forestry and Conservation
>> The University of Montana
>> http://www.ntsg.umt.edu
>> andrew at ntsg.umt.edu - 406.243.6310
>>
>>
>> James Chamberlain wrote:
>>> Hi all,
>>> I'd like to thank Bob Peterson for helping me solve the last  
>>> problem I was seeing with my storage cluster.  I've got a new one  
>>> now.  A couple days ago, site ops plugged in a new storage shelf  
>>> and this triggered some sort of error in the storage chassis.  I  
>>> was able to sort that out with gfs_fsck, and have since gotten the  
>>> new storage recognized by the cluster.  I'd like to make use of  
>>> this new storage, and it's here that we run into trouble.
>>> lvextend completed with no trouble, so I ran gfs_grow.  gfs_grow  
>>> has been running for over an hour now and has not progressed past:
>>> [root at s12n01 ~]# gfs_grow /dev/s12/scratch13
>>> FS: Mount Point: /scratch13
>>> FS: Device: /dev/s12/scratch13
>>> FS: Options: rw,noatime,nodiratime
>>> FS: Size: 4392290302
>>> DEV: Size: 5466032128
>>> Preparing to write new FS information...
>>> The load average on this node has risen from its normal ~30-40 to  
>>> 513 (the number of nfsd threads, plus one), and the file system  
>>> has become slow-to-inaccessible on client nodes.  I am seeing  
>>> messages in my log files that indicate things like:
>>> Oct  8 16:26:00 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>>> when sending 140 bytes - shutting down socket
>>> Oct  8 16:26:00 s12n01 last message repeated 4 times
>>> Oct  8 16:26:00 s12n01 kernel: nfsd: peername failed (err 107)!
>>> Oct  8 16:26:58 s12n01 kernel: nfsd: peername failed (err 107)!
>>> Oct  8 16:27:56 s12n01 last message repeated 2 times
>>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>>> when sending 140 bytes - shutting down socket
>>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>>> when sending 140 bytes - shutting down socket
>>> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
>>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>>> when sending 140 bytes - shutting down socket
>>> Oct  8 16:27:56 s12n01 kernel: rpc-srv/tcp: nfsd: got error -104  
>>> when sending 140 bytes - shutting down socket
>>> Oct  8 16:27:56 s12n01 kernel: nfsd: peername failed (err 107)!
>>> Oct  8 16:28:34 s12n01 last message repeated 2 times
>>> Oct  8 16:30:29 s12n01 last message repeated 2 times
>>> I was seeing similar messages this morning, but those went away  
>>> when I mounted this file system on another node in the cluster,  
>>> turned on statfs_fast, and then moved the service to that node.   
>>> I'm not sure what to do about it given that gfs_grow is running.   
>>> Is this something anyone else has seen?  Does anyone know what to  
>>> do about this?  Do I have any option other than to wait until  
>>> gfs_grow is done?  Given my recent experiences (see  
>>> "lm_dlm_cancel" in the list archives), I'm very hesitant to hit ^C  
>>> on this gfs_grow.  I'm running CentOS 4 for x86-64, kernel  
>>> 2.6.9-67.0.20.ELsmp.
>>> Thanks,
>>> James
>>> -- 
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081009/b66b9f8f/attachment.htm>

From fdinitto at redhat.com  Fri Oct 10 04:53:45 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Fri, 10 Oct 2008 06:53:45 +0200
Subject: [Linux-cluster] Cluster Summit Report
Message-ID: <1223614425.18797.17.camel@daitarn-fedora.int.fabbione.net>

Hi all,

The general feeling was that the Cluster Summit was a very good
experience for everybody and that the amount of work done during those
3 days would have taken months on normal communication media. Of the 3
days schedule only 2 and half were required as the people have been way
more efficient than expected. A lot of the pre-scheduled discussions
have been dropped in a natural fashion as they were absorbed, discussed
or deprecated at the source into other talks. People, coming from
different environments with different experience and use cases, made a
huge difference.

While we did discuss to a greater level of technical details, this is a
short summary of what will happen (in no particular order):

Tree's splitting:
- This item should be first and last at the same time.
  As a consequence of what has been decided, almost all trees will need
  to be divided and reorganized differently.
  As an example, RedHat specific bits will remain in one tree, while
  common components (such as dlm and fencing+fencing agents) will leave 
  in their own separate projects.
  Details of the split are still to be determined. Low hanging fruits
  will be done first (gnbd and gfs* for example).

- We discussed using clusterlabs.org as the go-to page for users,
  listing the versions of the latest (stable) components from all
  sources. The openSUSE Build Service could then be used as a hosting
  provider for this "community distro".

- For the heartbeat tree, all that will eventually remain in it is the
  heartbeat "cluster infrastructure layer" (can't drop for backwards
  compatibility for a while).

- Eventually some core libraries will migrate into corosync.

- fabbione to coordinate the splitting.

- lmb will coordinate the Linux-HA split and help with the build service
  stuff (if we go ahead with that).

Standard fencing:
- fencing daemon, libraries and agents will be merged (from RedHat and
  heartbeat) into two new projects (so that agents can be released
  independently from the daemon/libs).

- fencing project will grow a simulator for regression testing (honza). 
  The simulator will be a simple set of scripts that collect outputs 
  from all known fencing devices and pass them back to the agents to 
  test functionalities. While not perfect, it will still allow to do 
  basic regression testing. We discussed this in terms of rewriting the 
  RAs as simple python classes, which would interact with the world 
  through IO abstractions (which would then be easy to capture/replay).

- honzaf will write up an ABI/API for the agents which merges both
  functionalities and features.

- Possibly agents will need to be rewritten/re-factored as part of the
  merge; some of the C plug-ins might become python classes etc

- lmb, dejan, honza and dct to work on it.

Release time lines:
- As the trees will merge and split into separate projects, RM's will
  coordinate effort to make sure the new work will be available as 
  modular as possible.

- All releases will be available in neutral area for users to download 
  in one shot as discussed previously.

Standard logging:
- Everybody to standardize on logsys.

- The log recorder is worth mentioning here - buffering debug logging so
  that it can be dumped (retroactively) when a fault is encountered.
  Very useful feature.

- heartbeat has a hb_report feature to gather logs, configurations, 
  stack traces from core dumps etc from all cluster nodes, that'll be 
  extended over time to support all this too

- New features will be required in logsys to improve the user 
  experience.

Init scripts:
- agreed that all init scripts shipped from upstream need to be LSB 
  compliant and work in a distribution independent way. Users should 
  not need to care when installing from our tarballs.

- With portable packages, any differences should be hidden in there.

Packaging from upstream:
- in order to speed up adoption, our plan is to ship .spec and debian/
  packaging format directly from upstream and with support from   
  packagers. This will greatly reduce the time of propagation from 
  upstream release into users that do not like installing manually.
  Packages can be built using the openSUSE build service to avoid 
  requirement on new infrastructure.

Standard quorum service:
- Chrissie to implement the service within corosync/openais.

- API has been discussed and explained in depth.

Standard configuration:
- New stack will standardize on CIB (from pacemaker). CIB is approx. a 
  ccsd on steroids.

- fabbione to look into CIB, and port libccs to libcib.

- chrissie to port LDAP loader to CIB.

Common shell scripting library for RA's:
- Agreed to merge and review all RA's. This is a natural step as 
  rgmanager will be deprecated.

- lon and dejan to work on it.

Clustered Samba:
- More detailed investigation required but the short line is that 
  performance testing are required.

- Might require RA.

- Investigate benefit from infiniband.

- Nice to see samba integrated with corosync/openais.

Split site:
- There are 2 main scenarios for split site:
  - Metropolitan Area Clusters: "low" latency, redundancy affordable
  - Wide Area Clusters: high latency, expensive redundancy

  Each case has different problematic s (as latency and speed of the
  links). We will start tackling "remote" and only service/application 
  fail-over. Data Replication will come later as users will demand it.

- lmb to write the code for the "3rd site quorum" service tied into 
  pacemaker resource dependency framework.

- Identified need for some additional RAs to coordinate routing/address
  resolution switch-over; interfacing with routing protocols
  (BGP4/OSPF/etc) and DNS.

Misc:

- corosync release cycles
  - "Flatiron" to be released in time for February (+ Wilson/openAIS)
  - Need to understand effects of RDMA versus IP over infiniband

- openSharedRoot presentation
  - Lots of unsolved issues, mostly related to clunky CDSL emulation,
    and the need to bring up significant portions of the stack before
    mounting root

- NTT:
  - Raised lots of issues about supportability too
  - NTT will drive a stonith agent which works nicely with crashdumps 
    too


From wferi at niif.hu  Fri Oct 10 09:08:27 2008
From: wferi at niif.hu (Ferenc Wagner)
Date: Fri, 10 Oct 2008 11:08:27 +0200
Subject: [Linux-cluster] Cluster Summit Report
In-Reply-To: <1223614425.18797.17.camel@daitarn-fedora.int.fabbione.net>
	(Fabio M. Di Nitto's message of "Fri, 10 Oct 2008 06:53:45 +0200")
References: <1223614425.18797.17.camel@daitarn-fedora.int.fabbione.net>
Message-ID: <87hc7ku8ac.fsf@tac.ki.iif.hu>

"Fabio M. Di Nitto" <fdinitto at redhat.com> writes:

> - Agreed to merge and review all RA's. This is a natural step as
>   rgmanager will be deprecated.

Thanks for the report, it was a very interesting read.
And what will take the place of rgmanager?
-- 
Regards,
Feri.


From stefano.biagiotti at vola.it  Fri Oct 10 10:58:38 2008
From: stefano.biagiotti at vola.it (Stefano Biagiotti)
Date: Fri, 10 Oct 2008 12:58:38 +0200
Subject: [Linux-cluster] RILOE and fencing
Message-ID: <20081010105838.GA31738@palermo.priv2.gtn.it>

I'm testing GFS on a 2-node cluster with CentOS-5. I'm succesfully
running qdiskd, cman, clvmd, gfs and the GFS filesystem is up and running
on both nodes, but I have some issues with fencing when I simulate a
network outage.

The stonith device on both nodes is Compaq RILOE (Remote Insight
Lights-Out Edition). The fence_ilo command doesn't work to me...
 # fence_ilo -a node2-riloe -l admin -p $password -o off
 failed to turn off

Is fence_ilo the correct tool to use with RILOE?

I found fence_rib, but the manpage says it's deprecated:
 Name
   fence_rib - I/O Fencing agent for Compaq Remote Insight Lights Out card
 Description
   fence_rib is deprecated. fence_ilo should be used instead
 See Also
   fence_ilo(8)

Thank you in advance.
-- 
Stefano Biagiotti


From fdinitto at redhat.com  Sat Oct 11 07:20:12 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Sat, 11 Oct 2008 09:20:12 +0200 (CEST)
Subject: [Linux-cluster] Cluster Summit Report
In-Reply-To: <87hc7ku8ac.fsf@tac.ki.iif.hu>
References: <1223614425.18797.17.camel@daitarn-fedora.int.fabbione.net>
	<87hc7ku8ac.fsf@tac.ki.iif.hu>
Message-ID: <Pine.LNX.4.64.0810110919040.9830@trider-g7>

On Fri, 10 Oct 2008, Ferenc Wagner wrote:

> "Fabio M. Di Nitto" <fdinitto at redhat.com> writes:
>
>> - Agreed to merge and review all RA's. This is a natural step as
>>   rgmanager will be deprecated.
>
> Thanks for the report, it was a very interesting read.
> And what will take the place of rgmanager?

rgmanager will be around for backward compatibility for one release to 
allow people a smooth upgrade.

It will be replaced by pacemaker in the long run.

Fabio

--
I'm going to make him an offer he can't refuse.


From sanelson at gmail.com  Sat Oct 11 16:12:42 2008
From: sanelson at gmail.com (Stephen Nelson-Smith)
Date: Sat, 11 Oct 2008 17:12:42 +0100
Subject: [Linux-cluster] Cib.xml --> haresources
Message-ID: <b6131fdc0810110912v2819a612ka22ce148b37c4a1@mail.gmail.com>

Hi,

I want to managed a linux-ha cluster (heartbeat 2) with puppet,
however I think I'm likely to struggle with cib.xml, as it contains
runtime info, and the cluster software seems like it won't take kindly
to being deployed automatically.

I think the approach is to put the config in an haresources file, and
use the haresources conversion to cib.xml tool.  However, how do I put
the knowledge I've put into my cib.xml (attached) into haresources?

Or am I missing the point entirely and there's a much better/easier
way to do this?

S.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cib.xml
Type: text/xml
Size: 3390 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081011/6bb655a7/attachment.xml>

From jmacfarland at nexatech.com  Mon Oct 13 15:35:40 2008
From: jmacfarland at nexatech.com (Jeff Macfarland)
Date: Mon, 13 Oct 2008 10:35:40 -0500
Subject: [Linux-cluster] Cluster monitoring
In-Reply-To: <a01fe36d0810090420q4aca69fapec895f0f8ee5ee90@mail.gmail.com>
References: <a01fe36d0810090420q4aca69fapec895f0f8ee5ee90@mail.gmail.com>
Message-ID: <48F36ACC.3000307@nexatech.com>

Federico Simoncelli wrote:
> Hi all,
>   what is the best way to generate mail notification for cluster
> events such as joins/leaves/fences?
> I would rather not use an external monitor system like nagios and
> ganglia but looks like those are the best practice for now.
> Is there any other monitoring application/technique that I should consider?

I've been meaning to look into this as well. Best I can find is called 
"RIND" (are we tired of recursive acronyms yet?)

http://sources.redhat.com/cluster/wiki/EventScripting

-- 
Jeff Macfarland (jmacfarland at nexatech.com)
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net


From jparsons at redhat.com  Mon Oct 13 15:41:58 2008
From: jparsons at redhat.com (jim parsons)
Date: Mon, 13 Oct 2008 11:41:58 -0400
Subject: [Linux-cluster] Cluster monitoring
In-Reply-To: <48F36ACC.3000307@nexatech.com>
References: <a01fe36d0810090420q4aca69fapec895f0f8ee5ee90@mail.gmail.com>
	<48F36ACC.3000307@nexatech.com>
Message-ID: <1223912518.3298.3.camel@localhost.localdomain>

On Mon, 2008-10-13 at 10:35 -0500, Jeff Macfarland wrote:
> Federico Simoncelli wrote:
> > Hi all,
> >   what is the best way to generate mail notification for cluster
> > events such as joins/leaves/fences?
> > I would rather not use an external monitor system like nagios and
> > ganglia but looks like those are the best practice for now.
> > Is there any other monitoring application/technique that I should consider?
> 
> I've been meaning to look into this as well. Best I can find is called 
> "RIND" (are we tired of recursive acronyms yet?)
> 
> http://sources.redhat.com/cluster/wiki/EventScripting
> 
One easy way for just fence notifications would be to write a fence
agent that sent mail to a mail list you could include as one of its
cluster.conf attributes. Then place it in each fence block you define as
the first fence action.

Extra credit would be mailing the success/failure of the fence
attempt.  :)

-J


From gordan at bobich.net  Mon Oct 13 15:55:22 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 13 Oct 2008 16:55:22 +0100
Subject: [Linux-cluster] Cluster Aware Software RAID (md)
Message-ID: <48BF360208406B67@> (added by '')

Has any progress been made on this? I saw some posts from 3-4 years ago in the OpenGFS archives saying it was worked on, but not seen anything since.

What I'm tring to do is have 2 servers with RAID15 (or 16, or 10) between them. Have each disk mirrored with DRBD, and md RAID on top (and GFS on top of that).

I can see this would work with a fail-over configuration, but in active-active there would be RAID metadata inconsistencies. Is there a way to handle the active-active scenario?

I could invert the md and DRBD layers, but this would result in a massive reduction in fault tolerance (RAID51 instead RAID15), so I'd rather like to avoid this.

TIA.

Gordan


From shawnlhood at gmail.com  Mon Oct 13 19:33:42 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 13 Oct 2008 15:33:42 -0400
Subject: [Linux-cluster] GFS reserved blocks?
Message-ID: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>

Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
count"?  I've had a ~1.1TB FS report that it's full with df reporting
~100GB remaining.

-- 
Shawn Hood
910.670.1819 m


From jason.huddleston at verizon.com  Mon Oct 13 20:00:23 2008
From: jason.huddleston at verizon.com (Jason Huddleston)
Date: Mon, 13 Oct 2008 15:00:23 -0500
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
Message-ID: <48F3A8D7.3000309@verizon.com>

Shawn,
    I have been seeing the same thing on one of my clusters (shown 
below) under Red Hat 4.6. I found some details on this under an article 
on the open-shared root web site 
(http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full) 
and an article in Red Hat's knowledge base 
(http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in 
the reclaim of metadata blocks when an inode is released. I saw a patch 
(bz298931) released for this in the 2.99.10 cluster release notes but it 
was reverted (bz298931) a few days after it was submitted. The only 
suggestion that I have gotten back from Red Hat is to shutdown the app 
so the GFS drives are not being accessed and then run the "gfs_tool 
reclaim <mount point>" command.

[root at omzdwcdrp003 ~]# gfs_tool df /l1load1
/l1load1:
 SB lock proto = "lock_dlm"
 SB lock table = "DWCDR_prod:l1load1"
 SB ondisk format = 1309
 SB multihost format = 1401
 Block size = 4096
 Journals = 20
 Resource Groups = 6936
 Mounted lock proto = "lock_dlm"
 Mounted lock table = "DWCDR_prod:l1load1"
 Mounted host data = ""
 Journal number = 13
 Lock module flags =
 Local flocks = FALSE
 Local caching = FALSE
 Oopses OK = FALSE

 Type           Total          Used           Free           use%
 ------------------------------------------------------------------------
 inodes         155300         155300         0              100%
 metadata       2016995        675430         1341565        33%
 data           452302809      331558847      120743962      73%
[root at omzdwcdrp003 ~]# df -h /l1load1
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/l1load1--vg-l1load1--lv
                     1.7T  1.3T  468G  74% /l1load1
[root at omzdwcdrp003 ~]# du -sh /l1load1
18G     /l1load1

----
Jason Huddleston, RHCE
----
PS-USE-Linux
Partner Support - Unix Support and Engineering
Verizon Information Processing Services


Shawn Hood wrote:
> Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
> count"?  I've had a ~1.1TB FS report that it's full with df reporting
> ~100GB remaining.
>
>   


From shawnlhood at gmail.com  Mon Oct 13 20:02:59 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 13 Oct 2008 16:02:59 -0400
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <48F3A8D7.3000309@verizon.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
	<48F3A8D7.3000309@verizon.com>
Message-ID: <cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>

I actually just ran the reclaim on a live filesystem and it seems to
be working okay now.  Hopefully this isn't problematic, as a large
number of operations in the GFS tool suite operate on mounted
filesystems.

Shawn

On Mon, Oct 13, 2008 at 4:00 PM, Jason Huddleston
<jason.huddleston at verizon.com> wrote:
> Shawn,
>   I have been seeing the same thing on one of my clusters (shown below)
> under Red Hat 4.6. I found some details on this under an article on the
> open-shared root web site
> (http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full)
> and an article in Red Hat's knowledge base
> (http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in the
> reclaim of metadata blocks when an inode is released. I saw a patch
> (bz298931) released for this in the 2.99.10 cluster release notes but it was
> reverted (bz298931) a few days after it was submitted. The only suggestion
> that I have gotten back from Red Hat is to shutdown the app so the GFS
> drives are not being accessed and then run the "gfs_tool reclaim <mount
> point>" command.
>
> [root at omzdwcdrp003 ~]# gfs_tool df /l1load1
> /l1load1:
> SB lock proto = "lock_dlm"
> SB lock table = "DWCDR_prod:l1load1"
> SB ondisk format = 1309
> SB multihost format = 1401
> Block size = 4096
> Journals = 20
> Resource Groups = 6936
> Mounted lock proto = "lock_dlm"
> Mounted lock table = "DWCDR_prod:l1load1"
> Mounted host data = ""
> Journal number = 13
> Lock module flags =
> Local flocks = FALSE
> Local caching = FALSE
> Oopses OK = FALSE
>
> Type           Total          Used           Free           use%
> ------------------------------------------------------------------------
> inodes         155300         155300         0              100%
> metadata       2016995        675430         1341565        33%
> data           452302809      331558847      120743962      73%
> [root at omzdwcdrp003 ~]# df -h /l1load1
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/mapper/l1load1--vg-l1load1--lv
>                    1.7T  1.3T  468G  74% /l1load1
> [root at omzdwcdrp003 ~]# du -sh /l1load1
> 18G     /l1load1
>
> ----
> Jason Huddleston, RHCE
> ----
> PS-USE-Linux
> Partner Support - Unix Support and Engineering
> Verizon Information Processing Services
>
>
>
> Shawn Hood wrote:
>>
>> Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
>> count"?  I've had a ~1.1TB FS report that it's full with df reporting
>> ~100GB remaining.
>>
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Shawn Hood
910.670.1819 m


From jason.huddleston at verizon.com  Mon Oct 13 20:18:51 2008
From: jason.huddleston at verizon.com (Jason Huddleston)
Date: Mon, 13 Oct 2008 15:18:51 -0500
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
	<48F3A8D7.3000309@verizon.com>
	<cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>
Message-ID: <48F3AD2B.3090504@verizon.com>

I've been watching mine do this for about two months now. I think it 
started when I upgraded from RHEL 4.5 to 4.6. The app team only has 
about 18 gig used on that 1.7TB drive but they create and delete allot 
of files because that is the loading area they used when new data comes 
in. In the last month I have seen it go up to 70 to 85% used but it 
usually comes back down to about 50% within about 24 hours. Hopefully 
they will find a fix for this soon.

---
Jay

Shawn Hood wrote:
> I actually just ran the reclaim on a live filesystem and it seems to
> be working okay now.  Hopefully this isn't problematic, as a large
> number of operations in the GFS tool suite operate on mounted
> filesystems.
>
> Shawn
>
> On Mon, Oct 13, 2008 at 4:00 PM, Jason Huddleston
> <jason.huddleston at verizon.com> wrote:
>   
>> Shawn,
>>   I have been seeing the same thing on one of my clusters (shown below)
>> under Red Hat 4.6. I found some details on this under an article on the
>> open-shared root web site
>> (http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full)
>> and an article in Red Hat's knowledge base
>> (http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in the
>> reclaim of metadata blocks when an inode is released. I saw a patch
>> (bz298931) released for this in the 2.99.10 cluster release notes but it was
>> reverted (bz298931) a few days after it was submitted. The only suggestion
>> that I have gotten back from Red Hat is to shutdown the app so the GFS
>> drives are not being accessed and then run the "gfs_tool reclaim <mount
>> point>" command.
>>
>> [root at omzdwcdrp003 ~]# gfs_tool df /l1load1
>> /l1load1:
>> SB lock proto = "lock_dlm"
>> SB lock table = "DWCDR_prod:l1load1"
>> SB ondisk format = 1309
>> SB multihost format = 1401
>> Block size = 4096
>> Journals = 20
>> Resource Groups = 6936
>> Mounted lock proto = "lock_dlm"
>> Mounted lock table = "DWCDR_prod:l1load1"
>> Mounted host data = ""
>> Journal number = 13
>> Lock module flags =
>> Local flocks = FALSE
>> Local caching = FALSE
>> Oopses OK = FALSE
>>
>> Type           Total          Used           Free           use%
>> ------------------------------------------------------------------------
>> inodes         155300         155300         0              100%
>> metadata       2016995        675430         1341565        33%
>> data           452302809      331558847      120743962      73%
>> [root at omzdwcdrp003 ~]# df -h /l1load1
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/mapper/l1load1--vg-l1load1--lv
>>                    1.7T  1.3T  468G  74% /l1load1
>> [root at omzdwcdrp003 ~]# du -sh /l1load1
>> 18G     /l1load1
>>
>> ----
>> Jason Huddleston, RHCE
>> ----
>> PS-USE-Linux
>> Partner Support - Unix Support and Engineering
>> Verizon Information Processing Services
>>
>>
>>
>> Shawn Hood wrote:
>>     
>>> Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
>>> count"?  I've had a ~1.1TB FS report that it's full with df reporting
>>> ~100GB remaining.
>>>
>>>
>>>       
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>     
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081013/bf9ea5b6/attachment.htm>

From beekhof at gmail.com  Mon Oct 13 20:20:04 2008
From: beekhof at gmail.com (Andrew Beekhof)
Date: Mon, 13 Oct 2008 22:20:04 +0200
Subject: [Linux-cluster] Cib.xml --> haresources
In-Reply-To: <b6131fdc0810110912v2819a612ka22ce148b37c4a1@mail.gmail.com>
References: <b6131fdc0810110912v2819a612ka22ce148b37c4a1@mail.gmail.com>
Message-ID: <26ef5e70810131320w167f96fcw8301573a86884c12@mail.gmail.com>

2008/10/11 Stephen Nelson-Smith <sanelson at gmail.com>:
> Hi,
>
> I want to managed a linux-ha cluster (heartbeat 2) with puppet,
> however I think I'm likely to struggle with cib.xml, as it contains
> runtime info, and the cluster software seems like it won't take kindly
> to being deployed automatically.
>
> I think the approach is to put the config in an haresources file, and
> use the haresources conversion to cib.xml tool.  However, how do I put
> the knowledge I've put into my cib.xml (attached) into haresources?

You can't and shouldn't even if you could.
That tool is intended to be run once per cluster and not on a recurring basis.

> Or am I missing the point entirely and there's a much better/easier
> way to do this?

cibadmin -R
http://clusterlabs.org/mw/Image:Configuration_Explained.pdf
>
> S.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


From kanderso at redhat.com  Mon Oct 13 20:29:41 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Mon, 13 Oct 2008 15:29:41 -0500
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <48F3AD2B.3090504@verizon.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
	<48F3A8D7.3000309@verizon.com>
	<cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>
	<48F3AD2B.3090504@verizon.com>
Message-ID: <1223929781.2991.47.camel@dhcp80-204.msp.redhat.com>

For gfs, the recommended solution is to periodically run gfs_tool
reclaim on your filesystems at a time of your choosing.  Depending on
the frequency of your deletes, this might be once a day or once a week.
The only downside is the during the reclaim operation, the filesystem is
locked from other activities.  As the reclaim is relatively fast, this
doesn't really cause a problem.  But scheduling the command to be run
during "idle" times of the day will mitigate the impact.

We attempted to come up with a method of doing this automatically, but
there are deadlock lock issues between gfs and the vfs layer that
prevent it from being implemented.  In addition, there is still the
issue of when is the right time to do the reclaim, and this would be
application specific.

So, just run gfs_tool reclaim if your storage is getting consumed by
metadata storage.

Kevin

On Mon, 2008-10-13 at 15:18 -0500, Jason Huddleston wrote:
> I've been watching mine do this for about two months now. I think it
> started when I upgraded from RHEL 4.5 to 4.6. The app team only has
> about 18 gig used on that 1.7TB drive but they create and delete allot
> of files because that is the loading area they used when new data
> comes in. In the last month I have seen it go up to 70 to 85% used but
> it usually comes back down to about 50% within about 24 hours.
> Hopefully they will find a fix for this soon.
> 
> ---
> Jay
> 
> Shawn Hood wrote: 
> > I actually just ran the reclaim on a live filesystem and it seems to
> > be working okay now.  Hopefully this isn't problematic, as a large
> > number of operations in the GFS tool suite operate on mounted
> > filesystems.
> > 
> > Shawn
> > 
> > On Mon, Oct 13, 2008 at 4:00 PM, Jason Huddleston
> > <jason.huddleston at verizon.com> wrote:
> >   
> > > Shawn,
> > >   I have been seeing the same thing on one of my clusters (shown below)
> > > under Red Hat 4.6. I found some details on this under an article on the
> > > open-shared root web site
> > > (http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full)
> > > and an article in Red Hat's knowledge base
> > > (http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in the
> > > reclaim of metadata blocks when an inode is released. I saw a patch
> > > (bz298931) released for this in the 2.99.10 cluster release notes but it was
> > > reverted (bz298931) a few days after it was submitted. The only suggestion
> > > that I have gotten back from Red Hat is to shutdown the app so the GFS
> > > drives are not being accessed and then run the "gfs_tool reclaim <mount
> > > point>" command.
> > > 
> > > [root at omzdwcdrp003 ~]# gfs_tool df /l1load1
> > > /l1load1:
> > > SB lock proto = "lock_dlm"
> > > SB lock table = "DWCDR_prod:l1load1"
> > > SB ondisk format = 1309
> > > SB multihost format = 1401
> > > Block size = 4096
> > > Journals = 20
> > > Resource Groups = 6936
> > > Mounted lock proto = "lock_dlm"
> > > Mounted lock table = "DWCDR_prod:l1load1"
> > > Mounted host data = ""
> > > Journal number = 13
> > > Lock module flags =
> > > Local flocks = FALSE
> > > Local caching = FALSE
> > > Oopses OK = FALSE
> > > 
> > > Type           Total          Used           Free           use%
> > > ------------------------------------------------------------------------
> > > inodes         155300         155300         0              100%
> > > metadata       2016995        675430         1341565        33%
> > > data           452302809      331558847      120743962      73%
> > > [root at omzdwcdrp003 ~]# df -h /l1load1
> > > Filesystem            Size  Used Avail Use% Mounted on
> > > /dev/mapper/l1load1--vg-l1load1--lv
> > >                    1.7T  1.3T  468G  74% /l1load1
> > > [root at omzdwcdrp003 ~]# du -sh /l1load1
> > > 18G     /l1load1
> > > 
> > > ----
> > > Jason Huddleston, RHCE
> > > ----
> > > PS-USE-Linux
> > > Partner Support - Unix Support and Engineering
> > > Verizon Information Processing Services
> > > 
> > > 
> > > 
> > > Shawn Hood wrote:
> > >     
> > > > Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
> > > > count"?  I've had a ~1.1TB FS report that it's full with df reporting
> > > > ~100GB remaining.
> > > > 
> > > > 
> > > >       
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > > 
> > >     
> > 
> > 
> > 
> >   
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From shawnlhood at gmail.com  Mon Oct 13 20:33:25 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 13 Oct 2008 16:33:25 -0400
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <1223929781.2991.47.camel@dhcp80-204.msp.redhat.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
	<48F3A8D7.3000309@verizon.com>
	<cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>
	<48F3AD2B.3090504@verizon.com>
	<1223929781.2991.47.camel@dhcp80-204.msp.redhat.com>
Message-ID: <cfe2fc960810131333w18a57806tb734ebf0bc2807c@mail.gmail.com>

Someone give me write access to the FAQ!  I've been compiling these
undocumented (or hard to find) bits of knowledge for some time now.


Shawn

On Mon, Oct 13, 2008 at 4:29 PM, Kevin Anderson <kanderso at redhat.com> wrote:
> For gfs, the recommended solution is to periodically run gfs_tool
> reclaim on your filesystems at a time of your choosing.  Depending on
> the frequency of your deletes, this might be once a day or once a week.
> The only downside is the during the reclaim operation, the filesystem is
> locked from other activities.  As the reclaim is relatively fast, this
> doesn't really cause a problem.  But scheduling the command to be run
> during "idle" times of the day will mitigate the impact.
>
> We attempted to come up with a method of doing this automatically, but
> there are deadlock lock issues between gfs and the vfs layer that
> prevent it from being implemented.  In addition, there is still the
> issue of when is the right time to do the reclaim, and this would be
> application specific.
>
> So, just run gfs_tool reclaim if your storage is getting consumed by
> metadata storage.
>
> Kevin
>
> On Mon, 2008-10-13 at 15:18 -0500, Jason Huddleston wrote:
>> I've been watching mine do this for about two months now. I think it
>> started when I upgraded from RHEL 4.5 to 4.6. The app team only has
>> about 18 gig used on that 1.7TB drive but they create and delete allot
>> of files because that is the loading area they used when new data
>> comes in. In the last month I have seen it go up to 70 to 85% used but
>> it usually comes back down to about 50% within about 24 hours.
>> Hopefully they will find a fix for this soon.
>>
>> ---
>> Jay
>>
>> Shawn Hood wrote:
>> > I actually just ran the reclaim on a live filesystem and it seems to
>> > be working okay now.  Hopefully this isn't problematic, as a large
>> > number of operations in the GFS tool suite operate on mounted
>> > filesystems.
>> >
>> > Shawn
>> >
>> > On Mon, Oct 13, 2008 at 4:00 PM, Jason Huddleston
>> > <jason.huddleston at verizon.com> wrote:
>> >
>> > > Shawn,
>> > >   I have been seeing the same thing on one of my clusters (shown below)
>> > > under Red Hat 4.6. I found some details on this under an article on the
>> > > open-shared root web site
>> > > (http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full)
>> > > and an article in Red Hat's knowledge base
>> > > (http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in the
>> > > reclaim of metadata blocks when an inode is released. I saw a patch
>> > > (bz298931) released for this in the 2.99.10 cluster release notes but it was
>> > > reverted (bz298931) a few days after it was submitted. The only suggestion
>> > > that I have gotten back from Red Hat is to shutdown the app so the GFS
>> > > drives are not being accessed and then run the "gfs_tool reclaim <mount
>> > > point>" command.
>> > >
>> > > [root at omzdwcdrp003 ~]# gfs_tool df /l1load1
>> > > /l1load1:
>> > > SB lock proto = "lock_dlm"
>> > > SB lock table = "DWCDR_prod:l1load1"
>> > > SB ondisk format = 1309
>> > > SB multihost format = 1401
>> > > Block size = 4096
>> > > Journals = 20
>> > > Resource Groups = 6936
>> > > Mounted lock proto = "lock_dlm"
>> > > Mounted lock table = "DWCDR_prod:l1load1"
>> > > Mounted host data = ""
>> > > Journal number = 13
>> > > Lock module flags =
>> > > Local flocks = FALSE
>> > > Local caching = FALSE
>> > > Oopses OK = FALSE
>> > >
>> > > Type           Total          Used           Free           use%
>> > > ------------------------------------------------------------------------
>> > > inodes         155300         155300         0              100%
>> > > metadata       2016995        675430         1341565        33%
>> > > data           452302809      331558847      120743962      73%
>> > > [root at omzdwcdrp003 ~]# df -h /l1load1
>> > > Filesystem            Size  Used Avail Use% Mounted on
>> > > /dev/mapper/l1load1--vg-l1load1--lv
>> > >                    1.7T  1.3T  468G  74% /l1load1
>> > > [root at omzdwcdrp003 ~]# du -sh /l1load1
>> > > 18G     /l1load1
>> > >
>> > > ----
>> > > Jason Huddleston, RHCE
>> > > ----
>> > > PS-USE-Linux
>> > > Partner Support - Unix Support and Engineering
>> > > Verizon Information Processing Services
>> > >
>> > >
>> > >
>> > > Shawn Hood wrote:
>> > >
>> > > > Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
>> > > > count"?  I've had a ~1.1TB FS report that it's full with df reporting
>> > > > ~100GB remaining.
>> > > >
>> > > >
>> > > >
>> > > --
>> > > Linux-cluster mailing list
>> > > Linux-cluster at redhat.com
>> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>> > >
>> > >
>> >
>> >
>> >
>> >
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Shawn Hood
910.670.1819 m


From jason.huddleston at verizon.com  Mon Oct 13 20:39:26 2008
From: jason.huddleston at verizon.com (Jason Huddleston)
Date: Mon, 13 Oct 2008 15:39:26 -0500
Subject: [Linux-cluster] GFS reserved blocks?
In-Reply-To: <cfe2fc960810131333w18a57806tb734ebf0bc2807c@mail.gmail.com>
References: <cfe2fc960810131233n29f64eafj22f68a6695ba5dca@mail.gmail.com>
	<48F3A8D7.3000309@verizon.com>
	<cfe2fc960810131302j3a3e4d40x7a787bc8c1951770@mail.gmail.com>
	<48F3AD2B.3090504@verizon.com>
	<1223929781.2991.47.camel@dhcp80-204.msp.redhat.com>
	<cfe2fc960810131333w18a57806tb734ebf0bc2807c@mail.gmail.com>
Message-ID: <48F3B1FE.3090500@verizon.com>

Sweet. Maybe your notes will save someone else some time. I know it was 
a great resource for me when I set up my first GFS cluster.

---
Jay

Shawn Hood wrote:
> Someone give me write access to the FAQ!  I've been compiling these
> undocumented (or hard to find) bits of knowledge for some time now.
>
>
> Shawn
>
> On Mon, Oct 13, 2008 at 4:29 PM, Kevin Anderson <kanderso at redhat.com> wrote:
>   
>> For gfs, the recommended solution is to periodically run gfs_tool
>> reclaim on your filesystems at a time of your choosing.  Depending on
>> the frequency of your deletes, this might be once a day or once a week.
>> The only downside is the during the reclaim operation, the filesystem is
>> locked from other activities.  As the reclaim is relatively fast, this
>> doesn't really cause a problem.  But scheduling the command to be run
>> during "idle" times of the day will mitigate the impact.
>>
>> We attempted to come up with a method of doing this automatically, but
>> there are deadlock lock issues between gfs and the vfs layer that
>> prevent it from being implemented.  In addition, there is still the
>> issue of when is the right time to do the reclaim, and this would be
>> application specific.
>>
>> So, just run gfs_tool reclaim if your storage is getting consumed by
>> metadata storage.
>>
>> Kevin
>>
>> On Mon, 2008-10-13 at 15:18 -0500, Jason Huddleston wrote:
>>     
>>> I've been watching mine do this for about two months now. I think it
>>> started when I upgraded from RHEL 4.5 to 4.6. The app team only has
>>> about 18 gig used on that 1.7TB drive but they create and delete allot
>>> of files because that is the loading area they used when new data
>>> comes in. In the last month I have seen it go up to 70 to 85% used but
>>> it usually comes back down to about 50% within about 24 hours.
>>> Hopefully they will find a fix for this soon.
>>>
>>> ---
>>> Jay
>>>
>>> Shawn Hood wrote:
>>>       
>>>> I actually just ran the reclaim on a live filesystem and it seems to
>>>> be working okay now.  Hopefully this isn't problematic, as a large
>>>> number of operations in the GFS tool suite operate on mounted
>>>> filesystems.
>>>>
>>>> Shawn
>>>>
>>>> On Mon, Oct 13, 2008 at 4:00 PM, Jason Huddleston
>>>> <jason.huddleston at verizon.com> wrote:
>>>>
>>>>         
>>>>> Shawn,
>>>>>   I have been seeing the same thing on one of my clusters (shown below)
>>>>> under Red Hat 4.6. I found some details on this under an article on the
>>>>> open-shared root web site
>>>>> (http://www.open-sharedroot.org/faq/troubleshooting-guide/file-systems/gfs/file-system-full)
>>>>> and an article in Red Hat's knowledge base
>>>>> (http://kbase.redhat.com/faq/FAQ_78_10697.shtm). It seems to be a bug in the
>>>>> reclaim of metadata blocks when an inode is released. I saw a patch
>>>>> (bz298931) released for this in the 2.99.10 cluster release notes but it was
>>>>> reverted (bz298931) a few days after it was submitted. The only suggestion
>>>>> that I have gotten back from Red Hat is to shutdown the app so the GFS
>>>>> drives are not being accessed and then run the "gfs_tool reclaim <mount
>>>>> point>" command.
>>>>>
>>>>> [root at omzdwcdrp003 ~]# gfs_tool df /l1load1
>>>>> /l1load1:
>>>>> SB lock proto = "lock_dlm"
>>>>> SB lock table = "DWCDR_prod:l1load1"
>>>>> SB ondisk format = 1309
>>>>> SB multihost format = 1401
>>>>> Block size = 4096
>>>>> Journals = 20
>>>>> Resource Groups = 6936
>>>>> Mounted lock proto = "lock_dlm"
>>>>> Mounted lock table = "DWCDR_prod:l1load1"
>>>>> Mounted host data = ""
>>>>> Journal number = 13
>>>>> Lock module flags =
>>>>> Local flocks = FALSE
>>>>> Local caching = FALSE
>>>>> Oopses OK = FALSE
>>>>>
>>>>> Type           Total          Used           Free           use%
>>>>> ------------------------------------------------------------------------
>>>>> inodes         155300         155300         0              100%
>>>>> metadata       2016995        675430         1341565        33%
>>>>> data           452302809      331558847      120743962      73%
>>>>> [root at omzdwcdrp003 ~]# df -h /l1load1
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> /dev/mapper/l1load1--vg-l1load1--lv
>>>>>                    1.7T  1.3T  468G  74% /l1load1
>>>>> [root at omzdwcdrp003 ~]# du -sh /l1load1
>>>>> 18G     /l1load1
>>>>>
>>>>> ----
>>>>> Jason Huddleston, RHCE
>>>>> ----
>>>>> PS-USE-Linux
>>>>> Partner Support - Unix Support and Engineering
>>>>> Verizon Information Processing Services
>>>>>
>>>>>
>>>>>
>>>>> Shawn Hood wrote:
>>>>>
>>>>>           
>>>>>> Does GFS reserve blocks for the superuser, a la ext3's "Reserved block
>>>>>> count"?  I've had a ~1.1TB FS report that it's full with df reporting
>>>>>> ~100GB remaining.
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>> --
>>>>> Linux-cluster mailing list
>>>>> Linux-cluster at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>
>>>>>
>>>>>           
>>>>
>>>>
>>>>         
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>       
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>     
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081013/cfee2bd0/attachment.htm>

From shawnlhood at gmail.com  Mon Oct 13 21:32:42 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 13 Oct 2008 17:32:42 -0400
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
	<cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
Message-ID: <cfe2fc960810131432y4e04d24bo7e59efc927b1ec51@mail.gmail.com>

As a heads up, I'm about to open a high priority bug on this.  It's
crippling us.  Also, I meant to say it is a 4 node cluster, not a 3
node.

Please let me know if I can provide any more information in addition
to this.  I will provide the information from a time series of
gfs_tool counters commands with the support request.

Shawn

On Tue, Oct 7, 2008 at 1:40 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
> More info:
>
> All filesystems mounted using noatime,nodiratime,noquota.
>
> All filesystems report the same data from gfs_tool gettune:
>
> limit1 = 100
> ilimit1_tries = 3
> ilimit1_min = 1
> ilimit2 = 500
> ilimit2_tries = 10
> ilimit2_min = 3
> demote_secs = 300
> incore_log_blocks = 1024
> jindex_refresh_secs = 60
> depend_secs = 60
> scand_secs = 5
> recoverd_secs = 60
> logd_secs = 1
> quotad_secs = 5
> inoded_secs = 15
> glock_purge = 0
> quota_simul_sync = 64
> quota_warn_period = 10
> atime_quantum = 3600
> quota_quantum = 60
> quota_scale = 1.0000   (1, 1)
> quota_enforce = 0
> quota_account = 0
> new_files_jdata = 0
> new_files_directio = 0
> max_atomic_write = 4194304
> max_readahead = 262144
> lockdump_size = 131072
> stall_secs = 600
> complain_secs = 10
> reclaim_limit = 5000
> entries_per_readdir = 32
> prefetch_secs = 10
> statfs_slots = 64
> max_mhc = 10000
> greedy_default = 100
> greedy_quantum = 25
> greedy_max = 250
> rgrp_try_threshold = 100
> statfs_fast = 0
> seq_readahead = 0
>
>
> And data on the FS from gfs_tool counters:
>                                  locks 2948
>                             locks held 1352
>                           freeze count 0
>                          incore inodes 1347
>                       metadata buffers 0
>                        unlinked inodes 0
>                              quota IDs 0
>                     incore log buffers 0
>                         log space used 0.05%
>              meta header cache entries 0
>                     glock dependencies 0
>                 glocks on reclaim list 0
>                              log wraps 2
>                   outstanding LM calls 0
>                  outstanding BIO calls 0
>                       fh2dentry misses 0
>                       glocks reclaimed 223287
>                         glock nq calls 1812286
>                         glock dq calls 1810926
>                   glock prefetch calls 101158
>                          lm_lock calls 198294
>                        lm_unlock calls 142643
>                           lm callbacks 341621
>                     address operations 502691
>                      dentry operations 395330
>                      export operations 0
>                        file operations 199243
>                       inode operations 984276
>                       super operations 1727082
>                          vm operations 0
>                        block I/O reads 520531
>                       block I/O writes 130315
>
>                                  locks 171423
>                             locks held 85717
>                           freeze count 0
>                          incore inodes 85376
>                       metadata buffers 1474
>                        unlinked inodes 0
>                              quota IDs 0
>                     incore log buffers 24
>                         log space used 0.83%
>              meta header cache entries 6621
>                     glock dependencies 2037
>                 glocks on reclaim list 0
>                              log wraps 428
>                   outstanding LM calls 0
>                  outstanding BIO calls 0
>                       fh2dentry misses 0
>                       glocks reclaimed 45784677
>                         glock nq calls 962822941
>                         glock dq calls 962595532
>                   glock prefetch calls 20215922
>                          lm_lock calls 40708633
>                        lm_unlock calls 23410498
>                           lm callbacks 64156052
>                     address operations 705464659
>                      dentry operations 19701522
>                      export operations 0
>                        file operations 364990733
>                       inode operations 98910127
>                       super operations 440061034
>                          vm operations 7
>                        block I/O reads 90394984
>                       block I/O writes 131199864
>
>                                  locks 2916542
>                             locks held 1476005
>                           freeze count 0
>                          incore inodes 1454165
>                       metadata buffers 12539
>                        unlinked inodes 100
>                              quota IDs 0
>                     incore log buffers 11
>                         log space used 13.33%
>              meta header cache entries 9928
>                     glock dependencies 110
>                 glocks on reclaim list 0
>                              log wraps 2393
>                   outstanding LM calls 25
>                  outstanding BIO calls 0
>                       fh2dentry misses 55546
>                       glocks reclaimed 127341056
>                         glock nq calls 867427
>                         glock dq calls 867430
>                   glock prefetch calls 36679316
>                          lm_lock calls 110179878
>                        lm_unlock calls 84588424
>                           lm callbacks 194863553
>                     address operations 250891447
>                      dentry operations 359537343
>                      export operations 390941288
>                        file operations 399156716
>                       inode operations 537830
>                       super operations 1093798409
>                          vm operations 774785
>                        block I/O reads 258044208
>                       block I/O writes 101585172
>
>
>
> On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>> Problem:
>> It seems that IO on one machine in the cluster (not always the same
>> machine) will hang and all processes accessing clustered LVs will
>> block.  Other machines will follow suit shortly thereafter until the
>> machine that first exhibited the problem is rebooted (via fence_drac
>> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
>> fsckd.
>>
>> Hardware:
>> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
>> Running RHEL4 ES U7.  Four machines
>> Onboard gigabit NICs (Machines use little bandwidth, and all network
>> traffic including DLM share NICs)
>> QLogic 2462 PCI-Express dual channel FC HBAs
>> QLogic SANBox 5200 FC switch
>> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
>> Cisco Catalyst switch
>>
>> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
>> x86_64 with the following packages:
>> ccs-1.0.12-1
>> cman-1.0.24-1
>> cman-kernel-smp-2.6.9-55.13.el4_7.1
>> cman-kernheaders-2.6.9-55.13.el4_7.1
>> dlm-kernel-smp-2.6.9-54.11.el4_7.1
>> dlm-kernheaders-2.6.9-54.11.el4_7.1
>> fence-1.32.63-1.el4_7.1
>> GFS-6.1.18-1
>> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>>
>> One clustered VG.  Striped across two physical volumes, which
>> correspond to each side of an Apple XRAID.
>> Clustered volume group info:
>>  --- Volume group ---
>>  VG Name               hq-san
>>  System ID
>>  Format                lvm2
>>  Metadata Areas        2
>>  Metadata Sequence No  50
>>  VG Access             read/write
>>  VG Status             resizable
>>  Clustered             yes
>>  Shared                no
>>  MAX LV                0
>>  Cur LV                3
>>  Open LV               3
>>  Max PV                0
>>  Cur PV                2
>>  Act PV                2
>>  VG Size               4.55 TB
>>  PE Size               4.00 MB
>>  Total PE              1192334
>>  Alloc PE / Size       905216 / 3.45 TB
>>  Free  PE / Size       287118 / 1.10 TB
>>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>>
>> Logical volumes contained with hq-san VG:
>>  cam_development   hq-san                          -wi-ao 500.00G
>>  qa            hq-san                          -wi-ao   1.07T
>>  svn_users         hq-san                          -wi-ao   1.89T
>>
>> All four machines mount svn_users, two machines mount qa, and one
>> mounts cam_development.
>>
>> /etc/cluster/cluster.conf:
>>
>> <?xml version="1.0"?>
>> <cluster alias="tungsten" config_version="31" name="qualia">
>>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>        <clusternodes>
>>                <clusternode name="odin" votes="1">
>>                        <fence>
>>                                <method name="1">
>>                    <device modulename="" name="odin-drac"/>
>>                </method>
>>                        </fence>
>>                </clusternode>
>>                <clusternode name="hugin" votes="1">
>>                        <fence>
>>                                <method name="1">
>>                    <device modulename="" name="hugin-drac"/>
>>                </method>
>>                        </fence>
>>                </clusternode>
>>                <clusternode name="munin" votes="1">
>>                        <fence>
>>                                <method name="1">
>>                    <device modulename="" name="munin-drac"/>
>>                </method>
>>                        </fence>
>>                </clusternode>
>>                <clusternode name="zeus" votes="1">
>>                        <fence>
>>                                <method name="1">
>>                    <device modulename="" name="zeus-drac"/>
>>                </method>
>>                        </fence>
>>                </clusternode>
>>    </clusternodes>
>>        <cman expected_votes="1" two_node="0"/>
>>        <fencedevices>
>>                <resources/>
>>                <fencedevice name="odin-drac" agent="fence_drac"
>> ipaddr="redacted" login="root" passwd="redacted"/>
>>                <fencedevice name="hugin-drac" agent="fence_drac"
>> ipaddr="redacted" login="root" passwd="redacted"/>
>>                <fencedevice name="munin-drac" agent="fence_drac"
>> ipaddr="redacted" login="root" passwd="redacted"/>
>>                <fencedevice name="zeus-drac" agent="fence_drac"
>> ipaddr="redacted" login="root" passwd="redacted"/>
>>        </fencedevices>
>>        <rm>
>>        <failoverdomains/>
>>        <resources/>
>>    </rm>
>> </cluster>
>>
>>
>>
>>
>> --
>> Shawn Hood
>> 910.670.1819 m
>>
>
>
>
> --
> Shawn Hood
> 910.670.1819 m
>


-- 
Shawn Hood
910.670.1819 m


From shawnlhood at gmail.com  Mon Oct 13 21:32:54 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Mon, 13 Oct 2008 17:32:54 -0400
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810131432y4e04d24bo7e59efc927b1ec51@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
	<cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
	<cfe2fc960810131432y4e04d24bo7e59efc927b1ec51@mail.gmail.com>
Message-ID: <cfe2fc960810131432i34c34277g60fc568580591d11@mail.gmail.com>

High priorty support request, I mean.

On Mon, Oct 13, 2008 at 5:32 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
> As a heads up, I'm about to open a high priority bug on this.  It's
> crippling us.  Also, I meant to say it is a 4 node cluster, not a 3
> node.
>
> Please let me know if I can provide any more information in addition
> to this.  I will provide the information from a time series of
> gfs_tool counters commands with the support request.
>
> Shawn
>
> On Tue, Oct 7, 2008 at 1:40 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>> More info:
>>
>> All filesystems mounted using noatime,nodiratime,noquota.
>>
>> All filesystems report the same data from gfs_tool gettune:
>>
>> limit1 = 100
>> ilimit1_tries = 3
>> ilimit1_min = 1
>> ilimit2 = 500
>> ilimit2_tries = 10
>> ilimit2_min = 3
>> demote_secs = 300
>> incore_log_blocks = 1024
>> jindex_refresh_secs = 60
>> depend_secs = 60
>> scand_secs = 5
>> recoverd_secs = 60
>> logd_secs = 1
>> quotad_secs = 5
>> inoded_secs = 15
>> glock_purge = 0
>> quota_simul_sync = 64
>> quota_warn_period = 10
>> atime_quantum = 3600
>> quota_quantum = 60
>> quota_scale = 1.0000   (1, 1)
>> quota_enforce = 0
>> quota_account = 0
>> new_files_jdata = 0
>> new_files_directio = 0
>> max_atomic_write = 4194304
>> max_readahead = 262144
>> lockdump_size = 131072
>> stall_secs = 600
>> complain_secs = 10
>> reclaim_limit = 5000
>> entries_per_readdir = 32
>> prefetch_secs = 10
>> statfs_slots = 64
>> max_mhc = 10000
>> greedy_default = 100
>> greedy_quantum = 25
>> greedy_max = 250
>> rgrp_try_threshold = 100
>> statfs_fast = 0
>> seq_readahead = 0
>>
>>
>> And data on the FS from gfs_tool counters:
>>                                  locks 2948
>>                             locks held 1352
>>                           freeze count 0
>>                          incore inodes 1347
>>                       metadata buffers 0
>>                        unlinked inodes 0
>>                              quota IDs 0
>>                     incore log buffers 0
>>                         log space used 0.05%
>>              meta header cache entries 0
>>                     glock dependencies 0
>>                 glocks on reclaim list 0
>>                              log wraps 2
>>                   outstanding LM calls 0
>>                  outstanding BIO calls 0
>>                       fh2dentry misses 0
>>                       glocks reclaimed 223287
>>                         glock nq calls 1812286
>>                         glock dq calls 1810926
>>                   glock prefetch calls 101158
>>                          lm_lock calls 198294
>>                        lm_unlock calls 142643
>>                           lm callbacks 341621
>>                     address operations 502691
>>                      dentry operations 395330
>>                      export operations 0
>>                        file operations 199243
>>                       inode operations 984276
>>                       super operations 1727082
>>                          vm operations 0
>>                        block I/O reads 520531
>>                       block I/O writes 130315
>>
>>                                  locks 171423
>>                             locks held 85717
>>                           freeze count 0
>>                          incore inodes 85376
>>                       metadata buffers 1474
>>                        unlinked inodes 0
>>                              quota IDs 0
>>                     incore log buffers 24
>>                         log space used 0.83%
>>              meta header cache entries 6621
>>                     glock dependencies 2037
>>                 glocks on reclaim list 0
>>                              log wraps 428
>>                   outstanding LM calls 0
>>                  outstanding BIO calls 0
>>                       fh2dentry misses 0
>>                       glocks reclaimed 45784677
>>                         glock nq calls 962822941
>>                         glock dq calls 962595532
>>                   glock prefetch calls 20215922
>>                          lm_lock calls 40708633
>>                        lm_unlock calls 23410498
>>                           lm callbacks 64156052
>>                     address operations 705464659
>>                      dentry operations 19701522
>>                      export operations 0
>>                        file operations 364990733
>>                       inode operations 98910127
>>                       super operations 440061034
>>                          vm operations 7
>>                        block I/O reads 90394984
>>                       block I/O writes 131199864
>>
>>                                  locks 2916542
>>                             locks held 1476005
>>                           freeze count 0
>>                          incore inodes 1454165
>>                       metadata buffers 12539
>>                        unlinked inodes 100
>>                              quota IDs 0
>>                     incore log buffers 11
>>                         log space used 13.33%
>>              meta header cache entries 9928
>>                     glock dependencies 110
>>                 glocks on reclaim list 0
>>                              log wraps 2393
>>                   outstanding LM calls 25
>>                  outstanding BIO calls 0
>>                       fh2dentry misses 55546
>>                       glocks reclaimed 127341056
>>                         glock nq calls 867427
>>                         glock dq calls 867430
>>                   glock prefetch calls 36679316
>>                          lm_lock calls 110179878
>>                        lm_unlock calls 84588424
>>                           lm callbacks 194863553
>>                     address operations 250891447
>>                      dentry operations 359537343
>>                      export operations 390941288
>>                        file operations 399156716
>>                       inode operations 537830
>>                       super operations 1093798409
>>                          vm operations 774785
>>                        block I/O reads 258044208
>>                       block I/O writes 101585172
>>
>>
>>
>> On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>>> Problem:
>>> It seems that IO on one machine in the cluster (not always the same
>>> machine) will hang and all processes accessing clustered LVs will
>>> block.  Other machines will follow suit shortly thereafter until the
>>> machine that first exhibited the problem is rebooted (via fence_drac
>>> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
>>> fsckd.
>>>
>>> Hardware:
>>> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
>>> Running RHEL4 ES U7.  Four machines
>>> Onboard gigabit NICs (Machines use little bandwidth, and all network
>>> traffic including DLM share NICs)
>>> QLogic 2462 PCI-Express dual channel FC HBAs
>>> QLogic SANBox 5200 FC switch
>>> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
>>> Cisco Catalyst switch
>>>
>>> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
>>> x86_64 with the following packages:
>>> ccs-1.0.12-1
>>> cman-1.0.24-1
>>> cman-kernel-smp-2.6.9-55.13.el4_7.1
>>> cman-kernheaders-2.6.9-55.13.el4_7.1
>>> dlm-kernel-smp-2.6.9-54.11.el4_7.1
>>> dlm-kernheaders-2.6.9-54.11.el4_7.1
>>> fence-1.32.63-1.el4_7.1
>>> GFS-6.1.18-1
>>> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>>>
>>> One clustered VG.  Striped across two physical volumes, which
>>> correspond to each side of an Apple XRAID.
>>> Clustered volume group info:
>>>  --- Volume group ---
>>>  VG Name               hq-san
>>>  System ID
>>>  Format                lvm2
>>>  Metadata Areas        2
>>>  Metadata Sequence No  50
>>>  VG Access             read/write
>>>  VG Status             resizable
>>>  Clustered             yes
>>>  Shared                no
>>>  MAX LV                0
>>>  Cur LV                3
>>>  Open LV               3
>>>  Max PV                0
>>>  Cur PV                2
>>>  Act PV                2
>>>  VG Size               4.55 TB
>>>  PE Size               4.00 MB
>>>  Total PE              1192334
>>>  Alloc PE / Size       905216 / 3.45 TB
>>>  Free  PE / Size       287118 / 1.10 TB
>>>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>>>
>>> Logical volumes contained with hq-san VG:
>>>  cam_development   hq-san                          -wi-ao 500.00G
>>>  qa            hq-san                          -wi-ao   1.07T
>>>  svn_users         hq-san                          -wi-ao   1.89T
>>>
>>> All four machines mount svn_users, two machines mount qa, and one
>>> mounts cam_development.
>>>
>>> /etc/cluster/cluster.conf:
>>>
>>> <?xml version="1.0"?>
>>> <cluster alias="tungsten" config_version="31" name="qualia">
>>>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>>        <clusternodes>
>>>                <clusternode name="odin" votes="1">
>>>                        <fence>
>>>                                <method name="1">
>>>                    <device modulename="" name="odin-drac"/>
>>>                </method>
>>>                        </fence>
>>>                </clusternode>
>>>                <clusternode name="hugin" votes="1">
>>>                        <fence>
>>>                                <method name="1">
>>>                    <device modulename="" name="hugin-drac"/>
>>>                </method>
>>>                        </fence>
>>>                </clusternode>
>>>                <clusternode name="munin" votes="1">
>>>                        <fence>
>>>                                <method name="1">
>>>                    <device modulename="" name="munin-drac"/>
>>>                </method>
>>>                        </fence>
>>>                </clusternode>
>>>                <clusternode name="zeus" votes="1">
>>>                        <fence>
>>>                                <method name="1">
>>>                    <device modulename="" name="zeus-drac"/>
>>>                </method>
>>>                        </fence>
>>>                </clusternode>
>>>    </clusternodes>
>>>        <cman expected_votes="1" two_node="0"/>
>>>        <fencedevices>
>>>                <resources/>
>>>                <fencedevice name="odin-drac" agent="fence_drac"
>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>                <fencedevice name="hugin-drac" agent="fence_drac"
>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>                <fencedevice name="munin-drac" agent="fence_drac"
>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>                <fencedevice name="zeus-drac" agent="fence_drac"
>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>        </fencedevices>
>>>        <rm>
>>>        <failoverdomains/>
>>>        <resources/>
>>>    </rm>
>>> </cluster>
>>>
>>>
>>>
>>>
>>> --
>>> Shawn Hood
>>> 910.670.1819 m
>>>
>>
>>
>>
>> --
>> Shawn Hood
>> 910.670.1819 m
>>
>
>
>
> --
> Shawn Hood
> 910.670.1819 m
>


-- 
Shawn Hood
910.670.1819 m


From jason.huddleston at verizon.com  Mon Oct 13 21:38:15 2008
From: jason.huddleston at verizon.com (Jason Huddleston)
Date: Mon, 13 Oct 2008 16:38:15 -0500
Subject: [Linux-cluster] Re: GFS hanging on 3 node RHEL4 cluster
In-Reply-To: <cfe2fc960810131432i34c34277g60fc568580591d11@mail.gmail.com>
References: <cfe2fc960810071033x5abf179fw7f19fa5f56a4f8ed@mail.gmail.com>
	<cfe2fc960810071040s7f0dd5e8qb60873f82e458d91@mail.gmail.com>
	<cfe2fc960810131432y4e04d24bo7e59efc927b1ec51@mail.gmail.com>
	<cfe2fc960810131432i34c34277g60fc568580591d11@mail.gmail.com>
Message-ID: <48F3BFC7.9090307@verizon.com>

Shawn,
    Looking at the output below you may want to try and increase 
statfs_slots to 256. Also, if you have any disk monitoring utilities 
that monitor drive usage you may want to set statfs_fast equal to 1.

---
Jay
Shawn Hood wrote:
> High priorty support request, I mean.
>
> On Mon, Oct 13, 2008 at 5:32 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>   
>> As a heads up, I'm about to open a high priority bug on this.  It's
>> crippling us.  Also, I meant to say it is a 4 node cluster, not a 3
>> node.
>>
>> Please let me know if I can provide any more information in addition
>> to this.  I will provide the information from a time series of
>> gfs_tool counters commands with the support request.
>>
>> Shawn
>>
>> On Tue, Oct 7, 2008 at 1:40 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>>     
>>> More info:
>>>
>>> All filesystems mounted using noatime,nodiratime,noquota.
>>>
>>> All filesystems report the same data from gfs_tool gettune:
>>>
>>> limit1 = 100
>>> ilimit1_tries = 3
>>> ilimit1_min = 1
>>> ilimit2 = 500
>>> ilimit2_tries = 10
>>> ilimit2_min = 3
>>> demote_secs = 300
>>> incore_log_blocks = 1024
>>> jindex_refresh_secs = 60
>>> depend_secs = 60
>>> scand_secs = 5
>>> recoverd_secs = 60
>>> logd_secs = 1
>>> quotad_secs = 5
>>> inoded_secs = 15
>>> glock_purge = 0
>>> quota_simul_sync = 64
>>> quota_warn_period = 10
>>> atime_quantum = 3600
>>> quota_quantum = 60
>>> quota_scale = 1.0000   (1, 1)
>>> quota_enforce = 0
>>> quota_account = 0
>>> new_files_jdata = 0
>>> new_files_directio = 0
>>> max_atomic_write = 4194304
>>> max_readahead = 262144
>>> lockdump_size = 131072
>>> stall_secs = 600
>>> complain_secs = 10
>>> reclaim_limit = 5000
>>> entries_per_readdir = 32
>>> prefetch_secs = 10
>>> statfs_slots = 64
>>> max_mhc = 10000
>>> greedy_default = 100
>>> greedy_quantum = 25
>>> greedy_max = 250
>>> rgrp_try_threshold = 100
>>> statfs_fast = 0
>>> seq_readahead = 0
>>>
>>>
>>> And data on the FS from gfs_tool counters:
>>>                                  locks 2948
>>>                             locks held 1352
>>>                           freeze count 0
>>>                          incore inodes 1347
>>>                       metadata buffers 0
>>>                        unlinked inodes 0
>>>                              quota IDs 0
>>>                     incore log buffers 0
>>>                         log space used 0.05%
>>>              meta header cache entries 0
>>>                     glock dependencies 0
>>>                 glocks on reclaim list 0
>>>                              log wraps 2
>>>                   outstanding LM calls 0
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 0
>>>                       glocks reclaimed 223287
>>>                         glock nq calls 1812286
>>>                         glock dq calls 1810926
>>>                   glock prefetch calls 101158
>>>                          lm_lock calls 198294
>>>                        lm_unlock calls 142643
>>>                           lm callbacks 341621
>>>                     address operations 502691
>>>                      dentry operations 395330
>>>                      export operations 0
>>>                        file operations 199243
>>>                       inode operations 984276
>>>                       super operations 1727082
>>>                          vm operations 0
>>>                        block I/O reads 520531
>>>                       block I/O writes 130315
>>>
>>>                                  locks 171423
>>>                             locks held 85717
>>>                           freeze count 0
>>>                          incore inodes 85376
>>>                       metadata buffers 1474
>>>                        unlinked inodes 0
>>>                              quota IDs 0
>>>                     incore log buffers 24
>>>                         log space used 0.83%
>>>              meta header cache entries 6621
>>>                     glock dependencies 2037
>>>                 glocks on reclaim list 0
>>>                              log wraps 428
>>>                   outstanding LM calls 0
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 0
>>>                       glocks reclaimed 45784677
>>>                         glock nq calls 962822941
>>>                         glock dq calls 962595532
>>>                   glock prefetch calls 20215922
>>>                          lm_lock calls 40708633
>>>                        lm_unlock calls 23410498
>>>                           lm callbacks 64156052
>>>                     address operations 705464659
>>>                      dentry operations 19701522
>>>                      export operations 0
>>>                        file operations 364990733
>>>                       inode operations 98910127
>>>                       super operations 440061034
>>>                          vm operations 7
>>>                        block I/O reads 90394984
>>>                       block I/O writes 131199864
>>>
>>>                                  locks 2916542
>>>                             locks held 1476005
>>>                           freeze count 0
>>>                          incore inodes 1454165
>>>                       metadata buffers 12539
>>>                        unlinked inodes 100
>>>                              quota IDs 0
>>>                     incore log buffers 11
>>>                         log space used 13.33%
>>>              meta header cache entries 9928
>>>                     glock dependencies 110
>>>                 glocks on reclaim list 0
>>>                              log wraps 2393
>>>                   outstanding LM calls 25
>>>                  outstanding BIO calls 0
>>>                       fh2dentry misses 55546
>>>                       glocks reclaimed 127341056
>>>                         glock nq calls 867427
>>>                         glock dq calls 867430
>>>                   glock prefetch calls 36679316
>>>                          lm_lock calls 110179878
>>>                        lm_unlock calls 84588424
>>>                           lm callbacks 194863553
>>>                     address operations 250891447
>>>                      dentry operations 359537343
>>>                      export operations 390941288
>>>                        file operations 399156716
>>>                       inode operations 537830
>>>                       super operations 1093798409
>>>                          vm operations 774785
>>>                        block I/O reads 258044208
>>>                       block I/O writes 101585172
>>>
>>>
>>>
>>> On Tue, Oct 7, 2008 at 1:33 PM, Shawn Hood <shawnlhood at gmail.com> wrote:
>>>       
>>>> Problem:
>>>> It seems that IO on one machine in the cluster (not always the same
>>>> machine) will hang and all processes accessing clustered LVs will
>>>> block.  Other machines will follow suit shortly thereafter until the
>>>> machine that first exhibited the problem is rebooted (via fence_drac
>>>> manually).  No messages in dmesg, syslog, etc.  Filesystems recently
>>>> fsckd.
>>>>
>>>> Hardware:
>>>> Dell 1950s (similar except memory -- 3x 16GB RAM, 1x 8GB RAM).
>>>> Running RHEL4 ES U7.  Four machines
>>>> Onboard gigabit NICs (Machines use little bandwidth, and all network
>>>> traffic including DLM share NICs)
>>>> QLogic 2462 PCI-Express dual channel FC HBAs
>>>> QLogic SANBox 5200 FC switch
>>>> Apple XRAID which presents as two LUNs (~4.5TB raw aggregate)
>>>> Cisco Catalyst switch
>>>>
>>>> Simple four machine RHEL4 U7 cluster running kernel 2.6.9-78.0.1.ELsmp
>>>> x86_64 with the following packages:
>>>> ccs-1.0.12-1
>>>> cman-1.0.24-1
>>>> cman-kernel-smp-2.6.9-55.13.el4_7.1
>>>> cman-kernheaders-2.6.9-55.13.el4_7.1
>>>> dlm-kernel-smp-2.6.9-54.11.el4_7.1
>>>> dlm-kernheaders-2.6.9-54.11.el4_7.1
>>>> fence-1.32.63-1.el4_7.1
>>>> GFS-6.1.18-1
>>>> GFS-kernel-smp-2.6.9-80.9.el4_7.1
>>>>
>>>> One clustered VG.  Striped across two physical volumes, which
>>>> correspond to each side of an Apple XRAID.
>>>> Clustered volume group info:
>>>>  --- Volume group ---
>>>>  VG Name               hq-san
>>>>  System ID
>>>>  Format                lvm2
>>>>  Metadata Areas        2
>>>>  Metadata Sequence No  50
>>>>  VG Access             read/write
>>>>  VG Status             resizable
>>>>  Clustered             yes
>>>>  Shared                no
>>>>  MAX LV                0
>>>>  Cur LV                3
>>>>  Open LV               3
>>>>  Max PV                0
>>>>  Cur PV                2
>>>>  Act PV                2
>>>>  VG Size               4.55 TB
>>>>  PE Size               4.00 MB
>>>>  Total PE              1192334
>>>>  Alloc PE / Size       905216 / 3.45 TB
>>>>  Free  PE / Size       287118 / 1.10 TB
>>>>  VG UUID               hfeIhf-fzEq-clCf-b26M-cMy3-pphm-B6wmLv
>>>>
>>>> Logical volumes contained with hq-san VG:
>>>>  cam_development   hq-san                          -wi-ao 500.00G
>>>>  qa            hq-san                          -wi-ao   1.07T
>>>>  svn_users         hq-san                          -wi-ao   1.89T
>>>>
>>>> All four machines mount svn_users, two machines mount qa, and one
>>>> mounts cam_development.
>>>>
>>>> /etc/cluster/cluster.conf:
>>>>
>>>> <?xml version="1.0"?>
>>>> <cluster alias="tungsten" config_version="31" name="qualia">
>>>>        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>>>        <clusternodes>
>>>>                <clusternode name="odin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="odin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="hugin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="hugin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="munin" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="munin-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>                <clusternode name="zeus" votes="1">
>>>>                        <fence>
>>>>                                <method name="1">
>>>>                    <device modulename="" name="zeus-drac"/>
>>>>                </method>
>>>>                        </fence>
>>>>                </clusternode>
>>>>    </clusternodes>
>>>>        <cman expected_votes="1" two_node="0"/>
>>>>        <fencedevices>
>>>>                <resources/>
>>>>                <fencedevice name="odin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="hugin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="munin-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>                <fencedevice name="zeus-drac" agent="fence_drac"
>>>> ipaddr="redacted" login="root" passwd="redacted"/>
>>>>        </fencedevices>
>>>>        <rm>
>>>>        <failoverdomains/>
>>>>        <resources/>
>>>>    </rm>
>>>> </cluster>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Shawn Hood
>>>> 910.670.1819 m
>>>>
>>>>         
>>>
>>> --
>>> Shawn Hood
>>> 910.670.1819 m
>>>
>>>       
>>
>> --
>> Shawn Hood
>> 910.670.1819 m
>>
>>     
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081013/9a65324b/attachment.htm>

From pradhanparas at gmail.com  Mon Oct 13 22:19:57 2008
From: pradhanparas at gmail.com (Paras pradhan)
Date: Mon, 13 Oct 2008 17:19:57 -0500
Subject: [Linux-cluster] debuggin
Message-ID: <8b711df40810131519i56cd4f60k794d84587965cabe@mail.gmail.com>

My ha.cf entry looks like:
node1:

logfacility local0
keepalive 2
udpport 694
deadtime 15
warntime 5
initdead 60
ucast eth0 10.42.40.198
ucast eth0 10.42.40.26
auto_failback off
stonith_host * suicide ha1.domain.local
watchdog /dev/watchdog
debugfile /var/log/ha-debug
node ha1.domain.local
node ha2.domain.local


node2:

logfacility local0
keepalive 2
udpport 694
deadtime 15
warntime 5
initdead 60
ucast eth0 10.42.40.198
ucast eth0 10.42.40.26
auto_failback off
stonith_host * suicide ha2.domain.local
watchdog /dev/watchdog
debugfile /var/log/ha-debug
node ha1.domain.local
node ha2.domain.local

What does the below log file on node2 means when I turn off the eth0 on
node1.

Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: node ha1.domain.local: is dead
Oct 13 17:09:25 ha2 heartbeat: [6841]: info: Link ha1.domain.local:eth0
dead.
Oct 13 17:09:25 ha2 heartbeat: [6980]: info: Resetting node ha1.domain.local
with [Suicide STONITH device]
Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: glib: ha2.domain.local doesn't
control host [ha1.domain.local]
Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: Host ha1.domain.local not
reset!
Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: Managed STONITH
ha1.domain.local process 6980 exited with return code 1.
Oct 13 17:09:25 ha2 heartbeat: [6841]: ERROR: STONITH of ha1.domain.local
failed.  Retrying...
Oct 13 17:09:30 ha2 heartbeat: [6981]: info: Resetting node ha1.domain.local
with [Suicide STONITH device]
Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: glib: ha2.domain.local doesn't
control host [ha1.domain.local]
Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: Host ha1.domain.local not
reset!


I need node1 to be shutdown when eth0 on node1 is down.


Any help will be greatly appreciated.


Paras.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081013/f608b208/attachment.htm>

From beekhof at gmail.com  Tue Oct 14 10:00:16 2008
From: beekhof at gmail.com (Andrew Beekhof)
Date: Tue, 14 Oct 2008 12:00:16 +0200
Subject: [Linux-cluster] debuggin
In-Reply-To: <8b711df40810131519i56cd4f60k794d84587965cabe@mail.gmail.com>
References: <8b711df40810131519i56cd4f60k794d84587965cabe@mail.gmail.com>
Message-ID: <26ef5e70810140300y5ef4b044vd0211e6d90dd5d8d@mail.gmail.com>

You;re better off asking about the (old) heartbeat resource manager on
the heartbeat mailing list.

2008/10/14 Paras pradhan <pradhanparas at gmail.com>:
> My ha.cf entry looks like:
> node1:
>
> logfacility local0
> keepalive 2
> udpport 694
> deadtime 15
> warntime 5
> initdead 60
> ucast eth0 10.42.40.198
> ucast eth0 10.42.40.26
> auto_failback off
> stonith_host * suicide ha1.domain.local
> watchdog /dev/watchdog
> debugfile /var/log/ha-debug
> node ha1.domain.local
> node ha2.domain.local
>
> node2:
> logfacility local0
> keepalive 2
> udpport 694
> deadtime 15
> warntime 5
> initdead 60
> ucast eth0 10.42.40.198
> ucast eth0 10.42.40.26
> auto_failback off
> stonith_host * suicide ha2.domain.local
> watchdog /dev/watchdog
> debugfile /var/log/ha-debug
> node ha1.domain.local
> node ha2.domain.local
> What does the below log file on node2 means when I turn off the eth0 on
> node1.
> Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: node ha1.domain.local: is dead
> Oct 13 17:09:25 ha2 heartbeat: [6841]: info: Link ha1.domain.local:eth0
> dead.
> Oct 13 17:09:25 ha2 heartbeat: [6980]: info: Resetting node ha1.domain.local
> with [Suicide STONITH device]
> Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: glib: ha2.domain.local doesn't
> control host [ha1.domain.local]
> Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: Host ha1.domain.local not
> reset!
> Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: Managed STONITH
> ha1.domain.local process 6980 exited with return code 1.
> Oct 13 17:09:25 ha2 heartbeat: [6841]: ERROR: STONITH of ha1.domain.local
> failed.  Retrying...
> Oct 13 17:09:30 ha2 heartbeat: [6981]: info: Resetting node ha1.domain.local
> with [Suicide STONITH device]
> Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: glib: ha2.domain.local doesn't
> control host [ha1.domain.local]
> Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: Host ha1.domain.local not
> reset!
>
>
> I need node1 to be shutdown when eth0 on node1 is down.
>
>
> Any help will be greatly appreciated.
>
> Paras.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


From pradhanparas at gmail.com  Tue Oct 14 13:48:02 2008
From: pradhanparas at gmail.com (Paras pradhan)
Date: Tue, 14 Oct 2008 08:48:02 -0500
Subject: [Linux-cluster] debuggin
In-Reply-To: <26ef5e70810140300y5ef4b044vd0211e6d90dd5d8d@mail.gmail.com>
References: <8b711df40810131519i56cd4f60k794d84587965cabe@mail.gmail.com>
	<26ef5e70810140300y5ef4b044vd0211e6d90dd5d8d@mail.gmail.com>
Message-ID: <8b711df40810140648y2c39032dq948fefb5442ff76c@mail.gmail.com>

ok ! that was a mistake. sorry.

Paras.

On Tue, Oct 14, 2008 at 5:00 AM, Andrew Beekhof <beekhof at gmail.com> wrote:

> You;re better off asking about the (old) heartbeat resource manager on
> the heartbeat mailing list.
>
> 2008/10/14 Paras pradhan <pradhanparas at gmail.com>:
> > My ha.cf entry looks like:
> > node1:
> >
> > logfacility local0
> > keepalive 2
> > udpport 694
> > deadtime 15
> > warntime 5
> > initdead 60
> > ucast eth0 10.42.40.198
> > ucast eth0 10.42.40.26
> > auto_failback off
> > stonith_host * suicide ha1.domain.local
> > watchdog /dev/watchdog
> > debugfile /var/log/ha-debug
> > node ha1.domain.local
> > node ha2.domain.local
> >
> > node2:
> > logfacility local0
> > keepalive 2
> > udpport 694
> > deadtime 15
> > warntime 5
> > initdead 60
> > ucast eth0 10.42.40.198
> > ucast eth0 10.42.40.26
> > auto_failback off
> > stonith_host * suicide ha2.domain.local
> > watchdog /dev/watchdog
> > debugfile /var/log/ha-debug
> > node ha1.domain.local
> > node ha2.domain.local
> > What does the below log file on node2 means when I turn off the eth0 on
> > node1.
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: node ha1.domain.local: is
> dead
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: info: Link ha1.domain.local:eth0
> > dead.
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: info: Resetting node
> ha1.domain.local
> > with [Suicide STONITH device]
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: glib: ha2.domain.local
> doesn't
> > control host [ha1.domain.local]
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: Host ha1.domain.local not
> > reset!
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: Managed STONITH
> > ha1.domain.local process 6980 exited with return code 1.
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: ERROR: STONITH of ha1.domain.local
> > failed.  Retrying...
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: info: Resetting node
> ha1.domain.local
> > with [Suicide STONITH device]
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: glib: ha2.domain.local
> doesn't
> > control host [ha1.domain.local]
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: Host ha1.domain.local not
> > reset!
> >
> >
> > I need node1 to be shutdown when eth0 on node1 is down.
> >
> >
> > Any help will be greatly appreciated.
> >
> > Paras.
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081014/cad6ecb9/attachment.htm>

From jstoner at opsource.net  Tue Oct 14 15:32:09 2008
From: jstoner at opsource.net (Jeff Stoner)
Date: Tue, 14 Oct 2008 16:32:09 +0100
Subject: [Linux-cluster] Fencing quandry
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>

We had a "that totally sucks" event the other night involving fencing.
In short - Red Hat 4.7, 2 node cluster using iLO fencing with HP blade
servers:

- passive node detemined active node was unresponsive (missed too many
heartbeats)
- passive node initiates take-over and begins fencing process
- fencing agent successfully powers off blade server
- fencing agent sits in an endless loop trying to power on the blade,
which won't power up
- the cluster appears "stalled" at this point because fencing won't
complete

I was able to complete the failover by swapping out the fencing agent
with a shell script that does "exit 0". This allowed the fencing agent
to complete so the resource manager could successfully relocate the
service.

My question becomes: why isn't a successful power off considered
sufficient for a take-over of a service? If the power is off, you've
guaranteed that all resources are released by that node. By requiring a
successful power on (which may never happen due to hardware failure,)
the fencing agent becomes a single point of failure in the cluster. The
fencing agent should make an attempt to power on a down node but it
shouldn't hold up the failover process if that attempt fails.


--Jeff
Performance Engineer

OpSource, Inc.
http://www.opsource.net
"Your Success is Our Success"
 

From james.hofmeister at hp.com  Tue Oct 14 17:39:48 2008
From: james.hofmeister at hp.com (Hofmeister, James (WTEC Linux))
Date: Tue, 14 Oct 2008 17:39:48 +0000
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
Message-ID: <EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>

Hello Jeff,

I am working with RedHat on a RHEL-5 fencing issue with c-class blades...  We have bugzilla 433864 opened for this and my notes state to be resolved in RHEL-5.3.

We had a workaround in the RHEL-5 cluster configuration:

  In the /etc/cluster/cluster.conf

  *Update version number by 1.
  *Then edit the fence device section for "each" node for example:

                        <fence>
                                <method name="1">
                                        <device name="ilo01"/>
                                </method>
                        </fence>
  change to  -->
                        <fence>
                                <method name="1">
                                        <device name="ilo01" action="off"/>
                                        <device name="ilo01" action="on"/>
                                </method>
                        </fence>

Regards,
James Hofmeister
Hewlett Packard Linux Solutions Engineer


|-----Original Message-----
|From: linux-cluster-bounces at redhat.com
|[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
|Sent: Tuesday, October 14, 2008 8:32 AM
|To: linux clustering
|Subject: [Linux-cluster] Fencing quandry
|
|We had a "that totally sucks" event the other night involving fencing.
|In short - Red Hat 4.7, 2 node cluster using iLO fencing with HP blade
|servers:
|
|- passive node detemined active node was unresponsive (missed too many
|heartbeats)
|- passive node initiates take-over and begins fencing process
|- fencing agent successfully powers off blade server
|- fencing agent sits in an endless loop trying to power on the
|blade, which won't power up
|- the cluster appears "stalled" at this point because fencing
|won't complete
|
|I was able to complete the failover by swapping out the
|fencing agent with a shell script that does "exit 0". This
|allowed the fencing agent to complete so the resource manager
|could successfully relocate the service.
|
|My question becomes: why isn't a successful power off
|considered sufficient for a take-over of a service? If the
|power is off, you've guaranteed that all resources are
|released by that node. By requiring a successful power on
|(which may never happen due to hardware failure,) the fencing
|agent becomes a single point of failure in the cluster. The
|fencing agent should make an attempt to power on a down node
|but it shouldn't hold up the failover process if that attempt fails.
|
|
|
|--Jeff
|Performance Engineer
|
|OpSource, Inc.
|http://www.opsource.net
|"Your Success is Our Success"
|
|
|--
|Linux-cluster mailing list
|Linux-cluster at redhat.com
|https://www.redhat.com/mailman/listinfo/linux-cluster
|


From jstoner at opsource.net  Tue Oct 14 22:43:14 2008
From: jstoner at opsource.net (Jeff Stoner)
Date: Tue, 14 Oct 2008 23:43:14 +0100
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>

Thanks for the response, James. Unfortunately, it doesn't fully answer
my question or at least, I'm not following the logic. The bug report
would seem to indicate a problem with using the default "reboot" method
of the agent. The work around simply replaces the single fence device
('reboot') with 2 fence devices ('off' followed by 'on') in the same
fence method. If the server fails to power on, then, according to the
FAQ, fencing still fails ("All fence devices within a fence method must
succeed in order for the method to succeed").

I'm back to fenced being a SPoF if hardware failures prevent a fenced
node from powering on.

--Jeff
Performance Engineer

OpSource, Inc.
http://www.opsource.net
"Your Success is Our Success"
  

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> Hofmeister, James (WTEC Linux)
> Sent: Tuesday, October 14, 2008 1:40 PM
> To: linux clustering
> Subject: [Linux-cluster] RE: Fencing quandry
> 
> Hello Jeff,
> 
> I am working with RedHat on a RHEL-5 fencing issue with 
> c-class blades...  We have bugzilla 433864 opened for this 
> and my notes state to be resolved in RHEL-5.3.
> 
> We had a workaround in the RHEL-5 cluster configuration:
> 
>   In the /etc/cluster/cluster.conf
> 
>   *Update version number by 1.
>   *Then edit the fence device section for "each" node for example:
> 
>                         <fence>
>                                 <method name="1">
>                                         <device name="ilo01"/>
>                                 </method>
>                         </fence>
>   change to  -->
>                         <fence>
>                                 <method name="1">
>                                         <device name="ilo01" 
> action="off"/>
>                                         <device name="ilo01" 
> action="on"/>
>                                 </method>
>                         </fence>
> 
> Regards,
> James Hofmeister
> Hewlett Packard Linux Solutions Engineer
> 
> 
> 
> |-----Original Message-----
> |From: linux-cluster-bounces at redhat.com
> |[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
> |Sent: Tuesday, October 14, 2008 8:32 AM
> |To: linux clustering
> |Subject: [Linux-cluster] Fencing quandry
> |
> |We had a "that totally sucks" event the other night 
> involving fencing.
> |In short - Red Hat 4.7, 2 node cluster using iLO fencing 
> with HP blade
> |servers:
> |
> |- passive node detemined active node was unresponsive 
> (missed too many
> |heartbeats)
> |- passive node initiates take-over and begins fencing process
> |- fencing agent successfully powers off blade server
> |- fencing agent sits in an endless loop trying to power on the
> |blade, which won't power up
> |- the cluster appears "stalled" at this point because fencing
> |won't complete
> |
> |I was able to complete the failover by swapping out the
> |fencing agent with a shell script that does "exit 0". This
> |allowed the fencing agent to complete so the resource manager
> |could successfully relocate the service.
> |
> |My question becomes: why isn't a successful power off
> |considered sufficient for a take-over of a service? If the
> |power is off, you've guaranteed that all resources are
> |released by that node. By requiring a successful power on
> |(which may never happen due to hardware failure,) the fencing
> |agent becomes a single point of failure in the cluster. The
> |fencing agent should make an attempt to power on a down node
> |but it shouldn't hold up the failover process if that attempt fails.
> |
> |
> |
> |--Jeff
> |Performance Engineer
> |
> |OpSource, Inc.
> |http://www.opsource.net
> |"Your Success is Our Success"
> |
> |
> |--
> |Linux-cluster mailing list
> |Linux-cluster at redhat.com
> |https://www.redhat.com/mailman/listinfo/linux-cluster
> |
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 


From andres.mujica at seaq.com.co  Tue Oct 14 22:56:48 2008
From: andres.mujica at seaq.com.co (Andres Mauricio Mujica Zalamea)
Date: Tue, 14 Oct 2008 17:56:48 -0500 (COT)
Subject: [Linux-cluster] 2 phy hosts (domain0) hosting 3 vm (domU) with
 clustered services between them and san storage
Message-ID: <1420.200.1.81.99.1224025008.squirrel@webmail.seaq.com.co>


Hi, all

I've got this deployment. We've got 2 physical servers that are hosting 3
domUs with clustered services between them. We're presenting one multipath
device from the same disks of the SAN to each guest, so i've got
/dev/mpath/mpath0 mapped from phy node 1 to guest 1 and /dev/mpath/mpath0
mapped from phy node 2 to guest 2.

Using luci i've configured the storage but i'm seeing an odd behaviour,
for example the partition table is not inmediately seen by the other guest
node from the cluster. For the VG to be seen i need to restart the remote
guest.

And the worst part is that after formatting with GFS2 if i touch a file in
one guest node, i would expect the same file to be seen at the other node,
but the truth is not, the file is not seen after a reboot or
mount/remount...

any ideas?

i hope i could explained myself a litle bit...


Andr?s Mauricio Mujica Zalamea


From vipcert at yahoo.com  Wed Oct 15 04:33:19 2008
From: vipcert at yahoo.com (Vipin Sharma)
Date: Tue, 14 Oct 2008 21:33:19 -0700 (PDT)
Subject: [Linux-cluster] lock issue with gfs and gfs2
Message-ID: <358177.12602.qm@web56701.mail.re3.yahoo.com>

Hi,

Let me try to explain my issue with gfs/gfs2 filesystem. I have two node cluster and a shared gfs filesystem which is mounted on both the nodes at the same time. I can access the filesystem from both nodes. 
1. From node A I put a lock on a file called testfile and tried to put the lock on testfile from node B. I get message, file is already locked, which is good since file is locked from ndoe A.
2. Now unmount the filesystem on node B while lock is still there on testfile from node A and mount it back. Now try to put lock on the testfile from node B which is locked from node A. Expected result would be not to succeed in puting lock from node B, but "NO" I am able to put the lock from node B.
3. Node B does not know that there is some lock on testfile form node A but now if you release the lock from node A and put it again and then try to put lock on testfile from node B it works as expected means you will not be able to put lock on testfile. It says file is already locked.

It does not make any difference if I use gfs or gfs2 test works same way I tried on Oracle enterprise linux 5.1 and 5.2, which is nothing but redhat. Also node A or node B test results are same.

I have lock file which is compiled one but the following program also works the same way.
===========================================================================
/*
** lockdemo.c -- shows off your system's file locking.  Rated R.
*/

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
                     /* l_type   l_whence  l_start  l_len  l_pid   */
    struct flock fl = { F_WRLCK, SEEK_SET, 0,       0,     0 };
    int fd;

    fl.l_pid = getpid();

    if (argc > 1)
        fl.l_type = F_RDLCK;

    if ((fd = open("lockdemo.c", O_RDWR)) == -1) {
        perror("open");
        exit(1);
    }

    printf("Press <RETURN> to try to get lock: ");
    getchar();
    printf("Trying to get lock...");

    if (fcntl(fd, F_SETLKW, &fl) == -1) {
        perror("fcntl");
        exit(1);
    }

    printf("got lock\n");
    printf("Press <RETURN> to release lock: ");
    getchar();

    fl.l_type = F_UNLCK;  /* set to unlock same region */

    if (fcntl(fd, F_SETLK, &fl) == -1) {
        perror("fcntl");
        exit(1);
    }

    printf("Unlocked.\n");

    close(fd);
}

=============================================================================

I will be happy to provide more details as requested.

TIA
vip


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081014/5d742854/attachment.htm>

From swhiteho at redhat.com  Wed Oct 15 07:15:27 2008
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 15 Oct 2008 08:15:27 +0100
Subject: [Linux-cluster] lock issue with gfs and gfs2
In-Reply-To: <358177.12602.qm@web56701.mail.re3.yahoo.com>
References: <358177.12602.qm@web56701.mail.re3.yahoo.com>
Message-ID: <1224054927.25004.66.camel@quoit>

Hi,

Please file a bug at  bugzilla.redhat.com,

Steve.

On Tue, 2008-10-14 at 21:33 -0700, Vipin Sharma wrote:
> Hi,
> 
> Let me try to explain my issue with gfs/gfs2 filesystem. I have two
> node cluster and a shared gfs filesystem which is mounted on both the
> nodes at the same time. I can access the filesystem from both nodes. 
> 1. From node A I put a lock on a file called testfile and tried to put
> the lock on testfile from node B. I get message, file is already
> locked, which is good since file is locked from ndoe A.
> 2. Now unmount the filesystem on node B while lock is still there on
> testfile from node A and mount it back. Now try to put lock on the
> testfile from node B which is locked from node A. Expected result
> would be not to succeed in puting lock from node B, but "NO" I am able
> to put the lock from node B.
> 3. Node B does not know that there is some lock on testfile form node
> A but now if you release the lock from node A and put it again and
> then try to put lock on testfile from node B it works as expected
> means you will not be able to put lock on testfile. It says file is
> already locked.
> 
> It does not make any difference if I use gfs or gfs2 test works same
> way I tried on Oracle enterprise linux 5.1 and 5.2, which is nothing
> but redhat. Also node A or node B test results are same.
> 
> I have lock file which is compiled one but the following program also
> works the same way.
> ===========================================================================
> /*
> ** lockdemo.c -- shows off your system's file locking.  Rated R.
> */
> 
> #include <stdio.h>
> #include <stdlib.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <unistd.h>
> 
> int main(int argc, char *argv[])
> {
>                      /* l_type   l_whence  l_start  l_len  l_pid   */
>     struct flock fl = { F_WRLCK, SEEK_SET, 0,       0,     0 };
>     int fd;
> 
>     fl.l_pid = getpid();
> 
>     if (argc > 1)
>         fl.l_type = F_RDLCK;
> 
>     if ((fd = open("lockdemo.c", O_RDWR)) == -1) {
>         perror("open");
>         exit(1);
>     }
> 
>     printf("Press <RETURN> to try to get lock: ");
>     getchar();
>     printf("Trying to get lock...");
> 
>     if (fcntl(fd, F_SETLKW, &fl) == -1) {
>         perror("fcntl");
>         exit(1);
>     }
> 
>     printf("got lock\n");
>     printf("Press <RETURN> to release lock: ");
>     getchar();
> 
>     fl.l_type = F_UNLCK;  /* set to unlock same region */
> 
>     if (fcntl(fd, F_SETLK, &fl) == -1) {
>         perror("fcntl");
>         exit(1);
>     }
> 
>     printf("Unlocked.\n");
> 
>     close(fd);
> }
> 
> =============================================================================
> 
> I will be happy to provide more details as requested.
> 
> TIA
> vip
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From erling.nygaard at gmail.com  Wed Oct 15 07:53:04 2008
From: erling.nygaard at gmail.com (Erling Nygaard)
Date: Wed, 15 Oct 2008 09:53:04 +0200
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
	<38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
Message-ID: <adb721b40810150053i351e9d30ycaff7c1cffc1396d@mail.gmail.com>

Jeff

If you do not need the fenced node to come back (in your case it can
not come back due to the hardware issues)
you can remove the "on" fence action and simply have the fence device
issue a "off" command.
This should return a success.

In this case the fenced node will never return to life without human
interaction, but that is no worse than the situation you are in now.

Erling


On Wed, Oct 15, 2008 at 12:43 AM, Jeff Stoner <jstoner at opsource.net> wrote:
> Thanks for the response, James. Unfortunately, it doesn't fully answer
> my question or at least, I'm not following the logic. The bug report
> would seem to indicate a problem with using the default "reboot" method
> of the agent. The work around simply replaces the single fence device
> ('reboot') with 2 fence devices ('off' followed by 'on') in the same
> fence method. If the server fails to power on, then, according to the
> FAQ, fencing still fails ("All fence devices within a fence method must
> succeed in order for the method to succeed").
>
> I'm back to fenced being a SPoF if hardware failures prevent a fenced
> node from powering on.
>
> --Jeff
> Performance Engineer
>
> OpSource, Inc.
> http://www.opsource.net
> "Your Success is Our Success"
>
>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of
>> Hofmeister, James (WTEC Linux)
>> Sent: Tuesday, October 14, 2008 1:40 PM
>> To: linux clustering
>> Subject: [Linux-cluster] RE: Fencing quandry
>>
>> Hello Jeff,
>>
>> I am working with RedHat on a RHEL-5 fencing issue with
>> c-class blades...  We have bugzilla 433864 opened for this
>> and my notes state to be resolved in RHEL-5.3.
>>
>> We had a workaround in the RHEL-5 cluster configuration:
>>
>>   In the /etc/cluster/cluster.conf
>>
>>   *Update version number by 1.
>>   *Then edit the fence device section for "each" node for example:
>>
>>                         <fence>
>>                                 <method name="1">
>>                                         <device name="ilo01"/>
>>                                 </method>
>>                         </fence>
>>   change to  -->
>>                         <fence>
>>                                 <method name="1">
>>                                         <device name="ilo01"
>> action="off"/>
>>                                         <device name="ilo01"
>> action="on"/>
>>                                 </method>
>>                         </fence>
>>
>> Regards,
>> James Hofmeister
>> Hewlett Packard Linux Solutions Engineer
>>
>>
>>
>> |-----Original Message-----
>> |From: linux-cluster-bounces at redhat.com
>> |[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
>> |Sent: Tuesday, October 14, 2008 8:32 AM
>> |To: linux clustering
>> |Subject: [Linux-cluster] Fencing quandry
>> |
>> |We had a "that totally sucks" event the other night
>> involving fencing.
>> |In short - Red Hat 4.7, 2 node cluster using iLO fencing
>> with HP blade
>> |servers:
>> |
>> |- passive node detemined active node was unresponsive
>> (missed too many
>> |heartbeats)
>> |- passive node initiates take-over and begins fencing process
>> |- fencing agent successfully powers off blade server
>> |- fencing agent sits in an endless loop trying to power on the
>> |blade, which won't power up
>> |- the cluster appears "stalled" at this point because fencing
>> |won't complete
>> |
>> |I was able to complete the failover by swapping out the
>> |fencing agent with a shell script that does "exit 0". This
>> |allowed the fencing agent to complete so the resource manager
>> |could successfully relocate the service.
>> |
>> |My question becomes: why isn't a successful power off
>> |considered sufficient for a take-over of a service? If the
>> |power is off, you've guaranteed that all resources are
>> |released by that node. By requiring a successful power on
>> |(which may never happen due to hardware failure,) the fencing
>> |agent becomes a single point of failure in the cluster. The
>> |fencing agent should make an attempt to power on a down node
>> |but it shouldn't hold up the failover process if that attempt fails.
>> |
>> |
>> |
>> |--Jeff
>> |Performance Engineer
>> |
>> |OpSource, Inc.
>> |http://www.opsource.net
>> |"Your Success is Our Success"
>> |
>> |
>> |--
>> |Linux-cluster mailing list
>> |Linux-cluster at redhat.com
>> |https://www.redhat.com/mailman/listinfo/linux-cluster
>> |
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


From virginian at blueyonder.co.uk  Wed Oct 15 15:01:45 2008
From: virginian at blueyonder.co.uk (Virginian)
Date: Wed, 15 Oct 2008 16:01:45 +0100
Subject: [Linux-cluster] Strange error messages in /var/log/messages
Message-ID: <A0FC83F6B19340539B25CAB729EFAF64@Desktop>

Hi all,

I am running Centos 5.2 on a two node physical cluster with Xen virtualisation and 4 domains clustered underneath. I am see the following in /var/log/messages on one of the physical nodes:

Oct 15 15:53:13 xen2 avahi-daemon[3363]: New relevant interface eth0.IPv4 for mDNS.
Oct 15 15:53:13 xen2 avahi-daemon[3363]: Joining mDNS multicast group on interface eth0.IPv4 with address 10.199.10.170.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Network interface enumeration completed.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Registering new address record for fe80::200:ff:fe00:0 on virbr0.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Registering new address record for 192.168.122.1 on virbr0.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Registering new address record for fe80::202:a5ff:fed9:ef74 on eth0.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Registering new address record for 10.199.10.170 on eth0.
Oct 15 15:53:14 xen2 avahi-daemon[3363]: Registering HINFO record with values 'I686'/'LINUX'.
Oct 15 15:53:15 xen2 avahi-daemon[3363]: Server startup complete. Host name is xen2.local. Local service cookie is 3231388299.
Oct 15 15:53:16 xen2 avahi-daemon[3363]: Service "SFTP File Transfer on xen2" (/services/sftp-ssh.service) successfully established.
Oct 15 15:53:23 xen2 xenstored: Checking store ...
Oct 15 15:53:23 xen2 xenstored: Checking store complete.
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 ccsd[2806]: Error: unable to evaluate xpath query "/cluster/fence_xvmd/@(null) "
Oct 15 15:53:24 xen2 ccsd[2806]: Error while processing get: Invalid argument
Oct 15 15:53:24 xen2 modclusterd: startup succeeded
Oct 15 15:53:24 xen2 clurgmgrd[3531]: <notice> Resource Group Manager Starting
Oct 15 15:53:25 xen2 oddjobd: oddjobd startup succeeded
Oct 15 15:53:26 xen2 saslauthd[3885]: detach_tty      : master pid is: 3885
Oct 15 15:53:26 xen2 saslauthd[3885]: ipc_init        : listening on socket: /var/run/saslauthd/mux
Oct 15 15:53:26 xen2 ricci: startup succeeded
Oct 15 15:53:39 xen2 clurgmgrd[3531]: <notice> Starting stopped service vm:hermes
Oct 15 15:53:39 xen2 clurgmgrd[3531]: <notice> Starting stopped service vm:hestia
Oct 15 15:53:43 xen2 kernel: tap tap-1-51712: 2 getting info
Oct 15 15:53:44 xen2 kernel: tap tap-1-51728: 2 getting info
Oct 15 15:53:45 xen2 kernel: device vif1.0 entered promiscuous mode
Oct 15 15:53:45 xen2 kernel: ADDRCONF(NETDEV_UP): vif1.0: link is not ready
Oct 15 15:53:47 xen2 kernel: tap tap-2-51712: 2 getting info
Oct 15 15:53:48 xen2 kernel: tap tap-2-51728: 2 getting info
Oct 15 15:53:48 xen2 kernel: device vif2.0 entered promiscuous mode
Oct 15 15:53:48 xen2 kernel: ADDRCONF(NETDEV_UP): vif2.0: link is not ready
Oct 15 15:53:49 xen2 clurgmgrd[3531]: <notice> Service vm:hestia started
Oct 15 15:53:49 xen2 clurgmgrd[3531]: <notice> Service vm:hermes started
Oct 15 15:53:53 xen2 kernel: blktap: ring-ref 8, event-channel 11, protocol 1 (x86_32-abi)
Oct 15 15:53:53 xen2 kernel: blktap: ring-ref 9, event-channel 12, protocol 1 (x86_32-abi)
Oct 15 15:53:53 xen2 kernel: blktap: ring-ref 8, event-channel 11, protocol 1 (x86_32-abi)
Oct 15 15:53:53 xen2 kernel: blktap: ring-ref 9, event-channel 12, protocol 1 (x86_32-abi)
Oct 15 15:54:23 xen2 kernel: ADDRCONF(NETDEV_CHANGE): vif2.0: link becomes ready
Oct 15 15:54:23 xen2 kernel: xenbr0: topology change detected, propagating
Oct 15 15:54:23 xen2 kernel: xenbr0: port 4(vif2.0) entering forwarding state
Oct 15 15:54:27 xen2 kernel: ADDRCONF(NETDEV_CHANGE): vif1.0: link becomes ready
Oct 15 15:54:27 xen2 kernel: xenbr0: topology change detected, propagating
Oct 15 15:54:27 xen2 kernel: xenbr0: port 3(vif1.0) entering forwarding state
Oct 15 15:56:15 xen2 clurgmgrd[3531]: <notice> Resource Groups Locked

My cluster.conf is as follows:

cluster.conf                                                                               100% 1734     1.7KB/s   00:00
[root at xen1 cluster]# cat /etc/cluster/cluster.conf.15102008
<?xml version="1.0"?>
<cluster alias="XENCluster1" config_version="42" name="XENCluster1">
        <cman expected_votes="1" two_node="1"/>
        <clusternodes>
                <clusternode name="xen1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="xen1-ilo"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="xen2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="xen2-ilo"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="xen1-ilo" login="root" name="xen1-ilo" passwd="deckard1"/>
                <fencedevice agent="fence_ilo" hostname="xen2-ilo" login="root" name="xen2-ilo" passwd="deckard1"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="xen1-failover" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="xen1" priority="1"/>
                        </failoverdomain>
                        <failoverdomain name="xen2-failover" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="xen2" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources/>
                <vm autostart="1" domain="xen2-failover" exclusive="0" migrate="live" name="hermes" path="/guests" recovery="relocate"/>
                <vm autostart="1" domain="xen2-failover" exclusive="0" migrate="live" name="hestia" path="/guests" recovery="relocate"/>
                <vm autostart="1" domain="xen1-failover" exclusive="0" migrate="live" name="aether" path="/guests" recovery="relocate"/>
                <vm autostart="1" domain="xen1-failover" exclusive="0" migrate="live" name="athena" path="/guests" recovery="relocate"/>
        </rm>
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <totem token="21000"/>
        <fence_xvmd/>
</cluster>


Does anybody know what these messages mean?

My domain cluster.conf is as follows:

<?xml version="1.0"?>
<cluster alias="XENCluster2" config_version="13" name="XENCluster2">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="athena.private.lan" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="athena" name="virtual_fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="aether.private.lan" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="aether" name="virtual_fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="hermes.private.lan" nodeid="3" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="hermes" name="virtual_fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="hestia.private.lan" nodeid="4" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="hestia" name="virtual_fence"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
                <fencedevice agent="fence_xvm" name="virtual_fence"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
        <fence_xvmd/>
</cluster>

Thanks

John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081015/b367da8a/attachment.htm>

From jparsons at redhat.com  Wed Oct 15 15:42:25 2008
From: jparsons at redhat.com (jim parsons)
Date: Wed, 15 Oct 2008 11:42:25 -0400
Subject: [Linux-cluster] Strange error messages in /var/log/messages
In-Reply-To: <A0FC83F6B19340539B25CAB729EFAF64@Desktop>
References: <A0FC83F6B19340539B25CAB729EFAF64@Desktop>
Message-ID: <1224085345.3277.2.camel@localhost.localdomain>

On Wed, 2008-10-15 at 16:01 +0100, Virginian wrote:

>         <fence_xvmd/>

This tag does not need to be in the inner clusters' (dom u cluster) conf file, only the cluster set up on the physical hosts.

That might be the problem - easy enough to check! :)

-j


From jparsons at redhat.com  Wed Oct 15 15:48:45 2008
From: jparsons at redhat.com (jim parsons)
Date: Wed, 15 Oct 2008 11:48:45 -0400
Subject: [Linux-cluster] Strange error messages in /var/log/messages
In-Reply-To: <1224085345.3277.2.camel@localhost.localdomain>
References: <A0FC83F6B19340539B25CAB729EFAF64@Desktop>
	<1224085345.3277.2.camel@localhost.localdomain>
Message-ID: <1224085725.3277.4.camel@localhost.localdomain>

On Wed, 2008-10-15 at 11:42 -0400, jim parsons wrote:
> On Wed, 2008-10-15 at 16:01 +0100, Virginian wrote:
> 
> >         <fence_xvmd/>
> 
> This tag does not need to be in the inner clusters' (dom u cluster) conf file, only the cluster set up on the physical hosts.
> 
> That might be the problem - easy enough to check! :)

It would be fun to know if the above fixes the issue. Please let me
know.

-j


From teigland at redhat.com  Wed Oct 15 17:16:15 2008
From: teigland at redhat.com (David Teigland)
Date: Wed, 15 Oct 2008 12:16:15 -0500
Subject: [Linux-cluster] lock issue with gfs and gfs2
In-Reply-To: <358177.12602.qm@web56701.mail.re3.yahoo.com>
References: <358177.12602.qm@web56701.mail.re3.yahoo.com>
Message-ID: <20081015171615.GD30528@redhat.com>

On Tue, Oct 14, 2008 at 09:33:19PM -0700, Vipin Sharma wrote:
> Hi,
> 

> Let me try to explain my issue with gfs/gfs2 filesystem. I have two node
> cluster and a shared gfs filesystem which is mounted on both the nodes
> at the same time. I can access the filesystem from both nodes.  1. From
> node A I put a lock on a file called testfile and tried to put the lock
> on testfile from node B. I get message, file is already locked, which is
> good since file is locked from ndoe A.  2. Now unmount the filesystem on
> node B while lock is still there on testfile from node A and mount it
> back. Now try to put lock on the testfile from node B which is locked
> from node A. Expected result would be not to succeed in puting lock from
> node B, but "NO" I am able to put the lock from node B.  3. Node B does
> not know that there is some lock on testfile form node A but now if you
> release the lock from node A and put it again and then try to put lock
> on testfile from node B it works as expected means you will not be able
> to put lock on testfile. It says file is already locked.
> 
> It does not make any difference if I use gfs or gfs2 test works same way
> I tried on Oracle enterprise linux 5.1 and 5.2, which is nothing but
> redhat. Also node A or node B test results are same.

gfs_controld uses checkpoints to sync plock state to new nodes; it appears
that there's something going wrong with that.  After running your simple
test, run 'group_tool dump gfs <name>' from both nodes and include the
output in the bz.

Thanks,
Dave


From virginian at blueyonder.co.uk  Wed Oct 15 19:10:44 2008
From: virginian at blueyonder.co.uk (Virginian)
Date: Wed, 15 Oct 2008 20:10:44 +0100
Subject: [Linux-cluster] Strange error messages in /var/log/messages
References: <A0FC83F6B19340539B25CAB729EFAF64@Desktop><1224085345.3277.2.camel@localhost.localdomain>
	<1224085725.3277.4.camel@localhost.localdomain>
Message-ID: <E242BA2B867D4AF4BCC5DD477BB0DCC2@Desktop>

Hi Jim,

I changed the domU cluster config as you suggested then rebooted the whole 
caboodle but still get the same messages:


Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument
Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument
Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument
Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument
Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument
Oct 15 18:17:23 xen2 ccsd[2818]: Error: unable to evaluate xpath query 
"/cluster/fence_xvmd/@(null) "
Oct 15 18:17:23 xen2 ccsd[2818]: Error while processing get: Invalid 
argument

Thanks for the suggestion though and at least the domU configs are now 
corrected which is a plus.

Regards

John
----- Original Message ----- 
From: "jim parsons" <jparsons at redhat.com>
To: "linux clustering" <linux-cluster at redhat.com>
Sent: Wednesday, October 15, 2008 4:48 PM
Subject: Re: [Linux-cluster] Strange error messages in /var/log/messages


> On Wed, 2008-10-15 at 11:42 -0400, jim parsons wrote:
>> On Wed, 2008-10-15 at 16:01 +0100, Virginian wrote:
>>
>> >         <fence_xvmd/>
>>
>> This tag does not need to be in the inner clusters' (dom u cluster) conf 
>> file, only the cluster set up on the physical hosts.
>>
>> That might be the problem - easy enough to check! :)
>
> It would be fun to know if the above fixes the issue. Please let me
> know.
>
> -j
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From lhh at redhat.com  Wed Oct 15 19:18:09 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 15 Oct 2008 15:18:09 -0400
Subject: [Linux-cluster] Strange error messages in /var/log/messages
In-Reply-To: <1224085725.3277.4.camel@localhost.localdomain>
References: <A0FC83F6B19340539B25CAB729EFAF64@Desktop>
	<1224085345.3277.2.camel@localhost.localdomain>
	<1224085725.3277.4.camel@localhost.localdomain>
Message-ID: <1224098289.5912.5.camel@ayanami>

On Wed, 2008-10-15 at 11:48 -0400, jim parsons wrote:
> On Wed, 2008-10-15 at 11:42 -0400, jim parsons wrote:
> > On Wed, 2008-10-15 at 16:01 +0100, Virginian wrote:
> > 
> > >         <fence_xvmd/>
> > 
> > This tag does not need to be in the inner clusters' (dom u cluster) conf file, only the cluster set up on the physical hosts.
> > 
> > That might be the problem - easy enough to check! :)
> 
> It would be fun to know if the above fixes the issue. Please let me
> know.

I think I see it.

-- Lon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fence_xvmd-ccs.patch
Type: text/x-patch
Size: 427 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081015/efd666f9/attachment.bin>

From james.hofmeister at hp.com  Wed Oct 15 20:45:46 2008
From: james.hofmeister at hp.com (Hofmeister, James (WTEC Linux))
Date: Wed, 15 Oct 2008 20:45:46 +0000
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
	<38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
Message-ID: <EC61DD7B6048464AB0E1B713AF7521BC1656AD7DED@GVW0676EXC.americas.hpqcorp.net>

Hello Jeff,

RE: [Linux-cluster] RE: Fencing quandary

The root issue is the ILO scripts are not up to date with the current firmware rev in the c-class and p-class blades.

The method of '<device name="ilo01"/>' for a "reboot" is not working with this ILO firmware rev and the workaround is to send 2 commands to ILO under a single method... 'action="off"/' and 'action="on"/'.

I had tested this with my p-class blades and it was successful.  I am still waiting for my customers test results on their c-class blades.

...yes this is the root issue to the ILO problem, but it does not completely address your concern.  I believe you are saying: That the RHCS does not accept a "power off" as a fence, but is requiring both "power off" followed by "power on".

Regards,
James Hofmeister
Hewlett Packard Linux Solutions Engineer

|-----Original Message-----
|From: linux-cluster-bounces at redhat.com
|[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
|Sent: Tuesday, October 14, 2008 3:43 PM
|To: linux clustering
|Subject: RE: [Linux-cluster] RE: Fencing quandry
|
|Thanks for the response, James. Unfortunately, it doesn't
|fully answer my question or at least, I'm not following the
|logic. The bug report would seem to indicate a problem with
|using the default "reboot" method of the agent. The work
|around simply replaces the single fence device
|('reboot') with 2 fence devices ('off' followed by 'on') in
|the same fence method. If the server fails to power on, then,
|according to the FAQ, fencing still fails ("All fence devices
|within a fence method must succeed in order for the method to
|succeed").
|
|I'm back to fenced being a SPoF if hardware failures prevent a
|fenced node from powering on.
|
|--Jeff
|Performance Engineer
|
|OpSource, Inc.
|http://www.opsource.net
|"Your Success is Our Success"
|
|
|> -----Original Message-----
|> From: linux-cluster-bounces at redhat.com
|> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Hofmeister,
|> James (WTEC Linux)
|> Sent: Tuesday, October 14, 2008 1:40 PM
|> To: linux clustering
|> Subject: [Linux-cluster] RE: Fencing quandry
|>
|> Hello Jeff,
|>
|> I am working with RedHat on a RHEL-5 fencing issue with c-class
|> blades...  We have bugzilla 433864 opened for this and my
|notes state
|> to be resolved in RHEL-5.3.
|>
|> We had a workaround in the RHEL-5 cluster configuration:
|>
|>   In the /etc/cluster/cluster.conf
|>
|>   *Update version number by 1.
|>   *Then edit the fence device section for "each" node for example:
|>
|>                         <fence>
|>                                 <method name="1">
|>                                         <device name="ilo01"/>
|>                                 </method>
|>                         </fence>
|>   change to  -->
|>                         <fence>
|>                                 <method name="1">
|>                                         <device name="ilo01"
|> action="off"/>
|>                                         <device name="ilo01"
|> action="on"/>
|>                                 </method>
|>                         </fence>
|>
|> Regards,
|> James Hofmeister
|> Hewlett Packard Linux Solutions Engineer
|>
|>
|>
|> |-----Original Message-----
|> |From: linux-cluster-bounces at redhat.com
|> |[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
|> |Sent: Tuesday, October 14, 2008 8:32 AM
|> |To: linux clustering
|> |Subject: [Linux-cluster] Fencing quandry
|> |
|> |We had a "that totally sucks" event the other night
|> involving fencing.
|> |In short - Red Hat 4.7, 2 node cluster using iLO fencing
|> with HP blade
|> |servers:
|> |
|> |- passive node detemined active node was unresponsive
|> (missed too many
|> |heartbeats)
|> |- passive node initiates take-over and begins fencing process
|> |- fencing agent successfully powers off blade server
|> |- fencing agent sits in an endless loop trying to power on
|the blade,
|> |which won't power up
|> |- the cluster appears "stalled" at this point because fencing won't
|> |complete
|> |
|> |I was able to complete the failover by swapping out the
|fencing agent
|> |with a shell script that does "exit 0". This allowed the fencing
|> |agent to complete so the resource manager could
|successfully relocate
|> |the service.
|> |
|> |My question becomes: why isn't a successful power off considered
|> |sufficient for a take-over of a service? If the power is
|off, you've
|> |guaranteed that all resources are released by that node. By
|requiring
|> |a successful power on (which may never happen due to hardware
|> |failure,) the fencing agent becomes a single point of
|failure in the
|> |cluster. The fencing agent should make an attempt to power
|on a down
|> |node but it shouldn't hold up the failover process if that attempt
|> |fails.
|> |
|> |
|> |
|> |--Jeff
|> |Performance Engineer
|> |
|> |OpSource, Inc.
|> |http://www.opsource.net
|> |"Your Success is Our Success"
|> |
|> |
|> |--
|> |Linux-cluster mailing list
|> |Linux-cluster at redhat.com
|> |https://www.redhat.com/mailman/listinfo/linux-cluster
|> |
|>
|> --
|> Linux-cluster mailing list
|> Linux-cluster at redhat.com
|> https://www.redhat.com/mailman/listinfo/linux-cluster
|>
|>
|
|--
|Linux-cluster mailing list
|Linux-cluster at redhat.com
|https://www.redhat.com/mailman/listinfo/linux-cluster
|


From jparsons at redhat.com  Wed Oct 15 21:38:11 2008
From: jparsons at redhat.com (jim parsons)
Date: Wed, 15 Oct 2008 17:38:11 -0400
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <EC61DD7B6048464AB0E1B713AF7521BC1656AD7DED@GVW0676EXC.americas.hpqcorp.net>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
	<38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC1656AD7DED@GVW0676EXC.americas.hpqcorp.net>
Message-ID: <1224106691.3367.19.camel@localhost.localdomain>

On Wed, 2008-10-15 at 20:45 +0000, Hofmeister, James (WTEC Linux) wrote:
> Hello Jeff,
> 
> RE: [Linux-cluster] RE: Fencing quandary
> 
> The root issue is the ILO scripts are not up to date with the current firmware rev in the c-class and p-class blades.
> 
> The method of '<device name="ilo01"/>' for a "reboot" is not working with this ILO firmware rev and the workaround is to send 2 commands to ILO under a single method... 'action="off"/' and 'action="on"/'.
> 
> I had tested this with my p-class blades and it was successful.  I am still waiting for my customers test results on their c-class blades.
> 
> ...yes this is the root issue to the ILO problem, but it does not completely address your concern.  I believe you are saying: That the RHCS does not accept a "power off" as a fence, but is requiring both "power off" followed by "power on".
Right. It is failing because the 'power on' portion is not completing
because the fence agent is unable to send the correct power on command.

With all due respect to HP's iLO, along with DRAC, RSA, RSB, etc,
keeping up wee little delta's between firmware versions of baseboard
management devices is challenging. Please pull down the very latest
version of the agent and try it. For the time being, you could just use
the power off command and walk over and turn it back on if it is
convenient :). You could also run the agent from the command line with
the verbose output switch set (man fence_ilo) and see if you can
determine why the command is failing. Post what you find here. The agent
is written in Perl and pretty easy to understand I think, if you are
adventurous.

The upcoming 5.3 ilo agent has been rewritten to include additional
connection types, and is being heavily tested now on many firmware
versions. The beta is close to release. Grab it when you can.
-j
> 
> Regards,
> James Hofmeister
> Hewlett Packard Linux Solutions Engineer
> 
> |-----Original Message-----
> |From: linux-cluster-bounces at redhat.com
> |[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
> |Sent: Tuesday, October 14, 2008 3:43 PM
> |To: linux clustering
> |Subject: RE: [Linux-cluster] RE: Fencing quandry
> |
> |Thanks for the response, James. Unfortunately, it doesn't
> |fully answer my question or at least, I'm not following the
> |logic. The bug report would seem to indicate a problem with
> |using the default "reboot" method of the agent. The work
> |around simply replaces the single fence device
> |('reboot') with 2 fence devices ('off' followed by 'on') in
> |the same fence method. If the server fails to power on, then,
> |according to the FAQ, fencing still fails ("All fence devices
> |within a fence method must succeed in order for the method to
> |succeed").
> |
> |I'm back to fenced being a SPoF if hardware failures prevent a
> |fenced node from powering on.
> |
> |--Jeff
> |Performance Engineer
> |
> |OpSource, Inc.
> |http://www.opsource.net
> |"Your Success is Our Success"
> |
> |
> |> -----Original Message-----
> |> From: linux-cluster-bounces at redhat.com
> |> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Hofmeister,
> |> James (WTEC Linux)
> |> Sent: Tuesday, October 14, 2008 1:40 PM
> |> To: linux clustering
> |> Subject: [Linux-cluster] RE: Fencing quandry
> |>
> |> Hello Jeff,
> |>
> |> I am working with RedHat on a RHEL-5 fencing issue with c-class
> |> blades...  We have bugzilla 433864 opened for this and my
> |notes state
> |> to be resolved in RHEL-5.3.
> |>
> |> We had a workaround in the RHEL-5 cluster configuration:
> |>
> |>   In the /etc/cluster/cluster.conf
> |>
> |>   *Update version number by 1.
> |>   *Then edit the fence device section for "each" node for example:
> |>
> |>                         <fence>
> |>                                 <method name="1">
> |>                                         <device name="ilo01"/>
> |>                                 </method>
> |>                         </fence>
> |>   change to  -->
> |>                         <fence>
> |>                                 <method name="1">
> |>                                         <device name="ilo01"
> |> action="off"/>
> |>                                         <device name="ilo01"
> |> action="on"/>
> |>                                 </method>
> |>                         </fence>
> |>
> |> Regards,
> |> James Hofmeister
> |> Hewlett Packard Linux Solutions Engineer
> |>
> |>
> |>
> |> |-----Original Message-----
> |> |From: linux-cluster-bounces at redhat.com
> |> |[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Stoner
> |> |Sent: Tuesday, October 14, 2008 8:32 AM
> |> |To: linux clustering
> |> |Subject: [Linux-cluster] Fencing quandry
> |> |
> |> |We had a "that totally sucks" event the other night
> |> involving fencing.
> |> |In short - Red Hat 4.7, 2 node cluster using iLO fencing
> |> with HP blade
> |> |servers:
> |> |
> |> |- passive node detemined active node was unresponsive
> |> (missed too many
> |> |heartbeats)
> |> |- passive node initiates take-over and begins fencing process
> |> |- fencing agent successfully powers off blade server
> |> |- fencing agent sits in an endless loop trying to power on
> |the blade,
> |> |which won't power up
> |> |- the cluster appears "stalled" at this point because fencing won't
> |> |complete
> |> |
> |> |I was able to complete the failover by swapping out the
> |fencing agent
> |> |with a shell script that does "exit 0". This allowed the fencing
> |> |agent to complete so the resource manager could
> |successfully relocate
> |> |the service.
> |> |
> |> |My question becomes: why isn't a successful power off considered
> |> |sufficient for a take-over of a service? If the power is
> |off, you've
> |> |guaranteed that all resources are released by that node. By
> |requiring
> |> |a successful power on (which may never happen due to hardware
> |> |failure,) the fencing agent becomes a single point of
> |failure in the
> |> |cluster. The fencing agent should make an attempt to power
> |on a down
> |> |node but it shouldn't hold up the failover process if that attempt
> |> |fails.
> |> |
> |> |
> |> |
> |> |--Jeff
> |> |Performance Engineer
> |> |
> |> |OpSource, Inc.
> |> |http://www.opsource.net
> |> |"Your Success is Our Success"
> |> |
> |> |
> |> |--
> |> |Linux-cluster mailing list
> |> |Linux-cluster at redhat.com
> |> |https://www.redhat.com/mailman/listinfo/linux-cluster
> |> |
> |>
> |> --
> |> Linux-cluster mailing list
> |> Linux-cluster at redhat.com
> |> https://www.redhat.com/mailman/listinfo/linux-cluster
> |>
> |>
> |
> |--
> |Linux-cluster mailing list
> |Linux-cluster at redhat.com
> |https://www.redhat.com/mailman/listinfo/linux-cluster
> |
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From kanderso at redhat.com  Wed Oct 15 21:42:23 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Wed, 15 Oct 2008 16:42:23 -0500
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <1224106691.3367.19.camel@localhost.localdomain>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
	<38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC1656AD7DED@GVW0676EXC.americas.hpqcorp.net>
	<1224106691.3367.19.camel@localhost.localdomain>
Message-ID: <1224106943.2991.59.camel@dhcp80-204.msp.redhat.com>

On Wed, 2008-10-15 at 17:38 -0400, jim parsons wrote:
> On Wed, 2008-10-15 at 20:45 +0000, Hofmeister, James (WTEC Linux) wrote:
> > Hello Jeff,
> > 
> > RE: [Linux-cluster] RE: Fencing quandary
> > 
> > The root issue is the ILO scripts are not up to date with the current firmware rev in the c-class and p-class blades.
> > 
> > The method of '<device name="ilo01"/>' for a "reboot" is not working with this ILO firmware rev and the workaround is to send 2 commands to ILO under a single method... 'action="off"/' and 'action="on"/'.
> > 
> > I had tested this with my p-class blades and it was successful.  I am still waiting for my customers test results on their c-class blades.
> > 
> > ...yes this is the root issue to the ILO problem, but it does not completely address your concern.  I believe you are saying: That the RHCS does not accept a "power off" as a fence, but is requiring both "power off" followed by "power on".
> Right. It is failing because the 'power on' portion is not completing
> because the fence agent is unable to send the correct power on command.
> 

But the point is, even if the power on command fails, the fencing agent
should report success, since the real need is to ensure the machine is
no longer participating in the cluster and not bring it back up.

So, is it proper to report success if part of the request fails as long
as the critical part succeeds?

Kevin


From andres.mujica at seaq.com.co  Wed Oct 15 21:42:52 2008
From: andres.mujica at seaq.com.co (Andres Mauricio Mujica Zalamea)
Date: Wed, 15 Oct 2008 16:42:52 -0500 (COT)
Subject: [Linux-cluster] 2 phy hosts (domain0) hosting 3 vm (domU) with 
	clustered services between them and san storage
Message-ID: <18247.200.1.81.99.1224106972.squirrel@webmail.seaq.com.co>


Hi, i?ve narrowed down the problem to something similar if not the same as
posted recently on this list.

It seems that the problem lies within the LV creation.

My systemis using RHEL 5.1 and when i?ve manually create a LV i?ve
received was something like

Error locking on node node1 Volume group for uuid not found:
0PfAdiZHlULoLDX3gw3OGFrwsbPS8io1SWPEFXXOI0VzhYJLy8nGpBFdT7Oi25bF
  Failed to activate new LV.

but the LV appearead on the creation node and after a while (several
reboots) in the another node.

That leads me to think that my gfs2 problem lies there.

I?ve upgraded to RHEL 5.2 in order to use the lvm2 and kernel upgraded
packages that seemed to solve similar issues.

However, the error changed a bit as well as the behaviour at the LV creation.

When i execute the lvcreate command i get this error

lvcreate -n ems_lv -L +5.99G ems_vg
  Rounding up size to full physical extent 5.99 GB
  Error locking on node ems88clu2.bvc.com.co: Error backing up metadata,
can't find VG for group #global
  Aborting. Failed to activate new LV to wipe the start of it.

the difference with 5.1 is that previously the lv was created only at the
creation node , with 5.2 besides the different error message the LV is NOT
created at either both nodes.

This is happening inside a guest accessing the san as a virtual block
device presented by the domain-0 (the domain-o exports the mpath device). 
The nodes are in different domain-0 accessing the same SAN

Thanks for your help


-- 
Andr?s Mauricio Mujica Zalamea


From mockey.chen at nsn.com  Thu Oct 16 09:10:51 2008
From: mockey.chen at nsn.com (Chen, Mockey (NSN - CN/Cheng Du))
Date: Thu, 16 Oct 2008 17:10:51 +0800
Subject: [Linux-cluster] Two nodes cluster issue without shared storage issue
Message-ID: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net>

Hi,

I want to set up a two node cluster, I use active/standby mode to run my
service. I need even one node's hardware failure such as power cut,
another node still can handover from failure node and the provide the
service. 

In my environment, I have no shared storage, so I can not use quorum
disk. Is there any other way to implement it? I searched and found
'tiebreaker IP' may feed my request, but I can not found any hints on
how to configure it ?
Any suggestion ?
Thanks in advance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081016/51c8f7eb/attachment.htm>

From bkyoung at gmail.com  Thu Oct 16 14:52:50 2008
From: bkyoung at gmail.com (Brandon Young)
Date: Thu, 16 Oct 2008 09:52:50 -0500
Subject: [Linux-cluster] GFS Tunables
Message-ID: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>

Hi all,

I currently have a GFS deployment consisting of eight servers and several
GFS volumes.  One of my GFS servers is a dedicated backup server with a
second replica SAN attached to it through a second HBA.  My approach to
backups has been with tools such as rsync and rdiff-backup, run on a nightly
basis.  I am having a particular problem with one or two of my filesystems
taking a *very* long time to backup.  For example, I have /home living on
GFS.  Day-to-day performance is acceptable, but backups are hideously slow.
Every night, I kick off an rdiff-backup of /home from my backup server,
which dumps the backup onto an XFS filesystem on the replica SAN.  This
backup can take days in some cases.

We have done some investigating, and found that it appears that getdents(2)
calls (which give the list of filenames present in a directory) are
spectacularly slow on GFS, irrespective of the size of the directory in
question.  In particular, with 'strace -r', I'm seeing a rate below 100
filenames per second.  The filesystem /home has at least 10 million files in
it, which doing the math means 29.5 hours just to do the getdents calls to
scan them, which is more than a third of wall-clock time.  And that's before
we even start stat'ing.

I google'd around a bit and I can't see any discussion of slow getdents
calls under GFS.  Is there any chance we have some sort of tunable turned
on/off that might be causing this?  I'm not sure which tunables to consider
tweaking, even.  This seems awfully slow, even with sub-optimal locking.  Is
there perhaps some tunable I can try tweaking to improve this situation?
Any insights would be much appreciated.

--
Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081016/aa7f4f0e/attachment.htm>

From Greg.Caetano at hp.com  Thu Oct 16 15:05:13 2008
From: Greg.Caetano at hp.com (Caetano, Greg)
Date: Thu, 16 Oct 2008 15:05:13 +0000
Subject: [Linux-cluster] RE: Fencing quandry
In-Reply-To: <1224106943.2991.59.camel@dhcp80-204.msp.redhat.com>
References: <38A48FA2F0103444906AD22E14F1B5A3082D8723@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC165687B42E@GVW0676EXC.americas.hpqcorp.net>
	<38A48FA2F0103444906AD22E14F1B5A3082D8867@mailxchg01.corp.opsource.net>
	<EC61DD7B6048464AB0E1B713AF7521BC1656AD7DED@GVW0676EXC.americas.hpqcorp.net>
	<1224106691.3367.19.camel@localhost.localdomain>
	<1224106943.2991.59.camel@dhcp80-204.msp.redhat.com>
Message-ID: <DC4F4A611E70464E9D33EE2A18A8D07F36EC3BDC9B@GVW1158EXB.americas.hpqcorp.net>

As mentioned the version of the ilo firmware caused some issues for cluster admins because additional features/commands were incorporated. This topic was discussed at the Red Hat Summit and a single command of "COLD_BOOT_SERVER" would perform a power off/wait 4 seconds/cold boot the server. This directive was suggested as a replacement for the "HOLD_PWR_BTN" directive in the scripts


Greg Caetano
Hewlett-Packard Company
ESS Software
Platform & Business Enablement Solutions Engineering
Chicago, IL
greg.caetano at hp.com
Red Hat Certified Engineer
RHCE#805007310328754

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Kevin Anderson
Sent: Wednesday, October 15, 2008 4:42 PM
To: linux clustering
Subject: RE: [Linux-cluster] RE: Fencing quandry

On Wed, 2008-10-15 at 17:38 -0400, jim parsons wrote:
> On Wed, 2008-10-15 at 20:45 +0000, Hofmeister, James (WTEC Linux) wrote:
> > Hello Jeff,
> >
> > RE: [Linux-cluster] RE: Fencing quandary
> >
> > The root issue is the ILO scripts are not up to date with the current firmware rev in the c-class and p-class blades.
> >
> > The method of '<device name="ilo01"/>' for a "reboot" is not working with this ILO firmware rev and the workaround is to send 2 commands to ILO under a single method... 'action="off"/' and 'action="on"/'.
> >
> > I had tested this with my p-class blades and it was successful.  I am still waiting for my customers test results on their c-class blades.
> >
> > ...yes this is the root issue to the ILO problem, but it does not completely address your concern.  I believe you are saying: That the RHCS does not accept a "power off" as a fence, but is requiring both "power off" followed by "power on".
> Right. It is failing because the 'power on' portion is not completing
> because the fence agent is unable to send the correct power on command.
>

But the point is, even if the power on command fails, the fencing agent
should report success, since the real need is to ensure the machine is
no longer participating in the cluster and not bring it back up.

So, is it proper to report success if part of the request fails as long
as the critical part succeeds?

Kevin


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From shawnlhood at gmail.com  Thu Oct 16 15:29:23 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Thu, 16 Oct 2008 11:29:23 -0400
Subject: [Linux-cluster] fencing problem
Message-ID: <cfe2fc960810160829l3bf5d9e1o22ce6c1e9dc69aa3@mail.gmail.com>

All,

I'll provide some more config details a little later, but thought
maybe some cursory information could yield a response.  Simple four
node cluster running RHEL4U7, latest RHEL cluster packages.  Three GFS
filesystems.  This morning one of our nodes remained responsive, but
was having some problems that required a reboot.  Unfortunately, most
commands from the command line were unsuccessful (Input/Output error,
seems the root filesystem may have been remounted read only).  I
decided to fence the node from another node in the cluster -- using
fence_node <nodename>.  This calls fence_drac.  The operation returned
successful, the node was fenced and rebooted.

After this fencing operation, all nodes reporting their Membership
state (as reported by cman_tool status) as Transition-Master.  Per
http://sources.redhat.com/cluster/faq.html#gfs_fencefreeze, I
understand that GFS will freeze briefly after fencing is performed.
The filesystems did not return to a responsive state.  After many
transition restarts, all nodes leave the cluster (as expected).  Some
logs and cluster.conf below.

Shawn

Oct 16 10:09:12 hugin fence_node[3512]: Fence of "munin" was successful
Oct 16 10:09:32 hugin kernel: CMAN: removing node munin from the
cluster : Missed too many heartbeats
Oct 16 10:09:32 hugin kernel: CMAN: Initiating transition, generation 69
Oct 16 10:09:47 hugin kernel: CMAN: Initiating transition, generation 70
Oct 16 10:10:02 hugin kernel: CMAN: Initiating transition, generation 71
Oct 16 10:10:17 hugin kernel: CMAN: Initiating transition, generation 72
Oct 16 10:10:32 hugin kernel: CMAN: Initiating transition, generation 73
Oct 16 10:10:47 hugin kernel: CMAN: Initiating transition, generation 74
Oct 16 10:11:02 hugin kernel: CMAN: Initiating transition, generation 75
Oct 16 10:11:17 hugin kernel: CMAN: Initiating transition, generation 76
Oct 16 10:11:32 hugin kernel: CMAN: Initiating transition, generation 77
Oct 16 10:11:47 hugin kernel: CMAN: Initiating transition, generation 78
Oct 16 10:12:02 hugin kernel: CMAN: Initiating transition, generation 79
Oct 16 10:12:14 hugin kernel: CMAN: removing node odin from the
cluster : Inconsistent cluster view
Oct 16 10:12:14 hugin kernel: CMAN: Initiating transition, generation 80
Oct 16 10:12:14 hugin kernel: CMAN: removing node odin from the
cluster : Inconsistent cluster view
Oct 16 10:12:14 hugin kernel: CMAN: Initiating transition, generation 81
Oct 16 10:12:16 hugin kernel: CMAN: removing node zeus from the
cluster : Inconsistent cluster view
Oct 16 10:12:16 hugin kernel: CMAN: quorum lost, blocking activity
Oct 16 10:12:16 hugin clurgmgrd[8799]: <emerg> #1: Quorum Dissolved
Oct 16 10:12:16 hugin kernel: CMAN: removing node zeus from the
cluster : Inconsistent cluster view
Oct 16 10:12:19 hugin ccsd[6330]: Cluster is not quorate.  Refusing connection.
Oct 16 10:12:19 hugin ccsd[6330]: Error while processing connect:
Connection refused
Oct 16 10:12:29 hugin ccsd[6330]: Cluster is not quorate.  Refusing connection.
Oct 16 10:12:29 hugin ccsd[6330]: Error while processing connect:
Connection refused
Oct 16 10:12:39 hugin ccsd[6330]: Cluster is not quorate.  Refusing connection.
Oct 16 10:13:47 hugin kernel: CMAN: node munin rejoining
Oct 16 10:13:47 hugin kernel: CMAN: Completed transition, generation 81
Oct 16 10:13:49 hugin ccsd[6330]: Cluster is not quorate.  Refusing connection.
Oct 16 10:13:49 hugin ccsd[6330]: Error while processing connect:
Connection refused
-- previous error message repeated several times ---

Another node in the same cluster, after fencing munin from hugin:
Oct 16 10:09:31 zeus kernel: CMAN: removing node munin from the
cluster : Missed too many heartbeats
Oct 16 10:09:31 zeus kernel: CMAN: Initiating transition, generation 69
Oct 16 10:09:46 zeus kernel: CMAN: Initiating transition, generation 70
Oct 16 10:10:01 zeus kernel: CMAN: Initiating transition, generation 71

cluster.conf:

<?xml version="1.0"?>
<cluster alias="tungsten" config_version="31" name="qualia">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="odin" votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename="" name="odin-drac"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="hugin" votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename=""
name="hugin-drac"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="munin" votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename=""
name="munin-drac"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="zeus" votes="1">
                        <fence>
                                <method name="1">
                                        <device modulename="" name="zeus-drac"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="0"/>
        <fencedevices>
                <resources/>
                <fencedevice name="odin-drac" agent="fence_drac" <redacted>/>
                <fencedevice name="hugin-drac" agent="fence_drac" <redacted>/>
                <fencedevice name="munin-drac" agent="fence_drac" <redacted>/>
                <fencedevice name="zeus-drac" agent="fence_drac" <redacted>/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
</cluster>


From kanderso at redhat.com  Thu Oct 16 15:40:23 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Thu, 16 Oct 2008 10:40:23 -0500
Subject: [Linux-cluster] fencing problem
In-Reply-To: <cfe2fc960810160829l3bf5d9e1o22ce6c1e9dc69aa3@mail.gmail.com>
References: <cfe2fc960810160829l3bf5d9e1o22ce6c1e9dc69aa3@mail.gmail.com>
Message-ID: <1224171623.2982.14.camel@dhcp80-204.msp.redhat.com>

Shawn,

Not sure about your problem, but there is an issue with your
cluster.conf file.  You should remove this line:

        <cman expected_votes="1" two_node="0"/>

Since you have more than 2 nodes in your cluster.  Looks like an
artifact from running a two node cluster and then upgrading.  When
turning two_node off, you need to also remove the expected_votes setting
as well.

Kevin


From shawnlhood at gmail.com  Thu Oct 16 15:42:59 2008
From: shawnlhood at gmail.com (Shawn Hood)
Date: Thu, 16 Oct 2008 11:42:59 -0400
Subject: [Linux-cluster] fencing problem
In-Reply-To: <1224171623.2982.14.camel@dhcp80-204.msp.redhat.com>
References: <cfe2fc960810160829l3bf5d9e1o22ce6c1e9dc69aa3@mail.gmail.com>
	<1224171623.2982.14.camel@dhcp80-204.msp.redhat.com>
Message-ID: <cfe2fc960810160842g3922ec1aocf939edd2d440d3b@mail.gmail.com>

It is indeed an artifact of times past.  Thanks for pointing this out!

Shawn

On Thu, Oct 16, 2008 at 11:40 AM, Kevin Anderson <kanderso at redhat.com> wrote:
> Shawn,
>
> Not sure about your problem, but there is an issue with your
> cluster.conf file.  You should remove this line:
>
>        <cman expected_votes="1" two_node="0"/>
>
> Since you have more than 2 nodes in your cluster.  Looks like an
> artifact from running a two node cluster and then upgrading.  When
> turning two_node off, you need to also remove the expected_votes setting
> as well.
>
> Kevin
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Shawn Hood
910.670.1819 m


From s.wendy.cheng at gmail.com  Thu Oct 16 15:50:53 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Thu, 16 Oct 2008 10:50:53 -0500
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
Message-ID: <48F762DD.8040102@gmail.com>

Brandon Young wrote:
> Hi all,
>
> I currently have a GFS deployment consisting of eight servers and 
> several GFS volumes.  One of my GFS servers is a dedicated backup 
> server with a second replica SAN attached to it through a second HBA.  
> My approach to backups has been with tools such as rsync and 
> rdiff-backup, run on a nightly basis.  I am having a particular 
> problem with one or two of my filesystems taking a *very* long time to 
> backup.  For example, I have /home living on GFS.  Day-to-day 
> performance is acceptable, but backups are hideously slow.  Every 
> night, I kick off an rdiff-backup of /home from my backup server, 
> which dumps the backup onto an XFS filesystem on the replica SAN.  
> This backup can take days in some cases.

Not only GFS, the "getdents()" has been more than annoying on many
filesystems if entries count within the directory is high - but, yes,
GFS is particularly bloody slow with its directory read. There have been
efforts contributed by Red Hat POSIX and LIBC folks to have new
standardized light-weight directory operations. Unfortunately I lost
tracks of their progress ... On the other hand, integrating these new
calls into GFS would take time anyway (if they are available) - so
unlikely it can meet your need. There were also few experimental GFS
patches but none of them made into the production code.

Unless other GFS folks can give you more ideas, I think your best bet at
this moment is to think "outside" the box. That is, don't do
file-to-file backup if all possible. Check out other block level backup
strategies. Are Linux LVM mirroring and/or snapshots workable for you ?
Does your SAN vendor provide embedded features (e.g. Netapp SAN box
offers snapshot, snapmirror, syncmirror, etc) ?

-- Wendy

>
> We have done some investigating, and found that it appears that 
> getdents(2) calls (which give the list of filenames present in a 
> directory) are spectacularly slow on GFS, irrespective of the size of 
> the directory in question.  In particular, with 'strace -r', I'm 
> seeing a rate below 100 filenames per second.  The filesystem /home 
> has at least 10 million files in it, which doing the math means 29.5 
> hours just to do the getdents calls to scan them, which is more than a 
> third of wall-clock time.  And that's before we even start stat'ing.
>
> I google'd around a bit and I can't see any discussion of slow 
> getdents calls under GFS.  Is there any chance we have some sort of 
> tunable turned on/off that might be causing this?  I'm not sure which 
> tunables to consider tweaking, even.  This seems awfully slow, even 
> with sub-optimal locking.  Is there perhaps some tunable I can try 
> tweaking to improve this situation?  Any insights would be much 
> appreciated.
>
> --
> Brandon
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From jos at xos.nl  Thu Oct 16 16:30:56 2008
From: jos at xos.nl (Jos Vos)
Date: Thu, 16 Oct 2008 18:30:56 +0200
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <48F762DD.8040102@gmail.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
	<48F762DD.8040102@gmail.com>
Message-ID: <20081016163056.GA14934@jasmine.xos.nl>

On Thu, Oct 16, 2008 at 10:50:53AM -0500, Wendy Cheng wrote:

> Unless other GFS folks can give you more ideas, I think your best bet at
> this moment is to think "outside" the box. That is, don't do
> file-to-file backup if all possible. Check out other block level backup
> strategies. Are Linux LVM mirroring and/or snapshots workable for you ?
> Does your SAN vendor provide embedded features (e.g. Netapp SAN box
> offers snapshot, snapmirror, syncmirror, etc) ?

What about GFS2?

We have similar problems, using GFS on a ftp server, where (for example)
doing rsync's is almost impossible for large trees.

We tried some of the tuning suggestions you made in earlier mails and on
your web pages on RHEL 5.l, but none of them had a substantial effect,
only the tuning for making "df" more responsive worked.

We (while already having put part of our volumes on ext3 with NFS, a
situation that is far from ideal for the cluster) are about to do some
new tests.  One of the is trying GFS2 on one volume.

I'd appreciate if you can summarize (references to) the current
(RHEL 5.2) tuning possibilities for GFS.  If there is nothing new,
we want to start a test with GFS2.

-- 
--    Jos Vos <jos at xos.nl>
--    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
--    Amsterdam, The Netherlands        |     Fax: +31 20 6948204


From gordan at bobich.net  Thu Oct 16 16:44:20 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 16 Oct 2008 17:44:20 +0100
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <20081016163056.GA14934@jasmine.xos.nl>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>	<48F762DD.8040102@gmail.com>
	<20081016163056.GA14934@jasmine.xos.nl>
Message-ID: <48F76F64.80704@bobich.net>

Jos Vos wrote:
> On Thu, Oct 16, 2008 at 10:50:53AM -0500, Wendy Cheng wrote:
> 
>> Unless other GFS folks can give you more ideas, I think your best bet at
>> this moment is to think "outside" the box. That is, don't do
>> file-to-file backup if all possible. Check out other block level backup
>> strategies. Are Linux LVM mirroring and/or snapshots workable for you ?
>> Does your SAN vendor provide embedded features (e.g. Netapp SAN box
>> offers snapshot, snapmirror, syncmirror, etc) ?
> 
> What about GFS2?
> 
> We have similar problems, using GFS on a ftp server, where (for example)
> doing rsync's is almost impossible for large trees.
> 
> We tried some of the tuning suggestions you made in earlier mails and on
> your web pages on RHEL 5.l, but none of them had a substantial effect,
> only the tuning for making "df" more responsive worked.
> 
> We (while already having put part of our volumes on ext3 with NFS, a
> situation that is far from ideal for the cluster) are about to do some
> new tests.  One of the is trying GFS2 on one volume.
> 
> I'd appreciate if you can summarize (references to) the current
> (RHEL 5.2) tuning possibilities for GFS.  If there is nothing new,
> we want to start a test with GFS2.

Since you're experimenting, OCFS2 might be worth trying, on the 
offchance that it works better for your specific usage pattern.

Gordan


From jos at xos.nl  Thu Oct 16 17:58:18 2008
From: jos at xos.nl (Jos Vos)
Date: Thu, 16 Oct 2008 19:58:18 +0200
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <48F76F64.80704@bobich.net>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
	<48F762DD.8040102@gmail.com>
	<20081016163056.GA14934@jasmine.xos.nl> <48F76F64.80704@bobich.net>
Message-ID: <20081016175818.GA16742@jasmine.xos.nl>

On Thu, Oct 16, 2008 at 05:44:20PM +0100, Gordan Bobic wrote:

> Since you're experimenting, OCFS2 might be worth trying, on the 
> offchance that it works better for your specific usage pattern.

Not that kind of experimenting ;-), I want it to be based on RHEL5.

-- 
--    Jos Vos <jos at xos.nl>
--    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
--    Amsterdam, The Netherlands        |     Fax: +31 20 6948204


From gordan at bobich.net  Thu Oct 16 18:16:32 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 16 Oct 2008 19:16:32 +0100
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <20081016175818.GA16742@jasmine.xos.nl>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>	<48F762DD.8040102@gmail.com>	<20081016163056.GA14934@jasmine.xos.nl>
	<48F76F64.80704@bobich.net> <20081016175818.GA16742@jasmine.xos.nl>
Message-ID: <48F78500.5000100@bobich.net>

Jos Vos wrote:
> On Thu, Oct 16, 2008 at 05:44:20PM +0100, Gordan Bobic wrote:
> 
>> Since you're experimenting, OCFS2 might be worth trying, on the 
>> offchance that it works better for your specific usage pattern.
> 
> Not that kind of experimenting ;-), I want it to be based on RHEL5.

OCFS2 will run on RHEL5, and there is no need to abandon RHCS. It's just 
a different file system, and there are RPMs for RHEL5 available.

Gordan


From bkyoung at gmail.com  Thu Oct 16 20:21:05 2008
From: bkyoung at gmail.com (Brandon Young)
Date: Thu, 16 Oct 2008 15:21:05 -0500
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <48F762DD.8040102@gmail.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
	<48F762DD.8040102@gmail.com>
Message-ID: <824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>

Wendy,

We have searched high and low for an alternative to file-to-file backups,
especially looking block level backups.  The only product we've found that
"supports" GFS is Bak Bone Replicator.  My first crack at installing it was
late last week.  The experience was worrisome.  The replicator service
inserts a kernel module, which by itself is livable; but in our particular
case, we found a changed behavior in error codes the kernel returns for
things like non existent files, while this module is loaded.  Ultimately, if
the kernel module was the root cause of that behavior (we're still
investigating), that's unworkable.

As for LVM snapshotting ... I am under the impression that those features
are unavailable in GFS (and are slated for GFS2?  Which is not "production
ready", yet?)  It has certainly occured to me to try that feature, if only
it were available.  Am I misinformed?  Perhaps I need some more education on
how exactly LVM mirroring will help me.  I am *attempting* to approximate a
traditional backup scheme, atleast on this particular filesystem.  Am I
correct in believing that I could snapshot a volume (assuming the feature is
available) and run a traditional backup (using, say, rdiff-backup) in a
shorter time than I can now, where I'm running it straight off a live GFS
volume?

--
Brandon

On Thu, Oct 16, 2008 at 10:50 AM, Wendy Cheng <s.wendy.cheng at gmail.com>wrote:

> Brandon Young wrote:
>
>> Hi all,
>>
>> I currently have a GFS deployment consisting of eight servers and several
>> GFS volumes.  One of my GFS servers is a dedicated backup server with a
>> second replica SAN attached to it through a second HBA.  My approach to
>> backups has been with tools such as rsync and rdiff-backup, run on a nightly
>> basis.  I am having a particular problem with one or two of my filesystems
>> taking a *very* long time to backup.  For example, I have /home living on
>> GFS.  Day-to-day performance is acceptable, but backups are hideously slow.
>>  Every night, I kick off an rdiff-backup of /home from my backup server,
>> which dumps the backup onto an XFS filesystem on the replica SAN.  This
>> backup can take days in some cases.
>>
>
> Not only GFS, the "getdents()" has been more than annoying on many
> filesystems if entries count within the directory is high - but, yes,
> GFS is particularly bloody slow with its directory read. There have been
> efforts contributed by Red Hat POSIX and LIBC folks to have new
> standardized light-weight directory operations. Unfortunately I lost
> tracks of their progress ... On the other hand, integrating these new
> calls into GFS would take time anyway (if they are available) - so
> unlikely it can meet your need. There were also few experimental GFS
> patches but none of them made into the production code.
>
> Unless other GFS folks can give you more ideas, I think your best bet at
> this moment is to think "outside" the box. That is, don't do
> file-to-file backup if all possible. Check out other block level backup
> strategies. Are Linux LVM mirroring and/or snapshots workable for you ?
> Does your SAN vendor provide embedded features (e.g. Netapp SAN box
> offers snapshot, snapmirror, syncmirror, etc) ?
>
> -- Wendy
>
>
>> We have done some investigating, and found that it appears that
>> getdents(2) calls (which give the list of filenames present in a directory)
>> are spectacularly slow on GFS, irrespective of the size of the directory in
>> question.  In particular, with 'strace -r', I'm seeing a rate below 100
>> filenames per second.  The filesystem /home has at least 10 million files in
>> it, which doing the math means 29.5 hours just to do the getdents calls to
>> scan them, which is more than a third of wall-clock time.  And that's before
>> we even start stat'ing.
>>
>> I google'd around a bit and I can't see any discussion of slow getdents
>> calls under GFS.  Is there any chance we have some sort of tunable turned
>> on/off that might be causing this?  I'm not sure which tunables to consider
>> tweaking, even.  This seems awfully slow, even with sub-optimal locking.  Is
>> there perhaps some tunable I can try tweaking to improve this situation?
>>  Any insights would be much appreciated.
>>
>> --
>> Brandon
>> ------------------------------------------------------------------------
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081016/3325a7a8/attachment.htm>

From gordan at bobich.net  Thu Oct 16 20:29:52 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 16 Oct 2008 21:29:52 +0100
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>	<48F762DD.8040102@gmail.com>
	<824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>
Message-ID: <48F7A440.9050604@bobich.net>

Brandon Young wrote:

> As for LVM snapshotting ... I am under the impression that those 
> features are unavailable in GFS (and are slated for GFS2?  Which is not 
> "production ready", yet?)  It has certainly occured to me to try that 
> feature, if only it were available.  Am I misinformed?  Perhaps I need 
> some more education on how exactly LVM mirroring will help me.  I am 
> *attempting* to approximate a traditional backup scheme, atleast on this 
> particular filesystem.  Am I correct in believing that I could snapshot 
> a volume (assuming the feature is available) and run a traditional 
> backup (using, say, rdiff-backup) in a shorter time than I can now, 
> where I'm running it straight off a live GFS volume?

You can use CLVM (Cluser aware LVM) and create GFS on top of that 
volume. You can them use CLVM to take a snapshot of the block device, 
mount it read-only with lock_nolock and back that up. That should go at 
non-clustered FS speeds.

Gordan


From siddiqut at gmail.com  Thu Oct 16 20:32:31 2008
From: siddiqut at gmail.com (Tajdar Siddiqui)
Date: Thu, 16 Oct 2008 16:32:31 -0400
Subject: [Linux-cluster] gfs cluster server question
Message-ID: <3abaa1ce0810161332u38e20545j9fd03cc6169eeeea@mail.gmail.com>

I apologize in advance i have not worded my question correctly:

What kind of issues need to be considered is some of the servers connecting
to gfs san are spread over WAN.
Assume that data will be written/read by all the servers in the cluster and
also that there will be cross talk: meaning data written by Server1 which
connects to SAN over WAN will be read by Server 2 which connects to SAN over
LAN and vice versa.

Thanx,
Tajdar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081016/3245ef5e/attachment.htm>

From kanderso at redhat.com  Thu Oct 16 20:35:02 2008
From: kanderso at redhat.com (Kevin Anderson)
Date: Thu, 16 Oct 2008 15:35:02 -0500
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <48F7A440.9050604@bobich.net>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
	<48F762DD.8040102@gmail.com>
	<824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>
	<48F7A440.9050604@bobich.net>
Message-ID: <1224189302.2982.52.camel@dhcp80-204.msp.redhat.com>

On Thu, 2008-10-16 at 21:29 +0100, Gordan Bobic wrote:
> Brandon Young wrote:
> 
> > As for LVM snapshotting ... I am under the impression that those 
> > features are unavailable in GFS (and are slated for GFS2?  Which is not 
> > "production ready", yet?)  It has certainly occured to me to try that 
> > feature, if only it were available.  Am I misinformed?  Perhaps I need 
> > some more education on how exactly LVM mirroring will help me.  I am 
> > *attempting* to approximate a traditional backup scheme, atleast on this 
> > particular filesystem.  Am I correct in believing that I could snapshot 
> > a volume (assuming the feature is available) and run a traditional 
> > backup (using, say, rdiff-backup) in a shorter time than I can now, 
> > where I'm running it straight off a live GFS volume?
> 
> You can use CLVM (Cluser aware LVM) and create GFS on top of that 
> volume. You can them use CLVM to take a snapshot of the block device, 
> mount it read-only with lock_nolock and back that up. That should go at 
> non-clustered FS speeds.
> 
We don't have support for cluster snapshots as of yet even though it has
been on the todo list for about 5 years now :(.

Kevin


From gordan at bobich.net  Thu Oct 16 20:47:50 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 16 Oct 2008 21:47:50 +0100
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <1224189302.2982.52.camel@dhcp80-204.msp.redhat.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>	<48F762DD.8040102@gmail.com>	<824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>	<48F7A440.9050604@bobich.net>
	<1224189302.2982.52.camel@dhcp80-204.msp.redhat.com>
Message-ID: <48F7A876.6010300@bobich.net>

Kevin Anderson wrote:
> On Thu, 2008-10-16 at 21:29 +0100, Gordan Bobic wrote:
>> Brandon Young wrote:
>>
>>> As for LVM snapshotting ... I am under the impression that those 
>>> features are unavailable in GFS (and are slated for GFS2?  Which is not 
>>> "production ready", yet?)  It has certainly occured to me to try that 
>>> feature, if only it were available.  Am I misinformed?  Perhaps I need 
>>> some more education on how exactly LVM mirroring will help me.  I am 
>>> *attempting* to approximate a traditional backup scheme, atleast on this 
>>> particular filesystem.  Am I correct in believing that I could snapshot 
>>> a volume (assuming the feature is available) and run a traditional 
>>> backup (using, say, rdiff-backup) in a shorter time than I can now, 
>>> where I'm running it straight off a live GFS volume?
>> You can use CLVM (Cluser aware LVM) and create GFS on top of that 
>> volume. You can them use CLVM to take a snapshot of the block device, 
>> mount it read-only with lock_nolock and back that up. That should go at 
>> non-clustered FS speeds.
>>
> We don't have support for cluster snapshots as of yet even though it has
> been on the todo list for about 5 years now :(.

Joy... My mistake. Sorry I mentioned it.

Gordan


From swplotner at amherst.edu  Thu Oct 16 21:22:04 2008
From: swplotner at amherst.edu (Steffen Plotner)
Date: Thu, 16 Oct 2008 17:22:04 -0400
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <48F7A876.6010300@bobich.net>
Message-ID: <D0CAF0E9BC817D418E7E929CA8933C0F0955AC@mail8.amherst.edu>

 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Gordan Bobic
> Sent: Thursday, October 16, 2008 4:48 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS Tunables
> 
> Kevin Anderson wrote:
> > On Thu, 2008-10-16 at 21:29 +0100, Gordan Bobic wrote:
> >> Brandon Young wrote:
> >>
> >>> As for LVM snapshotting ... I am under the impression that those 
> >>> features are unavailable in GFS (and are slated for GFS2? 
>  Which is 
> >>> not "production ready", yet?)  It has certainly occured 
> to me to try 
> >>> that feature, if only it were available.  Am I 
> misinformed?  Perhaps 
> >>> I need some more education on how exactly LVM mirroring will help 
> >>> me.  I am
> >>> *attempting* to approximate a traditional backup scheme, 
> atleast on 
> >>> this particular filesystem.  Am I correct in believing 
> that I could 
> >>> snapshot a volume (assuming the feature is available) and run a 
> >>> traditional backup (using, say, rdiff-backup) in a 
> shorter time than 
> >>> I can now, where I'm running it straight off a live GFS volume?
> >> You can use CLVM (Cluser aware LVM) and create GFS on top of that 
> >> volume. You can them use CLVM to take a snapshot of the 
> block device, 
> >> mount it read-only with lock_nolock and back that up. That 
> should go 
> >> at non-clustered FS speeds.
> >>
> > We don't have support for cluster snapshots as of yet even 
> though it 
> > has been on the todo list for about 5 years now :(.
> 
> Joy... My mistake. Sorry I mentioned it.
> 

How about snapshotting at the backend storage device? We usually use
linux as the backed, hand out storage via iscsi and snapshot at the
backend - this eliminates the need for doing snaps at the GFS level - I
agree that if there was snapshotting at the LVM/GFS level we could get a
clean snapshot.... another problem..

> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From gordan at bobich.net  Thu Oct 16 21:42:07 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 16 Oct 2008 22:42:07 +0100
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <D0CAF0E9BC817D418E7E929CA8933C0F0955AC@mail8.amherst.edu>
References: <D0CAF0E9BC817D418E7E929CA8933C0F0955AC@mail8.amherst.edu>
Message-ID: <48F7B52F.4000907@bobich.net>

Steffen Plotner wrote:

>> Kevin Anderson wrote:
>>> On Thu, 2008-10-16 at 21:29 +0100, Gordan Bobic wrote:
>>>> Brandon Young wrote:
>>>>
>>>>> As for LVM snapshotting ... I am under the impression that those 
>>>>> features are unavailable in GFS (and are slated for GFS2? 
>>  Which is 
>>>>> not "production ready", yet?)  It has certainly occured 
>> to me to try 
>>>>> that feature, if only it were available.  Am I 
>> misinformed?  Perhaps 
>>>>> I need some more education on how exactly LVM mirroring will help 
>>>>> me.  I am
>>>>> *attempting* to approximate a traditional backup scheme, 
>> atleast on 
>>>>> this particular filesystem.  Am I correct in believing 
>> that I could 
>>>>> snapshot a volume (assuming the feature is available) and run a 
>>>>> traditional backup (using, say, rdiff-backup) in a 
>> shorter time than 
>>>>> I can now, where I'm running it straight off a live GFS volume?
>>>> You can use CLVM (Cluser aware LVM) and create GFS on top of that 
>>>> volume. You can them use CLVM to take a snapshot of the 
>> block device, 
>>>> mount it read-only with lock_nolock and back that up. That 
>> should go 
>>>> at non-clustered FS speeds.
>>>>
>>> We don't have support for cluster snapshots as of yet even 
>> though it 
>>> has been on the todo list for about 5 years now :(.
>> Joy... My mistake. Sorry I mentioned it.
>>
> 
> How about snapshotting at the backend storage device? We usually use
> linux as the backed, hand out storage via iscsi and snapshot at the
> backend - this eliminates the need for doing snaps at the GFS level - I
> agree that if there was snapshotting at the LVM/GFS level we could get a
> clean snapshot.... another problem..

The problem with that being that you have your storage system as a 
single point of failure which rather defeats the point of clustering.

Gordan


From pbruna at it-linux.cl  Fri Oct 17 04:03:14 2008
From: pbruna at it-linux.cl (Patricio A. Bruna)
Date: Fri, 17 Oct 2008 01:03:14 -0300 (CLST)
Subject: [Linux-cluster] Email alert
Message-ID: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>

Its possible to configure Cluster Suite to send an email when a service change host or faild to failover? 

------------------------------------ 
Patricio Bruna V. 
IT Linux Ltda. 
http://www.it-linux.cl 
Fono : (+56-2) 333 0578 - Chile 
Fono: (+54-11) 6632 2760 - Argentina 
M?vil : (+56-09) 8827 0342 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081017/502e7762/attachment.htm>

From tom at netspot.com.au  Fri Oct 17 04:58:06 2008
From: tom at netspot.com.au (Tom Lanyon)
Date: Fri, 17 Oct 2008 15:28:06 +1030
Subject: [Linux-cluster] GFS Tunables
In-Reply-To: <1224189302.2982.52.camel@dhcp80-204.msp.redhat.com>
References: <824ffea00810160752l1aa6df1el68032eba4ac87df5@mail.gmail.com>
	<48F762DD.8040102@gmail.com>
	<824ffea00810161321n569e4fd3wa707c96d884588e2@mail.gmail.com>
	<48F7A440.9050604@bobich.net>
	<1224189302.2982.52.camel@dhcp80-204.msp.redhat.com>
Message-ID: <EC3112E6-EE56-4F18-927C-DAD02E35C755@netspot.com.au>


On 17/10/2008, at 7:05 AM, Kevin Anderson wrote:
> We don't have support for cluster snapshots as of yet even though it  
> has
> been on the todo list for about 5 years now :(.
>
> Kevin


I asked recently on the lvm list what the status of CLVM snapshotting  
was and got no response... :)


From macscr at macscr.com  Fri Oct 17 05:19:06 2008
From: macscr at macscr.com (Mark Chaney)
Date: Fri, 17 Oct 2008 00:19:06 -0500
Subject: [Linux-cluster] Email alert
In-Reply-To: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
References: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
Message-ID: <028901c93017$e230f490$a692ddb0$@com>

No, but you can use a monitoring service like nagios to do that.

 
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patricio A. Bruna
Sent: Thursday, October 16, 2008 11:03 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Email alert

 
Its possible to configure Cluster Suite to send an email when a service change host or faild to failover?

------------------------------------
Patricio Bruna V.
IT Linux Ltda.
http://www.it-linux.cl
Fono : (+56-2) 333 0578 - Chile
Fono: (+54-11) 6632 2760 - Argentina
M?vil : (+56-09) 8827 0342

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081017/d2c00f5a/attachment.htm>

From bsd_daemon at msn.com  Fri Oct 17 07:29:29 2008
From: bsd_daemon at msn.com (Mehmet CELIK)
Date: Fri, 17 Oct 2008 07:29:29 +0000
Subject: [Linux-cluster] Email alert
In-Reply-To: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
References: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
Message-ID: <BLU129-W24699DA9EA614E0669BA92E3320@phx.gbl>


Its not possible, right now. Maybe in the future.. But, you can use the mon service.-- Mehmet CELIK
Istanbul/TURKEY


Date: Fri, 17 Oct 2008 01:03:14 -0300From: pbruna at it-linux.clTo: linux-cluster at redhat.comSubject: [Linux-cluster] Email alert


Its possible to configure Cluster Suite to send an email when a service change host or faild to failover?------------------------------------Patricio Bruna V.IT Linux Ltda.http://www.it-linux.clFono : (+56-2) 333 0578 - ChileFono: (+54-11) 6632 2760 - ArgentinaM?vil : (+56-09) 8827 0342
_________________________________________________________________
Store, manage and share up to 5GB with Windows Live SkyDrive.
http://skydrive.live.com/welcome.aspx?provision=1?ocid=TXT_TAGLM_WL_skydrive_102008
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081017/60573513/attachment.htm>

From federico.simoncelli at gmail.com  Fri Oct 17 11:27:02 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Fri, 17 Oct 2008 13:27:02 +0200
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
Message-ID: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>

Hi all,
  I am managing a two-node cluster without qdisk and to avoid the
fencing loop I thought to implement a startup quorum.
The startup quorum would be the minimum total votes a node requires to
remain in the cluster after the startup process.
When cman is started on a node it joins the cluster and if the total
votes are less than the CMAN_STARTUP_QUORUM it leaves preventing
fencing to be executed.

I configured my CMAN_STARTUP_QUORUM to 2:
- When the two nodes joins the cluster together the total votes are 2;
everything is normal.
- When a node get fenced the remaining one is in quorate (cman real quorum: 1).
- When the fenced node boots up and finds the other node the total
votes are 2; everything is normal.
- When the fenced node boots up and doesn't find the other node the
total votes are 1 (< CMAN_STARTUP_QUORUM); the node leaves the
cluster, stop cman and prevent fencing.

This might be handy for booting up a remote node for maintenance and
not being worried about fencing loops.
The downside is that you can't boot a single node and having it
working alone; this situation can be considered an emergency and can
be handled manually.

The startup quorum might resolve also:

https://www.redhat.com/archives/linux-cluster/2008-June/msg00143.html
https://bugzilla.redhat.com/show_bug.cgi?id=452234

Patch in attachment. Can anyone review it?
Is anyone interested to integrate this same behaviour into cman and
the cluster.conf?

Ex:
  <cman two_node="1" expected_votes="1" startup_votes="2" />

Thanks,
-- 
Federico.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman_startup_quorum.patch
Type: application/octet-stream
Size: 979 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081017/e0d019d4/attachment.obj>

From jmacfarland at nexatech.com  Fri Oct 17 13:36:04 2008
From: jmacfarland at nexatech.com (Jeff Macfarland)
Date: Fri, 17 Oct 2008 08:36:04 -0500
Subject: [Linux-cluster] Email alert
In-Reply-To: <028901c93017$e230f490$a692ddb0$@com>
References: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
	<028901c93017$e230f490$a692ddb0$@com>
Message-ID: <48F894C4.9010506@nexatech.com>

Mark Chaney wrote:
> No, but you can use a monitoring service like nagios to do that.

Is "RIND" (http://sources.redhat.com/cluster/wiki/EventScripting) not 
applicable? Or, if implemented in <rm/>, will it prevent the system from 
automated failover of services, etc? I dunno much about slang, but it 
looks like it at least supports system() for a quick email if nothing else.

> 
>  
> 
> *From:* linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Patricio A. Bruna
> *Sent:* Thursday, October 16, 2008 11:03 PM
> *To:* linux-cluster at redhat.com
> *Subject:* [Linux-cluster] Email alert
> 
>  
> 
> Its possible to configure Cluster Suite to send an email when a service 
> change host or faild to failover?
> 
> ------------------------------------
> Patricio Bruna V.
> IT Linux Ltda.
> http://www.it-linux.cl
> Fono : (+56-2) 333 0578 - Chile
> Fono: (+54-11) 6632 2760 - Argentina
> M?vil : (+56-09) 8827 0342
> 


-- 
Jeff Macfarland (jmacfarland at nexatech.com)
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net


From teigland at redhat.com  Fri Oct 17 15:41:42 2008
From: teigland at redhat.com (David Teigland)
Date: Fri, 17 Oct 2008 10:41:42 -0500
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
Message-ID: <20081017154142.GE3299@redhat.com>

On Fri, Oct 17, 2008 at 01:27:02PM +0200, Federico Simoncelli wrote:
> Hi all,
>   I am managing a two-node cluster without qdisk and to avoid the
> fencing loop I thought to implement a startup quorum.

Sounds like you want quorum of 2 when nodes are joining, but quorum of 1
after a node fails.  That sounds reasonable.

To do that manually, you *don't* set two_node/expected_votes in
cluster.conf, and then manually run cman_tool expected -e 1 after a node
fails.

Here's another possibility I hadn't thought of before:

. don't set two_node/expteced_votes in cluster.conf
. edit init.d/cman and possibly /etc/sysconfig/cman to do
  cman_tool join -w        (joins cluster and waits to be a member)
  cman_tool wait -q        (waits for quorum, both nodes to be members)
  cman_tool expected -e 1  (change expected votes to 1)

The effects of this will be:

. a node needs to see the other to get quorum and start up
. after both nodes see each other, if one fails, the other will fence
  it and continue
. after both nodes see each other, if they become partitioned, they
  will race to fence each other
. if both nodes are restarted while they are still partitioned, neither
  of them will be able to start

Dave


From federico.simoncelli at gmail.com  Fri Oct 17 16:11:53 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Fri, 17 Oct 2008 18:11:53 +0200
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <20081017154142.GE3299@redhat.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
	<20081017154142.GE3299@redhat.com>
Message-ID: <a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>

On Fri, Oct 17, 2008 at 5:41 PM, David Teigland <teigland at redhat.com> wrote:
> Here's another possibility I hadn't thought of before:
>
> . don't set two_node/expteced_votes in cluster.conf
> . edit init.d/cman and possibly /etc/sysconfig/cman to do
>  cman_tool join -w        (joins cluster and waits to be a member)
>  cman_tool wait -q        (waits for quorum, both nodes to be members)
>  cman_tool expected -e 1  (change expected votes to 1)

I tried this before but the downsides were:

- long waits due not  being in quorum (fence timeout is 600 seconds by default)
- you have to rewrite the cluster.conf to make it work
- break legacies with distros/systems not using this init script

Let me know what you think about the problems I listed above because
this solution looks much cleaner.
Thanks,
-- 
Federico.


From teigland at redhat.com  Fri Oct 17 16:09:23 2008
From: teigland at redhat.com (David Teigland)
Date: Fri, 17 Oct 2008 11:09:23 -0500
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
	<20081017154142.GE3299@redhat.com>
	<a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>
Message-ID: <20081017160923.GG3299@redhat.com>

On Fri, Oct 17, 2008 at 06:11:53PM +0200, Federico Simoncelli wrote:
> On Fri, Oct 17, 2008 at 5:41 PM, David Teigland <teigland at redhat.com> wrote:
> > Here's another possibility I hadn't thought of before:
> >
> > . don't set two_node/expteced_votes in cluster.conf
> > . edit init.d/cman and possibly /etc/sysconfig/cman to do
> >  cman_tool join -w        (joins cluster and waits to be a member)
> >  cman_tool wait -q        (waits for quorum, both nodes to be members)
> >  cman_tool expected -e 1  (change expected votes to 1)
> 
> I tried this before but the downsides were:
> 
> - long waits due not  being in quorum (fence timeout is 600 seconds by
> default)

cman_tool join -w can will possibly wait a long time if the other node
is not up or is partitioned.  That's the price you pay for avoiding the
potential back-and-forth fencing loop.

> - you have to rewrite the cluster.conf to make it work

I don't know what you're refering to.  You simply don't include the
two_node/expected_votes line.

> - break legacies with distros/systems not using this init script

Yes, you have to hack the init script.  There's a change coming in RHEL5.3
that's very similar to this, where you won't need to hack anything:

http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commitdiff;h=5ea416d26ec2b6bf605c573a5173736d0f8cd27c

http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commitdiff;h=397b8111d2d69b9dd25e7b074822be571f274032


From federico.simoncelli at gmail.com  Fri Oct 17 16:56:32 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Fri, 17 Oct 2008 18:56:32 +0200
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <20081017160923.GG3299@redhat.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
	<20081017154142.GE3299@redhat.com>
	<a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>
	<20081017160923.GG3299@redhat.com>
Message-ID: <a01fe36d0810170956m3a9cd5berc5b8b99860f53ef2@mail.gmail.com>

On Fri, Oct 17, 2008 at 6:09 PM, David Teigland <teigland at redhat.com> wrote:
>> I tried this before but the downsides were:
>>
>> - long waits due not  being in quorum (fence timeout is 600 seconds by
>> default)
>
> cman_tool join -w can will possibly wait a long time if the other node
> is not up or is partitioned.  That's the price you pay for avoiding the
> potential back-and-forth fencing loop.

I'll pay :-)
I quickly made a patch (in attachment).

Tested with:

# cat /etc/sysconfig/cman
CMAN_QUORUM_TIMEOUT=10
CMAN_EXPECTED_QUORUM=1

Working fine for now. More testing after the weekend. :-)
Comments are welcome.

--
Federico.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman_expected_quorum.patch
Type: application/octet-stream
Size: 858 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081017/99195039/attachment.obj>

From virginian at blueyonder.co.uk  Sat Oct 18 12:42:24 2008
From: virginian at blueyonder.co.uk (Virginian)
Date: Sat, 18 Oct 2008 13:42:24 +0100
Subject: [Linux-cluster] Strange error messages in /var/log/messages
References: <A0FC83F6B19340539B25CAB729EFAF64@Desktop><1224085345.3277.2.camel@localhost.localdomain><1224085725.3277.4.camel@localhost.localdomain>
	<1224098289.5912.5.camel@ayanami>
Message-ID: <5108FACF53854CBF98D6CA1C01542AEC@Desktop>

Hi Lon,

I see the attached patch but I am not a programmer so I am not sure what it 
means?

Thanks

John
----- Original Message ----- 
From: "Lon Hohberger" <lhh at redhat.com>
To: "linux clustering" <linux-cluster at redhat.com>
Sent: Wednesday, October 15, 2008 8:18 PM
Subject: Re: [Linux-cluster] Strange error messages in /var/log/messages


> On Wed, 2008-10-15 at 11:48 -0400, jim parsons wrote:
>> On Wed, 2008-10-15 at 11:42 -0400, jim parsons wrote:
>> > On Wed, 2008-10-15 at 16:01 +0100, Virginian wrote:
>> >
>> > >         <fence_xvmd/>
>> >
>> > This tag does not need to be in the inner clusters' (dom u cluster) 
>> > conf file, only the cluster set up on the physical hosts.
>> >
>> > That might be the problem - easy enough to check! :)
>>
>> It would be fun to know if the above fixes the issue. Please let me
>> know.
>
> I think I see it.
>
> -- Lon
>


--------------------------------------------------------------------------------


> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster 


From fdinitto at redhat.com  Mon Oct 20 10:07:26 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Mon, 20 Oct 2008 12:07:26 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.99.11 (development snapshot) released
Message-ID: <Pine.LNX.4.64.0810201203200.9830@trider-g7>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


The cluster team and its community are proud to announce the 2.99.11
release from the master branch.

Important note: If you are running 2.99.xx series, please upgrade 
immediatly to this version.

This release addresses a major bug in GFS1 kernel module and also contains 
a security fix for fence_egenera.

The development cycle for 3.0 is proceeding at a very good speed and
mostlikely one of the next releases will be 3.0alpha1. All features
designed for 3.0 are being completed and taking a proper shape, the
library API has been stable for sometime (and will soon be marked as 3.0
soname). Stay tuned for upcoming updates!

The 2.99.XX releases are _NOT_ meant to be used for production
environments.. yet.

The master branch is the main development tree that receives all new
features, code, clean up and a whole brand new set of bugs,

At some point in time this code will become the 3.0 stable release.

Everybody with test equipment and time to spare, is highly encouraged to
download, install and test the 2.99 releases and more important report
problems.

In order to build the 2.99.11 release you will need:

- - corosync svn r1667.
- - openais svn r1651.
- - linux kernel (2.6.27)

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.99.11.tar.gz
   https://fedorahosted.org/releases/c/l/cluster/cluster-2.99.11.tar.gz

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

Under the hood (from 2.99.10):

Abhijith Das (5):
       gfs-kernel: GFS: madvise system call causes assertion
       gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options
       Revert "gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options"
       gfs-kernel and mount.gfs2: GFS ignore the noatime and nodiratime mount options
       gfs-kernel: bz 458765 - In linux-2.6.26 / 2.03.06, GFS1 can't create more than 4kb file

Bob Peterson (2):
       GFS: gfs_fsck invalid response to question changes the question
       gfs-kmod: GFS corruption after forced withdraw

Christine Caulfield (1):
       cman: fix a couple of unhandled malloc failures

David Teigland (8):
       dlm_controld: add protocol negotiation
       fenced: add protocol negotiation
       fenced/fence_tool: improve list info
       fence_tool/dlm_tool/gfs_control: remove error message
       daemons/tools: misc minor cleanups and improvements
       dlm/fence: daemon fixes and tool improvements
       gfs_control: improve ls output
       fenced/dlm_controld/gfs_controld: modify a debug message

Fabio M. Di Nitto (2):
       fence egenera: fix logging file
       rgmanager: fix build after port to logsys

Jan Friesse (1):
       fence: New fence agent for Logical Domains (LDOMs)

Lon Hohberger (5):
       rgmanager: First pass at port to logsys
       group: Allow group_tool ls <name> <level> to be scriptable
       rgmanager: make clulog build even though it's incomplete
       rgmanager: don't change the build target just yet
       [fence] Fix fence_xvmd trying to read wrong args from ccs

Marek 'marx' Grac (3):
       [FENCE] Fix #290231 - "Switch (optional)" param does not default to "1" and program fails
       [RGMANAGER] - Fix #462910 postgres-8.sh and metadata fixes
       [fence] Operation 'list' for APC fence agent

Ryan McCabe (1):
       cman: Fix typo that caused start-up to fail

Ryan O'Hara (3):
       cman: allow custom xen network bridge scripts
       fence_scsi: improve logging for debugging
       fence_scsi: correctly declare key_list

Steven Whitehouse (1):
       libgfs2: Add support for UUID generation to gfs2_mkfs

rohara (2):
       fence_scsi.pl: check if nodeid is zero
       scsi_reserve: add restart option

  cman/daemon/daemon.c                        |    2 +-
  cman/init.d/cman                            |   17 +-
  cman/lib/libcman.c                          |    2 +
  dlm/libdlmcontrol/libdlmcontrol.h           |    1 -
  dlm/tool/main.c                             |  105 +++--
  fence/agents/apc/fence_apc.py               |   20 +-
  fence/agents/egenera/fence_egenera.pl       |    2 +-
  fence/agents/ldom/Makefile                  |    5 +
  fence/agents/ldom/fence_ldom.py             |  101 +++++
  fence/agents/lib/fencing.py.py              |   26 +-
  fence/agents/scsi/fence_scsi.pl             |   97 +++--
  fence/agents/scsi/scsi_reserve              |   55 +++
  fence/agents/xvm/options-ccs.c              |    3 +
  fence/fence_tool/fence_tool.c               |   61 +++-
  fence/fenced/cpg.c                          |  515 +++++++++++++++++++---
  fence/fenced/fd.h                           |   12 +-
  fence/fenced/main.c                         |    8 +
  fence/fenced/member_cman.c                  |    2 +-
  fence/fenced/recover.c                      |   10 +-
  fence/man/Makefile                          |    1 +
  fence/man/fence_ldom.8                      |  114 +++++
  gfs-kernel/src/gfs/glock.h                  |   15 +-
  gfs-kernel/src/gfs/incore.h                 |    1 +
  gfs-kernel/src/gfs/log.c                    |   27 +-
  gfs-kernel/src/gfs/mount.c                  |    3 +
  gfs-kernel/src/gfs/ops_address.c            |   34 +--
  gfs-kernel/src/gfs/ops_fstype.c             |    2 +-
  gfs/gfs_fsck/log.c                          |   10 +-
  gfs2/libgfs2/ondisk.c                       |    3 +
  gfs2/libgfs2/structures.c                   |   12 +-
  gfs2/mount/mount.gfs2.c                     |    1 +
  gfs2/mount/util.c                           |    7 +
  group/daemon/main.c                         |    4 +-
  group/dlm_controld/cpg.c                    |  632 +++++++++++++++++++++++++--
  group/dlm_controld/dlm_daemon.h             |    7 +-
  group/dlm_controld/group.c                  |    7 +-
  group/dlm_controld/main.c                   |   14 +-
  group/gfs_control/main.c                    |  134 ++++--
  group/gfs_controld/cpg-new.c                |   27 +-
  group/gfs_controld/cpg-old.c                |    4 +-
  group/gfs_controld/gfs_daemon.h             |    1 +
  group/gfs_controld/main.c                   |    2 +
  group/tool/main.c                           |   37 ++-
  rgmanager/include/clulog.h                  |  139 ------
  rgmanager/include/logging.h                 |   10 +
  rgmanager/include/resgroup.h                |    4 +-
  rgmanager/src/clulib/Makefile               |    6 +-
  rgmanager/src/clulib/clulog.c               |  281 ------------
  rgmanager/src/clulib/logging.c              |  225 ++++++++++
  rgmanager/src/clulib/msg_cluster.c          |    6 +-
  rgmanager/src/daemons/Makefile              |   19 +-
  rgmanager/src/daemons/clurmtabd.c           |   52 ++--
  rgmanager/src/daemons/depends.c             |   14 +-
  rgmanager/src/daemons/event_config.c        |   18 +-
  rgmanager/src/daemons/fo_domain.c           |   90 ++---
  rgmanager/src/daemons/groups.c              |  104 +++---
  rgmanager/src/daemons/main.c                |  120 +++---
  rgmanager/src/daemons/reslist.c             |    7 +-
  rgmanager/src/daemons/resrules.c            |    6 +-
  rgmanager/src/daemons/restree.c             |   11 +-
  rgmanager/src/daemons/rg_event.c            |   40 +-
  rgmanager/src/daemons/rg_forward.c          |   26 +-
  rgmanager/src/daemons/rg_state.c            |  185 ++++----
  rgmanager/src/daemons/rg_thread.c           |   12 +-
  rgmanager/src/daemons/service_op.c          |   16 +-
  rgmanager/src/daemons/slang_event.c         |   32 +-
  rgmanager/src/daemons/test.c                |    1 +
  rgmanager/src/daemons/watchdog.c            |    8 +-
  rgmanager/src/resources/ocf-shellfuncs      |    3 +-
  rgmanager/src/resources/postgres-8.metadata |    2 +-
  rgmanager/src/resources/postgres-8.sh       |   16 +-
  rgmanager/src/resources/utils/ra-skelet.sh  |    5 +
  rgmanager/src/utils/Makefile                |   13 +-
  rgmanager/src/utils/clubufflush.c           |   12 +-
  rgmanager/src/utils/clulog.c                |  123 ++----
  rgmanager/src/utils/clunfsops.c             |   18 +-
  76 files changed, 2482 insertions(+), 1285 deletions(-)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iQIVAwUBSPxYZQgUGcMLQ3qJAQJT8A//aFXIad5tlL3can9qHKKU01VJ4JGZ3SOv
pBmALC/S6Z+QWzw8e1Uawbu8iuxHUH4rG87GFV+792SBdn9UXotP045UsJVqX3O5
Zcat0T0TkwqSJGD+afT8GAeH7jsw9d92nN30E5THqdwv96EXkLZGDmhQVxrJgTd4
dmNV+010UJCY3Btgu088twv2ggRDOyHKDmAAj0r4vvsm/B5TqXe5Vk2DJrGsLOcL
GA7/GxbrgcporBme7dgGBbJFdBLIGDa9UeHsF2GZTilVvSKdYU5LpnM0yo+Sh1Y6
kse5hb7zDzAYm+Ns/9S3skb+N4rQT7ZIYoYaBxZuHSgNVwzbQvTtqgxGNKn3LZe3
oWcub94agRvlJM6vFkITspxfa3Wfg+w3F07qeOCWOUSeEy4cyfrTbf0Q2pMT+YXh
jZM5MUEEIgjtPcmL3TYOjj2xhAkzPhF4pQODtuBy4LDNIcVuFav6/22VWzqpwfan
lQRAqf+ep5uZA5w9okuUtXfiRdRkQtSu1McW8zgvV0lZ9NdmsFMVbutkzO7DDKLY
hA0rQTtsN96Rr+wAVrVrFTjTlkEDK5zVmbrYi5rNxm/2C8961DM/PEz5lizLZiGa
c9Ijtc43PPNlhiXUYPNQLmZ3Ynrh7kA5sB+Zyg2TbnjuY73963UY5ksb+t2WpzcQ
D8ePL9urQHo=
=bsgx
-----END PGP SIGNATURE-----


From federico.simoncelli at gmail.com  Mon Oct 20 11:27:11 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Mon, 20 Oct 2008 13:27:11 +0200
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <a01fe36d0810170956m3a9cd5berc5b8b99860f53ef2@mail.gmail.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
	<20081017154142.GE3299@redhat.com>
	<a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>
	<20081017160923.GG3299@redhat.com>
	<a01fe36d0810170956m3a9cd5berc5b8b99860f53ef2@mail.gmail.com>
Message-ID: <a01fe36d0810200427v455f256bqdea1860efa75c299@mail.gmail.com>

On Fri, Oct 17, 2008 at 6:56 PM, Federico Simoncelli
<federico.simoncelli at gmail.com> wrote:
> I quickly made a patch (in attachment).
>
> Tested with:
>
> # cat /etc/sysconfig/cman
> CMAN_QUORUM_TIMEOUT=10
> CMAN_EXPECTED_QUORUM=1
>
> Working fine for now. More testing after the weekend.

I confirm that the patch works fine. I just need to say that the
two_node flag is required anyway:

<cman two_node="1">

-- 
Federico.


From pbruna at it-linux.cl  Mon Oct 20 14:04:39 2008
From: pbruna at it-linux.cl (Patricio A. Bruna)
Date: Mon, 20 Oct 2008 11:04:39 -0300 (CLST)
Subject: [Linux-cluster] Email alert
In-Reply-To: <48F894C4.9010506@nexatech.com>
Message-ID: <30509963.1181224511479439.JavaMail.root@lisa.itlinux.cl>

I think this is an importan fetaure tha should be in the core of Cluster Suite. 
As an admin i must know when my servers do a failover. 

------------------------------------ 
Patricio Bruna V. 
IT Linux Ltda. 
http://www.it-linux.cl 
Fono : (+56-2) 333 0578 - Chile 
Fono: (+54-11) 6632 2760 - Argentina 
M?vil : (+56-09) 8827 0342 

----- "Jeff Macfarland" <jmacfarland at nexatech.com> escribi?: 
> Mark Chaney wrote: 
> > No, but you can use a monitoring service like nagios to do that. 
> 
> Is "RIND" (http://sources.redhat.com/cluster/wiki/EventScripting) not 
> applicable? Or, if implemented in <rm/>, will it prevent the system from 
> automated failover of services, etc? I dunno much about slang, but it 
> looks like it at least supports system() for a quick email if nothing else. 
> 
> > 
> > 
> > 
> > *From:* linux-cluster-bounces at redhat.com 
> > [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Patricio A. Bruna 
> > *Sent:* Thursday, October 16, 2008 11:03 PM 
> > *To:* linux-cluster at redhat.com 
> > *Subject:* [Linux-cluster] Email alert 
> > 
> > 
> > 
> > Its possible to configure Cluster Suite to send an email when a service 
> > change host or faild to failover? 
> > 
> > ------------------------------------ 
> > Patricio Bruna V. 
> > IT Linux Ltda. 
> > http://www.it-linux.cl 
> > Fono : (+56-2) 333 0578 - Chile 
> > Fono: (+54-11) 6632 2760 - Argentina 
> > M?vil : (+56-09) 8827 0342 
> > 
> 
> 
> -- 
> Jeff Macfarland (jmacfarland at nexatech.com) 
> Nexa Technologies - 972.747.8879 
> Systems Administrator 
> GPG Key ID: 0x5F1CA61B 
> GPG Key Server: hkp://wwwkeys.pgp.net 
> 
> -- 
> Linux-cluster mailing list 
> Linux-cluster at redhat.com 
> https://www.redhat.com/mailman/listinfo/linux-cluster 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081020/53759b28/attachment.htm>

From scooter at cgl.ucsf.edu  Mon Oct 20 17:15:40 2008
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Mon, 20 Oct 2008 10:15:40 -0700
Subject: [Linux-cluster] GFS2 Test setup
Message-ID: <48FCBCBC.50801@cgl.ucsf.edu>

We are in the process of building a cluster, which will hope to put into 
production when RHEL 5.3 is released.  Our plan is to use GFS2, which 
we've been experimenting with for some time, but we're having some 
problems.  The cluster has 3 nodes, two HP DL580's and one HP DL585 -- 
we're using ILO for fencing.  We want to share a couple of filesystems 
using GFS2 which are connected to our SAN (an EVA 5000).  I've set 
everything up and it all works as expected, although on occasion, GFS2 
just seems to hang.  This happens 1-4 times/week.  What I note in the 
logs are a series of dlm messages.  On node 1 (for example) I see:

dlm: connecting to 3
dlm: connecting to 2
dlm: connecting to 2
dlm: connecting to 2
dlm: connecting to 3
dlm: connecting to 2
dlm: connecting to 2
dlm: connecting to 2
dlm: connecting to 3
dlm: connecting to 3
dlm: connecting to 3
dlm: connecting to 3

On node 2, I see:

dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted

and on node 3, I see:
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted
dlm: got connection from 1
Extra connection from node 1 attempted


Now for my questions.  I know that GFS2 isn't officially released, yet, 
and I've been seeing logs of checkins on cluster-devel.  Should I 
updated to the latest GFS2 to continue my testing?  Is the dlm condition 
outlined above a known bug in GFS2 that's been fixed in later releases, 
or have I tripped over something new?

Any suggestions would be appreciated!

-- scooter

-------------- next part --------------
A non-text attachment was scrubbed...
Name: scooter.vcf
Type: text/x-vcard
Size: 378 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081020/7af9c161/attachment.vcf>

From ccaulfie at redhat.com  Tue Oct 21 07:31:25 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 21 Oct 2008 08:31:25 +0100
Subject: [Linux-cluster] GFS2 Test setup
In-Reply-To: <48FCBCBC.50801@cgl.ucsf.edu>
References: <48FCBCBC.50801@cgl.ucsf.edu>
Message-ID: <48FD854D.7030409@redhat.com>

Scooter Morris wrote:
> We are in the process of building a cluster, which will hope to put into
> production when RHEL 5.3 is released.  Our plan is to use GFS2, which
> we've been experimenting with for some time, but we're having some
> problems.  The cluster has 3 nodes, two HP DL580's and one HP DL585 --
> we're using ILO for fencing.  We want to share a couple of filesystems
> using GFS2 which are connected to our SAN (an EVA 5000).  I've set
> everything up and it all works as expected, although on occasion, GFS2
> just seems to hang.  This happens 1-4 times/week.  What I note in the
> logs are a series of dlm messages.  On node 1 (for example) I see:
> 
> dlm: connecting to 3
> dlm: connecting to 2
> dlm: connecting to 2
> dlm: connecting to 2
> dlm: connecting to 3
> dlm: connecting to 2
> dlm: connecting to 2
> dlm: connecting to 2
> dlm: connecting to 3
> dlm: connecting to 3
> dlm: connecting to 3
> dlm: connecting to 3
> 
> On node 2, I see:
> 
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> 
> and on node 3, I see:
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> dlm: got connection from 1
> Extra connection from node 1 attempted
> 

Those messages are usually caused by routing problems. The DLM binds to
the address it is given by cman (see the output of cman_tool status for
that) and receiving nodes check incoming packets against that address to
make sure that only valid cluster nodes try to make connections.

What is happening here (I think - it sounds like a problem I've seen
before) is that the packets are being routed though another interface
than the one cman is using and the remote node sees them as coming from
a different address. This can happen if you have two ethernet interfaces
connected to the same physical segment for example.

There was a also a bug that could cause this if the routing was not
quite so broken but a little odd, though I don't have the bugzilla
number to hand, sorry.
-- 

Chrissie


From fdinitto at redhat.com  Tue Oct 21 07:37:57 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Tue, 21 Oct 2008 09:37:57 +0200 (CEST)
Subject: [Linux-cluster] Cluster 2.03.08 released
Message-ID: <Pine.LNX.4.64.0810210932280.9830@trider-g7>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


The cluster team and its vibrant community are proud to announce the 
2.03.08 release from the STABLE2 branch.

The STABLE2 branch collects, on a daily base, all bug fixes and the bare
minimal changes required to run the cluster on top of the most recent Linux
kernel (2.6.27) and rock solid openais (0.80.3).

This release includes some major fixes and addresses 2 security issues in 
fence agents (apc_snmp and egenera). Please consider upgrading as soon as 
possible.

- From this release GFS1 kernel module is now totally standalone and does 
not require GFS2 nor a patched upstream kernel to run.

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.08.tar.gz
   https://fedorahosted.org/releases/c/l/cluster/cluster-2.03.08.tar.gz

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

Under the hood (from 2.03.07):

Abhijith Das (10):
       libgfs2:  Bug 459630 -  GFS2: changes needed to gfs2-utils due to gfs2meta fs changes in bz 457798
       gfs-kernel: bz298931 - GFS unlinked inode metadata leak
       Revert "gfs-kernel: bz298931 - GFS unlinked inode metadata leak"
       gfs-kernel: GFS: madvise system call causes assertion
       gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options
       gfs-kernel: Bug 450209: Create gfs1-specific lock modules + minor fixes to build with 2.6.27
       Revert "gfs-kernel: bz457473 - GFS ignore the noatime and nodiratime mount options"
       gfs-kernel and mount.gfs2: GFS ignore the noatime and nodiratime mount options
       gfs-kernel: bz 458765 - In linux-2.6.26 / 2.03.06, GFS1 can't create more than 4kb file
       gfs-kernel: bug 450209 - addendum to previous patch. Removes extraneous lock_dlm_plock.c

Andrew Price (1):
       [GFS2] libgfs2: Build with -fPIC

Bob Peterson (7):
       GFS2: Make gfs2_fsck accept UNLINKED metadata blocks
       GFS2: sync buffers to disk when rewriting superblock
       Changes needed to stay compatible with libvolume_id.
       Changes needed to stay current with libvolume_id.
       GFS2: gfs2_fsck: fix segfault while running special block lists.
       GFS: gfs_fsck invalid response to question changes the question
       gfs-kmod: GFS corruption after forced withdraw

Chris Feist (2):
       fence: fixed a fence storm with fence_egenera
       cman: fixed makefiles to actually install the vmware manpage

Christine Caulfield (10):
       cman: Return quorum state in a STATECHANGE notification
       cman: Allow a recently left node to join cleanly.
       cman: initialise key_filename variable.
       cman: honour the dirty flag on a node we haven't seen before
       cman: Clean shutdown_con if the controlling process is killed.
       dlm: add dlm_tcpdump tool
       dlm: Make dlm_tcpdump compile for RHEL5 too
       dlm: make dlm_tcpdump cope with length==0 packets
       dlm: Add timestamp and full cmdline to dlm_tcpdump
       dlm: Add dlmtop

David Teigland (7):
       groupd: ignore nolock gfs fix
       fenced: add skip_undefined option
       fenced: add skip_undefined option fix
       groupd: send and check version messages
       fence_tool: new option to delay before join
       init.d/cman: use fence_tool -m for two node clusters
       groupd: send and check version messages fix

Fabio M. Di Nitto (11):
       qdisk: allow scan of sysfs to dive into first level symlinks
       qdisk: fix sysfs path diving
       rgmanger: fix handling of VIP v6
       ccs: deal with xml file format special case
       cman: fix broken init script
       fence: update alom description
       fence: install fence_alom man page
       build: bump kernel requirement to 2.6.27
       [BUILD] Allow users to set default log dir
       [FENCE] Fix fence_apc_snmp logging
       fence egenera: fix logging file

Jan Friesse (6):
       fence: Fence agent for VMware ESX
       cman: Removed old Perl version of VMware fence agent, so new version is built.
       fence: Fix fence agent for VMware ESX.
       fence: Fix fence agent for VMware ESX.
       Fence: Added fence agent for Sun Advanced Lights Out Manager (ALOM)
       fence: New fence agent for Logical Domains (LDOMs)

Lon Hohberger (20):
       rgmanager: Ancillary fix for rhbz #453000
       cman: Fix qdiskd file descriptor leak
       rgmanager: Make freeze/unfreeze work with central_processing
       rgmanager: Detect restricted failover domain crash
       rgmanager: Permit careful restart w/o disturbing services
       rgmanager: Wait for fence domain join to complete
       rgmanager: Fix up clusvcadm.8 manual page to show -M option
       rgmanager: make status poll interval configurable
       rgmanager: Clean up build
       rgmanager: Implement enforcement of timeouts on a per-resource basis
       rgmanager: Make clustat and clusvcadm work faster
       rgmanager: Resolve hostnames->IPs and back when checking NFS clients
       cman: Fix broken qdisk main.c patch reverted with scandisk merge
       cman: Don't let qdiskd update cman if the disk is unavailable
       cman: show '-d' option in mkqdisk -h and mkqdisk.8
       [fence] Make fence_xvmd support reloading of key files on the fly.
       [rgmanager] Apply patch from Marcelo Azevedo to make migration more robust
       [rgmanager] Fix live migration option (broken in last commit)
       group: Allow group_tool ls <name> <level> to be scriptable
       [fence] Fix fence_xvmd trying to read wrong args from ccs

Marek 'marx' Grac (4):
       [FENCE] Fix #237266 - LPAR/HMC fence agent
       [FENCE] Fix #460054 - fence_apc fails with pexpect exception
       [FENCE] Fix #290231 - "Switch (optional)" param does not default to "1" and program fails
       [RGMANAGER] - Fix #462910 postgres-8.sh and metadata fixes

Ryan McCabe (1):
       cman: Fix typo that caused start-up to fail

Ryan O'Hara (5):
       cman: allow custom xen network bridge scripts
       groupd: detect dead daemons and remove node from cluster
       fence_scsi: improve logging for debugging
       groupd.8: update man page with information about -s option
       fence_scsi: correctly declare key_list

Satoru SATOH (1):
       fence: Add network interface select option for fence_xvmd

Simone Gotti (1):
       [rgmanager] Fix fuser parsing on later versions of psmisc

rohara (2):
       fence_scsi.pl: check if nodeid is zero
       scsi_reserve: add restart option

  ccs/daemon/cnx_mgr.c                            |   20 +-
  cman/daemon/ais.c                               |    3 +-
  cman/daemon/cmanccs.c                           |    2 +-
  cman/daemon/commands.c                          |   15 +-
  cman/init.d/cman.in                             |   32 +-
  cman/lib/libcman.h                              |    2 +-
  cman/man/mkqdisk.8                              |    5 +-
  cman/man/qdisk.5                                |   16 +
  cman/qdisk/disk.c                               |    3 +
  cman/qdisk/disk.h                               |    2 +
  cman/qdisk/main.c                               |   83 +++-
  cman/qdisk/mkqdisk.c                            |    2 +-
  cman/qdisk/scandisk.c                           |   13 +-
  configure                                       |    9 +-
  dlm/tests/tcpdump/Makefile                      |   23 +
  dlm/tests/tcpdump/README                        |   21 +
  dlm/tests/tcpdump/dlm_tcpdump.c                 |  370 ++++++++++++++
  dlm/tests/tcpdump/dlmtop.c                      |  613 +++++++++++++++++++++++
  fence/agents/alom/Makefile                      |    5 +
  fence/agents/alom/fence_alom.py                 |   69 +++
  fence/agents/apc/fence_apc.py                   |   15 +-
  fence/agents/apc_snmp/fence_apc_snmp.py         |    4 +-
  fence/agents/egenera/fence_egenera.pl           |    9 +-
  fence/agents/ldom/Makefile                      |    5 +
  fence/agents/ldom/fence_ldom.py                 |  101 ++++
  fence/agents/lib/fencing.py.py                  |   54 ++-
  fence/agents/lpar/fence_lpar.py                 |    3 +-
  fence/agents/scsi/fence_scsi.pl                 |   97 +++--
  fence/agents/scsi/scsi_reserve                  |   55 ++
  fence/agents/vmware/fence_vmware.pl             |  322 ------------
  fence/agents/vmware/fence_vmware.py             |  111 ++++
  fence/agents/xvm/fence_xvm.c                    |    2 +-
  fence/agents/xvm/fence_xvmd.c                   |   37 ++-
  fence/agents/xvm/mcast.c                        |    9 +-
  fence/agents/xvm/mcast.h                        |    4 +-
  fence/agents/xvm/options-ccs.c                  |    3 +
  fence/agents/xvm/options.c                      |   13 +
  fence/agents/xvm/options.h                      |    1 +
  fence/agents/xvm/simple_auth.c                  |    2 +
  fence/agents/xvm/xvm.h                          |    1 +
  fence/fence_tool/fence_tool.c                   |   93 ++++-
  fence/fenced/agent.c                            |    2 +-
  fence/fenced/fd.h                               |    4 +
  fence/fenced/main.c                             |   32 ++-
  fence/man/Makefile                              |    3 +
  fence/man/fence_alom.8                          |   90 ++++
  fence/man/fence_ldom.8                          |  114 +++++
  fence/man/fence_tool.8                          |    7 +-
  fence/man/fence_vmware.8                        |  137 +++++
  fence/man/fence_xvmd.8                          |    3 +
  gfs-kernel/src/gfs/Makefile                     |    7 +
  gfs-kernel/src/gfs/acl.c                        |    2 +-
  gfs-kernel/src/gfs/bits.c                       |    2 +-
  gfs-kernel/src/gfs/bmap.c                       |    2 +-
  gfs-kernel/src/gfs/dio.c                        |    2 +-
  gfs-kernel/src/gfs/dir.c                        |    2 +-
  gfs-kernel/src/gfs/eaops.c                      |    2 +-
  gfs-kernel/src/gfs/eattr.c                      |    2 +-
  gfs-kernel/src/gfs/file.c                       |    2 +-
  gfs-kernel/src/gfs/gfs.h                        |    2 +-
  gfs-kernel/src/gfs/glock.c                      |    2 +-
  gfs-kernel/src/gfs/glock.h                      |   15 +-
  gfs-kernel/src/gfs/glops.c                      |    2 +-
  gfs-kernel/src/gfs/incore.h                     |    1 +
  gfs-kernel/src/gfs/inode.c                      |   10 +-
  gfs-kernel/src/gfs/ioctl.c                      |    2 +-
  gfs-kernel/src/gfs/lm.c                         |    8 +-
  gfs-kernel/src/gfs/lm_interface.h               |  278 ++++++++++
  gfs-kernel/src/gfs/lock_dlm.h                   |  182 +++++++
  gfs-kernel/src/gfs/lock_dlm_lock.c              |  527 +++++++++++++++++++
  gfs-kernel/src/gfs/lock_dlm_main.c              |   40 ++
  gfs-kernel/src/gfs/lock_dlm_mount.c             |  279 ++++++++++
  gfs-kernel/src/gfs/lock_dlm_sysfs.c             |  225 +++++++++
  gfs-kernel/src/gfs/lock_dlm_thread.c            |  367 ++++++++++++++
  gfs-kernel/src/gfs/lock_nolock_main.c           |  230 +++++++++
  gfs-kernel/src/gfs/locking.c                    |  180 +++++++
  gfs-kernel/src/gfs/log.c                        |   29 +-
  gfs-kernel/src/gfs/lops.c                       |    2 +-
  gfs-kernel/src/gfs/lvb.c                        |    2 +-
  gfs-kernel/src/gfs/main.c                       |   12 +-
  gfs-kernel/src/gfs/mount.c                      |    5 +-
  gfs-kernel/src/gfs/ondisk.c                     |    2 +-
  gfs-kernel/src/gfs/ops_address.c                |   36 +-
  gfs-kernel/src/gfs/ops_dentry.c                 |    2 +-
  gfs-kernel/src/gfs/ops_export.c                 |    2 +-
  gfs-kernel/src/gfs/ops_file.c                   |    6 +-
  gfs-kernel/src/gfs/ops_fstype.c                 |    2 +-
  gfs-kernel/src/gfs/ops_inode.c                  |   16 +-
  gfs-kernel/src/gfs/ops_super.c                  |    2 +-
  gfs-kernel/src/gfs/ops_vm.c                     |    2 +-
  gfs-kernel/src/gfs/page.c                       |    2 +-
  gfs-kernel/src/gfs/proc.c                       |    2 +-
  gfs-kernel/src/gfs/quota.c                      |    2 +-
  gfs-kernel/src/gfs/recovery.c                   |    2 +-
  gfs-kernel/src/gfs/rgrp.c                       |    2 +-
  gfs-kernel/src/gfs/super.c                      |    2 +-
  gfs-kernel/src/gfs/sys.c                        |    2 +-
  gfs-kernel/src/gfs/trans.c                      |    2 +-
  gfs-kernel/src/gfs/unlinked.c                   |    2 +-
  gfs-kernel/src/gfs/util.c                       |    2 +-
  gfs/gfs_fsck/log.c                              |   10 +-
  gfs/gfs_mkfs/main.c                             |   28 +-
  gfs2/fsck/pass1b.c                              |    4 +-
  gfs2/fsck/pass1c.c                              |    4 +-
  gfs2/fsck/pass5.c                               |   14 +-
  gfs2/libgfs2/Makefile                           |    1 +
  gfs2/libgfs2/buf.c                              |    1 +
  gfs2/libgfs2/misc.c                             |    2 +-
  gfs2/mkfs/main_mkfs.c                           |   30 +-
  gfs2/mount/mount.gfs2.c                         |    1 +
  gfs2/mount/util.c                               |    7 +
  group/daemon/cman.c                             |    4 +
  group/daemon/cpg.c                              |  104 ++++-
  group/daemon/gd_internal.h                      |    5 +-
  group/daemon/main.c                             |   47 ++-
  group/man/groupd.8                              |    5 +
  group/tool/main.c                               |   20 +-
  make/defines.mk.input                           |    2 +
  make/fencebuild.mk                              |    1 +
  rgmanager/include/members.h                     |    3 +
  rgmanager/include/resgroup.h                    |    9 +-
  rgmanager/include/reslist.h                     |    3 +-
  rgmanager/man/clurgmgrd.8                       |   13 +-
  rgmanager/man/clusvcadm.8                       |   66 ++-
  rgmanager/src/clulib/members.c                  |   29 ++
  rgmanager/src/clulib/rg_strings.c               |   23 +-
  rgmanager/src/daemons/clurmtabd.c               |    4 +-
  rgmanager/src/daemons/event_config.c            |    8 +
  rgmanager/src/daemons/fo_domain.c               |   23 +-
  rgmanager/src/daemons/groups.c                  |  123 ++++--
  rgmanager/src/daemons/main.c                    |   55 ++-
  rgmanager/src/daemons/reslist.c                 |    7 +-
  rgmanager/src/daemons/restree.c                 |  101 ++++-
  rgmanager/src/daemons/rg_event.c                |   58 ++-
  rgmanager/src/daemons/rg_forward.c              |    6 +-
  rgmanager/src/daemons/rg_locks.c                |    3 +-
  rgmanager/src/daemons/rg_state.c                |   51 ++-
  rgmanager/src/daemons/rg_thread.c               |    3 +-
  rgmanager/src/daemons/service_op.c              |   13 +-
  rgmanager/src/daemons/slang_event.c             |   52 ++-
  rgmanager/src/daemons/test.c                    |    3 +-
  rgmanager/src/resources/clusterfs.sh            |    4 +-
  rgmanager/src/resources/default_event_script.sl |   16 +-
  rgmanager/src/resources/fs.sh                   |    4 +-
  rgmanager/src/resources/ip.sh                   |    6 +-
  rgmanager/src/resources/netfs.sh                |    4 +-
  rgmanager/src/resources/nfsclient.sh            |   94 ++++-
  rgmanager/src/resources/postgres-8.metadata     |    2 +-
  rgmanager/src/resources/postgres-8.sh           |   16 +-
  rgmanager/src/resources/service.sh              |   21 +
  rgmanager/src/resources/utils/ra-skelet.sh      |    5 +
  rgmanager/src/resources/vm.sh                   |   20 +-
  152 files changed, 5562 insertions(+), 730 deletions(-)

- --
I'm going to make him an offer he can't refuse.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iQIVAwUBSP2G3AgUGcMLQ3qJAQJqPQ//bgQo9WmIXcAsHJMHpb5qEUemFlIOXfXQ
YkeJM2agZiJK+/WMA2SrhI2FjPg5Q09Cxxi0KQH0F1XVKQiRuTwA2sGf9CJqLYps
HhZ2pOlN01ixNCJgfbRrxIqvvnel2nnRmlEpdU4FgHiSgJmFoEvao+Oy8dOVt+/s
b2zB8niYXWXsgC+Zx9QH9OmWygsf68pGozwnZ0UBOJluXcVZUdsfKn0WMvYSBTfP
fBObpwfK0F3Gpko3747tPYQEyFz6vrsrK2GVqivPuhCTP7ZLSsrUCg8Q9CFWCyA0
El+cBjXBgXRGQsIiWz6bbnnjeos/vM1N7fV9KSqoAxljrb6peSyT8SpBDVTdebF4
6IZNdxrxRPtPMgDjz3wnHwbhF8HhPAcQgWIqHdOBPvEFFkYaSF+1m0WFJHgIdWma
zZ6OWiNf+H5SNPZ9t9F0UBZAUbciugORfWUvhPOYJNk4HMSjuxAMCjjBzfNWkIed
G8XK8Xtq8g3aNv3CvD54Jl9NGZjQTwJFMwNu2u4RmXYH0L+PgF7fOjfD7P+0WEEB
E9uIzCqYv0svvPVCbLVXfk2qdJ2u2veW7REEvZSg2BT1bj4uS+sK7Tv3FK7aaoAx
CFOGb2Y6I4vqaJbunPTVWCyVsubtvQJSMBqRMJBhCXKE8o/YLWoyvUrAB+PB2j1d
er9d1M5+23g=
=ixnW
-----END PGP SIGNATURE-----


From nick at javacat.f2s.com  Tue Oct 21 07:55:41 2008
From: nick at javacat.f2s.com (nick at javacat.f2s.com)
Date: Tue, 21 Oct 2008 08:55:41 +0100
Subject: [Linux-cluster] 4 node GFS cluster sanity check
Message-ID: <1224575741.48fd8afd581cd@webmail.freedom2surf.net>

Hi,

RHEL 5.2 32bit kernel 2.6.18-92.1.10.el5PAE
kmod-gfs-0.1.23-5.el5
gfs2-utils-0.1.44-1.el5_2.1
gfs-utils-0.1.17-1.el5
cman-2.0.84-2.el5
kmod-gfs2-PAE-1.92-1.1.el5
kmod-gfs2-1.92-1.1.el5
kmod-gfs-PAE-0.1.23-5.el5
rgmanager-2.0.38-2.el5

I have a 4 node cluster. All I want to use is GFS so that each node can read/write to the same directory. I don't want failover. I want to enable as
few cluster daemons as possible.

Here is my cluster.conf

<?xml version="1.0"?>
<cluster alias="TEST" config_version="14" name="TEST">
	<fence_daemon post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="fintestapp1" nodeid="1" votes="1">
			<fence>
				<method name="dummy"/>
			</fence>
		</clusternode>
		<clusternode name="fintestapp2" nodeid="2" votes="1">
			<fence>
				<method name="dummy"/>
			</fence>
		</clusternode>
		<clusternode name="fintestapp3" nodeid="3" votes="1">
			<fence>
				<method name="dummy"/>
			</fence>
		</clusternode>
		<clusternode name="fintestapp4" nodeid="4" votes="1">
			<fence>
				<method name="dummy"/>
			</fence>
		</clusternode>
	</clusternodes>
	<cman/>
	<fencedevices>
		<fencedevice agent="fence_manual" name="dummy"/>
	</fencedevices>
	<rm>
		<failoverdomains/>
		<resources/>
	</rm>
</cluster>

Here' is the output of cman_tool services:
type             level name       id       state
fence            0     default    00010001 none
[1 3 4]
dlm              1     clvmd      00020001 none
[1 3 4]
dlm              1     GFS1       00040001 none
[1 4]
dlm              1     rgmanager  00010003 none
[1 3 4]

Here is the output of cman_tool status:
Version: 6.1.0
Config Version: 14
Cluster Name: TEST
Cluster Id: 1198
Cluster Member: Yes
Cluster Generation: 496
Membership state: Cluster-Member
Nodes: 7
Expected votes: 6
Total votes: 4
Quorum: 4
Active subsystems: 9
Flags: Dirty
Ports Bound: 0 11 177
Node name: fintestapp4
Node ID: 4
Multicast addresses: 239.192.4.178
Node addresses: 192.168.10.68

As you can see Expected votes is 6 while Total votes is 4 - whats wrong here ?

I would like confirmation that my cluster.conf is adequate please because after a few reboots last week expected votes and total votes give unexpected
results.

If any more info is needed, please ask.

Many thanks,
Nick .


From federico.simoncelli at gmail.com  Tue Oct 21 11:13:56 2008
From: federico.simoncelli at gmail.com (Federico Simoncelli)
Date: Tue, 21 Oct 2008 13:13:56 +0200
Subject: [Linux-cluster] Avoiding fencing loops with startup quorum (patch)
In-Reply-To: <a01fe36d0810200427v455f256bqdea1860efa75c299@mail.gmail.com>
References: <a01fe36d0810170427i7cb8485v806a899f80a3b715@mail.gmail.com>
	<20081017154142.GE3299@redhat.com>
	<a01fe36d0810170911k5ca2a98cqc6d3ee3a31e583a0@mail.gmail.com>
	<20081017160923.GG3299@redhat.com>
	<a01fe36d0810170956m3a9cd5berc5b8b99860f53ef2@mail.gmail.com>
	<a01fe36d0810200427v455f256bqdea1860efa75c299@mail.gmail.com>
Message-ID: <a01fe36d0810210413h8080f34v5490b96c0540af7a@mail.gmail.com>

On Mon, Oct 20, 2008 at 1:27 PM, Federico Simoncelli
<federico.simoncelli at gmail.com> wrote:
> I confirm that the patch works fine. I just need to say that the
> two_node flag is required anyway:
>
> <cman two_node="1">

After some testing I discovered that we need a couple of patches to
achieve the behaviour we wanted.
Basically if you set two_node=1 the quorum is locked to 1 (no matter
what expected_votes you configure). I unlocked the quorum value with
the patch "cman-2.0.84-2node2expected.patch" (in attachment). Now we
can change the quorum using the expected_votes:

# cman_tool expected -e 2 && cman_tool status | grep Quorum
Quorum: 2
# cman_tool expected -e 1 && cman_tool status | grep Quorum
Quorum: 1

In the same patch I fixed what I believe is a bug. Basically in the
file /cman/daemon/cmanccs.c the values node_count and vote_sum are
computed only if expected_votes == 0 but those values are used
afterwards:

                if (two_node) {
                        if (node_count != 2 || vote_sum != 2) {

To quickly verify the bug:

# cman_tool join -w -e 1

It should generate the error "two_node set but there are more than 2
nodes" on any cman version.

The second patch "cman-2.0.84-startupquorum.patch" is the init patch
to take advantage of the expected_votes and the quorum.
Using the following configuration:

# cat /etc/sysconfig/cman
CMAN_QUORUM_TIMEOUT=10
CMAN_PREJOIN_EXPECTED=2
CMAN_POSTJOIN_EXPECTED=1

Your cluster needs the quorum to be 2 (CMAN_PREJOIN_EXPECTED) within
10 seconds to start. No fencing (and no fencing loops) will be
performed if the quorum is less than CMAN_PREJOIN_EXPECTED. After the
join session the expected_votes are set back to 1
(CMAN_POSTJOIN_EXPECTED) and the quorum goes back to 1 too.

Comments are welcome,
-- 
Federico.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman-2.0.84-2node2expected.patch
Type: application/octet-stream
Size: 1135 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081021/2551996c/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman-2.0.84-startupquorum.patch
Type: application/octet-stream
Size: 1086 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081021/2551996c/attachment-0001.obj>

From scooter at cgl.ucsf.edu  Tue Oct 21 13:55:06 2008
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Tue, 21 Oct 2008 06:55:06 -0700
Subject: [Linux-cluster] GFS2 Test setup
In-Reply-To: <48FD854D.7030409@redhat.com>
References: <48FCBCBC.50801@cgl.ucsf.edu> <48FD854D.7030409@redhat.com>
Message-ID: <48FDDF3A.1000102@cgl.ucsf.edu>

Christine,
    Thanks for the information.  I checked my routing, and other than 
the zero conf route on the
same interface as my private network, everything seems clean.  I moved 
the zero conf route to
the public network, so we'll see if that fixes anything.  Also, the 
multicast route doesn't get involved,
does it?  The default route is on our public network (obviously) and the 
nodes should be talking to
each other over the private network (according to cman_tool status), but 
I don't know what interface
the multicasts will be sent out from.  I wouldn't think that would 
impact dlm, only the heartbeat, right?

Thanks again!

-- scooter

Christine Caulfield wrote:
> Scooter Morris wrote:
>   
>> We are in the process of building a cluster, which will hope to put into
>> production when RHEL 5.3 is released.  Our plan is to use GFS2, which
>> we've been experimenting with for some time, but we're having some
>> problems.  The cluster has 3 nodes, two HP DL580's and one HP DL585 --
>> we're using ILO for fencing.  We want to share a couple of filesystems
>> using GFS2 which are connected to our SAN (an EVA 5000).  I've set
>> everything up and it all works as expected, although on occasion, GFS2
>> just seems to hang.  This happens 1-4 times/week.  What I note in the
>> logs are a series of dlm messages.  On node 1 (for example) I see:
>>
>> dlm: connecting to 3
>> dlm: connecting to 2
>> dlm: connecting to 2
>> dlm: connecting to 2
>> dlm: connecting to 3
>> dlm: connecting to 2
>> dlm: connecting to 2
>> dlm: connecting to 2
>> dlm: connecting to 3
>> dlm: connecting to 3
>> dlm: connecting to 3
>> dlm: connecting to 3
>>
>> On node 2, I see:
>>
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>>
>> and on node 3, I see:
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>> dlm: got connection from 1
>> Extra connection from node 1 attempted
>>
>>     
>
> Those messages are usually caused by routing problems. The DLM binds to
> the address it is given by cman (see the output of cman_tool status for
> that) and receiving nodes check incoming packets against that address to
> make sure that only valid cluster nodes try to make connections.
>
> What is happening here (I think - it sounds like a problem I've seen
> before) is that the packets are being routed though another interface
> than the one cman is using and the remote node sees them as coming from
> a different address. This can happen if you have two ethernet interfaces
> connected to the same physical segment for example.
>
> There was a also a bug that could cause this if the routing was not
> quite so broken but a little odd, though I don't have the bugzilla
> number to hand, sorry.
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081021/0991caba/attachment.htm>

From max.liccardo at gmail.com  Wed Oct 22 21:50:46 2008
From: max.liccardo at gmail.com (max liccardo)
Date: Wed, 22 Oct 2008 23:50:46 +0200
Subject: [Linux-cluster] ipfails
Message-ID: <a900f4140810221450l3e7263f2r980874e39d7fc308@mail.gmail.com>

hi cluster masters,
I'm using linux-HA and linux-cluster on separate project.
I'm wondering if I can use with linux-cluster something like the
linux-ha ping nodes, in order to have some sort of "network quorum".
bye

 GnuPG public key available on wwwkeys.eu.pgp.net
 Key ID: D01F1CAD
 Key fingerprint:  992D 91B7 9682 9735 12C9 402D AD3F E4BB D01F 1CAD

"la velocit? induce all'oblio,
 la lentezza al ricordo"


From jallgood at ohl.com  Thu Oct 23 14:08:21 2008
From: jallgood at ohl.com (Allgood, John)
Date: Thu, 23 Oct 2008 09:08:21 -0500
Subject: [Linux-cluster] Cluster/GFS issue.
Message-ID: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>

Hello All

 
I am having some issues with building an eight node Xen cluster. Let me
give some background first. We have 8 dell PE 1950 with 32GB RAM
connected via dual brocade fiber switchs to an EMC CX-310. The guests
images are being stored on the SAN.  We are using EMC Powerpath to hand
the multipathing. The Operating system is Redhat Advanced Platform 5.2 .
The filesystems on the SAN were created using Conga CLVM/GFS1. We have
the heartbeat on an separate private network. The fence devices are Dell
DRAC's.

 Here is the problem that we are having. We can't on an consistent basic
get the GFS filesystem mounted. On the nodes that don't connect it will
just hang on bootup trying to mount the GFS filesystem. All nodes come
up and join the cluster at this point but only 1 or 2 will completely
come up with the GFS filesystem mounted. If we do an interactive startup
and skip the GFS part all systems will come up on the cluster but
without the gfs mounted.

At this point I am not sure what to do next.  I am thinking it may be a
problem with the way the GFS filesystem was created. We just used the
default settings. The LVM is 668GB created from an RAID10.

 
Best Regards

 
John Allgood
Senior Systems Administrator
Turbo, division of OHL 
2251 Jesse Jewell Pky. NE 
Gainesville, GA 30507
tel: (678) 989-3051  fax: (770) 531-7878 
jallgood at ohl.com

www.ohl.com

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081023/89c34201/attachment.htm>

From lhh at redhat.com  Thu Oct 23 15:49:59 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 23 Oct 2008 11:49:59 -0400
Subject: [Linux-cluster] Email alert
In-Reply-To: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
References: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
Message-ID: <1224776999.32460.79.camel@ayanami>

On Fri, 2008-10-17 at 01:03 -0300, Patricio A. Bruna wrote:
> Its possible to configure Cluster Suite to send an email when a
> service change host or faild to failover?

Syslog-ng can be configured to do this.

-- Lon

> 


From lhh at redhat.com  Thu Oct 23 15:51:16 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 23 Oct 2008 11:51:16 -0400
Subject: [Linux-cluster] Email alert
In-Reply-To: <48F894C4.9010506@nexatech.com>
References: <12438484.3551224216194223.JavaMail.root@lisa.itlinux.cl>
	<028901c93017$e230f490$a692ddb0$@com> <48F894C4.9010506@nexatech.com>
Message-ID: <1224777076.32460.82.camel@ayanami>

On Fri, 2008-10-17 at 08:36 -0500, Jeff Macfarland wrote:
> Mark Chaney wrote:
> > No, but you can use a monitoring service like nagios to do that.
> 
> Is "RIND" (http://sources.redhat.com/cluster/wiki/EventScripting) not 
> applicable? Or, if implemented in <rm/>, will it prevent the system from 
> automated failover of services, etc? I dunno much about slang, but it 
> looks like it at least supports system() for a quick email if nothing else.

Fork/exec during failover needs to be managed carefully.  I haven't
tried system() from within a RIND script.  It should probably work (or,
maybe we should provide an email interface).

-- Lon


From lhh at redhat.com  Thu Oct 23 15:55:41 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 23 Oct 2008 11:55:41 -0400
Subject: [Linux-cluster] 4 node GFS cluster sanity check
In-Reply-To: <1224575741.48fd8afd581cd@webmail.freedom2surf.net>
References: <1224575741.48fd8afd581cd@webmail.freedom2surf.net>
Message-ID: <1224777341.32460.85.camel@ayanami>

On Tue, 2008-10-21 at 08:55 +0100, nick at javacat.f2s.com wrote:
> Hi,
> 
> RHEL 5.2 32bit kernel 2.6.18-92.1.10.el5PAE

> Here is the output of cman_tool status:
> Version: 6.1.0
> Config Version: 14
> Cluster Name: TEST
> Cluster Id: 1198
> Cluster Member: Yes
> Cluster Generation: 496
> Membership state: Cluster-Member
> Nodes: 7
> Expected votes: 6
> Total votes: 4
> Quorum: 4
> Active subsystems: 9
> Flags: Dirty
> Ports Bound: 0 11 177
> Node name: fintestapp4
> Node ID: 4
> Multicast addresses: 239.192.4.178
> Node addresses: 192.168.10.68
> 
> As you can see Expected votes is 6 while Total votes is 4 - whats wrong here ?
> 
> I would like confirmation that my cluster.conf is adequate please because after a few reboots last week expected votes and total votes give unexpected
> results.
> 
> If any more info is needed, please ask.

?

How can you have 7 nodes, expected 6 with a 4-node cluster
configuration.  It looks like you have two clusters with the same name
on the same subnet.

Also, you should chkconfig --del rgmanager if you're not doing failover.
You don't need it.

-- Lon


From lhh at redhat.com  Thu Oct 23 15:56:49 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 23 Oct 2008 11:56:49 -0400
Subject: [Linux-cluster] ipfails
In-Reply-To: <a900f4140810221450l3e7263f2r980874e39d7fc308@mail.gmail.com>
References: <a900f4140810221450l3e7263f2r980874e39d7fc308@mail.gmail.com>
Message-ID: <1224777409.32460.87.camel@ayanami>

On Wed, 2008-10-22 at 23:50 +0200, max liccardo wrote:
> hi cluster masters,
> I'm using linux-HA and linux-cluster on separate project.
> I'm wondering if I can use with linux-cluster something like the
> linux-ha ping nodes, in order to have some sort of "network quorum".
> bye

Currently, no, but you could build a daemon which did this and talked to
the CMAN quorum API to do this.

-- Lon


From lhh at redhat.com  Thu Oct 23 16:01:57 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 23 Oct 2008 12:01:57 -0400
Subject: [Linux-cluster] Two nodes cluster issue without shared storage
	issue
In-Reply-To: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net>
Message-ID: <1224777717.32460.92.camel@ayanami>

On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
wrote:
> Hi,
> 
> I want to set up a two node cluster, I use active/standby mode to run
> my service. I need even one node's hardware failure such as power cut,
> another node still can handover from failure node and the provide the
> service. 
> 
> In my environment, I have no shared storage, so I can not use quorum
> disk. Is there any other way to implement it? I searched and found
> 'tiebreaker IP' may feed my request, but I can not found any hints on
> how to configure it ?

Since you have no shared data, you may be able to run without fencing. 

That should be pretty straightforward, but you might need to comment out
the "fenced" startup from the cman init script.

In this case, the worst that will happen is both nodes will end up
running the service at the same time in the event of a network
partition.

The other down side is that if the cluster divides into two partitions
and later merges back into one partition, I don't think certain things
will work right; you will need to detect this event and reboot one of
the nodes.

-- Lon


From billpp at gmail.com  Thu Oct 23 16:42:40 2008
From: billpp at gmail.com (Flavio Junior)
Date: Thu, 23 Oct 2008 14:42:40 -0200
Subject: [Linux-cluster] Two nodes cluster issue without shared storage
	issue
In-Reply-To: <1224777717.32460.92.camel@ayanami>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net>
	<1224777717.32460.92.camel@ayanami>
Message-ID: <58aa8d780810230942s421d74dfqaf61190be764b57@mail.gmail.com>

Well.. If you are using an active/standby scenario, without a shared
storage, probably you can make use of CARP/UCARP

http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol
http://www.ucarp.org/project/ucarp


--

Fl?vio do Carmo J?nior aka waKKu


On Thu, Oct 23, 2008 at 2:01 PM, Lon Hohberger <lhh at redhat.com> wrote:

> On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> wrote:
> > Hi,
> >
> > I want to set up a two node cluster, I use active/standby mode to run
> > my service. I need even one node's hardware failure such as power cut,
> > another node still can handover from failure node and the provide the
> > service.
> >
> > In my environment, I have no shared storage, so I can not use quorum
> > disk. Is there any other way to implement it? I searched and found
> > 'tiebreaker IP' may feed my request, but I can not found any hints on
> > how to configure it ?
>
> Since you have no shared data, you may be able to run without fencing.
>
> That should be pretty straightforward, but you might need to comment out
> the "fenced" startup from the cman init script.
>
> In this case, the worst that will happen is both nodes will end up
> running the service at the same time in the event of a network
> partition.
>
> The other down side is that if the cluster divides into two partitions
> and later merges back into one partition, I don't think certain things
> will work right; you will need to detect this event and reboot one of
> the nodes.
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081023/0fdd088f/attachment.htm>

From mockey.chen at nsn.com  Fri Oct 24 02:35:48 2008
From: mockey.chen at nsn.com (Chen, Mockey (NSN - CN/Cheng Du))
Date: Fri, 24 Oct 2008 10:35:48 +0800
Subject: [Linux-cluster] Two nodes cluster issue without shared
	storageissue
In-Reply-To: <1224777717.32460.92.camel@ayanami>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net>
	<1224777717.32460.92.camel@ayanami>
Message-ID: <174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net>

 
>-----Original Message-----
>From: linux-cluster-bounces at redhat.com 
>[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon 
>Hohberger
>Sent: 2008?10?24? 0:02
>To: linux clustering
>Subject: Re: [Linux-cluster] Two nodes cluster issue without 
>shared storageissue
>
>On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
>wrote:
>> Hi,
>> 
>> I want to set up a two node cluster, I use active/standby 
>mode to run 
>> my service. I need even one node's hardware failure such as 
>power cut, 
>> another node still can handover from failure node and the 
>provide the 
>> service.
>> 
>> In my environment, I have no shared storage, so I can not use quorum 
>> disk. Is there any other way to implement it? I searched and found 
>> 'tiebreaker IP' may feed my request, but I can not found any 
>hints on 
>> how to configure it ?
>
>Since you have no shared data, you may be able to run without fencing. 
>
>That should be pretty straightforward, but you might need to 
>comment out the "fenced" startup from the cman init script.
>
>In this case, the worst that will happen is both nodes will 
>end up running the service at the same time in the event of a 
>network partition.
>
>The other down side is that if the cluster divides into two 
>partitions and later merges back into one partition, I don't 
>think certain things will work right; you will need to detect 
>this event and reboot one of the nodes.
>
>-- Lon

I know such defects in two node cluster.  
Since our service is mission critical, I want to know how to avoid such failure case ?

Thanks.


From mockey.chen at nsn.com  Fri Oct 24 02:41:11 2008
From: mockey.chen at nsn.com (Chen, Mockey (NSN - CN/Cheng Du))
Date: Fri, 24 Oct 2008 10:41:11 +0800
Subject: [Linux-cluster] Two nodes cluster issue without shared
	storageissue
In-Reply-To: <58aa8d780810230942s421d74dfqaf61190be764b57@mail.gmail.com>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net><1224777717.32460.92.camel@ayanami>
	<58aa8d780810230942s421d74dfqaf61190be764b57@mail.gmail.com>
Message-ID: <174CED94DD8DC54AB888B56E103B118742183D@CNBEEXC007.nsn-intra.net>

>
>	From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Flavio Junior
>	Sent: 2008?10?24? 0:43
>	To: linux clustering
>	Subject: Re: [Linux-cluster] Two nodes cluster issue without shared storageissue
>	
>	
>	Well.. If you are using an active/standby scenario, without a shared storage, probably you can make use of CARP/UCARP
>	
>	http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol
>	http://www.ucarp.org/project/ucarp
>	
	
I think CARP will fullfill my current request, but we have choose RHCS as our cluster suite. It is very difficult to change it.
Anyhow, Thanks for your suggestion. 


From raju.rajsand at gmail.com  Fri Oct 24 07:11:33 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Fri, 24 Oct 2008 12:41:33 +0530
Subject: [Linux-cluster] Cluster/GFS issue.
In-Reply-To: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
Message-ID: <8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com>

Greetings,

2008/10/23 Allgood, John <jallgood at ohl.com>
<snip>

>  Here is the problem that we are having. We can't on an consistent basic
> get the GFS filesystem mounted. On
>
<snip>

Just a hunch... Cant say if it will help...

Have you tried putting the mount command in rc.local instead of /etc/fstab?

Regards,

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/c28769a4/attachment.htm>

From raju.rajsand at gmail.com  Fri Oct 24 07:14:16 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Fri, 24 Oct 2008 12:44:16 +0530
Subject: [Linux-cluster] Cluster/GFS issue.
In-Reply-To: <8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
	<8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com>
Message-ID: <8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>

>
> 2008/10/23 Allgood, John <jallgood at ohl.com>
> <snip>
>
>>  Here is the problem that we are having. We can't on an consistent basic
>> get the GFS filesystem mounted. On
>>
> <snip>
>
> Just a hunch... Cant say if it will help...
>
> Have you tried putting the mount command in rc.local instead of /etc/fstab?
>
start clvmd too in rc.local. of course before mounting and use the commands
in chain using &&

>
> Regards,
>
> Rajagopal
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/c01faa0e/attachment.htm>

From Santosh.Panigrahi at in.unisys.com  Fri Oct 24 11:59:05 2008
From: Santosh.Panigrahi at in.unisys.com (Panigrahi, Santosh Kumar)
Date: Fri, 24 Oct 2008 17:29:05 +0530
Subject: [Linux-cluster] cluster between 2 Xen guests where guests are on
	different hosts 
In-Reply-To: <476A18A2.2080406@wasko.pl>
References: <45824.79.10.137.147.1197661952.squirrel@picard.linux.it>	<1197672129.18614.2.camel@localhost.localdomain>	<26275.62.101.100.5.1197887615.squirrel@picard.linux.it><1197915660.4959.24.camel@ayanami.boston.devel.redhat.com>
	<476A18A2.2080406@wasko.pl>
Message-ID: <D566E8CF3538B54D95B925CB69CB4D2A16BC0485@inblr-exch1.eu.uis.unisys.com>

Hello,

I am using RHEL5.2+RHCS and configured a 2 node cluster in XEN virtual
environment for testing purpose only. These 2 cluster nodes are 2
virtual guests (p6pv1, p7pv1) and each virtual guest is on different
hosts/ Dom-0s (p6 & p7). I have already gone through the older questions
on this forum with similar problems and also the wiki page
(http://sources.redhat.com/cluster/wiki/VMClusterCookbook ).
But still I have confused a bit regarding the Xen fencing in this
scenarios.
I don't want to do any live migration here and only to do a
failover/failback services between 2 cluster nodes. I want to know
whether I have to configure fencing only between the 2 guests (using
fence_xvm) or also between the 2 hosts (using fence_xvmd) as well, where
as my cluster nodes are 2 Xen guests.

I am configuring the cluster using luci and there options are as
follows.

Fence Daemon Properties:
Post Fail Delay		- 0
Post Join Delay		- 3
Run XVM fence daemon	- tick mark selected

XVM fence daemon key distribution:
Enter a node hostname from the host cluster			- ?	
Enter a node hostname from the hosted (virtual) cluster	_ ?	

Can someone please help me in this regard?

Regards,
Santosh


From jeff.sturm at eprize.com  Fri Oct 24 14:09:57 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Fri, 24 Oct 2008 10:09:57 -0400
Subject: [Linux-cluster] cluster between 2 Xen guests where guests are
	ondifferent hosts 
In-Reply-To: <D566E8CF3538B54D95B925CB69CB4D2A16BC0485@inblr-exch1.eu.uis.unisys.com>
References: <45824.79.10.137.147.1197661952.squirrel@picard.linux.it>	<1197672129.18614.2.camel@localhost.localdomain>	<26275.62.101.100.5.1197887615.squirrel@picard.linux.it><1197915660.4959.24.camel@ayanami.boston.devel.redhat.com><476A18A2.2080406@wasko.pl>
	<D566E8CF3538B54D95B925CB69CB4D2A16BC0485@inblr-exch1.eu.uis.unisys.com>
Message-ID: <64D0546C5EBBD147B75DE133D798665F0180693B@hugo.eprize.local>

Santosh,

The hosts are responsible for fencing the guests, so, as far as I know
it is not possible to use fence_xvm without also configuring fence_xvmd.

In our configuration we run an "inner" cluster amongst the DomU guests,
and an "outer" cluster amongst the Dom0 hosts.  The outer cluster starts
fence_xvmd whenever cman starts.  The fence_xvmd daemon listens for
multicast traffic from fence_xvm.  We have a dedicated VLAN for this
traffic in our configuration.  (Make sure your routing tables are
adjusted for this, if needed--whereas aisexec figures out what
interfaces to use for multicast automatically based on the bind address,
fence_xvm does not.)

If your Dom0 hosts are not part of a cluster, it may be possible to run
fence_xvmd standalone.  We have not attempted to do so, so I can't say
whether it can work.

Jeff

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> Panigrahi, Santosh Kumar
> Sent: Friday, October 24, 2008 7:59 AM
> To: linux clustering
> Subject: [Linux-cluster] cluster between 2 Xen guests where 
> guests are ondifferent hosts 
> 
> Hello,
> 
> I am using RHEL5.2+RHCS and configured a 2 node cluster in 
> XEN virtual environment for testing purpose only. These 2 
> cluster nodes are 2 virtual guests (p6pv1, p7pv1) and each 
> virtual guest is on different hosts/ Dom-0s (p6 & p7). I have 
> already gone through the older questions on this forum with 
> similar problems and also the wiki page 
> (http://sources.redhat.com/cluster/wiki/VMClusterCookbook ).
> But still I have confused a bit regarding the Xen fencing in 
> this scenarios.
> I don't want to do any live migration here and only to do a 
> failover/failback services between 2 cluster nodes. I want to 
> know whether I have to configure fencing only between the 2 
> guests (using
> fence_xvm) or also between the 2 hosts (using fence_xvmd) as 
> well, where as my cluster nodes are 2 Xen guests.
> 
> I am configuring the cluster using luci and there options are 
> as follows.
> 
> Fence Daemon Properties:
> Post Fail Delay		- 0
> Post Join Delay		- 3
> Run XVM fence daemon	- tick mark selected
> 
> XVM fence daemon key distribution:
> Enter a node hostname from the host cluster			- ?	
> Enter a node hostname from the hosted (virtual) cluster	_ ?	
> 
> Can someone please help me in this regard?
> 
> Regards,
> Santosh
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 


From cedwards at smartechcorp.net  Fri Oct 24 14:13:07 2008
From: cedwards at smartechcorp.net (Chris Edwards)
Date: Fri, 24 Oct 2008 10:13:07 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
	<8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com>
	<8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>
Message-ID: <61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net>

If I am installing multiple Xen VM's in a cluster with shared iSCSI space with Logical Volumes for each virtual machine should I put each LV in its own logical volume group or should I use one logical volume group for all of the LV's?

Thanks!

---

Chris Edwards


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/7e71e22c/attachment.htm>

From jeff.sturm at eprize.com  Fri Oct 24 14:18:08 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Fri, 24 Oct 2008 10:18:08 -0400
Subject: [Linux-cluster] Two nodes cluster issue without sharedstorageissue
In-Reply-To: <174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net><1224777717.32460.92.camel@ayanami>
	<174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net>
Message-ID: <64D0546C5EBBD147B75DE133D798665F0180693C@hugo.eprize.local>

For what it's worth, considerations like these have caused us to abandon any efforts to build a 2-node cluster.

>From this point forward all our RHCS deployments will have a minimum of 3 nodes, even if the 3rd node is a small node that provides no resources and only exists for arbitration purposes.  (It was going to be that, or a quorum disk for our application, but we have no experience running a quorum disk over the long-haul in a production envrironment.)

Hope this helps someone.

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chen, 
> Mockey (NSN - CN/Cheng Du)
> Sent: Thursday, October 23, 2008 10:36 PM
> To: linux clustering
> Subject: RE: [Linux-cluster] Two nodes cluster issue without 
> sharedstorageissue
> 
>  
> 
> >-----Original Message-----
> >From: linux-cluster-bounces at redhat.com 
> >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon 
> >Hohberger
> >Sent: 2008?10?24? 0:02
> >To: linux clustering
> >Subject: Re: [Linux-cluster] Two nodes cluster issue without shared 
> >storageissue
> >
> >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> >wrote:
> >> Hi,
> >> 
> >> I want to set up a two node cluster, I use active/standby
> >mode to run
> >> my service. I need even one node's hardware failure such as
> >power cut,
> >> another node still can handover from failure node and the
> >provide the
> >> service.
> >> 
> >> In my environment, I have no shared storage, so I can not 
> use quorum 
> >> disk. Is there any other way to implement it? I searched and found 
> >> 'tiebreaker IP' may feed my request, but I can not found any
> >hints on
> >> how to configure it ?
> >
> >Since you have no shared data, you may be able to run 
> without fencing. 
> >
> >That should be pretty straightforward, but you might need to comment 
> >out the "fenced" startup from the cman init script.
> >
> >In this case, the worst that will happen is both nodes will end up 
> >running the service at the same time in the event of a network 
> >partition.
> >
> >The other down side is that if the cluster divides into two 
> partitions 
> >and later merges back into one partition, I don't think 
> certain things 
> >will work right; you will need to detect this event and 
> reboot one of 
> >the nodes.
> >
> >-- Lon
> 
> I know such defects in two node cluster.  
> Since our service is mission critical, I want to know how to 
> avoid such failure case ?
> 
> Thanks.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 


From jeff.sturm at eprize.com  Fri Oct 24 14:20:04 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Fri, 24 Oct 2008 10:20:04 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com><8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com><8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>
	<61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net>
Message-ID: <64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>

Chris,
 
Are you running a clustered LVM, and do you expect to be able to use Xen
migration?
 
Jeff


________________________________

	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
	Sent: Friday, October 24, 2008 10:13 AM
	To: linux clustering
	Subject: [Linux-cluster] Cluster and LVG/LV
	
	
	If I am installing multiple Xen VM's in a cluster with shared
iSCSI space with Logical Volumes for each virtual machine should I put
each LV in its own logical volume group or should I use one logical
volume group for all of the LV's?

	 
	Thanks!

	 
	---

	 
	Chris Edwards
	
	
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/5650b4ab/attachment.htm>

From cedwards at smartechcorp.net  Fri Oct 24 14:28:35 2008
From: cedwards at smartechcorp.net (Chris Edwards)
Date: Fri, 24 Oct 2008 10:28:35 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com><8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com><8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>
	<61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net>
	<64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>
Message-ID: <61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>

Yes to both.  Right now the cluster is running GFS and I can migrate VM's between the nodes.

This question is coming up because I have been trying to do a snap shot and I realized the snapshot is stored on the Volume Group that the LV is located on.  I did not realize  this and I cannot do a snapshot because I did not leave enough space in each of the Volume Groups for each of the VM's.

---

Chris Edwards
Smartech Corp.
Div. of AirNet Group
http://www.airnetgroup.com<http://www.airnetgroup.com/>
http://www.smartechcorp.net<http://www.smartechcorp.net/>
cedwards at smartechcorp.net<mailto:agarrison at smartechcorp.net>
P:  423-664-7678 x114
C:  423-593-6964
F:  423-664-7680

From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
Sent: Friday, October 24, 2008 10:20 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV

Chris,

Are you running a clustered LVM, and do you expect to be able to use Xen migration?

Jeff

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
Sent: Friday, October 24, 2008 10:13 AM
To: linux clustering
Subject: [Linux-cluster] Cluster and LVG/LV
If I am installing multiple Xen VM's in a cluster with shared iSCSI space with Logical Volumes for each virtual machine should I put each LV in its own logical volume group or should I use one logical volume group for all of the LV's?

Thanks!

---

Chris Edwards


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/7bf865ff/attachment.htm>

From rodrique.heron at baruch.cuny.edu  Fri Oct 24 14:33:07 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Fri, 24 Oct 2008 10:33:07 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com><8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com><8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com><61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net><64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>
	<61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>
Message-ID: <4901DCA3.3050501@baruch.cuny.edu>

I am would be interested what others have to say as well, but I have one 
VG that I carved a LV from for each VM.


Chris Edwards wrote:
>
> Yes to both.  Right now the cluster is running GFS and I can migrate 
> VM's between the nodes.  
>
>  
>
> This question is coming up because I have been trying to do a snap 
> shot and I realized the snapshot is stored on the Volume Group that 
> the LV is located on.  I did not realize  this and I cannot do a 
> snapshot because I did not leave enough space in each of the Volume 
> Groups for each of the VM's.
>
>  
>
> ---
>
>  
>
> Chris Edwards
> Smartech Corp.
> Div. of AirNet Group
>
> http://www.airnetgroup.com <http://www.airnetgroup.com/>
>
> http://www.smartechcorp.net <http://www.smartechcorp.net/>
>
> cedwards at smartechcorp.net <mailto:agarrison at smartechcorp.net>
> P:  423-664-7678 x114
>
> C:  423-593-6964
>
> F:  423-664-7680
>
>  
>
> *From:* linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Jeff Sturm
> *Sent:* Friday, October 24, 2008 10:20 AM
> *To:* linux clustering
> *Subject:* RE: [Linux-cluster] Cluster and LVG/LV
>
>  
>
> Chris,
>
>  
>
> Are you running a clustered LVM, and do you expect to be able to use 
> Xen migration?
>
>  
>
> Jeff
>
>      
>
>     ------------------------------------------------------------------------
>
>     *From:* linux-cluster-bounces at redhat.com
>     [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Chris Edwards
>     *Sent:* Friday, October 24, 2008 10:13 AM
>     *To:* linux clustering
>     *Subject:* [Linux-cluster] Cluster and LVG/LV
>
>     If I am installing multiple Xen VM's in a cluster with shared
>     iSCSI space with Logical Volumes for each virtual machine should I
>     put each LV in its own logical volume group or should I use one
>     logical volume group for all of the LV's?
>
>      
>
>     Thanks!
>
>      
>
>     ---
>
>      
>
>     Chris Edwards
>
>      
>
>      
>

-- 
Rodrique Heron 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/19e9655b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rodrique_heron.vcf
Type: text/x-vcard
Size: 328 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/19e9655b/attachment.vcf>

From v.galande at gmail.com  Fri Oct 24 14:50:46 2008
From: v.galande at gmail.com (varun)
Date: Fri, 24 Oct 2008 18:50:46 +0400
Subject: [Linux-cluster] RE:Two nodes cluster issue without shared storage
	issue
Message-ID: <7e19e5b90810240750n1e5aa2abq8a5af976f1677703@mail.gmail.com>

Hi Lon

I think you should try Linux Virtual Server (  LVS ) here this will
definitely  help you.
You can see the details over here .

www.linuxvirtualserver.org

Br,Varun

On Fri, Oct 24, 2008 at 6:20 PM, <linux-cluster-request at redhat.com> wrote:

> Send Linux-cluster mailing list submissions to
>        linux-cluster at redhat.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        https://www.redhat.com/mailman/listinfo/linux-cluster
> or, via email, send a message with subject or body 'help' to
>        linux-cluster-request at redhat.com
>
> You can reach the person managing the list at
>        linux-cluster-owner at redhat.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-cluster digest..."
>
>
> Today's Topics:
>
>   1. Re: Two nodes cluster issue without shared storage        issue
>      (Lon Hohberger)
>   2. Re: Two nodes cluster issue without shared storage        issue
>      (Flavio Junior)
>   3. RE: Two nodes cluster issue without shared        storageissue
>      (Chen, Mockey (NSN - CN/Cheng Du))
>   4. RE: Two nodes cluster issue without shared        storageissue
>      (Chen, Mockey (NSN - CN/Cheng Du))
>   5. Re: Cluster/GFS issue. (Rajagopal Swaminathan)
>   6. Re: Cluster/GFS issue. (Rajagopal Swaminathan)
>   7. cluster between 2 Xen guests where guests are on  different
>      hosts  (Panigrahi, Santosh Kumar)
>   8. RE: cluster between 2 Xen guests where guests are ondifferent
>      hosts  (Jeff Sturm)
>   9. Cluster and LVG/LV (Chris Edwards)
>  10. RE: Two nodes cluster issue without sharedstorageissue
>      (Jeff Sturm)
>  11. RE: Cluster and LVG/LV (Jeff Sturm)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 23 Oct 2008 12:01:57 -0400
> From: Lon Hohberger <lhh at redhat.com>
> Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
>        storage issue
> To: linux clustering <linux-cluster at redhat.com>
> Message-ID: <1224777717.32460.92.camel at ayanami>
> Content-Type: text/plain
>
> On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> wrote:
> > Hi,
> >
> > I want to set up a two node cluster, I use active/standby mode to run
> > my service. I need even one node's hardware failure such as power cut,
> > another node still can handover from failure node and the provide the
> > service.
> >
> > In my environment, I have no shared storage, so I can not use quorum
> > disk. Is there any other way to implement it? I searched and found
> > 'tiebreaker IP' may feed my request, but I can not found any hints on
> > how to configure it ?
>
> Since you have no shared data, you may be able to run without fencing.
>
> That should be pretty straightforward, but you might need to comment out
> the "fenced" startup from the cman init script.
>
> In this case, the worst that will happen is both nodes will end up
> running the service at the same time in the event of a network
> partition.
>
> The other down side is that if the cluster divides into two partitions
> and later merges back into one partition, I don't think certain things
> will work right; you will need to detect this event and reboot one of
> the nodes.
>
> -- Lon
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 23 Oct 2008 14:42:40 -0200
> From: "Flavio Junior" <billpp at gmail.com>
> Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
>        storage issue
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <58aa8d780810230942s421d74dfqaf61190be764b57 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Well.. If you are using an active/standby scenario, without a shared
> storage, probably you can make use of CARP/UCARP
>
> http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol
> http://www.ucarp.org/project/ucarp
>
>
> --
>
> Fl?vio do Carmo J?nior aka waKKu
>
>
> On Thu, Oct 23, 2008 at 2:01 PM, Lon Hohberger <lhh at redhat.com> wrote:
>
> > On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> > wrote:
> > > Hi,
> > >
> > > I want to set up a two node cluster, I use active/standby mode to run
> > > my service. I need even one node's hardware failure such as power cut,
> > > another node still can handover from failure node and the provide the
> > > service.
> > >
> > > In my environment, I have no shared storage, so I can not use quorum
> > > disk. Is there any other way to implement it? I searched and found
> > > 'tiebreaker IP' may feed my request, but I can not found any hints on
> > > how to configure it ?
> >
> > Since you have no shared data, you may be able to run without fencing.
> >
> > That should be pretty straightforward, but you might need to comment out
> > the "fenced" startup from the cman init script.
> >
> > In this case, the worst that will happen is both nodes will end up
> > running the service at the same time in the event of a network
> > partition.
> >
> > The other down side is that if the cluster divides into two partitions
> > and later merges back into one partition, I don't think certain things
> > will work right; you will need to detect this event and reboot one of
> > the nodes.
> >
> > -- Lon
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20081023/0fdd088f/attachment.html
>
> ------------------------------
>
> Message: 3
> Date: Fri, 24 Oct 2008 10:35:48 +0800
> From: "Chen, Mockey (NSN - CN/Cheng Du)" <mockey.chen at nsn.com>
> Subject: RE: [Linux-cluster] Two nodes cluster issue without shared
>        storageissue
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <174CED94DD8DC54AB888B56E103B118742183A at CNBEEXC007.nsn-intra.net>
> Content-Type: text/plain;       charset="gb2312"
>
>
>
> >-----Original Message-----
> >From: linux-cluster-bounces at redhat.com
> >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon
> >Hohberger
> >Sent: 2008??10??24?? 0:02
> >To: linux clustering
> >Subject: Re: [Linux-cluster] Two nodes cluster issue without
> >shared storageissue
> >
> >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> >wrote:
> >> Hi,
> >>
> >> I want to set up a two node cluster, I use active/standby
> >mode to run
> >> my service. I need even one node's hardware failure such as
> >power cut,
> >> another node still can handover from failure node and the
> >provide the
> >> service.
> >>
> >> In my environment, I have no shared storage, so I can not use quorum
> >> disk. Is there any other way to implement it? I searched and found
> >> 'tiebreaker IP' may feed my request, but I can not found any
> >hints on
> >> how to configure it ?
> >
> >Since you have no shared data, you may be able to run without fencing.
> >
> >That should be pretty straightforward, but you might need to
> >comment out the "fenced" startup from the cman init script.
> >
> >In this case, the worst that will happen is both nodes will
> >end up running the service at the same time in the event of a
> >network partition.
> >
> >The other down side is that if the cluster divides into two
> >partitions and later merges back into one partition, I don't
> >think certain things will work right; you will need to detect
> >this event and reboot one of the nodes.
> >
> >-- Lon
>
> I know such defects in two node cluster.
> Since our service is mission critical, I want to know how to avoid such
> failure case ?
>
> Thanks.
>
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Fri, 24 Oct 2008 10:41:11 +0800
> From: "Chen, Mockey (NSN - CN/Cheng Du)" <mockey.chen at nsn.com>
> Subject: RE: [Linux-cluster] Two nodes cluster issue without shared
>        storageissue
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <174CED94DD8DC54AB888B56E103B118742183D at CNBEEXC007.nsn-intra.net>
> Content-Type: text/plain;       charset="gb2312"
>
> >
> >       From: linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] On Behalf Of ext Flavio Junior
> >       Sent: 2008??10??24?? 0:43
> >       To: linux clustering
> >       Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
> storageissue
> >
> >
> >       Well.. If you are using an active/standby scenario, without a
> shared storage, probably you can make use of CARP/UCARP
> >
> >       http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol
> >       http://www.ucarp.org/project/ucarp
> >
>
> I think CARP will fullfill my current request, but we have choose RHCS as
> our cluster suite. It is very difficult to change it.
> Anyhow, Thanks for your suggestion.
>
>
>
> ------------------------------
>
> Message: 5
> Date: Fri, 24 Oct 2008 12:41:33 +0530
> From: "Rajagopal Swaminathan" <raju.rajsand at gmail.com>
> Subject: Re: [Linux-cluster] Cluster/GFS issue.
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <8786b91c0810240011u71e91161ia374c591d5f3cadb at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Greetings,
>
> 2008/10/23 Allgood, John <jallgood at ohl.com>
> <snip>
>
> >  Here is the problem that we are having. We can't on an consistent basic
> > get the GFS filesystem mounted. On
> >
> <snip>
>
> Just a hunch... Cant say if it will help...
>
> Have you tried putting the mount command in rc.local instead of /etc/fstab?
>
> Regards,
>
> Rajagopal
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20081024/c28769a4/attachment.html
>
> ------------------------------
>
> Message: 6
> Date: Fri, 24 Oct 2008 12:44:16 +0530
> From: "Rajagopal Swaminathan" <raju.rajsand at gmail.com>
> Subject: Re: [Linux-cluster] Cluster/GFS issue.
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> >
> > 2008/10/23 Allgood, John <jallgood at ohl.com>
> > <snip>
> >
> >>  Here is the problem that we are having. We can't on an consistent basic
> >> get the GFS filesystem mounted. On
> >>
> > <snip>
> >
> > Just a hunch... Cant say if it will help...
> >
> > Have you tried putting the mount command in rc.local instead of
> /etc/fstab?
> >
> start clvmd too in rc.local. of course before mounting and use the commands
> in chain using &&
>
> >
> > Regards,
> >
> > Rajagopal
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20081024/c01faa0e/attachment.html
>
> ------------------------------
>
> Message: 7
> Date: Fri, 24 Oct 2008 17:29:05 +0530
> From: "Panigrahi, Santosh Kumar" <Santosh.Panigrahi at in.unisys.com>
> Subject: [Linux-cluster] cluster between 2 Xen guests where guests are
>        on      different hosts
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <
> D566E8CF3538B54D95B925CB69CB4D2A16BC0485 at inblr-exch1.eu.uis.unisys.com>
>
> Content-Type: text/plain;       charset="us-ascii"
>
> Hello,
>
> I am using RHEL5.2+RHCS and configured a 2 node cluster in XEN virtual
> environment for testing purpose only. These 2 cluster nodes are 2
> virtual guests (p6pv1, p7pv1) and each virtual guest is on different
> hosts/ Dom-0s (p6 & p7). I have already gone through the older questions
> on this forum with similar problems and also the wiki page
> (http://sources.redhat.com/cluster/wiki/VMClusterCookbook ).
> But still I have confused a bit regarding the Xen fencing in this
> scenarios.
> I don't want to do any live migration here and only to do a
> failover/failback services between 2 cluster nodes. I want to know
> whether I have to configure fencing only between the 2 guests (using
> fence_xvm) or also between the 2 hosts (using fence_xvmd) as well, where
> as my cluster nodes are 2 Xen guests.
>
> I am configuring the cluster using luci and there options are as
> follows.
>
> Fence Daemon Properties:
> Post Fail Delay         - 0
> Post Join Delay         - 3
> Run XVM fence daemon    - tick mark selected
>
> XVM fence daemon key distribution:
> Enter a node hostname from the host cluster                     - ?
> Enter a node hostname from the hosted (virtual) cluster _ ?
>
> Can someone please help me in this regard?
>
> Regards,
> Santosh
>
>
>
> ------------------------------
>
> Message: 8
> Date: Fri, 24 Oct 2008 10:09:57 -0400
> From: "Jeff Sturm" <jeff.sturm at eprize.com>
> Subject: RE: [Linux-cluster] cluster between 2 Xen guests where guests
>        are     ondifferent hosts
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <64D0546C5EBBD147B75DE133D798665F0180693B at hugo.eprize.local>
> Content-Type: text/plain;       charset="us-ascii"
>
> Santosh,
>
> The hosts are responsible for fencing the guests, so, as far as I know
> it is not possible to use fence_xvm without also configuring fence_xvmd.
>
> In our configuration we run an "inner" cluster amongst the DomU guests,
> and an "outer" cluster amongst the Dom0 hosts.  The outer cluster starts
> fence_xvmd whenever cman starts.  The fence_xvmd daemon listens for
> multicast traffic from fence_xvm.  We have a dedicated VLAN for this
> traffic in our configuration.  (Make sure your routing tables are
> adjusted for this, if needed--whereas aisexec figures out what
> interfaces to use for multicast automatically based on the bind address,
> fence_xvm does not.)
>
> If your Dom0 hosts are not part of a cluster, it may be possible to run
> fence_xvmd standalone.  We have not attempted to do so, so I can't say
> whether it can work.
>
> Jeff
>
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of
> > Panigrahi, Santosh Kumar
> > Sent: Friday, October 24, 2008 7:59 AM
> > To: linux clustering
> > Subject: [Linux-cluster] cluster between 2 Xen guests where
> > guests are ondifferent hosts
> >
> > Hello,
> >
> > I am using RHEL5.2+RHCS and configured a 2 node cluster in
> > XEN virtual environment for testing purpose only. These 2
> > cluster nodes are 2 virtual guests (p6pv1, p7pv1) and each
> > virtual guest is on different hosts/ Dom-0s (p6 & p7). I have
> > already gone through the older questions on this forum with
> > similar problems and also the wiki page
> > (http://sources.redhat.com/cluster/wiki/VMClusterCookbook ).
> > But still I have confused a bit regarding the Xen fencing in
> > this scenarios.
> > I don't want to do any live migration here and only to do a
> > failover/failback services between 2 cluster nodes. I want to
> > know whether I have to configure fencing only between the 2
> > guests (using
> > fence_xvm) or also between the 2 hosts (using fence_xvmd) as
> > well, where as my cluster nodes are 2 Xen guests.
> >
> > I am configuring the cluster using luci and there options are
> > as follows.
> >
> > Fence Daemon Properties:
> > Post Fail Delay               - 0
> > Post Join Delay               - 3
> > Run XVM fence daemon  - tick mark selected
> >
> > XVM fence daemon key distribution:
> > Enter a node hostname from the host cluster                   - ?
> > Enter a node hostname from the hosted (virtual) cluster       _ ?
> >
> > Can someone please help me in this regard?
> >
> > Regards,
> > Santosh
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
>
>
>
> ------------------------------
>
> Message: 9
> Date: Fri, 24 Oct 2008 10:13:07 -0400
> From: Chris Edwards <cedwards at smartechcorp.net>
> Subject: [Linux-cluster] Cluster and LVG/LV
> To: linux clustering <linux-cluster at redhat.com>
> Message-ID:
>        <
> 61252CC53A97634BA52256DCF2344FBC66C68DE2FF at OFFICEEXCHANGE.office.smartechcorp.net
> >
>
> Content-Type: text/plain; charset="us-ascii"
>
> If I am installing multiple Xen VM's in a cluster with shared iSCSI space
> with Logical Volumes for each virtual machine should I put each LV in its
> own logical volume group or should I use one logical volume group for all of
> the LV's?
>
> Thanks!
>
> ---
>
> Chris Edwards
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20081024/7e71e22c/attachment.html
>
> ------------------------------
>
> Message: 10
> Date: Fri, 24 Oct 2008 10:18:08 -0400
> From: "Jeff Sturm" <jeff.sturm at eprize.com>
> Subject: RE: [Linux-cluster] Two nodes cluster issue without
>        sharedstorageissue
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <64D0546C5EBBD147B75DE133D798665F0180693C at hugo.eprize.local>
> Content-Type: text/plain;       charset="iso-2022-jp"
>
> For what it's worth, considerations like these have caused us to abandon
> any efforts to build a 2-node cluster.
>
> >From this point forward all our RHCS deployments will have a minimum of 3
> nodes, even if the 3rd node is a small node that provides no resources and
> only exists for arbitration purposes.  (It was going to be that, or a quorum
> disk for our application, but we have no experience running a quorum disk
> over the long-haul in a production envrironment.)
>
> Hope this helps someone.
>
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chen,
> > Mockey (NSN - CN/Cheng Du)
> > Sent: Thursday, October 23, 2008 10:36 PM
> > To: linux clustering
> > Subject: RE: [Linux-cluster] Two nodes cluster issue without
> > sharedstorageissue
> >
> >
> >
> > >-----Original Message-----
> > >From: linux-cluster-bounces at redhat.com
> > >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon
> > >Hohberger
> > >Sent: 2008 $BG/ (J10 $B7n (J24 $BF| (J 0:02
> > >To: linux clustering
> > >Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
> > >storageissue
> > >
> > >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> > >wrote:
> > >> Hi,
> > >>
> > >> I want to set up a two node cluster, I use active/standby
> > >mode to run
> > >> my service. I need even one node's hardware failure such as
> > >power cut,
> > >> another node still can handover from failure node and the
> > >provide the
> > >> service.
> > >>
> > >> In my environment, I have no shared storage, so I can not
> > use quorum
> > >> disk. Is there any other way to implement it? I searched and found
> > >> 'tiebreaker IP' may feed my request, but I can not found any
> > >hints on
> > >> how to configure it ?
> > >
> > >Since you have no shared data, you may be able to run
> > without fencing.
> > >
> > >That should be pretty straightforward, but you might need to comment
> > >out the "fenced" startup from the cman init script.
> > >
> > >In this case, the worst that will happen is both nodes will end up
> > >running the service at the same time in the event of a network
> > >partition.
> > >
> > >The other down side is that if the cluster divides into two
> > partitions
> > >and later merges back into one partition, I don't think
> > certain things
> > >will work right; you will need to detect this event and
> > reboot one of
> > >the nodes.
> > >
> > >-- Lon
> >
> > I know such defects in two node cluster.
> > Since our service is mission critical, I want to know how to
> > avoid such failure case ?
> >
> > Thanks.
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
>
>
>
> ------------------------------
>
> Message: 11
> Date: Fri, 24 Oct 2008 10:20:04 -0400
> From: "Jeff Sturm" <jeff.sturm at eprize.com>
> Subject: RE: [Linux-cluster] Cluster and LVG/LV
> To: "linux clustering" <linux-cluster at redhat.com>
> Message-ID:
>        <64D0546C5EBBD147B75DE133D798665F0180693D at hugo.eprize.local>
> Content-Type: text/plain; charset="us-ascii"
>
> Chris,
>
> Are you running a clustered LVM, and do you expect to be able to use Xen
> migration?
>
> Jeff
>
>
> ________________________________
>
>        From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
>        Sent: Friday, October 24, 2008 10:13 AM
>        To: linux clustering
>        Subject: [Linux-cluster] Cluster and LVG/LV
>
>
>
>        If I am installing multiple Xen VM's in a cluster with shared
> iSCSI space with Logical Volumes for each virtual machine should I put
> each LV in its own logical volume group or should I use one logical
> volume group for all of the LV's?
>
>
>
>        Thanks!
>
>
>
>        ---
>
>
>
>        Chris Edwards
>
>
>
>
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://www.redhat.com/archives/linux-cluster/attachments/20081024/5650b4ab/attachment.html
>
> ------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> End of Linux-cluster Digest, Vol 54, Issue 31
> *********************************************
>


-- 
Regards,

Varun Galande
+971505589029
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/d152b775/attachment.htm>

From jeff.sturm at eprize.com  Fri Oct 24 14:59:44 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Fri, 24 Oct 2008 10:59:44 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com><8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com><8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com><61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net><64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>
	<61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>
Message-ID: <64D0546C5EBBD147B75DE133D798665F01806944@hugo.eprize.local>

Okay.  For CLVM it probably makes the most sense to run one big volume
group across your cluster, but there's also the option of running a
non-clustered LVM on each Dom0 host.  The latter would only work for you
however if you don't require Xen migration.
 
I see 3 options for central storage in a Xen cluster, each with their
own drawbacks:
 
1) Run a single clustered volume group across all hosts, containing one
or more PV's from your shared storage.
 
2) Run a non-clustered volume group on each host, each with a distinct
PV carved out of your shared storage.
 
3) Export storage for each host individually from your SAN, i.e. rely
completely on your SAN for volume management.  With this you don't need
LVM at all.
 
Both 1) and 3) allow you to use Xen migration.  2) is feasible if you
don't need to migrate guests online.
 
Our problem with 1) is snapshot support, and that we could not get
pvmove to work acceptably well.  (We had to make the entire volume group
inactive before pvmove would even run--I'm not sure if it is expected,
or what we did wrong.)
 
We've tried and failed at 1), and will now be attempting 3).  This gives
us a lot of flexibility on a storage appliance that supports snapshots.
I'd still like to have pvmove work so we could migrate online from one
SAN to another, if needed, but I haven't been able to get it to work
acceptably well.
 
Also I thought I had read that snapshots are not supported by a
clustered LVM?  That would be difficult for us too, as we are relying on
snapshots for a backup mechanism.
 
Jeff


________________________________

	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
	Sent: Friday, October 24, 2008 10:29 AM
	To: linux clustering
	Subject: RE: [Linux-cluster] Cluster and LVG/LV
	
	
	Yes to both.  Right now the cluster is running GFS and I can
migrate VM's between the nodes.   

	 
	This question is coming up because I have been trying to do a
snap shot and I realized the snapshot is stored on the Volume Group that
the LV is located on.  I did not realize  this and I cannot do a
snapshot because I did not leave enough space in each of the Volume
Groups for each of the VM's.

	 
	---

	 
	Chris Edwards
	Smartech Corp.
	Div. of AirNet Group

	http://www.airnetgroup.com <http://www.airnetgroup.com/> 

	http://www.smartechcorp.net <http://www.smartechcorp.net/> 

	cedwards at smartechcorp.net <mailto:agarrison at smartechcorp.net> 
	P:  423-664-7678 x114

	C:  423-593-6964

	F:  423-664-7680

	 
	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
	Sent: Friday, October 24, 2008 10:20 AM
	To: linux clustering
	Subject: RE: [Linux-cluster] Cluster and LVG/LV

	 
	Chris,

	 
	Are you running a clustered LVM, and do you expect to be able to
use Xen migration?

	 
	Jeff

		 
________________________________

		From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
		Sent: Friday, October 24, 2008 10:13 AM
		To: linux clustering
		Subject: [Linux-cluster] Cluster and LVG/LV

		If I am installing multiple Xen VM's in a cluster with
shared iSCSI space with Logical Volumes for each virtual machine should
I put each LV in its own logical volume group or should I use one
logical volume group for all of the LV's?

		 
		Thanks!

		 
		---

		 
		Chris Edwards

		 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/60be6460/attachment.htm>

From lhh at redhat.com  Fri Oct 24 15:37:30 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 24 Oct 2008 11:37:30 -0400
Subject: [Linux-cluster] cluster between 2 Xen guests where guests are
	ondifferent hosts
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0180693B@hugo.eprize.local>
References: <45824.79.10.137.147.1197661952.squirrel@picard.linux.it>
	<1197672129.18614.2.camel@localhost.localdomain>
	<26275.62.101.100.5.1197887615.squirrel@picard.linux.it>
	<1197915660.4959.24.camel@ayanami.boston.devel.redhat.com>
	<476A18A2.2080406@wasko.pl>
	<D566E8CF3538B54D95B925CB69CB4D2A16BC0485@inblr-exch1.eu.uis.unisys.com>
	<64D0546C5EBBD147B75DE133D798665F0180693B@hugo.eprize.local>
Message-ID: <1224862650.32460.126.camel@ayanami>

On Fri, 2008-10-24 at 10:09 -0400, Jeff Sturm wrote:
> Santosh,
> 
> The hosts are responsible for fencing the guests, so, as far as I know
> it is not possible to use fence_xvm without also configuring fence_xvmd.

Correct.

> In our configuration we run an "inner" cluster amongst the DomU guests,
> and an "outer" cluster amongst the Dom0 hosts.  The outer cluster starts
> fence_xvmd whenever cman starts.  The fence_xvmd daemon listens for
> multicast traffic from fence_xvm.  We have a dedicated VLAN for this
> traffic in our configuration.  (Make sure your routing tables are
> adjusted for this, if needed--whereas aisexec figures out what
> interfaces to use for multicast automatically based on the bind address,
> fence_xvm does not.)


> If your Dom0 hosts are not part of a cluster, it may be possible to run
> fence_xvmd standalone.  We have not attempted to do so, so I can't say
> whether it can work.

fence_xvmd -LX (need to add to rc.local or something)

You could (in theory) do fencing using multiple fence_xvm agent
instances to try different keys (one per physical host) so that if
fencing a host on one key succeeds, you also ensure the other guest
isn't running the node.

For example, if you had two keys on the guests, you could do the
following:

  * dd if=/dev/urandom of=/etc/cluster/fence_xvm-host1.key bs=4k count=1
  * dd if=/dev/urandom of=/etc/cluster/fence_xvm-host2.key bs=4k count=1
  * scp /etc/cluster/fence_xvm-host1.key
        host1:/etc/cluster/fence_xvm.key
  * scp /etc/cluster/fence_xvm-host2.key
        host2:/etc/cluster/fence_xvm.key

(don't forget to copy /etc/cluster/fence_xvm* to the other virtual guest
too!)

Set up two fencing devices:

  <fencedevices>
    <fencedevice agent="fence_xvm" name="host1" 
      key_file="/etc/cluster/fence_xvm-host1.key" />
    <fencedevice agent="fence_xvm" name="host2" 
      key_file="/etc/cluster/fence_xvm-host2.key" />
  </fencedevices>

Set up the nodes to fence both:
  <clusternodes>
    <clusternode name="virt1.mydomain.com">
      <fence>
        <method name="hack-xvm">
          <device name="host1" domain="virt1"/>
          <device name="host2" domain="virt1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="virt2.mydomain.com">
      <fence>
        <method name="hack-xvm">
          <device name="host2" domain="virt2"/>
          <device name="host1" domain="virt2"/>
        </method>
      </fence>
    </clusternode>

... maybe that would work.

The reason you need a cluster in dom0 typically is because we use
Checkpointing to distribute the states of VMs cluster-wide.  If there's
no cluster, then you can't distribute the states.  Now, key files are,
well, key here - fence_xvmd assumes that the admin does the correct
thing (not reusing key files on multiple clusters), so therefore it
returns "ok" if it's not got information about a guest...

Suppose virt1 (on guest1) fails:

 * virt2 sends a request that only host2 listens to to try to fence
virt1.
   - "Never heard of that domain, so it must be safe"
 * virt2 sends a request that only host1 listens to to try to fence
virt1.
   - "Ok, it's running locally -> kill it and return success"

-- Lon


From rodrique.heron at baruch.cuny.edu  Fri Oct 24 15:54:48 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Fri, 24 Oct 2008 11:54:48 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F01806944@hugo.eprize.local>
Message-ID: <856BC630A1FD6540B94C17D96A17843F157BDA@mb01.baruch.local>

Jeff-

 
Thanks for your thoughts, until now I never really considered exporting
storage from the SAN to my domU's. I can definitely see the advantage
here, using the SAN snapshot utilities, it most cases it can be
automated. I am interested in how you would accomplish similar
functionality to the SAN snapshot, using LVM snapshots (let's say lvm
snapshot support worked well).

 
________________________________

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
Sent: Friday, October 24, 2008 11:00 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV

 
Okay.  For CLVM it probably makes the most sense to run one big volume
group across your cluster, but there's also the option of running a
non-clustered LVM on each Dom0 host.  The latter would only work for you
however if you don't require Xen migration.

 
I see 3 options for central storage in a Xen cluster, each with their
own drawbacks:

 
1) Run a single clustered volume group across all hosts, containing one
or more PV's from your shared storage.

 
2) Run a non-clustered volume group on each host, each with a distinct
PV carved out of your shared storage.

 
3) Export storage for each host individually from your SAN, i.e. rely
completely on your SAN for volume management.  With this you don't need
LVM at all.

 
Both 1) and 3) allow you to use Xen migration.  2) is feasible if you
don't need to migrate guests online.

 
Our problem with 1) is snapshot support, and that we could not get
pvmove to work acceptably well.  (We had to make the entire volume group
inactive before pvmove would even run--I'm not sure if it is expected,
or what we did wrong.)

 
We've tried and failed at 1), and will now be attempting 3).  This gives
us a lot of flexibility on a storage appliance that supports snapshots.
I'd still like to have pvmove work so we could migrate online from one
SAN to another, if needed, but I haven't been able to get it to work
acceptably well.

 
Also I thought I had read that snapshots are not supported by a
clustered LVM?  That would be difficult for us too, as we are relying on
snapshots for a backup mechanism.

 
Jeff

 
________________________________


	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
	Sent: Friday, October 24, 2008 10:29 AM
	To: linux clustering
	Subject: RE: [Linux-cluster] Cluster and LVG/LV

	Yes to both.  Right now the cluster is running GFS and I can
migrate VM's between the nodes.   

	 
	This question is coming up because I have been trying to do a
snap shot and I realized the snapshot is stored on the Volume Group that
the LV is located on.  I did not realize  this and I cannot do a
snapshot because I did not leave enough space in each of the Volume
Groups for each of the VM's.

	 
	---

	 
	Chris Edwards
	Smartech Corp.
	Div. of AirNet Group

	http://www.airnetgroup.com <http://www.airnetgroup.com/> 

	http://www.smartechcorp.net <http://www.smartechcorp.net/> 

	cedwards at smartechcorp.net <mailto:agarrison at smartechcorp.net> 
	P:  423-664-7678 x114

	C:  423-593-6964

	F:  423-664-7680

	 
	From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
	Sent: Friday, October 24, 2008 10:20 AM
	To: linux clustering
	Subject: RE: [Linux-cluster] Cluster and LVG/LV

	 
	Chris,

	 
	Are you running a clustered LVM, and do you expect to be able to
use Xen migration?

	 
	Jeff

		 
________________________________


		From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
		Sent: Friday, October 24, 2008 10:13 AM
		To: linux clustering
		Subject: [Linux-cluster] Cluster and LVG/LV

		If I am installing multiple Xen VM's in a cluster with
shared iSCSI space with Logical Volumes for each virtual machine should
I put each LV in its own logical volume group or should I use one
logical volume group for all of the LV's?

		 
		Thanks!

		 
		---

		 
		Chris Edwards

		 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/6352b482/attachment.htm>

From rodrique.heron at baruch.cuny.edu  Fri Oct 24 16:00:01 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Fri, 24 Oct 2008 12:00:01 -0400
Subject: [Linux-cluster] Two nodes cluster issue without sharedstorageissue
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0180693C@hugo.eprize.local>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net><1224777717.32460.92.camel@ayanami><174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net>
	<64D0546C5EBBD147B75DE133D798665F0180693C@hugo.eprize.local>
Message-ID: <20081024155620.BAE1B15EC7D@smtp25.baruch.cuny.edu>

Jeff

I have two node cluster only because my storage array only supports two
nodes, can I add a third node without it having access to the storage? I
am using CLVM to run domU's.


Jeff Sturm wrote:
>
> For what it's worth, considerations like these have caused us to
> abandon any efforts to build a 2-node cluster.
>
> >From this point forward all our RHCS deployments will have a minimum
> of 3 nodes, even if the 3rd node is a small node that provides no
> resources and only exists for arbitration purposes. (It was going to
> be that, or a quorum disk for our application, but we have no
> experience running a quorum disk over the long-haul in a production
> envrironment.)
>
> Hope this helps someone.
>
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chen,
> > Mockey (NSN - CN/Cheng Du)
> > Sent: Thursday, October 23, 2008 10:36 PM
> > To: linux clustering
> > Subject: RE: [Linux-cluster] Two nodes cluster issue without
> > sharedstorageissue
> >
> >
> >
> > >-----Original Message-----
> > >From: linux-cluster-bounces at redhat.com
> > >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon
> > >Hohberger
> > >Sent: 2008?10?24? 0:02
> > >To: linux clustering
> > >Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
> > >storageissue
> > >
> > >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
> > >wrote:
> > >> Hi,
> > >>
> > >> I want to set up a two node cluster, I use active/standby
> > >mode to run
> > >> my service. I need even one node's hardware failure such as
> > >power cut,
> > >> another node still can handover from failure node and the
> > >provide the
> > >> service.
> > >>
> > >> In my environment, I have no shared storage, so I can not
> > use quorum
> > >> disk. Is there any other way to implement it? I searched and found
> > >> 'tiebreaker IP' may feed my request, but I can not found any
> > >hints on
> > >> how to configure it ?
> > >
> > >Since you have no shared data, you may be able to run
> > without fencing.
> > >
> > >That should be pretty straightforward, but you might need to comment
> > >out the "fenced" startup from the cman init script.
> > >
> > >In this case, the worst that will happen is both nodes will end up
> > >running the service at the same time in the event of a network
> > >partition.
> > >
> > >The other down side is that if the cluster divides into two
> > partitions
> > >and later merges back into one partition, I don't think
> > certain things
> > >will work right; you will need to detect this event and
> > reboot one of
> > >the nodes.
> > >
> > >-- Lon
> >
> > I know such defects in two node cluster.
> > Since our service is mission critical, I want to know how to
> > avoid such failure case ?
> >
> > Thanks.
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

-- 
Rodrique Heron 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/766be5bf/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rodrique_heron.vcf
Type: text/x-vcard
Size: 342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/766be5bf/attachment.vcf>

From jeff.sturm at eprize.com  Fri Oct 24 16:29:29 2008
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Fri, 24 Oct 2008 12:29:29 -0400
Subject: [Linux-cluster] Two nodes cluster issue without sharedstorageissue
In-Reply-To: <20081024155620.BAE1B15EC7D@smtp25.baruch.cuny.edu>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net><1224777717.32460.92.camel@ayanami><174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net><64D0546C5EBBD147B75DE133D798665F0180693C@hugo.eprize.local>
	<20081024155620.BAE1B15EC7D@smtp25.baruch.cuny.edu>
Message-ID: <64D0546C5EBBD147B75DE133D798665F0180694D@hugo.eprize.local>

Certainly.  That third node need not run any cluster services at all other than fencing, and yet would guarantee a quorum in the even of loss of any single node.
 
A quorum disk would theoretically solve this as well, but for reasons I can't quite articulate I suspect the three-node cluster is superior.  (Besides, we have stockpiles of cheap hardware where I'm at, so there's little reason for us not to do it.)

________________________________

	From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Rodrique Heron
	Sent: Friday, October 24, 2008 12:00 PM
	To: linux clustering
	Subject: Re: [Linux-cluster] Two nodes cluster issue without sharedstorageissue
	
	
	Jeff
	
	I have two node cluster only because my storage array only supports two nodes, can I add a third node without it having access to the storage? I am using CLVM to run domU's.
	
	
	Jeff Sturm wrote: 

		For what it's worth, considerations like these have caused us to abandon any efforts to build a 2-node cluster.
		
		>From this point forward all our RHCS deployments will have a minimum of 3 nodes, even if the 3rd node is a small node that provides no resources and only exists for arbitration purposes.  (It was going to be that, or a quorum disk for our application, but we have no experience running a quorum disk over the long-haul in a production envrironment.)
		
		Hope this helps someone.
		
		> -----Original Message-----
		> From: linux-cluster-bounces at redhat.com
		> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chen,
		> Mockey (NSN - CN/Cheng Du)
		> Sent: Thursday, October 23, 2008 10:36 PM
		> To: linux clustering
		> Subject: RE: [Linux-cluster] Two nodes cluster issue without
		> sharedstorageissue
		>
		> 
		>
		> >-----Original Message-----
		> >From: linux-cluster-bounces at redhat.com
		> >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon
		> >Hohberger
		> >Sent: 2008?10?24? 0:02
		> >To: linux clustering
		> >Subject: Re: [Linux-cluster] Two nodes cluster issue without shared
		> >storageissue
		> >
		> >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN - CN/Cheng Du)
		> >wrote:
		> >> Hi,
		> >>
		> >> I want to set up a two node cluster, I use active/standby
		> >mode to run
		> >> my service. I need even one node's hardware failure such as
		> >power cut,
		> >> another node still can handover from failure node and the
		> >provide the
		> >> service.
		> >>
		> >> In my environment, I have no shared storage, so I can not
		> use quorum
		> >> disk. Is there any other way to implement it? I searched and found
		> >> 'tiebreaker IP' may feed my request, but I can not found any
		> >hints on
		> >> how to configure it ?
		> >
		> >Since you have no shared data, you may be able to run
		> without fencing.
		> >
		> >That should be pretty straightforward, but you might need to comment
		> >out the "fenced" startup from the cman init script.
		> >
		> >In this case, the worst that will happen is both nodes will end up
		> >running the service at the same time in the event of a network
		> >partition.
		> >
		> >The other down side is that if the cluster divides into two
		> partitions
		> >and later merges back into one partition, I don't think
		> certain things
		> >will work right; you will need to detect this event and
		> reboot one of
		> >the nodes.
		> >
		> >-- Lon
		>
		> I know such defects in two node cluster. 
		> Since our service is mission critical, I want to know how to
		> avoid such failure case ?
		>
		> Thanks.
		>
		>
		>
		> --
		> Linux-cluster mailing list
		> Linux-cluster at redhat.com
		> https://www.redhat.com/mailman/listinfo/linux-cluster
		>
		>
		
		--
		Linux-cluster mailing list
		Linux-cluster at redhat.com
		https://www.redhat.com/mailman/listinfo/linux-cluster
		

	-- 
	Rodrique Heron 
	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/600e846f/attachment.htm>

From rodrique.heron at baruch.cuny.edu  Fri Oct 24 16:52:00 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Fri, 24 Oct 2008 12:52:00 -0400
Subject: [Linux-cluster] Two nodes cluster issue without sharedstorageissue
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F0180694D@hugo.eprize.local>
References: <174CED94DD8DC54AB888B56E103B11873C2CE1@CNBEEXC007.nsn-intra.net><1224777717.32460.92.camel@ayanami><174CED94DD8DC54AB888B56E103B118742183A@CNBEEXC007.nsn-intra.net><64D0546C5EBBD147B75DE133D798665F0180693C@hugo.eprize.local>
	<20081024155620.BAE1B15EC7D@smtp25.baruch.cuny.edu>
	<64D0546C5EBBD147B75DE133D798665F0180694D@hugo.eprize.local>
Message-ID: <20081024164819.2A9B315EC49@smtp25.baruch.cuny.edu>

Thanks Jeff, I share the same reasons.


Jeff Sturm wrote:
> Certainly. That third node need not run any clusterservices atall
> other than fencing, and yet would guarantee a quorum in the even of
> loss of any single node.
> A quorum disk would theoretically solve this as well, but for reasons
> I can't quite articulate I suspect the three-node cluster is superior.
> (Besides, we have stockpiles of cheap hardware where I'm at, so
> there's little reason for usnot to do it.)
>
>     ------------------------------------------------------------------------
>     *From:* linux-cluster-bounces at redhat.com
>     [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Rodrique
>     Heron
>     *Sent:* Friday, October 24, 2008 12:00 PM
>     *To:* linux clustering
>     *Subject:* Re: [Linux-cluster] Two nodes cluster issue without
>     sharedstorageissue
>
>     Jeff
>
>     I have two node cluster only because my storage array only
>     supports two nodes, can I add a third node without it having
>     access to the storage? I am using CLVM to run domU's.
>
>
>
>     Jeff Sturm wrote:
>>
>>     For what it's worth, considerations like these have caused us to
>>     abandon any efforts to build a 2-node cluster.
>>
>>     >From this point forward all our RHCS deployments will have a
>>     minimum of 3 nodes, even if the 3rd node is a small node that
>>     provides no resources and only exists for arbitration purposes.
>>     (It was going to be that, or a quorum disk for our application,
>>     but we have no experience running a quorum disk over the
>>     long-haul in a production envrironment.)
>>
>>     Hope this helps someone.
>>
>>     > -----Original Message-----
>>     > From: linux-cluster-bounces at redhat.com
>>     > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chen,
>>     > Mockey (NSN - CN/Cheng Du)
>>     > Sent: Thursday, October 23, 2008 10:36 PM
>>     > To: linux clustering
>>     > Subject: RE: [Linux-cluster] Two nodes cluster issue without
>>     > sharedstorageissue
>>     >
>>     >
>>     >
>>     > >-----Original Message-----
>>     > >From: linux-cluster-bounces at redhat.com
>>     > >[mailto:linux-cluster-bounces at redhat.com] On Behalf Of ext Lon
>>     > >Hohberger
>>     > >Sent: 2008?10?24? 0:02
>>     > >To: linux clustering
>>     > >Subject: Re: [Linux-cluster] Two nodes cluster issue without
>>     shared
>>     > >storageissue
>>     > >
>>     > >On Thu, 2008-10-16 at 17:10 +0800, Chen, Mockey (NSN -
>>     CN/Cheng Du)
>>     > >wrote:
>>     > >> Hi,
>>     > >>
>>     > >> I want to set up a two node cluster, I use active/standby
>>     > >mode to run
>>     > >> my service. I need even one node's hardware failure such as
>>     > >power cut,
>>     > >> another node still can handover from failure node and the
>>     > >provide the
>>     > >> service.
>>     > >>
>>     > >> In my environment, I have no shared storage, so I can not
>>     > use quorum
>>     > >> disk. Is there any other way to implement it? I searched and
>>     found
>>     > >> 'tiebreaker IP' may feed my request, but I can not found any
>>     > >hints on
>>     > >> how to configure it ?
>>     > >
>>     > >Since you have no shared data, you may be able to run
>>     > without fencing.
>>     > >
>>     > >That should be pretty straightforward, but you might need to
>>     comment
>>     > >out the "fenced" startup from the cman init script.
>>     > >
>>     > >In this case, the worst that will happen is both nodes will end up
>>     > >running the service at the same time in the event of a network
>>     > >partition.
>>     > >
>>     > >The other down side is that if the cluster divides into two
>>     > partitions
>>     > >and later merges back into one partition, I don't think
>>     > certain things
>>     > >will work right; you will need to detect this event and
>>     > reboot one of
>>     > >the nodes.
>>     > >
>>     > >-- Lon
>>     >
>>     > I know such defects in two node cluster.
>>     > Since our service is mission critical, I want to know how to
>>     > avoid such failure case ?
>>     >
>>     > Thanks.
>>     >
>>     >
>>     >
>>     > --
>>     > Linux-cluster mailing list
>>     > Linux-cluster at redhat.com
>>     > https://www.redhat.com/mailman/listinfo/linux-cluster
>>     >
>>     >
>>
>>     --
>>     Linux-cluster mailing list
>>     Linux-cluster at redhat.com
>>     https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>     -- 
>     Rodrique Heron 
>
>         
>

-- 
Rodrique Heron 
Systems Administrator/
Red Hat Certified Engineer
Baruch College 
1 Bernard Baruch Way,
Box H-0910
New York, NY 10010
Phone: (646) 312-1055 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/57ba3351/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rodrique_heron.vcf
Type: text/x-vcard
Size: 342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/57ba3351/attachment.vcf>

From cedwards at smartechcorp.net  Fri Oct 24 17:26:10 2008
From: cedwards at smartechcorp.net (Chris Edwards)
Date: Fri, 24 Oct 2008 13:26:10 -0400
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <856BC630A1FD6540B94C17D96A17843F157BDA@mb01.baruch.local>
References: <64D0546C5EBBD147B75DE133D798665F01806944@hugo.eprize.local>
	<856BC630A1FD6540B94C17D96A17843F157BDA@mb01.baruch.local>
Message-ID: <61252CC53A97634BA52256DCF2344FBC66C68DE314@OFFICEEXCHANGE.office.smartechcorp.net>

Thanks for the advice!

So could I use vgmerge to merge all of my volume groups into one large volume group then?

---

Chris Edwards


From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Rodrique Heron
Sent: Friday, October 24, 2008 11:55 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV

Jeff-

Thanks for your thoughts, until now I never really considered exporting storage from the SAN to my domU's. I can definitely see the advantage here, using the SAN snapshot utilities, it most cases it can be automated. I am interested in how you would accomplish similar functionality to the SAN snapshot, using LVM snapshots (let's say lvm snapshot support worked well).

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
Sent: Friday, October 24, 2008 11:00 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV

Okay.  For CLVM it probably makes the most sense to run one big volume group across your cluster, but there's also the option of running a non-clustered LVM on each Dom0 host.  The latter would only work for you however if you don't require Xen migration.

I see 3 options for central storage in a Xen cluster, each with their own drawbacks:

1) Run a single clustered volume group across all hosts, containing one or more PV's from your shared storage.

2) Run a non-clustered volume group on each host, each with a distinct PV carved out of your shared storage.

3) Export storage for each host individually from your SAN, i.e. rely completely on your SAN for volume management.  With this you don't need LVM at all.

Both 1) and 3) allow you to use Xen migration.  2) is feasible if you don't need to migrate guests online.

Our problem with 1) is snapshot support, and that we could not get pvmove to work acceptably well.  (We had to make the entire volume group inactive before pvmove would even run--I'm not sure if it is expected, or what we did wrong.)

We've tried and failed at 1), and will now be attempting 3).  This gives us a lot of flexibility on a storage appliance that supports snapshots.  I'd still like to have pvmove work so we could migrate online from one SAN to another, if needed, but I haven't been able to get it to work acceptably well.

Also I thought I had read that snapshots are not supported by a clustered LVM?  That would be difficult for us too, as we are relying on snapshots for a backup mechanism.

Jeff

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
Sent: Friday, October 24, 2008 10:29 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV
Yes to both.  Right now the cluster is running GFS and I can migrate VM's between the nodes.

This question is coming up because I have been trying to do a snap shot and I realized the snapshot is stored on the Volume Group that the LV is located on.  I did not realize  this and I cannot do a snapshot because I did not leave enough space in each of the Volume Groups for each of the VM's.

---

From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
Sent: Friday, October 24, 2008 10:20 AM
To: linux clustering
Subject: RE: [Linux-cluster] Cluster and LVG/LV

Chris,

Are you running a clustered LVM, and do you expect to be able to use Xen migration?

Jeff

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Chris Edwards
Sent: Friday, October 24, 2008 10:13 AM
To: linux clustering
Subject: [Linux-cluster] Cluster and LVG/LV
If I am installing multiple Xen VM's in a cluster with shared iSCSI space with Logical Volumes for each virtual machine should I put each LV in its own logical volume group or should I use one logical volume group for all of the LV's?

Thanks!

---

Chris Edwards


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081024/c6751ef0/attachment.htm>

From lhh at redhat.com  Fri Oct 24 20:36:41 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 24 Oct 2008 16:36:41 -0400
Subject: [Linux-cluster] ipfails
In-Reply-To: <1224777409.32460.87.camel@ayanami>
References: <a900f4140810221450l3e7263f2r980874e39d7fc308@mail.gmail.com>
	<1224777409.32460.87.camel@ayanami>
Message-ID: <1224880601.32460.138.camel@ayanami>

On Thu, 2008-10-23 at 11:56 -0400, Lon Hohberger wrote:
> On Wed, 2008-10-22 at 23:50 +0200, max liccardo wrote:
> > hi cluster masters,
> > I'm using linux-HA and linux-cluster on separate project.
> > I'm wondering if I can use with linux-cluster something like the
> > linux-ha ping nodes, in order to have some sort of "network quorum".
> > bye
> 
> Currently, no, but you could build a daemon which did this and talked
> to
> the CMAN quorum API to do this.

Actually, I have something partially prototyped to do "simple IP
tiebreaker" sort of thing like this.  It's based on what we had in
clumanager a few years ago, and only works in limited cases (i.e. 2 node
clusters).

It kind of plugs in the same way as qdiskd but is far simpler (and, of
course, doesn't require a disk).  I could finish up pretty quickly if
you cared to test it.

-- Lon


From greg.hellings at harcourt.com  Fri Oct 24 22:41:15 2008
From: greg.hellings at harcourt.com (Greg Hellings)
Date: Fri, 24 Oct 2008 15:41:15 -0700
Subject: [Linux-cluster] LVS-DR question
Message-ID: <C5279D1B.1AE69%greg.hellings@harcourt.com>

Does anyone know if the VIP in a LVS-DR config has to be on the same subnet
as the RIP?  And If not, is there some reason that all the RIPs would need
to be in the same subnet?

--
Greg


From wferi at niif.hu  Sun Oct 26 10:36:52 2008
From: wferi at niif.hu (Ferenc Wagner)
Date: Sun, 26 Oct 2008 11:36:52 +0100
Subject: [Linux-cluster] Cluster and LVG/LV
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F01806944@hugo.eprize.local>
	(Jeff Sturm's message of "Fri, 24 Oct 2008 10:59:44 -0400")
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
	<8786b91c0810240011u71e91161ia374c591d5f3cadb@mail.gmail.com>
	<8786b91c0810240014g7ff6ed7cw5b8fd853b9ca256c@mail.gmail.com>
	<61252CC53A97634BA52256DCF2344FBC66C68DE2FF@OFFICEEXCHANGE.office.smartechcorp.net>
	<64D0546C5EBBD147B75DE133D798665F0180693D@hugo.eprize.local>
	<61252CC53A97634BA52256DCF2344FBC66C68DE303@OFFICEEXCHANGE.office.smartechcorp.net>
	<64D0546C5EBBD147B75DE133D798665F01806944@hugo.eprize.local>
Message-ID: <87mygrr6bf.fsf@szonett.ki.iif.hu>

"Jeff Sturm" <jeff.sturm at eprize.com> writes:

> 1) Run a single clustered volume group across all hosts, containing
> one or more PV's from your shared storage.
>  
> 3) Export storage for each host individually from your SAN,
> i.e. rely completely on your SAN for volume management.  With this
> you don't need LVM at all.
>  
> Our problem with 1) is snapshot support, and that we could not get
> pvmove to work acceptably well.  (We had to make the entire volume
> group inactive before pvmove would even run--I'm not sure if it is
> expected, or what we did wrong.)

It helps if you do LVM in your domU's, too.  Or only there, if you use 3).
-- 
Feri.


From linux-cluster at via-rs.net  Mon Oct 27 02:22:41 2008
From: linux-cluster at via-rs.net (CR Lou)
Date: Mon, 27 Oct 2008 00:22:41 -0200
Subject: [Linux-cluster] fence_ilo + HP ProLiant DL580 G5
Message-ID: <000301c937da$e86a0430$0200a8c0@beta>

Hi cluster men,

we are in the process of building a cluster to virtualization a lot
of  low-end servers using xen. Our plan is to use rhcs and clvm
for this but iLO insists on not working...  :-|

The cluster has 2 nodes, two HP ProLiant DL580 G5 (x86_64).
We're using multi-vlan access to reach a lot of networks
and EMC symmetrix more multipath to share the disks. Well,
everything is ok except when I need to use iLO to provide
one secure way for ha. Follows my cluster.conf:

<?xml version="1.0"?>
<cluster name="alpha" config_version="3">
<cman two_node="0" expected_votes="3"/>
<clusternodes>
        <clusternode name="node1.ha" votes="1" nodeid="1"/>
               <fence>
                    <method name="1">
                        <device name="ilo-node1"/>
                     </method>
                     <method name="2">
                        <device name="manual" nodename="node1.ha"/>
                     </method>
                </fence>
        <clusternode name="node2.ha" votes="1" nodeid="2"/>
               <fence>
                    <method name="1">
                        <device name="ilo-node2"/>
                     </method>
                     <method name="2">
                        <device name="manual" nodename="node2.ha"/>
                     </method>
               </fence>
</clusternodes>

<fencedevices>
  <fencedevice agent="fence_ilo" hostname="10.127.255.129" 
login="Administrator" name="ilo-node1" passwd="xxxx"/>
  <fencedevice agent="fence_ilo" hostname="10.127.255.130" 
login="Administrator" name="ilo-node2" passwd="xxxx"/>
  <fencedevice agent="fence_manual" name="manual"/>
</fencedevices>

<quorumd device="/dev/mapper/3600604800002877515624d4630383434p1" tko="10" 
votes="1" log_facility="local6" log_level="7" min_score="1" interval="1">
   <heuristic interval="4" tko="3" program="ping -c1 -t3 10.10.10.1" 
score="1"/>
   <heuristic interval="4" tko="3" program="ping -c1 -t3 10.10.10.2" 
score="1"/>
</quorumd>

<rm log_facility="local5" log_level="7">
 <failoverdomains>
     <failoverdomain name="para_dom" nofailback="1" ordered="1" 
restricted="0">
        <failoverdomainnode name="node1.ha" priority="1"/>
        <failoverdomainnode name="node2.ha" priority="2"/>
     </failoverdomain>
     <failoverdomain name="hvm_dom" nofailback="1" ordered="1" 
restricted="0">
        <failoverdomainnode name="node1.ha" priority="2"/>
        <failoverdomainnode name="node2.ha" priority="1"/>
     </failoverdomain>
 </failoverdomains>
 <resources/>

 <vm autostart="1" domain="para_dom" exclusive="0" migrate="live" 
name="rh52-para-virt01" path="/etc/xen"/>
 <vm autostart="1" domain="hvm_dom" exclusive="0" migrate="live" 
name="w2003-vm01" path="/etc/xen"/>
</rm>
</cluster>

node1# clustat
Cluster Status for alpha @ Sun Oct 26 21:32:52 2008
Member Status: Quorate

 Member Name                                                    ID   Status
 ------ ----                                                    ---- ------
 node1.ha                                                           1 
Online, Local, rgmanager
 node2.ha                                                           2 
Online, rgmanager
 /dev/mapper/3600604800002877515624d4630383434p1                    0 
Online, Quorum Disk

 Service Name                                          Owner (Last) 
State
 ------- ----                                          ----- ------          
                                 -----
 vm:rh52-para-virt01                                   node1.ha 
started
 vm:w2003-vm01                                         node2.ha 
started


Look, when I try to fence the another node it doesn't works.
node1# fence_node node2.ha
node1# echo $?
1
node1# tail -1 /var/log/messages
Oct 26 21:44:44 xxxxx fence_node[1480]: Fence of "node2.ha" was unsuccessful

But if I try to fence via agent it works fine.
node1# ./fence_ilo  -o off  -l Administrator -p xxxx -a 10.127.255.130
success
echo $?
0

# clustat
Cluster Status for alpha @ Sun Oct 26 21:56:36 2008
Member Status: Quorate

 Member Name                                                    ID   Status
 ------ ----                                                    ---- ------
 node1.ha                                                           1 
Online, Local, rgmanager
 node2.ha                                                           2 
Offline
 /dev/mapper/3600604800002877515624d4630383434p1                    0 
Online, Quorum Disk

 Service Name                                          Owner (Last) 
State
 ------- ----                                          ----- ------          
                                 -----
 vm:rh52-para-virt01                                   node1.ha 
started
 vm:w2003-vm01                                         node2.ha 
started

Now node2 is offline but the service remains there,
that is, node1 doesn't take over the vm:w2003-vm01
from node2. Follow the messages.log.
node1# tail -50 /var/log/messages
Oct 26 21:44:44 xxxxx fence_node[1480]: Fence of "node2.ha" was unsuccessful
Oct 26 21:54:49 xxxxx openais[31517]: [TOTEM] The token was lost in the 
OPERATIONAL state.
Oct 26 21:54:49 xxxxx openais[31517]: [TOTEM] Receive multicast socket recv 
buffer size (288000 bytes).
Oct 26 21:54:49 xxxxx openais[31517]: [TOTEM] Transmit multicast socket send 
buffer size (262142 bytes).
Oct 26 21:54:49 xxxxx openais[31517]: [TOTEM] entering GATHER state from 2.
Oct 26 21:54:50 xxxxx qdiskd[31565]: <notice> Writing eviction notice for 
node 2
Oct 26 21:54:51 xxxxx qdiskd[31565]: <notice> Node 2 evicted
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] entering GATHER state from 0.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] Creating commit token because 
I am the rep.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] Saving state aru 75 high seq 
received 75
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] Storing new sequence id for 
ring 14ac
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] entering COMMIT state.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] entering RECOVERY state.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] position [0] member 
10.127.255.137:
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] previous ring seq 5288 rep 
10.127.255.137
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] aru 75 high delivered 75 
received flag 1
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] Did not need to originate any 
messages in recovery.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] Sending initial ORF token
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] New Configuration:
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.137)
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] Members Left:
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.138)
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] Members Joined:
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] New Configuration:
Oct 26 21:54:54 xxxxx clurgmgrd[31715]: <info> State change: node2.ha DOWN
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.137)
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] Members Left:
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] Members Joined:
Oct 26 21:54:54 xxxxx openais[31517]: [SYNC ] This node is within the 
primary component and will provide service.
Oct 26 21:54:54 xxxxx openais[31517]: [TOTEM] entering OPERATIONAL state.
Oct 26 21:54:54 xxxxx openais[31517]: [CLM  ] got nodejoin message 
10.127.255.137
Oct 26 21:54:54 xxxxx openais[31517]: [CPG  ] got joinlist message from node 
1
Oct 26 21:54:54 xxxxx kernel: dlm: closing connection to node 2
Oct 26 21:54:54 xxxxx fenced[31533]: node2.ha not a cluster member after 0 
sec post_fail_delay
Oct 26 21:54:54 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:54:54 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:54:59 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:54:59 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:55:04 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:55:04 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:55:09 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:55:09 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:55:14 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:55:14 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:55:19 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 21:55:19 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 21:55:24 xxxxx fenced[31533]: fencing node "node2.ha"

Until I to force via fenced_override
node1# echo node2.ha  > /var/run/cluster/fenced_override
tail -1 /var/log/messages
Oct 26 22:05:08 xxxxx clurgmgrd[31715]: <notice> Taking over service 
vm:w2003-vm01 from down member node2.ha

Another example, if I simply to put the iface of heartbeat to
off on node2 (for simulate the problem), the same thing happens.
node2#  ifconfig  eth1 down
node1# tail -50 /var/log/messages
Oct 26 23:39:07 xxxxx openais[31517]: [TOTEM] The token was lost in the 
OPERATIONAL state.
Oct 26 23:39:07 xxxxx openais[31517]: [TOTEM] Receive multicast socket recv 
buffer size (288000 bytes).
Oct 26 23:39:07 xxxxx openais[31517]: [TOTEM] Transmit multicast socket send 
buffer size (262142 bytes).
Oct 26 23:39:07 xxxxx openais[31517]: [TOTEM] entering GATHER state from 2.
Oct 26 23:39:12 xxxxx openais[31517]: [TOTEM] entering GATHER state from 0.
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] Creating commit token because 
I am the rep.
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] Saving state aru 52 high seq 
received 52
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] Storing new sequence id for 
ring 14b4
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] entering COMMIT state.
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] entering RECOVERY state.
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] position [0] member 
10.127.255.137:
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] previous ring seq 5296 rep 
10.127.255.137
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] aru 52 high delivered 52 
received flag 1
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] Did not need to originate any 
messages in recovery.
Oct 26 23:39:12 xxxxx  openais[31517]: [TOTEM] Sending initial ORF token
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] New Configuration:
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.137)
Oct 26 23:39:12 xxxxx clurgmgrd[31715]: <info> State change: node2.ha DOWN
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] Members Left:
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.138)
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] Members Joined:
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] New Configuration:
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ]  r(0) ip(10.127.255.137)
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] Members Left:
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] Members Joined:
Oct 26 23:39:12 xxxxx openais[31517]: [SYNC ] This node is within the 
primary component and will provide service.
Oct 26 23:39:12 xxxxx openais[31517]: [TOTEM] entering OPERATIONAL state.
Oct 26 23:39:12 xxxxx openais[31517]: [CLM  ] got nodejoin message 
10.127.255.137
Oct 26 23:39:12 xxxxx openais[31517]: [CPG  ] got joinlist message from node 
1
Oct 26 23:39:12 xxxxx kernel: dlm: closing connection to node 2
Oct 26 23:39:12 xxxxx fenced[31533]: node2.ha not a cluster member after 0 
sec post_fail_delay
Oct 26 23:39:12 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 23:39:12 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 23:39:17 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 23:39:17 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 23:39:22 xxxxx fenced[31533]: fencing node "node2.ha"
Oct 26 23:39:22 xxxxx fenced[31533]: fence "node2.ha" failed
Oct 26 23:39:27 xxxxx fenced[31533]: fencing node "node2.ha"

node1# clustat
Cluster Status for alpha @ Sun Oct 26 23:41:20 2008
Member Status: Quorate

 Member Name                                                    ID   Status
 ------ ----                                                    ---- ------
 node1.ha                                                           1 
Online, Local, rgmanager
 node2.ha                                                           2 
Offline
 /dev/mapper/3600604800002877515624d4630383434p1                    0 
Online, Quorum Disk

 Service Name                                          Owner (Last) 
State
 ------- ----                                          ----- ------          
                                 -----
 vm:rh52-para-virt01                                   node1.ha 
started
 vm:w2003-vm01                                         node2.ha 
started

I believe that node1 had power off node2 via iLO because node2
don't responded anymore but node1 didn't take over the service
like it should to do.

Finally for try to solve this problem I loaded these modules on both nodes
from hp-OpenIPMI-8.1.0-104.rhel5.rpm package but nothing changed.
/opt/hp/hp-OpenIPMI/bin/2.6.18-92.el5xen/ipmi_devintf.ko
/opt/hp/hp-OpenIPMI/bin/2.6.18-92.el5xen/ipmi_msghandler.ko
/opt/hp/hp-OpenIPMI/bin/2.6.18-92.el5xen/ipmi_poweroff.ko
/opt/hp/hp-OpenIPMI/bin/2.6.18-92.el5xen/ipmi_si.ko
/opt/hp/hp-OpenIPMI/bin/2.6.18-92.el5xen/ipmi_watchdog.ko


ps. I'm using one rh5.2, kernel-2.6.18-92.el5, cman-2.0.84-2.el5,
rgmanager-2.0.38-2.el5 and iLO 1.50 on HPs.


tks a lot.

--
Renan


From oioi at cableplus.com.cn  Mon Oct 27 02:39:54 2008
From: oioi at cableplus.com.cn (Lu Wen-yan)
Date: Mon, 27 Oct 2008 10:39:54 +0800
Subject: [Linux-cluster] cman killed by node 2 for reason 2
Message-ID: <804362282.20081027103954@cableplus.com.cn>

Hello linux-cluster,

Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service. 
Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state. 
Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.1 
Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2 
Oct 26 18:45:08 cms2 openais[13904]: [CPG  ] got joinlist message from node 1 
[b][color=Red]Oct 26 18:45:08 cms2 openais[13904]: [CMAN ] cman killed by node 2 for reason 2 

 I get an error msg when I restart cman. Anyone know what is " reason 2 " ?

Thanks


-- 
Best regards,
 Lu                          mailto:oioi at cableplus.com.cn


From tom at netspot.com.au  Mon Oct 27 06:20:21 2008
From: tom at netspot.com.au (Tom Lanyon)
Date: Mon, 27 Oct 2008 16:50:21 +1030
Subject: [Linux-cluster] SELinux contexts not propagating between GFS nodes
Message-ID: <88A3D32D-1F53-4CAC-950A-D3EBCAE47547@netspot.com.au>

Hi list,

I'm seeing an occasional issue where an SELinux file context is  
applied on a cluster node to a file on a GFS1 filesystem, but the old  
context remains on one (or more) other nodes.

A simple 'restorecon /path/to/file' fixes the context on the "broken"  
node.

We're running CentOS 5.2 x86_64 with all the latest stable cluster and  
GFS versions.

Any ideas why this could be happening and/or how to debug it?

Thanks,
Tom

--
Tom Lanyon
Systems Administrator
NetSpot Pty Ltd


From ccaulfie at redhat.com  Mon Oct 27 09:36:34 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Mon, 27 Oct 2008 09:36:34 +0000
Subject: [Linux-cluster] cman killed by node 2 for reason 2
In-Reply-To: <804362282.20081027103954@cableplus.com.cn>
References: <804362282.20081027103954@cableplus.com.cn>
Message-ID: <49058BA2.5000505@redhat.com>

Lu Wen-yan wrote:
> Hello linux-cluster,
> 
> Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service. 
> Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state. 
> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.1 
> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2 
> Oct 26 18:45:08 cms2 openais[13904]: [CPG  ] got joinlist message from node 1 
> [b][color=Red]Oct 26 18:45:08 cms2 openais[13904]: [CMAN ] cman killed by node 2 for reason 2 
> 
>  I get an error msg when I restart cman. Anyone know what is " reason 2 " ?
> 

It means you have a very old version of cman that needs updating ;-)
That message as from 5.0 and lots of things have been fixed (including
that error) since then .

Chrissie


From lhh at redhat.com  Mon Oct 27 15:00:56 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 27 Oct 2008 11:00:56 -0400
Subject: [Linux-cluster] ipfails
In-Reply-To: <1224880601.32460.138.camel@ayanami>
References: <a900f4140810221450l3e7263f2r980874e39d7fc308@mail.gmail.com>
	<1224777409.32460.87.camel@ayanami>
	<1224880601.32460.138.camel@ayanami>
Message-ID: <1225119656.32460.139.camel@ayanami>

On Fri, 2008-10-24 at 16:36 -0400, Lon Hohberger wrote:
> On Thu, 2008-10-23 at 11:56 -0400, Lon Hohberger wrote:
> > On Wed, 2008-10-22 at 23:50 +0200, max liccardo wrote:
> > > hi cluster masters,
> > > I'm using linux-HA and linux-cluster on separate project.
> > > I'm wondering if I can use with linux-cluster something like the
> > > linux-ha ping nodes, in order to have some sort of "network quorum".
> > > bye
> > 
> > Currently, no, but you could build a daemon which did this and talked
> > to
> > the CMAN quorum API to do this.
> 
> Actually, I have something partially prototyped to do "simple IP
> tiebreaker" sort of thing like this.  It's based on what we had in
> clumanager a few years ago, and only works in limited cases (i.e. 2 node
> clusters).
> 
> It kind of plugs in the same way as qdiskd but is far simpler (and, of
> course, doesn't require a disk).  I could finish up pretty quickly if
> you cared to test it.

Fun with the CMAN quorum API - an IPv4 tiebreaker a la RHCS3 /
clumanager 1.2.x

http://people.redhat.com/lhh/qnet.tar.gz

[sha256sum]

769a35d8ec7b2ebdec9ba1439d6ff98a5d6b5dddf5f9c3ce7cb3d97fd4e7d1ad

-- Lon


From lhh at redhat.com  Mon Oct 27 18:06:42 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 27 Oct 2008 14:06:42 -0400
Subject: [Linux-cluster] LVS-DR question
In-Reply-To: <C5279D1B.1AE69%greg.hellings@harcourt.com>
References: <C5279D1B.1AE69%greg.hellings@harcourt.com>
Message-ID: <1225130802.32460.159.camel@ayanami>

On Fri, 2008-10-24 at 15:41 -0700, Greg Hellings wrote:
> Does anyone know if the VIP in a LVS-DR config has to be on the same subnet
> as the RIP?  

If I understand the question....

Yes.  All realservers and the director's VIP need to be on the same
subnet.

I usually put the VIP on the realservers' public NICs and use
arptables_jf to prevent the VIPs from sending/receiving ARP requests for
the VIP.

One trick you can do lets you put the VIP on the realservers on lo:0,
but I've never done it.  Either way, the realservers' "real" IP needs to
be on the same subnet as the VIP.

-- Lon


From greg.hellings at harcourt.com  Mon Oct 27 20:33:33 2008
From: greg.hellings at harcourt.com (Greg Hellings)
Date: Mon, 27 Oct 2008 13:33:33 -0700
Subject: [Linux-cluster] LVS-DR question
In-Reply-To: <1225130802.32460.159.camel@ayanami>
Message-ID: <C52B73AD.1B639%greg.hellings@harcourt.com>

Thank you.  That directly answers my question.  BTW, I am doing the lo:0
trick with 
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2

And it works great.

--
Greg


On 10/27/08 11:06 AM, "Lon Hohberger" <lhh at redhat.com> wrote:

> On Fri, 2008-10-24 at 15:41 -0700, Greg Hellings wrote:
>> Does anyone know if the VIP in a LVS-DR config has to be on the same subnet
>> as the RIP?  
> 
> If I understand the question....
> 
> Yes.  All realservers and the director's VIP need to be on the same
> subnet.
> 
> I usually put the VIP on the realservers' public NICs and use
> arptables_jf to prevent the VIPs from sending/receiving ARP requests for
> the VIP.
> 
> One trick you can do lets you put the VIP on the realservers on lo:0,
> but I've never done it.  Either way, the realservers' "real" IP needs to
> be on the same subnet as the VIP.
> 
> -- Lon
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From oioi at cableplus.com.cn  Tue Oct 28 05:46:19 2008
From: oioi at cableplus.com.cn (Lu Wen-yan)
Date: Tue, 28 Oct 2008 13:46:19 +0800
Subject: [Linux-cluster] cman killed by node 2 for reason 2
In-Reply-To: <49058BA2.5000505@redhat.com>
References: <804362282.20081027103954@cableplus.com.cn>
	<49058BA2.5000505@redhat.com>
Message-ID: <55153690.20081028134619@cableplus.com.cn>

Hello Christine,

Can you tell me what is the problem?
I have many servers in production. Is it safe to upgrade cluster?

Thanks


Monday, October 27, 2008, 5:36:34 PM, you wrote:

CC> Lu Wen-yan wrote:
>> Hello linux-cluster,
>> 
>> Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service. 
>> Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state. 
>> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.1 
>> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2 
>> Oct 26 18:45:08 cms2 openais[13904]: [CPG  ] got joinlist message from node 1 
>> [b][color=Red]Oct 26 18:45:08 cms2 openais[13904]: [CMAN ] cman killed by node 2 for reason 2 
>> 
>>  I get an error msg when I restart cman. Anyone know what is " reason 2 " ?
>> 

CC> It means you have a very old version of cman that needs updating ;-)
CC> That message as from 5.0 and lots of things have been fixed (including
CC> that error) since then .

CC> Chrissie


-- 
Best regards,
 Lu                            mailto:oioi at cableplus.com.cn


From ccaulfie at redhat.com  Tue Oct 28 08:39:25 2008
From: ccaulfie at redhat.com (Christine Caulfield)
Date: Tue, 28 Oct 2008 08:39:25 +0000
Subject: [Linux-cluster] cman killed by node 2 for reason 2
In-Reply-To: <55153690.20081028134619@cableplus.com.cn>
References: <804362282.20081027103954@cableplus.com.cn>
	<49058BA2.5000505@redhat.com>
	<55153690.20081028134619@cableplus.com.cn>
Message-ID: <4906CFBD.2030603@redhat.com>

Lu Wen-yan wrote:
> Hello Christine,
> 
> Can you tell me what is the problem?
> I have many servers in production. Is it safe to upgrade cluster?
> 
> Thanks
> 
> 
> Monday, October 27, 2008, 5:36:34 PM, you wrote:
> 
> CC> Lu Wen-yan wrote:
>>> Hello linux-cluster,
>>>
>>> Oct 26 18:45:08 cms2 openais[13904]: [SYNC ] This node is within the primary component and will provide service. 
>>> Oct 26 18:45:08 cms2 openais[13904]: [TOTEM] entering OPERATIONAL state. 
>>> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.1 
>>> Oct 26 18:45:08 cms2 openais[13904]: [CLM  ] got nodejoin message 192.168.201.2 
>>> Oct 26 18:45:08 cms2 openais[13904]: [CPG  ] got joinlist message from node 1 
>>> [b][color=Red]Oct 26 18:45:08 cms2 openais[13904]: [CMAN ] cman killed by node 2 for reason 2 
>>>
>>>  I get an error msg when I restart cman. Anyone know what is " reason 2 " ?
>>>

Reason "2" is that someone issued a cman_tool kill command on another
node. So it's nothing wrong with the cluster that has caused that message.

I do strongly recommend you upgrade. There have been a substantial
number of fixes to all aspects of cluster suite since RHEL 5.0.


> CC> It means you have a very old version of cman that needs updating ;-)
> CC> That message as from 5.0 and lots of things have been fixed (including
> CC> that error) since then .
> 
> CC> Chrissie
> 
> 
> 


-- 

Chrissie


From afahounko at gmail.com  Tue Oct 28 15:27:24 2008
From: afahounko at gmail.com (AFAHOUNKO Danny)
Date: Tue, 28 Oct 2008 15:27:24 +0000
Subject: [Linux-cluster] Cluster Two nodes - Software Installation
Message-ID: <49072F5C.4060002@gmail.com>

Hi,
I'm newbees in Clustering. I've installed a cluster with two nodes 
without a share storage.
I want i know if it's possible to install a software (apache, exim,...) 
once, and it will be automaticaly deployed on the two nodes ?!
I'm using RedHat 5.1 Advanced Plateform with RedHat Cluster Suite.

Thanks for helps.

-- 
Cordialement AFAHOUNKO Danny
Administrateur R?seaux & Syst?me d'Information - CICA-RE
Gsm: +228 914.55.89
Tel: +228 223.62.62


From raju.rajsand at gmail.com  Tue Oct 28 17:03:28 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Tue, 28 Oct 2008 22:33:28 +0530
Subject: [Linux-cluster] Cluster Two nodes - Software Installation
In-Reply-To: <49072F5C.4060002@gmail.com>
References: <49072F5C.4060002@gmail.com>
Message-ID: <8786b91c0810281003w7aa893f3h9a5e71f16f6ea81d@mail.gmail.com>

Greetings

On Tue, Oct 28, 2008 at 8:57 PM, AFAHOUNKO Danny <afahounko at gmail.com>wrote:

> I'm newbees in Clustering. I've installed a cluster with two nodes without
> a share storage.
>
I want i know if it's possible to install a software (apache, exim,...)
> once, and it will be automaticaly deployed on the two nodes ?!
> I'm using RedHat 5.1 Advanced Plateform with RedHat Cluster Suite.
>

On the face of it, No.

C'mon, How can two different OS images find the same binary when the storage
is not available in shared mode?

(NFS, one of the possible options in your case, is considered Shared
Storage, But then it could be a pain to configure to do what you want)

But it is just not worth it

Assuming both the nodes are identical in hardware, why not install all the
packages on say node 1. using yumand then copy the /var/cache/yum from that
node and do a yum install in node 2.

This may not be painful for small number of nodes but can can get unwieldy
if cluster nodes are large. In which case one should contemplete using
kickstart _before_ RHEL install. IOW, Plan properly


HTH

Regards

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081028/77ee0076/attachment.htm>

From gordan at bobich.net  Tue Oct 28 17:32:07 2008
From: gordan at bobich.net (Gordan Bobic)
Date: Tue, 28 Oct 2008 17:32:07 +0000
Subject: [Linux-cluster] Cluster Two nodes - Software Installation
Message-ID: <4906F1690017E8BB@> (added by postmaster@mail.o2.co.uk)

You can set two machines up with shared root storage using Open Shared Root


From jds at techma.com  Tue Oct 28 20:40:52 2008
From: jds at techma.com (Simmons, Dan A)
Date: Tue, 28 Oct 2008 16:40:52 -0400
Subject: [Linux-cluster] Cluster with kernel-smp nodes and hugemem nodes
Message-ID: <79CEFE3C5C43714D9170E3138DC09935A36895@TMAEMAIL.techma.com>

Hi All,

I have a 12 node Redhat 4.7 cluster and I want to run 3 nodes with the
hugemem kernel while keeping the rest of the nodes running the smp kernel.

 
Is there anything I have to worry about if I do this?  

 
J. Dan

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081028/3c082b11/attachment.htm>

From rodrique.heron at baruch.cuny.edu  Tue Oct 28 22:45:05 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Tue, 28 Oct 2008 18:45:05 -0400
Subject: [Linux-cluster] Multiple network path for cluster traffic
Message-ID: <20081028224115.E98BF15EC27@smtp25.baruch.cuny.edu>

Hello all-

Is it necessary to provide redundant paths for cluster traffic?

My server as six network interface, I would like to dedicate two for cluster traffic, both interfaces will be connected to separate switches. Is there a recommended way of setting this up so I can restrict all cluster traffic through the two interfaces? Should I bond both interfaces?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081028/c13aaac3/attachment.htm>

From david.costakos at gmail.com  Wed Oct 29 01:22:50 2008
From: david.costakos at gmail.com (Dave Costakos)
Date: Tue, 28 Oct 2008 18:22:50 -0700
Subject: [Linux-cluster] Cluster/GFS issue.
In-Reply-To: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
Message-ID: <6b6836c60810281822g554650c8y5a54fcfec9ea9520@mail.gmail.com>

Usually, when I hear about problems like this, it is often a multicast issue
-- at least from my experience.

Can you confirm that cman is able to talk on your multicast address?

If not, I suggest specifying a multicast address in the 224.0.0.111 - 250.

This will require that the whole cluster be reset.  I would avoid putting
startup commands in /etc/rc.local as some suggest -- seems like a red
herring to me. The init scripts should work fine (they do for me on our 3
8-node clusters.

-Dave.

2008/10/23 Allgood, John <jallgood at ohl.com>

>  Hello All
>
>
>
> I am having some issues with building an eight node Xen cluster. Let me
> give some background first. We have 8 dell PE 1950 with 32GB RAM connected
> via dual brocade fiber switchs to an EMC CX-310. The guests images are being
> stored on the SAN.  We are using EMC Powerpath to hand the multipathing. The
> Operating system is Redhat Advanced Platform 5.2 . The filesystems on the
> SAN were created using Conga CLVM/GFS1. We have the heartbeat on an separate
> private network. The fence devices are Dell DRAC's.
>
>  Here is the problem that we are having. We can't on an consistent basic
> get the GFS filesystem mounted. On the nodes that don't connect it will just
> hang on bootup trying to mount the GFS filesystem. All nodes come up and
> join the cluster at this point but only 1 or 2 will completely come up with
> the GFS filesystem mounted. If we do an interactive startup and skip the GFS
> part all systems will come up on the cluster but without the gfs mounted.
>
> At this point I am not sure what to do next.  I am thinking it may be a
> problem with the way the GFS filesystem was created. We just used the
> default settings. The LVM is 668GB created from an RAID10.
>
>
>
> Best Regards
>
>
>
> *John Allgood**
> **Senior Systems Administrator**
> **Turbo, division of OHL**
> **2251 Jesse Jewell Pky. NE**
> **Gainesville, GA 30507**
> **tel: (678) 989-3051  fax: (770) 531-7878**
> **jallgood at ohl.com*
>
> *www.ohl.com***
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Dave Costakos
mailto:david.costakos at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081028/d3fa0ae2/attachment.htm>

From raju.rajsand at gmail.com  Wed Oct 29 04:58:07 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Wed, 29 Oct 2008 10:28:07 +0530
Subject: [Linux-cluster] Cluster Two nodes - Software Installation
In-Reply-To: <7378413047924619566@unknownmsgid>
References: <7378413047924619566@unknownmsgid>
Message-ID: <8786b91c0810282158l3aef4edfq322eece085cd1e1f@mail.gmail.com>

On Tue, Oct 28, 2008 at 11:02 PM, Gordan Bobic <gordan at bobich.net> wrote:

> You can set two machines up with shared root storage using Open Shared Root
>

AFAIK, the prerequisite for Open Shared Root is a shared storage

from URL:
http://www.open-sharedroot.org/documentation/the-opensharedroot-mini-howto#prerequesits

[quote]
1. You should have at least two servers connected to some kind of storage
network. Both servers need to have concurrent access to at least one better
two logical units (LUNS).
[unquote]

So, IMHO, without some type of storage accessible to both nodes, as is the
case quoted in the original post, It is impossible.

Regards,

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081029/86a8385d/attachment.htm>

From raju.rajsand at gmail.com  Wed Oct 29 05:07:06 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Wed, 29 Oct 2008 10:37:06 +0530
Subject: [Linux-cluster] Cluster/GFS issue.
In-Reply-To: <6b6836c60810281822g554650c8y5a54fcfec9ea9520@mail.gmail.com>
References: <82E499DEBAB95F4E91140984379FB1C6046DD7@NOC-ML-09.ohlogistics.com>
	<6b6836c60810281822g554650c8y5a54fcfec9ea9520@mail.gmail.com>
Message-ID: <8786b91c0810282207y44d40a5brf150cc1b6c97d90c@mail.gmail.com>

Greetings,


2008/10/29 Dave Costakos <david.costakos at gmail.com>

> Usually, when I hear about problems like this, it is often a multicast
> issue -- at least from my experience.
>
> Yes, that is one possibility that must be checked.


> I would avoid putting startup commands in /etc/rc.local as some suggest --
> seems like a red herring to me.


Trust me,it is not a Red Herring. This method worked for a three node
cluster (one node had a different configuration) as this ensures that any
device drivers (Like SAS DAS box Which I came across once) which are not
"burnt" into initrd but are later loaded when the full system is booted.

The  rc.local method worked reliably compared to the /etc/fstab entries. But
then YMMV.

Regards,

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081029/dbeae84a/attachment.htm>

From raju.rajsand at gmail.com  Wed Oct 29 07:01:08 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Wed, 29 Oct 2008 12:31:08 +0530
Subject: [Linux-cluster] cmirror
Message-ID: <8786b91c0810290001h1b9c9534k947a9e8e8299d151@mail.gmail.com>

Greetings,

Could somebody point to some introductory material/doc on cmirror please

Regards

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081029/6ec0a69a/attachment.htm>

From jmacfarland at nexatech.com  Wed Oct 29 13:07:45 2008
From: jmacfarland at nexatech.com (Jeff Macfarland)
Date: Wed, 29 Oct 2008 08:07:45 -0500
Subject: [Linux-cluster] Multiple network path for cluster traffic
In-Reply-To: <20081028224115.E98BF15EC27@smtp25.baruch.cuny.edu>
References: <20081028224115.E98BF15EC27@smtp25.baruch.cuny.edu>
Message-ID: <49086021.3080505@nexatech.com>

Rodrique Heron wrote:
> Hello all-
> 
> Is it necessary to provide redundant paths for cluster traffic?
> 
> My server as six network interface, I would like to dedicate two for 
> cluster traffic, both interfaces will be connected to separate switches. 
> Is there a recommended way of setting this up so I can restrict all 
> cluster traffic through the two interfaces? Should I bond both interfaces?
> 
> Thanks
> 

Red Hat clustering currently only supports once interface for cluster 
traffic. If you want to use multiple interfaces, you must use bonding.

-- 
Jeff Macfarland (jmacfarland at nexatech.com)
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net


From jralph at intertechmedia.com  Thu Oct 30 17:37:14 2008
From: jralph at intertechmedia.com (Jason Ralph)
Date: Thu, 30 Oct 2008 13:37:14 -0400
Subject: [Linux-cluster] Data Loss / Files and Folders "2-Node_GFS-Cluster"
Message-ID: <2fd157df0810301037jf985e3bne5ca25e91dd74872@mail.gmail.com>

Hello List,

We currently have in production a two node cluster with a shared SAS storage
device.  Both nodes are running RHEL5 AP and are connected directly to the
storage device via SAS.  We also have configured a high availability NFS
service directory that is being exported out and is mounted on multiple
other linux servers.

The problem that I am seeing is:
FIle and folders that are using the GFS filesystem and live on the storage
device are mysteriously getting lost.  My first thought was that maybe one
of our many users has deleted them. So I have revoked the users privilleges
and it is still happening.  My other tought was that a rsync script may have
overwrote these files or deleted them.  I have stopped all scripting and
crons and it has happened again.

Can someone help me with a command or a log to view that would show me where
any of these folders may have gone?  Or has anyone else ever run into this
type of data loss using the similar setup?

Regards,
-- 
Jason R. Ralph
Systems Administrator
Intertech Media LLC
20 Summer Street - Floor 5
Stamford CT 06901
(203) 967 - 1800 x 122
jralph at intertechmedia.com


This transmittal may be a confidential communication or may otherwise be
privileged or confidential. If it is not clear that you are the intended
recipient, you are hereby notified that you have received this transmittal
in error; any review, dissemination, distribution or copying of this
transmittal is strictly prohibited. If you suspect that you have received
this communication in error, please notify us immediately by telephone at
1-203-967-1800 x 114, or e-mail at it at intertechmedia.com and immediately
delete this message and all its attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081030/301fb910/attachment.htm>

From alan.zg at gmail.com  Thu Oct 30 20:57:29 2008
From: alan.zg at gmail.com (Alan A)
Date: Thu, 30 Oct 2008 15:57:29 -0500
Subject: [Linux-cluster] APC Power switch question
Message-ID: <fac531740810301357l7ac12924l9e3095f25d913ea9@mail.gmail.com>

Hello everyone!

I have a few short questions. We just acquired 2 APC Power Switches. Our
clustered servers have two power supplies so each APC switch
supplies/supports one server power supply. Example:
dev02 power supply 1 - APC switch 1
dev02 power supply 2 - APC switch 2

Question:
I am trying to complete CONGA setup - and all is clear in the first box:
Name - got it
IP - got it
Login - got it
Password got it

What I do not understand is what is: 'port' stand for - is that the port
fence_apc is connecting to APC power switch - or is that the number of the
outlet.
What is switch(optional) mean?
I repeat this is in CONGA!

Thanks for the fast help.
-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081030/ba62e1b5/attachment.htm>

From jparsons at redhat.com  Thu Oct 30 21:32:02 2008
From: jparsons at redhat.com (jim parsons)
Date: Thu, 30 Oct 2008 17:32:02 -0400
Subject: [Linux-cluster] APC Power switch question
In-Reply-To: <fac531740810301357l7ac12924l9e3095f25d913ea9@mail.gmail.com>
References: <fac531740810301357l7ac12924l9e3095f25d913ea9@mail.gmail.com>
Message-ID: <1225402322.3319.4.camel@localhost.localdomain>

On Thu, 2008-10-30 at 15:57 -0500, Alan A wrote:
> 
> Hello everyone!
> 
> I have a few short questions. We just acquired 2 APC Power Switches.
> Our clustered servers have two power supplies so each APC switch
> supplies/supports one server power supply. Example:
> dev02 power supply 1 - APC switch 1
> dev02 power supply 2 - APC switch 2
> 
> Question:
> I am trying to complete CONGA setup - and all is clear in the first
> box:
> Name - got it
> IP - got it
> Login - got it
> Password got it
> 
> What I do not understand is what is: 'port' stand for - is that the
> port fence_apc is connecting to APC power switch - or is that the
> number of the outlet.
It is the outlet number on the switch...or the name of the outlet if you
have assigned a name to it using the APC firmware application
> What is switch(optional) mean? 
Certain APC switch models can be ganged together. If you are using the
switches standalone (which you are, it seems from the above) just leave
this field blank.

-j


From alan.zg at gmail.com  Thu Oct 30 21:38:34 2008
From: alan.zg at gmail.com (Alan A)
Date: Thu, 30 Oct 2008 16:38:34 -0500
Subject: [Linux-cluster] APC Power switch question
In-Reply-To: <1225402322.3319.4.camel@localhost.localdomain>
References: <fac531740810301357l7ac12924l9e3095f25d913ea9@mail.gmail.com>
	<1225402322.3319.4.camel@localhost.localdomain>
Message-ID: <fac531740810301438l40727ccaj265550efc363947f@mail.gmail.com>

Thanks for the answer. I have actually named switches in a 3 node cluster
and will set them up accordingly.

THis is how cluster.conf looks like, I am finishing the setup for dev03.

    <fencedevices>
        <fencedevice agent="fence_scsi" name="dev02_scsi"/>
        <fencedevice agent="fence_scsi" name="dev03_scsi"/>
        <fencedevice agent="fence_scsi" name="dev04_scsi"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx"
login="xxxxxx" name="PCPS01_dev02" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx"
login="xxxxxx" name="PCPS02_dev02" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx"
login="xxxxxx" name="PCPS01_dev04" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx"
login="xxxxxx" name="PCPS02_dev04" passwd="xxxxxx1"/>
    </fencedevices>


On Thu, Oct 30, 2008 at 4:32 PM, jim parsons <jparsons at redhat.com> wrote:

> On Thu, 2008-10-30 at 15:57 -0500, Alan A wrote:
> >
> > Hello everyone!
> >
> > I have a few short questions. We just acquired 2 APC Power Switches.
> > Our clustered servers have two power supplies so each APC switch
> > supplies/supports one server power supply. Example:
> > dev02 power supply 1 - APC switch 1
> > dev02 power supply 2 - APC switch 2
> >
> > Question:
> > I am trying to complete CONGA setup - and all is clear in the first
> > box:
> > Name - got it
> > IP - got it
> > Login - got it
> > Password got it
> >
> > What I do not understand is what is: 'port' stand for - is that the
> > port fence_apc is connecting to APC power switch - or is that the
> > number of the outlet.
> It is the outlet number on the switch...or the name of the outlet if you
> have assigned a name to it using the APC firmware application
> > What is switch(optional) mean?
> Certain APC switch models can be ganged together. If you are using the
> switches standalone (which you are, it seems from the above) just leave
> this field blank.
>
> -j
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081030/31b4885d/attachment.htm>

From pronix.service at gmail.com  Thu Oct 30 22:13:09 2008
From: pronix.service at gmail.com (pronix pronix)
Date: Thu, 30 Oct 2008 22:13:09 -0000
Subject: [Linux-cluster] Can clustered RHEL 5 use a SAN with different
	access rights for different nodes in the cluster?
In-Reply-To: <C47D7762.128A%RichardW@iodynamix.com>
References: <C47D7762.128A%RichardW@iodynamix.com>
Message-ID: <639ce0480806171423u4503665ewd7426080145309ea@mail.gmail.com>

yes , you can deploy than without gfs,but with gfs2 better
readonly access implement by anonymous (read only) users.
failover possible create - enough 2 nodes and drbd


2008/6/18 Richard Williams - IoDynamix <RichardW at iodynamix.com>:

> Please advise and/or redirect this posting if this is not the correct forum
> for my question - thanks.
>
> A company wants to use clustered rhel5 systems as inside/outside ftp
> servers. Users on the inside (LAN) cluster nodes can read and write to the
> SAN, while users on the outside (DMZ) cluster can only read.
>
> Is this application possible without GFS?
>
> If one node in the cluster fails, can the other node be provisioned to
> provide all services until recovery?
>
> Can a SAN be used as the "single" ftp location for both services (inside
> FTP
> & outside FTP?)
>
> Does the customer need more than four systems (i.e. 2 inside - 2 outside) -
> is a separate "command" system required?
>
>
> Have Dell's m1000e & 600 series blades been certified for this operating
> system?
>
> Is there any documentation available regarding separate access rights for
> multiple nodes in a cluster available?
>
> Thanks for your constructive reply.
>
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081030/92d21ea9/attachment.htm>

From s.wendy.cheng at gmail.com  Fri Oct 31 02:02:00 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Thu, 30 Oct 2008 21:02:00 -0500
Subject: [Linux-cluster] Data Loss / Files and Folders "2-Node_GFS-Cluster"
In-Reply-To: <2fd157df0810301037jf985e3bne5ca25e91dd74872@mail.gmail.com>
References: <2fd157df0810301037jf985e3bne5ca25e91dd74872@mail.gmail.com>
Message-ID: <490A6718.8000700@gmail.com>

Jason Ralph wrote:
> Hello List,
>
> We currently have in production a two node cluster with a shared SAS 
> storage device.  Both nodes are running RHEL5 AP and are connected 
> directly to the storage device via SAS.  We also have configured a 
> high availability NFS service directory that is being exported out and 
> is mounted on multiple other linux servers. 
>
> The problem that I am seeing is:
> FIle and folders that are using the GFS filesystem and live on the 
> storage device are mysteriously getting lost.  My first thought was 
> that maybe one of our many users has deleted them. So I have revoked 
> the users privilleges and it is still happening.  My other tought was 
> that a rsync script may have overwrote these files or deleted them.  I 
> have stopped all scripting and crons and it has happened again.
>
> Can someone help me with a command or a log to view that would show me 
> where any of these folders may have gone?  Or has anyone else ever run 
> into this type of data loss using the similar setup?
>


I don't (or "didn't") have adequate involvements with RHEL5 GFS. I may
not know enough to response. However, ......

Before RHEL 5.1 and/or community version 2.6.22 kernels, NFS lock (via
flock, fcntl, etc from client ends) is not populated into filesystem 
layer. It only reaches Linux VFS layer (local to one particular server). 
If your file access needs to get synchronized by either flock or posix 
fcntl *between multiple hosts (NFS servers)*, data loss could occur.
Newer versions of RHEL and 2.6.22-and-after kernels should have the fixes.

There was an old write-up in section 4.1 of
"http://people.redhat.com/wcheng/Project/nfs.htm" about this issue.


-- Wendy


From fdinitto at redhat.com  Fri Oct 31 08:27:06 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Fri, 31 Oct 2008 09:27:06 +0100 (CET)
Subject: [Linux-cluster] Cluster 2.99.12 (development snapshot) released
Message-ID: <Pine.LNX.4.64.0810310916060.2932@trider-g7>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


The cluster team and its community are proud to announce the 2.99.12
release from the master branch.

Important note: If you are running 2.99.xx series, please upgrade
immediatly to this version.

This release addresses several security issues.

The development cycle for 3.0 is proceeding at a very good speed and
mostlikely one of the next releases will be 3.0alpha1. All features
designed for 3.0 are being completed and taking a proper shape, the
library API has been stable for sometime (and will soon be marked as 3.0
soname). Stay tuned for upcoming updates!

The 2.99.XX releases are _NOT_ meant to be used for production
environments.. yet.

The master branch is the main development tree that receives all new
features, code, clean up and a whole brand new set of bugs,

At some point in time this code will become the 3.0 stable release.

Everybody with test equipment and time to spare, is highly encouraged to
download, install and test the 2.99 releases and more important report
problems.

In order to build the 2.99.11 release you will need:

- - corosync svn r1677 (porting to newer corosync is in progress).
- - openais svn r1656.
- - linux kernel (2.6.27)

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.99.12.tar.gz
   https://fedorahosted.org/releases/c/l/cluster/cluster-2.99.12.tar.gz

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

Under the hood (from 2.99.11):

Christine Caulfield (1):
       cman: fix two_node startup if -e is specified

David Teigland (4):
       dlm_controld: fix plock dump
       groupd/fenced/dlm_controld/gfs_controld: init logging after fork
       gfs_controld: move log_error message
       fenced/dlm_controld/gfs_controld: query thread mutex

Fabio M. Di Nitto (40):
       misc: cleanup copyright.... again
       misc: fix gfs2_edit build
       fence: update man page for fence_apc
       gfs2: randomize debugfs mount point
       gfs2: randomize file for savemeta operations
       gfs2: remove unused define
       rgmanager: randomize file for automatic data dump
       rgmanager: randomize ASEHAagent temp files
       rgmanager: move fs.sh log file where they belong
       rgmanager: move nfsclient.sh cache files where they belong
       rgmanager: move oracledb.sh log files where they belong
       build: reinstate targets in rgmanager metadata check
       rgmanager: randomize SAPDatabase temp file
       libgfs2: randomize creation of temporary directories for metafs mount
       xmlconfig: remove debugging fprintf
       ccs: implement config reload in legacy ccs
       cman: add /libccs/@next_handle support
       ccs: libccs major rework pass 1
       ccs: libccs split ccs_lookup_nodename into extras.c
       ccs: libccs major rework pass 2
       ccs: libccs major rework pass 3
       ccs: libccs major rework pass 4
       ccs: remove duplicate entry in internal header file
       ccs: libccs major rework pass 5
       common: plug liblogthread in the system
       build: use standard syslog priority name rather than corosync
       ccs: add ccs_read_logging
       ccsais: fix buffer overflow when reading huge config files
       xmlconfig: fix buffer overflow when reading huge config files
       ccs: cleanup ccs_read_logging
       gfs2: randomize debugfs mount point even more
       gfs2: randomize file for savemeta operations even more
       rgmanager: move state dump file where it belongs
       rgmanager: randomize ASEHAagent temp files even more
       rgmanager: randomize SAPDatabase temp file even more
       rgmanager: randomize oracledb.sh temp file
       misc: fix mktemp usage
       rgmanager: randomize smb.sh temp file
       rgmanager: randomize svclib_nfslock temp dir
       gfs2: randomize creation of temporary directories for metafs mount more

Jan Friesse (6):
       [fence] Fence agent for ePowerSwitch 8M+ (fence_eps)
       [fence] Fixed man pages makefile, so fence_eps.8 is now installed.
       fence: Added support for no_password in fence agents library and fence_eps.
       fence: Fixed case sensitives in action parameter.
       fence: Fix -C switch description in Python library
       fence: Operation 'list' and 'monitor' for Alom, LDOM, VMware and ePowerSwitch

Jim Meyering (8):
       don't dereference NULL upon failed realloc
       * fence/agents/xvm/ip_lookup.c (add_ip): Handle malloc failure.
       * gfs/gfs_fsck/inode.c (check_inode): handle failed malloc
       remove dead code (useless test of memset return value)
       add comments marking unchecked malloc calls
       Remove unused local variable, buf,
       add comments marking unchecked strdup calls
       handle some malloc failures

Jonathan Brassow (1):
       rgmanager (HALVM): Stop dumping debug output to /tmp

Marek 'marx' Grac (4):
       [fence]	Operation 'list' and 'monitor' for iLO, DRAC5 and APC
       [fence]     Operation 'list' and 'monitor' for WTI IPS 800-CE
       [fence] WTI should not power on/off plug if it is unable to get status
       [fence] WTI should not power on/off plug if it is unable to get status

Simone Gotti (1):
       [rgmanager] Fix fuser parsing on later versions of psmisc

  Makefile                                      |    4 +-
  cman/daemon/cman-preconfig.c                  |   17 +
  cman/daemon/cmanconfig.c                      |   44 +-
  common/Makefile                               |    4 +
  common/liblogthread/Makefile                  |   13 +
  common/liblogthread/liblogthread.c            |  222 +++++
  common/liblogthread/liblogthread.h            |   17 +
  config/libs/libccsconfdb/Makefile             |    7 +-
  config/libs/libccsconfdb/ccs.h                |    5 +
  config/libs/libccsconfdb/ccs_internal.h       |   29 +
  config/libs/libccsconfdb/extras.c             |  382 ++++++++
  config/libs/libccsconfdb/fullxpath.c          |  334 +++++++
  config/libs/libccsconfdb/libccs.c             | 1181 ++++++++---------------
  config/libs/libccsconfdb/xpathlite.c          |  424 ++++++++
  config/plugins/ccsais/config.c                |   28 +-
  config/plugins/ldap/configldap.c              |   11 -
  config/plugins/xml/config.c                   |  100 +-
  config/tools/ccs_tool/editconf.c              |    1 +
  config/tools/ldap/confdb2ldif.c               |    9 -
  configure                                     |   25 +-
  doc/COPYRIGHT                                 |    9 +-
  fence/agents/alom/fence_alom.py               |    2 +-
  fence/agents/apc/fence_apc.py                 |    4 +-
  fence/agents/apc_snmp/fence_apc_snmp.py       |    2 +-
  fence/agents/baytech/fence_baytech.pl         |    2 +-
  fence/agents/bladecenter/Makefile             |   13 -
  fence/agents/bladecenter/fence_bladecenter.py |    3 +-
  fence/agents/drac/fence_drac5.py              |    2 +-
  fence/agents/eps/Makefile                     |    5 +
  fence/agents/eps/fence_eps.py                 |  112 +++
  fence/agents/ilo/fence_ilo.py                 |    2 +-
  fence/agents/ldom/fence_ldom.py               |   34 +-
  fence/agents/lib/fencing.py.py                |   53 +-
  fence/agents/rsa/fence_rsa.py                 |   18 +-
  fence/agents/rsb/fence_rsb.py                 |   18 +-
  fence/agents/vmware/fence_vmware.py           |   66 +-
  fence/agents/wti/fence_wti.py                 |   16 +-
  fence/agents/xcat/fence_xcat.pl               |    2 +
  fence/agents/xvm/ip_lookup.c                  |    2 +
  fence/fenced/main.c                           |   13 +-
  fence/fenced/recover.c                        |   33 +-
  fence/man/Makefile                            |    3 +-
  fence/man/fence_alom.8                        |   10 +-
  fence/man/fence_apc.8                         |    8 +-
  fence/man/fence_baytech.8                     |    4 +-
  fence/man/fence_eps.8                         |  106 ++
  fence/man/fence_ibmblade.8                    |    2 +-
  fence/man/fence_rsa.8                         |    4 +-
  fence/man/fence_rsb.8                         |    4 +-
  fence/man/fence_vmware.8                      |   10 +-
  gfs-kernel/src/gfs/lm_interface.h             |    9 -
  gfs-kernel/src/gfs/lock_dlm.h                 |    9 -
  gfs-kernel/src/gfs/lock_dlm_lock.c            |    9 -
  gfs-kernel/src/gfs/lock_dlm_main.c            |    9 -
  gfs-kernel/src/gfs/lock_dlm_mount.c           |    9 -
  gfs-kernel/src/gfs/lock_dlm_sysfs.c           |    9 -
  gfs-kernel/src/gfs/lock_dlm_thread.c          |    9 -
  gfs-kernel/src/gfs/lock_nolock_main.c         |    9 -
  gfs-kernel/src/gfs/locking.c                  |    9 -
  gfs/gfs_fsck/block_list.c                     |    5 +-
  gfs/gfs_fsck/fs_dir.c                         |    4 +
  gfs/gfs_fsck/inode.c                          |    6 +-
  gfs/gfs_fsck/super.c                          |    3 +
  gfs/libgfs/fs_dir.c                           |    4 +
  gfs/libgfs/inode.c                            |    1 +
  gfs/libgfs/super.c                            |    1 +
  gfs/tests/filecon2/filecon2_server.c          |    2 +-
  gfs2/edit/hexedit.c                           |    3 +-
  gfs2/edit/hexedit.h                           |    4 +-
  gfs2/edit/savemeta.c                          |   19 +-
  gfs2/fsck/initialize.c                        |    1 +
  gfs2/libgfs2/libgfs2.h                        |    4 -
  gfs2/libgfs2/misc.c                           |  117 +--
  gfs2/libgfs2/super.c                          |    1 +
  gfs2/mkfs/main_grow.c                         |    4 +-
  gfs2/mkfs/main_jadd.c                         |    7 +-
  gfs2/quota/check.c                            |   12 +-
  gfs2/quota/gfs2_quota.h                       |    3 -
  gfs2/quota/main.c                             |   25 +-
  gfs2/tool/df.c                                |    4 +-
  gfs2/tool/misc.c                              |   36 +-
  gnbd/tools/gnbd_export/gnbd_export.c          |    4 +
  group/daemon/app.c                            |    3 +
  group/daemon/cpg.c                            |    2 +
  group/daemon/joinleave.c                      |    1 +
  group/daemon/main.c                           |    7 +-
  group/dlm_controld/deadlock.c                 |   10 +-
  group/dlm_controld/main.c                     |    9 +-
  group/dlm_controld/plock.c                    |    3 +
  group/gfs_controld/cpg-new.c                  |   22 +-
  group/gfs_controld/crc.c                      |   12 -
  group/gfs_controld/main.c                     |   10 +-
  group/gfs_controld/plock.c                    |    2 +
  make/copyright.cf                             |    2 +-
  make/defines.mk.input                         |    3 +
  rgmanager/src/daemons/clurmtabd_lib.c         |    1 +
  rgmanager/src/daemons/dtest.c                 |    2 +
  rgmanager/src/daemons/main.c                  |    4 +-
  rgmanager/src/resources/ASEHAagent.sh         |   11 +-
  rgmanager/src/resources/Makefile              |   33 +-
  rgmanager/src/resources/SAPDatabase           |    2 +-
  rgmanager/src/resources/clusterfs.sh          |    4 +-
  rgmanager/src/resources/fs.sh                 | 1304 -------------------------
  rgmanager/src/resources/fs.sh.in              | 1304 +++++++++++++++++++++++++
  rgmanager/src/resources/lvm_by_vg.sh          |    2 +-
  rgmanager/src/resources/netfs.sh              |    4 +-
  rgmanager/src/resources/nfsclient.sh          |    7 +-
  rgmanager/src/resources/oracledb.sh           |  887 -----------------
  rgmanager/src/resources/oracledb.sh.in        |  888 +++++++++++++++++
  rgmanager/src/resources/smb.sh                |    2 +-
  rgmanager/src/resources/svclib_nfslock        |    3 +-
  111 files changed, 4805 insertions(+), 3509 deletions(-)

- --
I'm going to make him an offer he can't refuse.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iQIVAwUBSQrBYQgUGcMLQ3qJAQIk0BAAlZFrEgXWy8RxvHrXuvScIgutOjd+Bjgj
2cPdoaFZjeLSroWifZJNHjjYfSG/FcpZug/NJxall3xVicmwc/CUljJtgqRJeN6x
0VWTFyC7GJrg6pnzEnTyriggBpaGDZZgnbLisV2gmIqFuDmiEVHAqnoYWl+dU9dj
xaPq01LrVXzYhVb18DYqglCWl5LmHQQyTmhDh5pvUwbwZd/fsdr4WI7gkcgxA1Uw
Hy+pbVMWIgRTBH+YEDH2j28pynvaNLvUopLBPHFLGY971vLhYldGzUmQubm04J7O
ocQl0Q9qxuSVCqrCIpQ/Ty+V0x0begzahaczccdJAXVyxti2owKS4FX8OqLQPHo0
plFIx4g8hJSxX4zgfh/P7Fb48ePlGN6WE07o/2mO1vplfEOpnQ2xoWYFsDCaoSjO
W2bETI+xT+E+UpKTI0w1j5/mfo/8kJ79WmDlZZujuwrM6/1iMJVTWbffqZkbGMcj
ukl0B3q5VkFo4NOTtZJHOUfhC3+2QglfyhT09Fxhp1eqMiAFZDWWqEQxHC7dbtAv
xu8KRCQiR4hVEZLNnLaoAIYlWABVAz1Ltux52uDFuul/jusxDpqjlp1cT54+j+ss
h1wwlxgyFyisYCXxnAiRkECKjttcOG4FrVAA4k3fOl8u6F0Suw+GJdS2cESeXtce
D2lFBpyTapQ=
=JxFs
-----END PGP SIGNATURE-----


From fdinitto at redhat.com  Fri Oct 31 08:58:52 2008
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Fri, 31 Oct 2008 09:58:52 +0100 (CET)
Subject: [Linux-cluster] Cluster 2.03.09 released
Message-ID: <Pine.LNX.4.64.0810310956450.2932@trider-g7>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


The cluster team and its vibrant community are proud to announce the
2.03.09 release from the STABLE2 branch.

The STABLE2 branch collects, on a daily base, all bug fixes and the bare
minimal changes required to run the cluster on top of the most recent Linux
kernel (2.6.27) and rock solid openais (0.80.3).

This release addresses several security issues. Please consider upgrading 
as soon as possible.

The new source tarball can be downloaded here:

   ftp://sources.redhat.com/pub/cluster/releases/cluster-2.03.09.tar.gz
   https://fedorahosted.org/releases/c/l/cluster/cluster-2.03.09.tar.gz

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Happy clustering,
Fabio

Under the hood (from 2.03.08):

Christine Caulfield (3):
       dlmtop: Add usage message
       cman: fix two_node startup if -e is specified
       dlmtop: fix some typos.

Fabio M. Di Nitto (26):
       misc: cleanup copyright.... again... and again...
       misc: fix gfs2_edit build
       cman: fix buffer overflow when reading huge config files
       fence: update man page for fence_apc
       gfs2: randomize debugfs mount point
       gfs2: randomize debugfs mount point even more
       gfs2: randomize file for savemeta operations
       gfs2: randomize file for savemeta operations even more
       gfs2: remove unused define
       rgmanager: randomize file for automatic data dump
       rgmanager: move state dump file where it belongs
       rgmanager: randomize ASEHAagent temp files
       rgmanager: randomize ASEHAagent temp files even more
       rgmanager: move fs.sh log file where they belong
       rgmanager: move nfsclient.sh cache files where they belong
       rgmanager: move oracledb.sh log files where they belong
       build: reinstate targets in rgmanager metadata check
       rgmanager: randomize SAPDatabase temp file
       rgmanager: randomize SAPDatabase temp file even more
       libgfs2: randomize creation of temporary directories for metafs mount
       rgmanager: randomize oracledb.sh temp file
       misc: fix mktemp usage
       rgmanager: randomize smb.sh temp file
       rgmanager: randomize svclib_nfslock temp dir
       ccs_tool: randomize temporary file
       gfs2: randomize creation of temporary directories for metafs mount more

Jan Friesse (4):
       [fence] Fence agent for ePowerSwitch 8M+ (fence_eps)
       [fence] Fixed man pages makefile, so fence_eps.8 is now installed.
       fence: Added support for no_password in fence agents library and fence_eps.
       fence: Fixed case sensitives in action parameter.

Jonathan Brassow (1):
       rgmanager (HALVM): Stop dumping debug output to /tmp

Marek 'marx' Grac (2):
       [fence] WTI should not power on/off plug if it is unable to get status
       [fence] WTI should not power on/off plug if it is unable to get status

  ccs/ccs_tool/upgrade.c                  |    7 +-
  cman/daemon/ais.c                       |    8 +-
  cman/daemon/cmanccs.c                   |   70 +-
  cman/daemon/config.c                    |   49 +-
  config/copyright.cf                     |    2 +-
  dlm/tests/tcpdump/dlmtop.c              |   84 ++-
  fence/agents/apc_snmp/README            |    2 -
  fence/agents/apc_snmp/fence_apc_snmp.py |    2 +-
  fence/agents/baytech/fence_baytech.pl   |    2 +-
  fence/agents/eps/Makefile               |    5 +
  fence/agents/eps/fence_eps.py           |  108 +++
  fence/agents/lib/fencing.py.py          |   27 +-
  fence/agents/lpar/Makefile              |   13 -
  fence/agents/lpar/fence_lpar.py         |    3 +-
  fence/agents/rsa/fence_rsa.py           |   18 +-
  fence/agents/rsb/fence_rsb.py           |   18 +-
  fence/agents/vmware/fence_vmware.py     |    3 +-
  fence/agents/xcat/fence_xcat.pl         |    2 +
  fence/man/Makefile                      |    3 +-
  fence/man/fence_alom.8                  |   10 +-
  fence/man/fence_apc.8                   |    8 +-
  fence/man/fence_baytech.8               |    4 +-
  fence/man/fence_eps.8                   |  106 +++
  fence/man/fence_ibmblade.8              |    2 +-
  fence/man/fence_rsa.8                   |    4 +-
  fence/man/fence_rsb.8                   |    4 +-
  fence/man/fence_vmware.8                |   10 +-
  gfs-kernel/src/gfs/lm_interface.h       |    9 -
  gfs-kernel/src/gfs/lock_dlm.h           |    9 -
  gfs-kernel/src/gfs/lock_dlm_lock.c      |    9 -
  gfs-kernel/src/gfs/lock_dlm_main.c      |    9 -
  gfs-kernel/src/gfs/lock_dlm_mount.c     |    9 -
  gfs-kernel/src/gfs/lock_dlm_sysfs.c     |    9 -
  gfs-kernel/src/gfs/lock_dlm_thread.c    |    9 -
  gfs-kernel/src/gfs/lock_nolock_main.c   |    9 -
  gfs-kernel/src/gfs/locking.c            |    9 -
  gfs2/edit/hexedit.c                     |    2 +-
  gfs2/edit/hexedit.h                     |    4 +-
  gfs2/edit/savemeta.c                    |   15 +-
  gfs2/libgfs2/libgfs2.h                  |    4 -
  gfs2/libgfs2/misc.c                     |  117 +---
  gfs2/mkfs/main_grow.c                   |    4 +-
  gfs2/mkfs/main_jadd.c                   |    7 +-
  gfs2/quota/check.c                      |   12 +-
  gfs2/quota/gfs2_quota.h                 |    3 -
  gfs2/quota/main.c                       |   25 +-
  gfs2/tool/df.c                          |    4 +-
  gfs2/tool/misc.c                        |   36 +-
  rgmanager/src/daemons/main.c            |    4 +-
  rgmanager/src/resources/ASEHAagent.sh   |   11 +-
  rgmanager/src/resources/Makefile        |   33 +-
  rgmanager/src/resources/SAPDatabase     |    2 +-
  rgmanager/src/resources/fs.sh           | 1304 -------------------------------
  rgmanager/src/resources/fs.sh.in        | 1304 +++++++++++++++++++++++++++++++
  rgmanager/src/resources/lvm_by_vg.sh    |    2 +-
  rgmanager/src/resources/nfsclient.sh    |    7 +-
  rgmanager/src/resources/oracledb.sh     |  888 ---------------------
  rgmanager/src/resources/oracledb.sh.in  |  889 +++++++++++++++++++++
  rgmanager/src/resources/smb.sh          |    2 +-
  rgmanager/src/resources/svclib_nfslock  |    3 +-
  60 files changed, 2732 insertions(+), 2605 deletions(-)

- --
I'm going to make him an offer he can't refuse.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iQIVAwUBSQrI0QgUGcMLQ3qJAQJWPw/8Cx56Zw5mdKNUBTHqgGEF5UTySV3GLMMq
KiLS8L2sAaKvAjP2hqntTJn+iKBQdH02hCLo0PLDKdXwY8TSFY8Ryu04340ElUKw
ShnC6mxKs0Kc44X+jUiqG4gH7zEoeW0KdW514NFdyY41Jd7X6IXmIyGDgE+kCxjm
T/n3HJv+3sNyCYbtHMBnnCnXj4e5Bp9lj5Dd0u0QiJWunDucX4x5DQrDZF7SmUaF
QnyIEDU3AUB6TI2Yzg24BuWhXX4upaBX9LGOVn0Y4sLappZqrI/RFgN1A05mXnsd
fRQWRDjsDcKBlU/+YKKNZaE2uefVHzshza0VOxlvqFtEbbDmjIRv+Bkw7L51C/nG
Vxe4xNvXukg8GhiZCsPsP3Iv84nJaLnHkS1JqKAf8iZRfHGvlHXmzYBj462j+T/i
RrpF3qmcCiwz12HI+MUkCNgkbVTA3LagSZKbiB1AYFWA+I+vksBTD1d9VgYSUIub
vrrn2IhpsSRVbAsvVGO4lCGZJYRNza/d6c3bi8O0GG7JjN2I4ucGZs3yCgyjei1O
1bJSIxhL0COWhmJYaZnwhll1mYQ9td+BTu4BzF2Wd1NE94G2wE+/OnT4Xu4xzQiK
Wse4BjuezGWbjooG0BLpAnZbiZfOHZnUGNMAlrkTELtg2c9ed3vvYZy4jdkmoxcY
zkdU2QK+8xs=
=Fqdo
-----END PGP SIGNATURE-----


From pk at nodex.ru  Fri Oct 31 10:52:12 2008
From: pk at nodex.ru (Pavel Kuzin)
Date: Fri, 31 Oct 2008 13:52:12 +0300
Subject: [Linux-cluster] Building error in  Cluster 2.03.09
References: <Pine.LNX.4.64.0810310956450.2932@trider-g7>
Message-ID: <0d5f01c93b46$baf5a4e0$a401a8c0@mainoffice.nodex.ru>

Hello!

I`m tryig to build cluster 2.03.09 against linux 2.6.27.4.

When building have a error:

upgrade.o: In function `upgrade_device_archive':
/root/newcluster/cluster-2.03.09/ccs/ccs_tool/upgrade.c:226: undefined reference to `mkostemp'
collect2: ld returned 1 exit status
make[2]: *** [ccs_tool] Error 1
make[2]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs/ccs_tool'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs'
make: *** [ccs] Error 2

node2:~/newcluster/cluster-2.03.09# uname -a
Linux node2 2.6.27.4 #2 SMP Fri Oct 31 13:42:09 MSK 2008 i686 GNU/Linux

--
Pavel D.Kuzin
Nodex LTD.


From pk at nodex.ru  Fri Oct 31 11:13:05 2008
From: pk at nodex.ru (Pavel Kuzin)
Date: Fri, 31 Oct 2008 14:13:05 +0300
Subject: [Linux-cluster] Fw: Building error in  Cluster 2.03.09
Message-ID: <0d9d01c93b49$a5a81fc0$a401a8c0@mainoffice.nodex.ru>

Hello!

I`m trying to build cluster 2.03.09 against linux 2.6.27.4.

When building have a error:
 
 upgrade.o: In function `upgrade_device_archive':
 /root/newcluster/cluster-2.03.09/ccs/ccs_tool/upgrade.c:226: undefined reference to `mkostemp'
 collect2: ld returned 1 exit status
 make[2]: *** [ccs_tool] Error 1
 make[2]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs/ccs_tool'
 make[1]: *** [all] Error 2
 make[1]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs'
 make: *** [ccs] Error 2
 
 node2:~/newcluster/cluster-2.03.09# uname -a
 Linux node2 2.6.27.4 #2 SMP Fri Oct 31 13:42:09 MSK 2008 i686 GNU/Linux

Distro  - Debian Etch 

Seems mkostemp is available since glibc 2.7.
I have 2.6.
Can "mkostemp" be changed to another similar function?


 --
 Pavel D.Kuzin
 Nodex LTD.


From mad at wol.de  Fri Oct 31 11:19:35 2008
From: mad at wol.de (Marc - A. Dahlhaus [ Administration | Westermann GmbH ])
Date: Fri, 31 Oct 2008 12:19:35 +0100
Subject: [Linux-cluster] Building error in  Cluster 2.03.09
In-Reply-To: <0d5f01c93b46$baf5a4e0$a401a8c0@mainoffice.nodex.ru>
References: <Pine.LNX.4.64.0810310956450.2932@trider-g7>
	<0d5f01c93b46$baf5a4e0$a401a8c0@mainoffice.nodex.ru>
Message-ID: <1225451975.3666.10.camel@marc>

Hello Pavel,


2.03.09 builds just fine against kernel 2.6.27.4, openais 0.84 and glibc
2.8 here.

As mkostemp should be defined inside of /usr/include/stdlib.h this must
be a problem with your local build-environment.


Marc

Am Freitag, den 31.10.2008, 13:52 +0300 schrieb Pavel Kuzin:
> Hello!
> 
> I`m tryig to build cluster 2.03.09 against linux 2.6.27.4.
> 
> When building have a error:
> 
> upgrade.o: In function `upgrade_device_archive':
> /root/newcluster/cluster-2.03.09/ccs/ccs_tool/upgrade.c:226: undefined reference to `mkostemp'
> collect2: ld returned 1 exit status
> make[2]: *** [ccs_tool] Error 1
> make[2]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs/ccs_tool'
> make[1]: *** [all] Error 2
> make[1]: Leaving directory `/root/newcluster/cluster-2.03.09/ccs'
> make: *** [ccs] Error 2
> 
> node2:~/newcluster/cluster-2.03.09# uname -a
> Linux node2 2.6.27.4 #2 SMP Fri Oct 31 13:42:09 MSK 2008 i686 GNU/Linux
> 
> --
> Pavel D.Kuzin
> Nodex LTD.
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From rodrique.heron at baruch.cuny.edu  Fri Oct 31 13:27:23 2008
From: rodrique.heron at baruch.cuny.edu (Rodrique Heron)
Date: Fri, 31 Oct 2008 09:27:23 -0400
Subject: [Linux-cluster] APC Power switch question
Message-ID: <20081031132328.BDDC715EC52@smtp25.baruch.cuny.edu>

I just acquired 2 APC Power Switches my self, my servers are dell, so my plan is to use the APC as a primary and drac as a secondary fencingdevice.

My cluster as one interface for production traffic and another for cluster traffic which is a private non routed network. 

Can the fencing devices be connected to the production? Or they have to been on the same network for cluster traffic?

Thanks
 

----- Original Message -----
From: linux-cluster-bounces at redhat.com <linux-cluster-bounces at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Thu Oct 30 17:38:34 2008
Subject: Re: [Linux-cluster] APC Power switch question

Thanks for the answer. I have actually named switches in a 3 node cluster and will set them up accordingly. 

THis is how cluster.conf looks like, I am finishing the setup for dev03.

    <fencedevices>
        <fencedevice agent="fence_scsi" name="dev02_scsi"/>
        <fencedevice agent="fence_scsi" name="dev03_scsi"/>
        <fencedevice agent="fence_scsi" name="dev04_scsi"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx" login="xxxxxx" name="PCPS01_dev02" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx" login="xxxxxx" name="PCPS02_dev02" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx" login="xxxxxx" name="PCPS01_dev04" passwd="xxxxxx1"/>
        <fencedevice agent="fence_apc" ipaddr="xxx.xxx.xxx.xxx" login="xxxxxx" name="PCPS02_dev04" passwd="xxxxxx1"/>
    </fencedevices>


On Thu, Oct 30, 2008 at 4:32 PM, jim parsons <jparsons at redhat.com> wrote:


	On Thu, 2008-10-30 at 15:57 -0500, Alan A wrote:
	>
	> Hello everyone!
	>
	> I have a few short questions. We just acquired 2 APC Power Switches.
	> Our clustered servers have two power supplies so each APC switch
	> supplies/supports one server power supply. Example:
	> dev02 power supply 1 - APC switch 1
	> dev02 power supply 2 - APC switch 2
	>
	> Question:
	> I am trying to complete CONGA setup - and all is clear in the first
	> box:
	> Name - got it
	> IP - got it
	> Login - got it
	> Password got it
	>
	> What I do not understand is what is: 'port' stand for - is that the
	> port fence_apc is connecting to APC power switch - or is that the
	> number of the outlet.
	
	It is the outlet number on the switch...or the name of the outlet if you
	have assigned a name to it using the APC firmware application
	
	> What is switch(optional) mean?
	
	Certain APC switch models can be ganged together. If you are using the
	switches standalone (which you are, it seems from the above) just leave
	this field blank.
	
	-j
	
	--
	Linux-cluster mailing list
	Linux-cluster at redhat.com
	https://www.redhat.com/mailman/listinfo/linux-cluster
	

-- 
Alan A.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081031/87720caa/attachment.htm>

From raju.rajsand at gmail.com  Fri Oct 31 17:18:29 2008
From: raju.rajsand at gmail.com (Rajagopal Swaminathan)
Date: Fri, 31 Oct 2008 22:48:29 +0530
Subject: [Linux-cluster] one click to start httpd on all nodes - possible?
In-Reply-To: <200809011154.43460.linux@vfemail.net>
References: <200808271301.57414.linux@vfemail.net>
	<38A48FA2F0103444906AD22E14F1B5A307F20245@mailxchg01.corp.opsource.net>
	<200809011154.43460.linux@vfemail.net>
Message-ID: <8786b91c0810311018v75fa7cfdyc0508baad7a29ba6@mail.gmail.com>

Greetings

On Mon, Sep 1, 2008 at 2:24 PM, Alex <linux at vfemail.net> wrote:

> i need a "command center" to control (start/stop) a
> resourse/service globally in 2nd thier, on all N nodes.


Have you tried cssh?

Not exactly a bells and whistles stuff, but can do what you are describing

Regards

Rajagopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081031/943a640d/attachment.htm>

From pk at nodex.ru  Fri Oct 31 17:32:45 2008
From: pk at nodex.ru (Pavel Kuzin)
Date: Fri, 31 Oct 2008 20:32:45 +0300
Subject: [Linux-cluster] Strange fenced error
References: <Pine.LNX.4.64.0810310956450.2932@trider-g7>
Message-ID: <1a9e01c93b7e$afbc2170$a401a8c0@mainoffice.nodex.ru>

When node trying to fence another

Oct 31 20:36:01 node1 fenced[2634]: fencing node "node2"
Oct 31 20:36:05 node1 fenced[2634]: can't get node number for node ??^U^I~a^F?^P
Oct 31 20:36:05 node1 fenced[2634]: fence "node2" success

--
Pavel D.Kuzin
Nodex LTD.


From pk at nodex.ru  Fri Oct 31 17:32:45 2008
From: pk at nodex.ru (Pavel Kuzin)
Date: Fri, 31 Oct 2008 20:32:45 +0300
Subject: [Linux-cluster] Strange fenced error
References: <Pine.LNX.4.64.0810310956450.2932@trider-g7>
Message-ID: <1a9e01c93b7e$afbc2170$a401a8c0@mainoffice.nodex.ru>

When node trying to fence another

Oct 31 20:36:01 node1 fenced[2634]: fencing node "node2"
Oct 31 20:36:05 node1 fenced[2634]: can't get node number for node ??^U^I~a^F?^P
Oct 31 20:36:05 node1 fenced[2634]: fence "node2" success

--
Pavel D.Kuzin
Nodex LTD.


From lhh at redhat.com  Fri Oct 31 18:31:54 2008
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 31 Oct 2008 14:31:54 -0400
Subject: [Linux-cluster] Fw: Building error in  Cluster 2.03.09
In-Reply-To: <0d9d01c93b49$a5a81fc0$a401a8c0@mainoffice.nodex.ru>
References: <0d9d01c93b49$a5a81fc0$a401a8c0@mainoffice.nodex.ru>
Message-ID: <1225477914.3194.159.camel@ayanami>

On Fri, 2008-10-31 at 14:13 +0300, Pavel Kuzin wrote:
>  node2:~/newcluster/cluster-2.03.09# uname -a
>  Linux node2 2.6.27.4 #2 SMP Fri Oct 31 13:42:09 MSK 2008 i686 GNU/Linux
> 
> Distro  - Debian Etch 
> 

Maybe...

> Seems mkostemp is available since glibc 2.7.
> I have 2.6.
> Can "mkostemp" be changed to another similar function?

#define mkostemp(val, flags) mkstemp(val)

?

Man page:

int mkstemp(char *template);
int mkostemp (char *template, int flags);

...

mkostemp() is like mkstemp(), with the difference that flags as for
open(2) may be specified in flags (e.g., O_APPEND, O_SYNC).

Not sure the implications of doing this; I didn't analyze the open flags
used.

-- Lon


From alan.zg at gmail.com  Fri Oct 31 19:39:01 2008
From: alan.zg at gmail.com (Alan A)
Date: Fri, 31 Oct 2008 14:39:01 -0500
Subject: [Linux-cluster] Node won't fence APC switch strange error
Message-ID: <fac531740810311239u2e7c9149y877fe6a770f3491c@mail.gmail.com>

Does anyone have any idea what this means? Any suggestions?


>fence_node fenmrdev03


agent "fence_apc" reports: Traceback (most recent call last):
  File "/sbin/fence_apc", line 829, in ?
    main()
  File "/sbin/fence_apc", line 303, in main
    do_power_off(sock)
  File "/sbin/fence_apc", line 813, in do_power_off
    x = do_power_switch(sock, "off")
  File "/sbi
agent "fence_apc" reports: n/fence_apc", line 611, in do_power_switch
    result_code, response = power_off(txt + ndbuf)
  File "/sbin/fence_apc", line 817, in power_off
    x = power_switch(buffer, False, "2", "3");
  File "/sbin/fence_apc", line 810, in power_switch
    raise "un
agent "fence_apc" reports: known screen encountered in \n" + str(lines) +
"\n"
unknown screen encountered in
['', '> 2', '', '', '------- Configure Outlet
------------------------------------------------------', '', '    #  State
Ph  Name                     Pwr On Dly  Pwr Off D
agent "fence_apc" reports: ly  Reboot Dur.', '
----------------------------------------------------------------------------',
'    2  ON     1   Fenmrdev03               0 sec       0 sec        5 sec',
'', '     1- Outlet Name         : Fenmrdev03', '     2- Power On Delay(sec)

agent "fence_apc" reports: : 0', '     3- Power Off Delay(sec): 0', '     4-
Reboot Duration(sec): 5', '     5- Accept Changes      : ', '', '     ?-
Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']


-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081031/e39407ae/attachment.htm>

From alan.zg at gmail.com  Fri Oct 31 19:42:04 2008
From: alan.zg at gmail.com (Alan A)
Date: Fri, 31 Oct 2008 14:42:04 -0500
Subject: [Linux-cluster] Re: Node won't fence APC switch strange error
In-Reply-To: <fac531740810311239u2e7c9149y877fe6a770f3491c@mail.gmail.com>
References: <fac531740810311239u2e7c9149y877fe6a770f3491c@mail.gmail.com>
Message-ID: <fac531740810311242n2d34338cpb4853a5a82724dd5@mail.gmail.com>

clvmd hangs when trying to get status or restart it - I am not sure how
related is this?

On Fri, Oct 31, 2008 at 2:39 PM, Alan A <alan.zg at gmail.com> wrote:

> Does anyone have any idea what this means? Any suggestions?
>
>
> >fence_node fenmrdev03
>
>
>
> agent "fence_apc" reports: Traceback (most recent call last):
>   File "/sbin/fence_apc", line 829, in ?
>     main()
>   File "/sbin/fence_apc", line 303, in main
>     do_power_off(sock)
>   File "/sbin/fence_apc", line 813, in do_power_off
>     x = do_power_switch(sock, "off")
>   File "/sbi
> agent "fence_apc" reports: n/fence_apc", line 611, in do_power_switch
>     result_code, response = power_off(txt + ndbuf)
>   File "/sbin/fence_apc", line 817, in power_off
>     x = power_switch(buffer, False, "2", "3");
>   File "/sbin/fence_apc", line 810, in power_switch
>     raise "un
> agent "fence_apc" reports: known screen encountered in \n" + str(lines) +
> "\n"
> unknown screen encountered in
> ['', '> 2', '', '', '------- Configure Outlet
> ------------------------------------------------------', '', '    #  State
> Ph  Name                     Pwr On Dly  Pwr Off D
> agent "fence_apc" reports: ly  Reboot Dur.', '
> ----------------------------------------------------------------------------',
> '    2  ON     1   Fenmrdev03               0 sec       0 sec        5 sec',
> '', '     1- Outlet Name         : Fenmrdev03', '     2- Power On Delay(sec)
>
> agent "fence_apc" reports: : 0', '     3- Power Off Delay(sec): 0', '
> 4- Reboot Duration(sec): 5', '     5- Accept Changes      : ', '', '     ?-
> Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']
>
>
> --
> Alan A.
>


-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081031/906891b8/attachment.htm>

From s.wendy.cheng at gmail.com  Thu Oct 30 18:49:37 2008
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Thu, 30 Oct 2008 14:49:37 -0400
Subject: [Linux-cluster] Data Loss / Files and Folders "2-Node_GFS-Cluster"
In-Reply-To: <2fd157df0810301037jf985e3bne5ca25e91dd74872@mail.gmail.com>
References: <2fd157df0810301037jf985e3bne5ca25e91dd74872@mail.gmail.com>
Message-ID: <490A01C1.5030003@gmail.com>

Jason Ralph wrote:
> Hello List,
>
> We currently have in production a two node cluster with a shared SAS 
> storage device.  Both nodes are running RHEL5 AP and are connected 
> directly to the storage device via SAS.  We also have configured a 
> high availability NFS service directory that is being exported out and 
> is mounted on multiple other linux servers. 
>
> The problem that I am seeing is:
> FIle and folders that are using the GFS filesystem and live on the 
> storage device are mysteriously getting lost.  My first thought was 
> that maybe one of our many users has deleted them. So I have revoked 
> the users privilleges and it is still happening.  My other tought was 
> that a rsync script may have overwrote these files or deleted them.  I 
> have stopped all scripting and crons and it has happened again.
>
> Can someone help me with a command or a log to view that would show me 
> where any of these folders may have gone?  Or has anyone else ever run 
> into this type of data loss using the similar setup?
>

I don't (or "didn't") have adequate involvements with RHEL5 GFS. I may 
not know enough to response. However, users should be aware of ...

Before RHEL 5.1 and community version 2.6.22 kernels, NFS locks (i.e. 
flock, posix lock, etc) is not populated into filesystem layer. It only 
reaches Linux VFS layer (local to one particular server). If your file 
access needs to get synchronized via either flock or posix locks 
*between multiple hosts (i.e. NFS servers)*,  data loss could occur. 
Newer versions of RHEL and 2.6.22-and-above kernels should have the code 
to support this new feature.

There was an old write-up in section 4.1 of 
"http://people.redhat.com/wcheng/Project/nfs.htm" about this issue.

-- Wendy