From stefan at lsd.co.za  Sat Jan  1 15:00:07 2011
From: stefan at lsd.co.za (Stefan Lesicnik)
Date: Sat, 1 Jan 2011 17:00:07 +0200 (SAST)
Subject: [Linux-cluster] Multiple communication channels
In-Reply-To: <4D1B92C1.2030900@alteeve.com>
Message-ID: <1043183573.3.1293894007922.JavaMail.root@zimbra>



----- Original Message -----
> On 12/29/2010 02:49 PM, Kit Gerrits wrote:
> > Hello,
> >
> > AFAIK, Multi-interface heartbeat is something that was only recently
> > added to RHCS (earlier this year, if I recall correctly).
> >
> > Until then, the failover part was usually achieved by using a bonded
> > interface as heartbeat interface.
> > If possible, I would suggest using 2 (connected) Multicast switches
> > and
> > running a bond from each server to each switch.
> > Or 2 regular switches and broadcast heartbeat (switches only
> > connected
> > to eachother)
> > Otherwise, using an active-active bond (channel?) with 2 crossover
> > cables might also work, but offers less protection against interface
> > failures.
> >
> >
> > Regards,
> >
> > Kit
> 
> Hi,
> 
> It was around in el5. Perhaps not in the early versions, I am not sure
> exactly when it was added, but certainly by 5.4.
> 
> In the recent 3.x branch, openais was replaced by corosync (for core
> cluster communications), which is where rrp is controlled.
> 
> Of course, I could always be wrong. :)
> 
> Cheers.


Thanks everybody for the replies. I am running 5.5 and followed the documentation on http://sources.redhat.com/cluster/wiki/MultiHome
The config seems to be working with cman status showing both IP addresses. I will do proper testing to see if it actually works.

Thanks for the assistance.

Stefan



From jacob.ishak at gmail.com  Mon Jan  3 08:27:19 2011
From: jacob.ishak at gmail.com (jacob ishak)
Date: Mon, 3 Jan 2011 10:27:19 +0200
Subject: [Linux-cluster] Multiple communication channels
In-Reply-To: <1218306684.1.1293649373746.JavaMail.root@zimbra>
References: <2119444257.0.1293649365513.JavaMail.root@zimbra>
	<1218306684.1.1293649373746.JavaMail.root@zimbra>
Message-ID: <AANLkTikgLM_jnsC0Jvh21=1MRwY1OQGvNP0CfT-6SwnX@mail.gmail.com>

Hi

using the public interface for cluster communication is never recommended ,

you can add another crossover cable between the two servers , and apply bond
configuration between the crossovers connections in this way you can have
redundancy for cluster communication

Br,
jacob

On Wed, Dec 29, 2010 at 9:02 PM, Stefan Lesicnik <stefan at lsd.co.za> wrote:

> Hi all,
>
> I am running RHCS 5 and have a two node cluster with a shared qdisk. I have
> a bonded network bond0 and a back to back crossover eth1.
>
> Currently I have multicast cluster communication over the crossover, but
> was wondering if it was possible to use bond0 as an alternative / failover.
> So if eth1 was down, it could still communicate?
>
> I havent been able to find anything in the FAQ / documentation that would
> suggest this, so I thought I would ask.
>
> Thanks alot and I hope everyone has a great new year :)
>
> Stefan
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110103/2dddca4d/attachment.htm>

From hostmaster at inwx.de  Mon Jan  3 23:54:45 2011
From: hostmaster at inwx.de (InterNetworX | Hostmaster)
Date: Tue, 04 Jan 2011 00:54:45 +0100
Subject: [Linux-cluster] Processes in D state
Message-ID: <4D2261C5.8000802@inwx.de>

Hello,

we are using GFS2 but sometimes there are processes hanging in D state:

# ps axl | grep D
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
0     0 14220 14219  20   0  19624  1916 -      Ds   ?          0:00
/usr/lib/postfix/master -t
0     0 14555 14498  20   0  16608  1716 -      D+
/mnt/storage/openvz/root/129/dev/pts/0   0:00 apt-get install less
0     0 15068 15067  19  -1  36844  2156 -      D<s  ?          0:00
/usr/lib/postfix/master -t
0     0 16603 16602  19  -1  36844  2156 -      D<s  ?          0:00
/usr/lib/postfix/master -t
4   101 19534 13238  19  -1  33132  2984 -      D<   ?          0:00
smtpd -n smtp -t inet -u -c
4   101 19542 13238  19  -1  33116  2976 -      D<   ?          0:00
smtpd -n smtp -t inet -u -c
0     0 19735 13068  20   0   7548   880 -      S+   pts/0      0:00 grep D

dmesg shows this message many times:

[11142.334229] INFO: task master:14220 blocked for more than 120 seconds.
[11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[11142.334310] master        D ffff88032b644800     0 14220  14219
0x00000000
[11142.334315]  ffff88062dd40000 0000000000000086 0000000000000000
ffffffffa02628d9
[11142.334318]  ffff88017a517ef8 000000000000fa40 ffff88017a517fd8
0000000000016940
[11142.334322]  0000000000016940 ffff88032b644800 ffff88032b644af8
0000000b7a517cd8
[11142.334325] Call Trace:
[11142.334340]  [<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2]
[11142.334347]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
[11142.334353]  [<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd [gfs2]
[11142.334358]  [<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70
[11142.334363]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
[11142.334367]  [<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77
[11142.334370]  [<ffffffff81066808>] ? wake_bit_function+0x0/0x23
[11142.334376]  [<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2]
[11142.334383]  [<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2]
[11142.334386]  [<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a
[11142.334389]  [<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65
[11142.334393]  [<ffffffff8112221b>] ? sys_flock+0xff/0x12a
[11142.334396]  [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b

Any idea what is going wrong? Do you need any more informations?

Mario



From pradhanparas at gmail.com  Tue Jan  4 01:23:25 2011
From: pradhanparas at gmail.com (Paras pradhan)
Date: Mon, 3 Jan 2011 19:23:25 -0600
Subject: [Linux-cluster] Processes in D state
In-Reply-To: <4D2261C5.8000802@inwx.de>
References: <4D2261C5.8000802@inwx.de>
Message-ID: <AANLkTik-=E2S0FG2psa2=TsBaq=T3H2d1pJHCiKAbSyj@mail.gmail.com>

I had the same problem. it locked the whole gfs cluster and had to
reboot the node. after reboot all is fine now but still trying to find
out what has caused it.

Paras

On Monday, January 3, 2011, InterNetworX | Hostmaster
<hostmaster at inwx.de> wrote:
> Hello,
>
> we are using GFS2 but sometimes there are processes hanging in D state:
>
> # ps axl | grep D
> F ? UID ? PID ?PPID PRI ?NI ? ?VSZ ? RSS WCHAN ?STAT TTY ? ? ? ?TIME COMMAND
> 0 ? ? 0 14220 14219 ?20 ? 0 ?19624 ?1916 - ? ? ?Ds ? ? ? ? ? ? ?0:00
> /usr/lib/postfix/master -t
> 0 ? ? 0 14555 14498 ?20 ? 0 ?16608 ?1716 - ? ? ?D+
> /mnt/storage/openvz/root/129/dev/pts/0 ? 0:00 apt-get install less
> 0 ? ? 0 15068 15067 ?19 ?-1 ?36844 ?2156 - ? ? ?D<s ?? ? ? ? ? ?0:00
> /usr/lib/postfix/master -t
> 0 ? ? 0 16603 16602 ?19 ?-1 ?36844 ?2156 - ? ? ?D<s ?? ? ? ? ? ?0:00
> /usr/lib/postfix/master -t
> 4 ? 101 19534 13238 ?19 ?-1 ?33132 ?2984 - ? ? ?D< ? ? ? ? ? ? ?0:00
> smtpd -n smtp -t inet -u -c
> 4 ? 101 19542 13238 ?19 ?-1 ?33116 ?2976 - ? ? ?D< ? ? ? ? ? ? ?0:00
> smtpd -n smtp -t inet -u -c
> 0 ? ? 0 19735 13068 ?20 ? 0 ? 7548 ? 880 - ? ? ?S+ ? pts/0 ? ? ?0:00 grep D
>
> dmesg shows this message many times:
>
> [11142.334229] INFO: task master:14220 blocked for more than 120 seconds.
> [11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [11142.334310] master ? ? ? ?D ffff88032b644800 ? ? 0 14220 ?14219
> 0x00000000
> [11142.334315] ?ffff88062dd40000 0000000000000086 0000000000000000
> ffffffffa02628d9
> [11142.334318] ?ffff88017a517ef8 000000000000fa40 ffff88017a517fd8
> 0000000000016940
> [11142.334322] ?0000000000016940 ffff88032b644800 ffff88032b644af8
> 0000000b7a517cd8
> [11142.334325] Call Trace:
> [11142.334340] ?[<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2]
> [11142.334347] ?[<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
> [11142.334353] ?[<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd [gfs2]
> [11142.334358] ?[<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70
> [11142.334363] ?[<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2]
> [11142.334367] ?[<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77
> [11142.334370] ?[<ffffffff81066808>] ? wake_bit_function+0x0/0x23
> [11142.334376] ?[<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2]
> [11142.334383] ?[<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2]
> [11142.334386] ?[<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a
> [11142.334389] ?[<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65
> [11142.334393] ?[<ffffffff8112221b>] ? sys_flock+0xff/0x12a
> [11142.334396] ?[<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
>
> Any idea what is going wrong? Do you need any more informations?
>
> Mario
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From thomas at sjolshagen.net  Tue Jan  4 10:00:56 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Tue, 04 Jan 2011 05:00:56 -0500
Subject: [Linux-cluster] GFS2 locking in a VM based cluster (KVM)
Message-ID: <20110104050056.96196p5ratbwjie0@www.sjolshagen.net>

Hi,

Posted this on IRC just now, but was recommended to try another approach.

I'm wondering if the locks/sec rate I'm seeing between two virtual  
machines is what I should expect - details below:

I've got a 2-node cluster that is actually two KVM VMs (Fedora 14 in  
the guests/vms). Between the two VM's, I'm sharing - iSCSI based  
external array using virtio_net drivers in the guests/vms - two GFS2  
file systems. After setting <dlm> and <gfs_controld> as listed below  
in cluster.conf (and, of course, rebooting the whole cluster), I'm  
still only seeing 600-650 locks/sec when using the ping_pong utility.

<dlm plock_ownership="1" plock_rate_limit="0" protocol="sctp"/>
<gfs_controld plock_rate_limit="0"/>

The same settings & configuration on the physical hosts (a separate  
cluster, but using the same iSCSI array) is seeing 5.5-6K locks/sec.

I'm also curious as to why, when starting ping_pong on the virtualized  
node1, I see ~20-40K locks/sec, but if I start and then stop ping_pong  
on the virtualized node2, node1 never returns to anything close to the  
20-40K locks/sec. Is that (also) expected?

Both physical & virtual machines are running Fedora 14  
w/gfs2-utils-3.1.0-3.fc14.x86_64 & gfs2-cluster-3.1.0-3.fc14.x86_64.

// Thomas

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




From swhiteho at redhat.com  Tue Jan  4 10:24:03 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Tue, 04 Jan 2011 10:24:03 +0000
Subject: [Linux-cluster] GFS2 locking in a VM based cluster (KVM)
In-Reply-To: <20110104050056.96196p5ratbwjie0@www.sjolshagen.net>
References: <20110104050056.96196p5ratbwjie0@www.sjolshagen.net>
Message-ID: <1294136643.2455.1.camel@dolmen>

Hi,

On Tue, 2011-01-04 at 05:00 -0500, Thomas Sjolshagen wrote:
> Hi,
> 
> Posted this on IRC just now, but was recommended to try another approach.
> 
> I'm wondering if the locks/sec rate I'm seeing between two virtual  
> machines is what I should expect - details below:
> 
> I've got a 2-node cluster that is actually two KVM VMs (Fedora 14 in  
> the guests/vms). Between the two VM's, I'm sharing - iSCSI based  
> external array using virtio_net drivers in the guests/vms - two GFS2  
> file systems. After setting <dlm> and <gfs_controld> as listed below  
> in cluster.conf (and, of course, rebooting the whole cluster), I'm  
> still only seeing 600-650 locks/sec when using the ping_pong utility.
> 
> <dlm plock_ownership="1" plock_rate_limit="0" protocol="sctp"/>
> <gfs_controld plock_rate_limit="0"/>
> 
> The same settings & configuration on the physical hosts (a separate  
> cluster, but using the same iSCSI array) is seeing 5.5-6K locks/sec.
> 
> I'm also curious as to why, when starting ping_pong on the virtualized  
> node1, I see ~20-40K locks/sec, but if I start and then stop ping_pong  
> on the virtualized node2, node1 never returns to anything close to the  
> 20-40K locks/sec. Is that (also) expected?
> 
Did you have the second node mounted when you got the faster locking
rates on the first node?

Steve.




From thomas at sjolshagen.net  Tue Jan  4 10:45:35 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Tue, 04 Jan 2011 05:45:35 -0500
Subject: [Linux-cluster] GFS2 locking in a VM based cluster (KVM)
In-Reply-To: <1294136643.2455.1.camel@dolmen>
References: <20110104050056.96196p5ratbwjie0@www.sjolshagen.net>
	<1294136643.2455.1.camel@dolmen>
Message-ID: <d83d2aa32f318913f000415dee42f73a@sjolshagen.net>

 On Tue, 04 Jan 2011 10:24:03 +0000, Steven Whitehouse 
 <swhiteho at redhat.com> wrote:
> Hi,
>
> On Tue, 2011-01-04 at 05:00 -0500, Thomas Sjolshagen wrote:
>> Hi,
>>
>> Posted this on IRC just now, but was recommended to try another 
>> approach.
>>
>> I'm wondering if the locks/sec rate I'm seeing between two virtual
>> machines is what I should expect - details below:
>>
>> I've got a 2-node cluster that is actually two KVM VMs (Fedora 14 in
>> the guests/vms). Between the two VM's, I'm sharing - iSCSI based
>> external array using virtio_net drivers in the guests/vms - two GFS2
>> file systems. After setting <dlm> and <gfs_controld> as listed below
>> in cluster.conf (and, of course, rebooting the whole cluster), I'm
>> still only seeing 600-650 locks/sec when using the ping_pong 
>> utility.
>>
>> <dlm plock_ownership="1" plock_rate_limit="0" protocol="sctp"/>
>> <gfs_controld plock_rate_limit="0"/>
>>
>> The same settings & configuration on the physical hosts (a separate
>> cluster, but using the same iSCSI array) is seeing 5.5-6K locks/sec.
>>
>> I'm also curious as to why, when starting ping_pong on the 
>> virtualized
>> node1, I see ~20-40K locks/sec, but if I start and then stop 
>> ping_pong
>> on the virtualized node2, node1 never returns to anything close to 
>> the
>> 20-40K locks/sec. Is that (also) expected?
>>
> Did you have the second node mounted when you got the faster locking
> rates on the first node?
>
 Yes, both nodes mounted (before and after starting ping_pong on the 2nd 
 node, no umount inbetween)

 // Thomas



From emilio at ugr.es  Tue Jan  4 11:27:52 2011
From: emilio at ugr.es (Emilio Arjona)
Date: Tue, 4 Jan 2011 12:27:52 +0100
Subject: [Linux-cluster] Processes in D state
In-Reply-To: <AANLkTik-=E2S0FG2psa2=TsBaq=T3H2d1pJHCiKAbSyj@mail.gmail.com>
References: <4D2261C5.8000802@inwx.de>
	<AANLkTik-=E2S0FG2psa2=TsBaq=T3H2d1pJHCiKAbSyj@mail.gmail.com>
Message-ID: <AANLkTindMJXuFhAt_ELHPk=62QPKRi3M_dSVrrEJ2XoL@mail.gmail.com>

Same problem here,

in a webserver cluster httpd run into D state sometimes. I have to restart
the node or even the whole cluster if there are more than one node locked.
I'm using REDHAT 5.4 and HP hardware.

Regards,

2011/1/4 Paras pradhan <pradhanparas at gmail.com>

> I had the same problem. it locked the whole gfs cluster and had to
> reboot the node. after reboot all is fine now but still trying to find
> out what has caused it.
>
> Paras
>
> On Monday, January 3, 2011, InterNetworX | Hostmaster
> <hostmaster at inwx.de> wrote:
> > Hello,
> >
> > we are using GFS2 but sometimes there are processes hanging in D state:
> >
> > # ps axl | grep D
> > F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME
> COMMAND
> > 0     0 14220 14219  20   0  19624  1916 -      Ds   ?          0:00
> > /usr/lib/postfix/master -t
> > 0     0 14555 14498  20   0  16608  1716 -      D+
> > /mnt/storage/openvz/root/129/dev/pts/0   0:00 apt-get install less
> > 0     0 15068 15067  19  -1  36844  2156 -      D<s  ?          0:00
> > /usr/lib/postfix/master -t
> > 0     0 16603 16602  19  -1  36844  2156 -      D<s  ?          0:00
> > /usr/lib/postfix/master -t
> > 4   101 19534 13238  19  -1  33132  2984 -      D<   ?          0:00
> > smtpd -n smtp -t inet -u -c
> > 4   101 19542 13238  19  -1  33116  2976 -      D<   ?          0:00
> > smtpd -n smtp -t inet -u -c
> > 0     0 19735 13068  20   0   7548   880 -      S+   pts/0      0:00 grep
> D
> >
> > dmesg shows this message many times:
> >
> > [11142.334229] INFO: task master:14220 blocked for more than 120 seconds.
> > [11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [11142.334310] master        D ffff88032b644800     0 14220  14219
> > 0x00000000
> > [11142.334315]  ffff88062dd40000 0000000000000086 0000000000000000
> > ffffffffa02628d9
> > [11142.334318]  ffff88017a517ef8 000000000000fa40 ffff88017a517fd8
> > 0000000000016940
> > [11142.334322]  0000000000016940 ffff88032b644800 ffff88032b644af8
> > 0000000b7a517cd8
> > [11142.334325] Call Trace:
> > [11142.334340]  [<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2]
> > [11142.334347]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd
> [gfs2]
> > [11142.334353]  [<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd
> [gfs2]
> > [11142.334358]  [<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70
> > [11142.334363]  [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd
> [gfs2]
> > [11142.334367]  [<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77
> > [11142.334370]  [<ffffffff81066808>] ? wake_bit_function+0x0/0x23
> > [11142.334376]  [<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2]
> > [11142.334383]  [<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2]
> > [11142.334386]  [<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a
> > [11142.334389]  [<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65
> > [11142.334393]  [<ffffffff8112221b>] ? sys_flock+0xff/0x12a
> > [11142.334396]  [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
> >
> > Any idea what is going wrong? Do you need any more informations?
> >
> > Mario
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
*******************************************
Emilio Arjona Heredia
Centro de Ense?anzas Virtuales de la Universidad de Granada
C/ Real de Cartuja 36-38
http://cevug.ugr.es
Tlfno.: 958-241000 ext. 20206
*******************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110104/96c37afe/attachment.htm>

From thomas at sjolshagen.net  Tue Jan  4 12:16:11 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Tue, 04 Jan 2011 07:16:11 -0500
Subject: [Linux-cluster] GFS2 locking in a VM based cluster (KVM)
In-Reply-To: <d83d2aa32f318913f000415dee42f73a@sjolshagen.net>
References: <20110104050056.96196p5ratbwjie0@www.sjolshagen.net>
	<1294136643.2455.1.camel@dolmen>
	<d83d2aa32f318913f000415dee42f73a@sjolshagen.net>
Message-ID: <4e54d6ded4d62fa4fa0f8498384a4448@sjolshagen.net>

 On Tue, 04 Jan 2011 05:45:35 -0500, Thomas Sjolshagen 
 <thomas at sjolshagen.net> wrote:

>>
> Yes, both nodes mounted (before and after starting ping_pong on the
> 2nd node, no umount inbetween)
>

 Well, disabling the 2nd corosync ring ( <altname /> in cluster.conf) 
 seems to have almost doubled the # of locks/sec (from 590-650 to 
 1000-1100ish)

 // Thomas



From adrew at redhat.com  Tue Jan  4 15:27:53 2011
From: adrew at redhat.com (Adam Drew)
Date: Tue, 4 Jan 2011 10:27:53 -0500 (EST)
Subject: [Linux-cluster] Processes in D state
In-Reply-To: <AANLkTindMJXuFhAt_ELHPk=62QPKRi3M_dSVrrEJ2XoL@mail.gmail.com>
Message-ID: <1599288175.144252.1294154873625.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>

Hello,

Processes accessing a GFS2 filesystem falling into D state is typically indicative of lock contention; however, other causes are also possible. D state is uninterruptable sleep waiting on IO. With regards to GFS2 this means that a PID has requested access to some object on disk and has not yet gained access to that object. As the PID cannot proceed until granted access it is hung in D state.

The most common cause of D state PIDs on GFS2 is lock contention. GFS2's shared locking system is more complex than traditional single-node filesystems. You can run into a situation where a given PID is locking a resource but is waiting in line for a lock on another resource to be released where the holder of that second resource is waiting on the PID holding the first to release it as well. This causes a deadlock where neither process can make process, both end up in D state, and so will any process that requests access to either of those resources as well. In other cases PIDs requesting access to a resource on disk may build up faster than than they release them. In this case the queue of waiters will build and build until the filesystem grinds to a halt and appears to "hang." In other cases bugs or design issues may lead to locking bottlenecks.

GFS2 locks are arbitrated in the glock (pronounced gee-lock) layer. The glock subsystem is exposed via debugfs. You can mount debugfs, look in the gfs2 directory, and view the glocks. You can then match up the glocks to the process list on the system and to the messages logs. Doing this for every node in the cluster can reveal problems. If you have Red Hat support I encourage you to engage them as learning to read glocks can be non-trivial process but it is not impossible. They are documented to a degree in the following documents:

"Testing and verification of cluster filesystems" by Steven Whitehouse
http://www.kernel.org/doc/ols/2009/ols2009-pages-311-318.pdf

Global File System 2, Edition 7, section 1.4. "GFS2 Node Locking"
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html#s1-ov-lockbounce

More information is available out on the web.

Regards,
Adam Drew

----- Original Message -----
From: "Emilio Arjona" <emilio at ugr.es>
To: "linux clustering" <linux-cluster at redhat.com>
Sent: Tuesday, January 4, 2011 6:27:52 AM
Subject: Re: [Linux-cluster] Processes in D state


Same problem here, 


in a webserver cluster httpd run into D state sometimes. I have to restart the node or even the whole cluster if there are more than one node locked. I'm using REDHAT 5.4 and HP hardware. 


Regards, 


2011/1/4 Paras pradhan < pradhanparas at gmail.com > 


I had the same problem. it locked the whole gfs cluster and had to 
reboot the node. after reboot all is fine now but still trying to find 
out what has caused it. 

Paras 

On Monday, January 3, 2011, InterNetworX | Hostmaster 



< hostmaster at inwx.de > wrote: 
> Hello, 
> 
> we are using GFS2 but sometimes there are processes hanging in D state: 
> 
> # ps axl | grep D 
> F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 
> 0 0 14220 14219 20 0 19624 1916 - Ds ? 0:00 
> /usr/lib/postfix/master -t 
> 0 0 14555 14498 20 0 16608 1716 - D+ 
> /mnt/storage/openvz/root/129/dev/pts/0 0:00 apt-get install less 
> 0 0 15068 15067 19 -1 36844 2156 - D<s ? 0:00 
> /usr/lib/postfix/master -t 
> 0 0 16603 16602 19 -1 36844 2156 - D<s ? 0:00 
> /usr/lib/postfix/master -t 
> 4 101 19534 13238 19 -1 33132 2984 - D< ? 0:00 
> smtpd -n smtp -t inet -u -c 
> 4 101 19542 13238 19 -1 33116 2976 - D< ? 0:00 
> smtpd -n smtp -t inet -u -c 
> 0 0 19735 13068 20 0 7548 880 - S+ pts/0 0:00 grep D 
> 
> dmesg shows this message many times: 
> 
> [11142.334229] INFO: task master:14220 blocked for more than 120 seconds. 
> [11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message. 
> [11142.334310] master D ffff88032b644800 0 14220 14219 
> 0x00000000 
> [11142.334315] ffff88062dd40000 0000000000000086 0000000000000000 
> ffffffffa02628d9 
> [11142.334318] ffff88017a517ef8 000000000000fa40 ffff88017a517fd8 
> 0000000000016940 
> [11142.334322] 0000000000016940 ffff88032b644800 ffff88032b644af8 
> 0000000b7a517cd8 
> [11142.334325] Call Trace: 
> [11142.334340] [<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2] 
> [11142.334347] [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2] 
> [11142.334353] [<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd [gfs2] 
> [11142.334358] [<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70 
> [11142.334363] [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2] 
> [11142.334367] [<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77 
> [11142.334370] [<ffffffff81066808>] ? wake_bit_function+0x0/0x23 
> [11142.334376] [<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2] 
> [11142.334383] [<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2] 
> [11142.334386] [<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a 
> [11142.334389] [<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65 
> [11142.334393] [<ffffffff8112221b>] ? sys_flock+0xff/0x12a 
> [11142.334396] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b 
> 
> Any idea what is going wrong? Do you need any more informations? 
> 
> Mario 
> 
> -- 
> Linux-cluster mailing list 
> Linux-cluster at redhat.com 
> https://www.redhat.com/mailman/listinfo/linux-cluster 
> 

-- 
Linux-cluster mailing list 
Linux-cluster at redhat.com 
https://www.redhat.com/mailman/listinfo/linux-cluster 



-- 
******************************************* 
Emilio Arjona Heredia 
Centro de Ense?anzas Virtuales de la Universidad de Granada 
C/ Real de Cartuja 36-38 
http://cevug.ugr.es 
Tlfno.: 958-241000 ext. 20206 
******************************************* 


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From nukejun at gmail.com  Tue Jan  4 18:42:45 2011
From: nukejun at gmail.com (juncheol park)
Date: Tue, 4 Jan 2011 11:42:45 -0700
Subject: [Linux-cluster] GFS block size
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F06A12904@hugo.eprize.local>
References: <AcuePESZzbywCbvARqOcjjroAVoIxQ==>
	<64D0546C5EBBD147B75DE133D798665F06A12904@hugo.eprize.local>
Message-ID: <AANLkTi=mmdGchaJrOLzdtv3th0rtcuWscnsQTStScufw@mail.gmail.com>

I also experimented 1k block size on GFS1. Although you can improve
the disk usage using a smaller block size, typically it is recommended
to use the block size same as the page size, which is 4k in Linux.

I don't remember all the details of results. However, for large files,
the overall performance of read/write operations with 1k block size
was much worse than the one with 4k block size. This is obvious,
though. If you don't care any performance degradation for large files,
it would be fine for you to use 1k.

Just my two cents,

-Jun


On Fri, Dec 17, 2010 at 3:53 PM, Jeff Sturm <jeff.sturm at eprize.com> wrote:
> One of our GFS filesystems tends to have a large number of very small files,
> on average about 1000 bytes each.
>
>
>
> I realized this week we'd created our filesystems with default options.? As
> an experiment on a test system, I've recreated a GFS filesystem with "-b
> 1024" to reduce overall disk usage and disk bandwidth.
>
>
>
> Initially, tests look very good?single file creates are less than one
> millisecond on average (down from about 5ms each).? Before I go very far
> with this, I wanted to ask:? Has anyone else experimented with the block
> size option, and are there any tricks or gotchas to report?
>
>
>
> (This is with CentOS 5.5, GFS 1.)
>
>
>
> -Jeff
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From adrew at redhat.com  Tue Jan  4 19:17:38 2011
From: adrew at redhat.com (Adam Drew)
Date: Tue, 4 Jan 2011 14:17:38 -0500 (EST)
Subject: [Linux-cluster] GFS block size
In-Reply-To: <AANLkTi=mmdGchaJrOLzdtv3th0rtcuWscnsQTStScufw@mail.gmail.com>
Message-ID: <102121344.150135.1294168658370.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>

If your average file size is less than 1K then using a block size of 1k may be a good option. If you can fit your data in a single block you get the minor performance boost of using a stuffed inode so you never have to walk a list from your inode to your data block. The performance boost should be small but could add up to larger gains over time with lots of transactions. If your average data payload is less than the default block-size however, you'll end up losing the delta. So, from a filesystem perspective, using a 1k blocksize to store mostly sub-1k files may be a good idea. 

You additionally may want to experiment with reducing your resource group size. Blocks are organized into resource groups. If you are using 1k blocks and sub-1k files then you'll end up with tons of stuffed inodes per resource group. Some operations in GFS require locking the resource group metadata (such as deletes) so you may start to experience performance bottle-necks depending on usage patterns and disk layout.

All-in-all I'd be skeptical of the claim of large performance gains over time by changing rg size and block size but modest gains may be had. Still, some access patterns and filesystem layouts may experience greater performance gains with such tweaking. However, I would expect to see the most significant gains (in GFS1 at least) made by mount options and tuneables.

Regards,
Adam Drew

----- Original Message -----
From: "juncheol park" <nukejun at gmail.com>
To: "linux clustering" <linux-cluster at redhat.com>
Sent: Tuesday, January 4, 2011 1:42:45 PM
Subject: Re: [Linux-cluster] GFS block size

I also experimented 1k block size on GFS1. Although you can improve
the disk usage using a smaller block size, typically it is recommended
to use the block size same as the page size, which is 4k in Linux.

I don't remember all the details of results. However, for large files,
the overall performance of read/write operations with 1k block size
was much worse than the one with 4k block size. This is obvious,
though. If you don't care any performance degradation for large files,
it would be fine for you to use 1k.

Just my two cents,

-Jun


On Fri, Dec 17, 2010 at 3:53 PM, Jeff Sturm <jeff.sturm at eprize.com> wrote:
> One of our GFS filesystems tends to have a large number of very small files,
> on average about 1000 bytes each.
>
>
>
> I realized this week we'd created our filesystems with default options.? As
> an experiment on a test system, I've recreated a GFS filesystem with "-b
> 1024" to reduce overall disk usage and disk bandwidth.
>
>
>
> Initially, tests look very good?single file creates are less than one
> millisecond on average (down from about 5ms each).? Before I go very far
> with this, I wanted to ask:? Has anyone else experimented with the block
> size option, and are there any tricks or gotchas to report?
>
>
>
> (This is with CentOS 5.5, GFS 1.)
>
>
>
> -Jeff
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From sdake at redhat.com  Tue Jan  4 20:59:29 2011
From: sdake at redhat.com (Steven Dake)
Date: Tue, 04 Jan 2011 13:59:29 -0700
Subject: [Linux-cluster] [Openais] packet dissectors for totempg, cman,
 clvmd, rgmanager, cpg, 
In-Reply-To: <4D07A04C.9020703@redhat.com>
References: <20100527.133950.593311767624382812.yamato@redhat.com>	<20101214.210216.512133496326900668.yamato@redhat.com>	<4D0776DC.9080003@redhat.com>
	<20101214.231525.648039044490713397.yamato@redhat.com>
	<4D078465.3020509@redhat.com> <4D079B49.8070009@redhat.com>
	<4D07A04C.9020703@redhat.com>
Message-ID: <4D238A31.5040206@redhat.com>

On 12/14/2010 09:50 AM, Jan Friesse wrote:
> Steven Dake napsal(a):
>> On 12/14/2010 07:51 AM, Jan Friesse wrote:
>>> Masatake,
>>>
> 
> ....
> 
>>>> Thank you.
>>> Regards,
>>>   Honza
>>
>>
>> I am not changing corosync license to GPL.  I think the separate plugin
>> works fine, and we can even take up packaging of it in fedora and Red
>> Hat variants, if it is maintained in an upstream repo.
>>
>> Regards
>> -steve
> 
> Steve,
> I'm not talking about relicensing corosync (it doesn't make any sense
> and I would be first against that), but give permissions to that portion
> of code (seems to be more or less header files) to use GPL (which also
> seems to me like old version without support for NSS). It's same as what
> we did for libqb.
> 
> Separate plugin works fine for Fedora, but I'm not sure if it works also
> for other distributions.
> 
> 
> Regards,
>   Honza

What headers are needed?  We can likely provide a GPL version of the
headers for third party projects to use.  One issue is likely the use of
libtomcrypt which has a "public domain" license which we did not write
nor can re-license.

Regards
-steve



From sdake at redhat.com  Tue Jan  4 21:02:56 2011
From: sdake at redhat.com (Steven Dake)
Date: Tue, 04 Jan 2011 14:02:56 -0700
Subject: [Linux-cluster] [Openais] packet dissectors for totempg, cman,
 clvmd, rgmanager, cpg, 
In-Reply-To: <20101215.000429.721897046580218183.yamato@redhat.com>
References: <4D0776DC.9080003@redhat.com>	<20101214.231525.648039044490713397.yamato@redhat.com>	<4D078465.3020509@redhat.com>
	<20101215.000429.721897046580218183.yamato@redhat.com>
Message-ID: <4D238B00.5010103@redhat.com>

On 12/14/2010 08:04 AM, Masatake YAMATO wrote:
> Thank you for replying.
> 
>> Masatake,
>>
>> Masatake YAMATO napsal(a):
>>> I'd like to your advice more detail seriously.
>>> I've been developing this code for three years.
>>> I don't want to make this code garbage.
>>>
>>>> Masatake,
>>>> I'm pretty sure that biggest problem of your code was that it was
>>>> licensed under BSD (three clause, same as Corosync has)
>>>> license. Wireshark is licensed under GPL and even I like BSD licenses
>>>> much more, I would recommend you to try to relicense code under GPL
>>>> and send them this code.
>>>>
>>>> Regards,
>>>>   Honza
>>> I got the similar comment from wireshark developer.
>>> Please, read the discussion:
>>> 	https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=3232
>>>
>>
>> I've read that thread long time before I've sent previous mail, so
>> thats reason why I think that Wireshark developers just feel MUCH more
>> comfortable with GPL and thats reason why they just ignoring it.
> 
> I see.
>  
>>> In my understanding there is no legal problem in putting 3-clause BSD
>>> code into GPL code.  Acutally wireshark includes some 3-clause BSD
>>> code:
>>>
>>
>> Actually there is really not. BSD to GPL works without problem, but
>> many people just don't know it...
> 
> ...it is too bad. I strongly believe FOSS developers should know the
> intent behind of the both licenses.
>  
>>> epan/dissectors/packet-radiotap-defs.h:
>>> /*-
>>>  * Copyright (c) 2003, 2004 David Young.  All rights reserved.
>>>  *
>>>  * $Id: packet-radiotap-defs.h 34554 2010-10-18 13:24:10Z morriss $
>>>  *
>>>  * Redistribution and use in source and binary forms, with or without
>>>  * modification, are permitted provided that the following conditions
>>>  * are met:
>>>  * 1. Redistributions of source code must retain the above copyright
>>>  *    notice, this list of conditions and the following disclaimer.
>>>  * 2. Redistributions in binary form must reproduce the above copyright
>>>  *    notice, this list of conditions and the following disclaimer in the
>>>  *    documentation and/or other materials provided with the distribution.
>>>  * 3. The name of David Young may not be used to endorse or promote
>>>  *    products derived from this software without specific prior
>>>  *    written permission.
>>>  *
>>>  * THIS SOFTWARE IS PROVIDED BY DAVID YOUNG ``AS IS'' AND ANY
>>>  * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
>>>  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
>>>  * PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL DAVID
>>>  * YOUNG BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
>>>  * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
>>>  * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>>>  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
>>>  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
>>>  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
>>>  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
>>>  * OF SUCH DAMAGE.
>>>  */
>>> I'd like to separate the legal issue and preference. I think I
>>> understand the importance of preference of upstream
>>> developers. However, I'd like to clear the legal issue first.
>>>
>>
>> Legally it's ok. But as you said, developers preference are
>> different. And because you are trying to change THEIR code it's
>> sometimes better to play they rules.
> 
> I see.
>  
>>> I can image there are people who prefer to GPL as the license covering
>>> their software. But here I've taken some corosync code in my
>>> dissector. It is essential part of my dissector. And corosync is
>>
>> ^^^ This may be problem. Question is how big is that part and if it
>> can be possible to make exception there. Can you point that code?
>>
>> Steve, we were able to relicense HUGE portion of code in case of
>> libqb, are we able to make the same for Wireshark dissector?
> 
> Could you see https://github.com/masatake/wireshark-plugin-rhcs/blob/master/src/packet-corosync-totemnet.c#L156
> I refer totemnet.c to write dissect_corosynec_totemnet_with_decryption() function. 
> 
>>> licensed in 3-clause BSD, as you know. I'd like to change the license
>>> to merge my code to upstream project. I cannot do it in this context.
>>> See https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=3232#c13
>>> Thank you.
>>
>> Regards,
>>   Honza
> 
> Masatake YAMATO

Masatake,

Red Hat is the author of the totemnet file and can provide that code
under GPL if you like.  We cannot modify the license for libtomcrypt as
we are not the authors.  Feel free to change the license for that
particular code you rewrote in the link

> Could you see
https://github.com/masatake/wireshark-plugin-rhcs/blob/master/src/packet-corosync-totemnet.c#L156

under a GPL license if it helps move things along.

Regards
-steveu



From yamato at redhat.com  Wed Jan  5 05:56:35 2011
From: yamato at redhat.com (Masatake YAMATO)
Date: Wed, 05 Jan 2011 14:56:35 +0900 (JST)
Subject: [Linux-cluster] [Openais] packet dissectors for totempg, cman,
 clvmd, rgmanager, cpg, 
In-Reply-To: <4D238B00.5010103@redhat.com>
References: <4D078465.3020509@redhat.com>
	<20101215.000429.721897046580218183.yamato@redhat.com>
	<4D238B00.5010103@redhat.com>
Message-ID: <20110105.145635.1000595217822844587.yamato@redhat.com>

Thank you very much.
I'll push my patch again.

Masatake YAMATO

> On 12/14/2010 08:04 AM, Masatake YAMATO wrote:
>> Thank you for replying.
>> 
>>> Masatake,
>>>
>>> Masatake YAMATO napsal(a):
>>>> I'd like to your advice more detail seriously.
>>>> I've been developing this code for three years.
>>>> I don't want to make this code garbage.
>>>>
>>>>> Masatake,
>>>>> I'm pretty sure that biggest problem of your code was that it was
>>>>> licensed under BSD (three clause, same as Corosync has)
>>>>> license. Wireshark is licensed under GPL and even I like BSD licenses
>>>>> much more, I would recommend you to try to relicense code under GPL
>>>>> and send them this code.
>>>>>
>>>>> Regards,
>>>>>   Honza
>>>> I got the similar comment from wireshark developer.
>>>> Please, read the discussion:
>>>> 	https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=3232
>>>>
>>>
>>> I've read that thread long time before I've sent previous mail, so
>>> thats reason why I think that Wireshark developers just feel MUCH more
>>> comfortable with GPL and thats reason why they just ignoring it.
>> 
>> I see.
>>  
>>>> In my understanding there is no legal problem in putting 3-clause BSD
>>>> code into GPL code.  Acutally wireshark includes some 3-clause BSD
>>>> code:
>>>>
>>>
>>> Actually there is really not. BSD to GPL works without problem, but
>>> many people just don't know it...
>> 
>> ...it is too bad. I strongly believe FOSS developers should know the
>> intent behind of the both licenses.
>>  
>>>> epan/dissectors/packet-radiotap-defs.h:
>>>> /*-
>>>>  * Copyright (c) 2003, 2004 David Young.  All rights reserved.
>>>>  *
>>>>  * $Id: packet-radiotap-defs.h 34554 2010-10-18 13:24:10Z morriss $
>>>>  *
>>>>  * Redistribution and use in source and binary forms, with or without
>>>>  * modification, are permitted provided that the following conditions
>>>>  * are met:
>>>>  * 1. Redistributions of source code must retain the above copyright
>>>>  *    notice, this list of conditions and the following disclaimer.
>>>>  * 2. Redistributions in binary form must reproduce the above copyright
>>>>  *    notice, this list of conditions and the following disclaimer in the
>>>>  *    documentation and/or other materials provided with the distribution.
>>>>  * 3. The name of David Young may not be used to endorse or promote
>>>>  *    products derived from this software without specific prior
>>>>  *    written permission.
>>>>  *
>>>>  * THIS SOFTWARE IS PROVIDED BY DAVID YOUNG ``AS IS'' AND ANY
>>>>  * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
>>>>  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
>>>>  * PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL DAVID
>>>>  * YOUNG BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
>>>>  * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
>>>>  * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>>>>  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
>>>>  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
>>>>  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
>>>>  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
>>>>  * OF SUCH DAMAGE.
>>>>  */
>>>> I'd like to separate the legal issue and preference. I think I
>>>> understand the importance of preference of upstream
>>>> developers. However, I'd like to clear the legal issue first.
>>>>
>>>
>>> Legally it's ok. But as you said, developers preference are
>>> different. And because you are trying to change THEIR code it's
>>> sometimes better to play they rules.
>> 
>> I see.
>>  
>>>> I can image there are people who prefer to GPL as the license covering
>>>> their software. But here I've taken some corosync code in my
>>>> dissector. It is essential part of my dissector. And corosync is
>>>
>>> ^^^ This may be problem. Question is how big is that part and if it
>>> can be possible to make exception there. Can you point that code?
>>>
>>> Steve, we were able to relicense HUGE portion of code in case of
>>> libqb, are we able to make the same for Wireshark dissector?
>> 
>> Could you see https://github.com/masatake/wireshark-plugin-rhcs/blob/master/src/packet-corosync-totemnet.c#L156
>> I refer totemnet.c to write dissect_corosynec_totemnet_with_decryption() function. 
>> 
>>>> licensed in 3-clause BSD, as you know. I'd like to change the license
>>>> to merge my code to upstream project. I cannot do it in this context.
>>>> See https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=3232#c13
>>>> Thank you.
>>>
>>> Regards,
>>>   Honza
>> 
>> Masatake YAMATO
> 
> Masatake,
> 
> Red Hat is the author of the totemnet file and can provide that code
> under GPL if you like.  We cannot modify the license for libtomcrypt as
> we are not the authors.  Feel free to change the license for that
> particular code you rewrote in the link
> 
>> Could you see
> https://github.com/masatake/wireshark-plugin-rhcs/blob/master/src/packet-corosync-totemnet.c#L156
> 
> under a GPL license if it helps move things along.
> 
> Regards
> -steveu



From hostmaster at inwx.de  Wed Jan  5 12:56:25 2011
From: hostmaster at inwx.de (InterNetworX | Hostmaster)
Date: Wed, 05 Jan 2011 13:56:25 +0100
Subject: [Linux-cluster] Processes in D state
In-Reply-To: <1599288175.144252.1294154873625.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>
References: <1599288175.144252.1294154873625.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>
Message-ID: <4D246A79.7020807@inwx.de>

Hi Adam,

thanks for your help. One problem was, that we did not mounted the GFS2
file system with no noatime and nodiratime options.

We still have a problem with postfix. The gfs2 hang analyzer says:

There is 1 glock with waiters.
node4, pid 20902 is waiting for glock 6/11486739, which is held by pid 12382

Both PIDs are on the some node:

root     12382  0.0  0.0  36844  2300 ?        Ss   12:39   0:00
/usr/lib/postfix/master
root     20902  0.0  0.0  36844  2156 ?        Ds   12:45   0:00
/usr/lib/postfix/master -t

I have no idea what Postfix is trying to do here?!

Mario

Am 04.01.11 16:27, schrieb Adam Drew:
> Hello,
> 
> Processes accessing a GFS2 filesystem falling into D state is typically indicative of lock contention; however, other causes are also possible. D state is uninterruptable sleep waiting on IO. With regards to GFS2 this means that a PID has requested access to some object on disk and has not yet gained access to that object. As the PID cannot proceed until granted access it is hung in D state.
> 
> The most common cause of D state PIDs on GFS2 is lock contention. GFS2's shared locking system is more complex than traditional single-node filesystems. You can run into a situation where a given PID is locking a resource but is waiting in line for a lock on another resource to be released where the holder of that second resource is waiting on the PID holding the first to release it as well. This causes a deadlock where neither process can make process, both end up in D state, and so will any process that requests access to either of those resources as well. In other cases PIDs requesting access to a resource on disk may build up faster than than they release them. In this case the queue of waiters will build and build until the filesystem grinds to a halt and appears to "hang." In other cases bugs or design issues may lead to locking bottlenecks.
> 
> GFS2 locks are arbitrated in the glock (pronounced gee-lock) layer. The glock subsystem is exposed via debugfs. You can mount debugfs, look in the gfs2 directory, and view the glocks. You can then match up the glocks to the process list on the system and to the messages logs. Doing this for every node in the cluster can reveal problems. If you have Red Hat support I encourage you to engage them as learning to read glocks can be non-trivial process but it is not impossible. They are documented to a degree in the following documents:
> 
> "Testing and verification of cluster filesystems" by Steven Whitehouse
> http://www.kernel.org/doc/ols/2009/ols2009-pages-311-318.pdf
> 
> Global File System 2, Edition 7, section 1.4. "GFS2 Node Locking"
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html#s1-ov-lockbounce
> 
> More information is available out on the web.
> 
> Regards,
> Adam Drew
> 
> ----- Original Message -----
> From: "Emilio Arjona" <emilio at ugr.es>
> To: "linux clustering" <linux-cluster at redhat.com>
> Sent: Tuesday, January 4, 2011 6:27:52 AM
> Subject: Re: [Linux-cluster] Processes in D state
> 
> 
> Same problem here, 
> 
> 
> in a webserver cluster httpd run into D state sometimes. I have to restart the node or even the whole cluster if there are more than one node locked. I'm using REDHAT 5.4 and HP hardware. 
> 
> 
> Regards, 
> 
> 
> 2011/1/4 Paras pradhan < pradhanparas at gmail.com > 
> 
> 
> I had the same problem. it locked the whole gfs cluster and had to 
> reboot the node. after reboot all is fine now but still trying to find 
> out what has caused it. 
> 
> Paras 
> 
> On Monday, January 3, 2011, InterNetworX | Hostmaster 
> 
> 
> 
> < hostmaster at inwx.de > wrote: 
>> Hello, 
>>
>> we are using GFS2 but sometimes there are processes hanging in D state: 
>>
>> # ps axl | grep D 
>> F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 
>> 0 0 14220 14219 20 0 19624 1916 - Ds ? 0:00 
>> /usr/lib/postfix/master -t 
>> 0 0 14555 14498 20 0 16608 1716 - D+ 
>> /mnt/storage/openvz/root/129/dev/pts/0 0:00 apt-get install less 
>> 0 0 15068 15067 19 -1 36844 2156 - D<s ? 0:00 
>> /usr/lib/postfix/master -t 
>> 0 0 16603 16602 19 -1 36844 2156 - D<s ? 0:00 
>> /usr/lib/postfix/master -t 
>> 4 101 19534 13238 19 -1 33132 2984 - D< ? 0:00 
>> smtpd -n smtp -t inet -u -c 
>> 4 101 19542 13238 19 -1 33116 2976 - D< ? 0:00 
>> smtpd -n smtp -t inet -u -c 
>> 0 0 19735 13068 20 0 7548 880 - S+ pts/0 0:00 grep D 
>>
>> dmesg shows this message many times: 
>>
>> [11142.334229] INFO: task master:14220 blocked for more than 120 seconds. 
>> [11142.334266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
>> disables this message. 
>> [11142.334310] master D ffff88032b644800 0 14220 14219 
>> 0x00000000 
>> [11142.334315] ffff88062dd40000 0000000000000086 0000000000000000 
>> ffffffffa02628d9 
>> [11142.334318] ffff88017a517ef8 000000000000fa40 ffff88017a517fd8 
>> 0000000000016940 
>> [11142.334322] 0000000000016940 ffff88032b644800 ffff88032b644af8 
>> 0000000b7a517cd8 
>> [11142.334325] Call Trace: 
>> [11142.334340] [<ffffffffa02628d9>] ? gfs2_glock_put+0xf9/0x118 [gfs2] 
>> [11142.334347] [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2] 
>> [11142.334353] [<ffffffffa0261db9>] ? gfs2_glock_holder_wait+0x9/0xd [gfs2] 
>> [11142.334358] [<ffffffff812e9897>] ? __wait_on_bit+0x41/0x70 
>> [11142.334363] [<ffffffffa0261db0>] ? gfs2_glock_holder_wait+0x0/0xd [gfs2] 
>> [11142.334367] [<ffffffff812e9931>] ? out_of_line_wait_on_bit+0x6b/0x77 
>> [11142.334370] [<ffffffff81066808>] ? wake_bit_function+0x0/0x23 
>> [11142.334376] [<ffffffffa0261d9e>] ? gfs2_glock_wait+0x23/0x28 [gfs2] 
>> [11142.334383] [<ffffffffa026b2b0>] ? gfs2_flock+0x17c/0x1f9 [gfs2] 
>> [11142.334386] [<ffffffff810e735d>] ? virt_to_head_page+0x9/0x2a 
>> [11142.334389] [<ffffffff810e743e>] ? ub_slab_ptr+0x22/0x65 
>> [11142.334393] [<ffffffff8112221b>] ? sys_flock+0xff/0x12a 
>> [11142.334396] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b 
>>
>> Any idea what is going wrong? Do you need any more informations? 
>>
>> Mario 
>>
>> -- 
>> Linux-cluster mailing list 
>> Linux-cluster at redhat.com 
>> https://www.redhat.com/mailman/listinfo/linux-cluster 
>>
> 



From jeff.sturm at eprize.com  Wed Jan  5 13:47:44 2011
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Wed, 5 Jan 2011 08:47:44 -0500
Subject: [Linux-cluster] GFS block size
In-Reply-To: <102121344.150135.1294168658370.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>
References: <AANLkTi=mmdGchaJrOLzdtv3th0rtcuWscnsQTStScufw@mail.gmail.com>
	<102121344.150135.1294168658370.JavaMail.root@zmail01.collab.prod.int.phx2.redhat.com>
Message-ID: <64D0546C5EBBD147B75DE133D798665F06A1296E@hugo.eprize.local>

Adam,

Thank you for the background on stuffed inodes and resource groups, it is much appreciated.

For this specific application most files are under 1k.  A few are larger (20-30k) but they are rare and so I think we can accommodate a small performance hit for these.  Overall the file system may contain 500,000 or more of these small files at a time.

The improvement we measured is a bit more than "modest".  Our benchmark finishes about 30% faster with the 1k block size compared to 4k.  That's a nice win for a simple change.  Disk bandwidth to/from shared storage might be a factor--we have 12 nodes accessing this storage, so the aggregate bandwidth is considerable.

It has been suggested to me that NFS would yield more performance gains, but I have not attempted this.  RHCS has so far met our expectations of high availability.  Given that NFS is not a cluster file system I'm nervous that such a setup could introduce new points of failure.  (I realize that NFS could be coupled with e.g. DRBD+pacemaker for failover purposes.)

We implemented the typical GFS1 tuneables long ago (noatime, noquota, statfs_fast).  Disabling SELinux also helped.  Checking block size was truly an afterthought, and we had not given any consideration to resource group size either.

I've learned a ton about disk storage by implementing shared storage and clustered filesystems over the past 3 years.  Block devices are a bit "magical" in general, and widely misunderstood by system administrators and software engineers.  (For example, I've heard some fantastic performance claims on ext3 file systems that turned out to demonstrate how effective Linux is at hiding disk latency.)  Thanks again to you and this list for providing continued insight.

-Jeff

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Adam Drew
> Sent: Tuesday, January 04, 2011 2:18 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS block size
> 
> If your average file size is less than 1K then using a block size of 1k may be a good
> option. If you can fit your data in a single block you get the minor performance boost
> of using a stuffed inode so you never have to walk a list from your inode to your data
> block. The performance boost should be small but could add up to larger gains over
> time with lots of transactions. If your average data payload is less than the default
> block-size however, you'll end up losing the delta. So, from a filesystem perspective,
> using a 1k blocksize to store mostly sub-1k files may be a good idea.
> 
> You additionally may want to experiment with reducing your resource group size.
> Blocks are organized into resource groups. If you are using 1k blocks and sub-1k files
> then you'll end up with tons of stuffed inodes per resource group. Some operations in
> GFS require locking the resource group metadata (such as deletes) so you may start
> to experience performance bottle-necks depending on usage patterns and disk layout.
> 
> All-in-all I'd be skeptical of the claim of large performance gains over time by changing
> rg size and block size but modest gains may be had. Still, some access patterns and
> filesystem layouts may experience greater performance gains with such tweaking.
> However, I would expect to see the most significant gains (in GFS1 at least) made by
> mount options and tuneables.
> 
> Regards,
> Adam Drew
> 
> ----- Original Message -----
> From: "juncheol park" <nukejun at gmail.com>
> To: "linux clustering" <linux-cluster at redhat.com>
> Sent: Tuesday, January 4, 2011 1:42:45 PM
> Subject: Re: [Linux-cluster] GFS block size
> 
> I also experimented 1k block size on GFS1. Although you can improve the disk usage
> using a smaller block size, typically it is recommended to use the block size same as
> the page size, which is 4k in Linux.
> 
> I don't remember all the details of results. However, for large files, the overall
> performance of read/write operations with 1k block size was much worse than the one
> with 4k block size. This is obvious, though. If you don't care any performance
> degradation for large files, it would be fine for you to use 1k.
> 
> Just my two cents,
> 
> -Jun
> 
> 
> On Fri, Dec 17, 2010 at 3:53 PM, Jeff Sturm <jeff.sturm at eprize.com> wrote:
> > One of our GFS filesystems tends to have a large number of very small
> > files, on average about 1000 bytes each.
> >
> >
> >
> > I realized this week we'd created our filesystems with default
> > options.? As an experiment on a test system, I've recreated a GFS
> > filesystem with "-b 1024" to reduce overall disk usage and disk bandwidth.
> >
> >
> >
> > Initially, tests look very good?single file creates are less than one
> > millisecond on average (down from about 5ms each).? Before I go very
> > far with this, I wanted to ask:? Has anyone else experimented with the
> > block size option, and are there any tricks or gotchas to report?
> >
> >
> >
> > (This is with CentOS 5.5, GFS 1.)
> >
> >
> >
> > -Jeff
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From luiceur at gmail.com  Wed Jan  5 16:08:28 2011
From: luiceur at gmail.com (Luis Cebamanos)
Date: Wed, 05 Jan 2011 16:08:28 +0000
Subject: [Linux-cluster] How to re-store my lost data
Message-ID: <4D24977C.1080301@gmail.com>

Dear everyone!

we have recently had an unknown problem with our cluster and we have 
lost some data, including the latest user accounts created.
Does anyone have any idea of how to recover those user accounts and data?
The data haven't been deleted so it should be in somewhere in the disk!!!

Please, any help would be much more appreciated.


Best,

Luis



From linux at alteeve.com  Wed Jan  5 16:28:24 2011
From: linux at alteeve.com (Digimer)
Date: Wed, 05 Jan 2011 11:28:24 -0500
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D24977C.1080301@gmail.com>
References: <4D24977C.1080301@gmail.com>
Message-ID: <4D249C28.7050406@alteeve.com>

On 01/05/2011 11:08 AM, Luis Cebamanos wrote:
> Dear everyone!
> 
> we have recently had an unknown problem with our cluster and we have
> lost some data, including the latest user accounts created.
> Does anyone have any idea of how to recover those user accounts and data?
> The data haven't been deleted so it should be in somewhere in the disk!!!
> 
> Please, any help would be much more appreciated.
> 
> 
> Best,
> 
> Luis

Please provide more details. Specifically, what file system? How did the
data loss occur (as best as you know)? What versions of what cluster
applications? What has been done since then? Where and how was the data
stored? etc.

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From rpeterso at redhat.com  Wed Jan  5 16:55:08 2011
From: rpeterso at redhat.com (Bob Peterson)
Date: Wed, 5 Jan 2011 11:55:08 -0500 (EST)
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D24977C.1080301@gmail.com>
Message-ID: <1776791552.141901.1294246508264.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>

----- Original Message -----
| Dear everyone!
| 
| we have recently had an unknown problem with our cluster and we have
| lost some data, including the latest user accounts created.
| Does anyone have any idea of how to recover those user accounts and
| data?
| The data haven't been deleted so it should be in somewhere in the
| disk!!!
| 
| Please, any help would be much more appreciated.
| 
| 
| Best,
| 
| Luis

We need lots of more details before we can help.  For example:

Is this clustered storage?  Is it your root mount point?
What file system is this?  GFS?  GFS2?  ext3?  ext4?  Other?
What happened to it?  Did you lose drives in a RAID array?
Did you notice the missing data before or after fsck was run?
We need lots of details.

Regards,

Bob Peterson
Red Hat File Systems



From jcasale at activenetwerx.com  Wed Jan  5 23:37:07 2011
From: jcasale at activenetwerx.com (Joseph L. Casale)
Date: Wed, 5 Jan 2011 23:37:07 +0000
Subject: [Linux-cluster] Service State via snmp
Message-ID: <CA5A491E9DEFBE4CB777DE97E21575E90754CB7D@prato.activenetwerx.local>

Is there any way to query if a service is frozen with snmp, it doesn't appear that
"running" is distinguished from "frozen" in rhcServiceStatusCode and I was hoping
to avoid anything not native to snmp in order to deduce this. I could use an extend
or exec for example but that's not really desirable...

Thanks,
jlc



From parvez.h.shaikh at gmail.com  Thu Jan  6 05:24:40 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 6 Jan 2011 10:54:40 +0530
Subject: [Linux-cluster] Determining red hat cluster version
Message-ID: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>

Hi all,

Is there any command which states Red Hat cluster version?

I tried cman_tool version, and ccs_tool -V both produce different
results, most likely reporting version of their own (not of Cluster
suite)

yum list installed *Cluster*

produces following -

Installed Packages
Cluster_Administration-en-US.noarch
              5.2-1
      installed
.....

Does it mean cluster version is 5.2-1?

Any help will be appreciated.

Thanks



From fdinitto at redhat.com  Thu Jan  6 07:04:53 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Thu, 06 Jan 2011 08:04:53 +0100
Subject: [Linux-cluster] Determining red hat cluster version
In-Reply-To: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>
References: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>
Message-ID: <4D256995.5040706@redhat.com>

On 1/6/2011 6:24 AM, Parvez Shaikh wrote:
> Hi all,
> 
> Is there any command which states Red Hat cluster version?
> 
> I tried cman_tool version, and ccs_tool -V both produce different
> results, most likely reporting version of their own (not of Cluster
> suite)
> 

rpm -q -f $(which cman_tool) is one option, otherwise you need to parse
cman_tool protocol version manually.

Fabio



From parvez.h.shaikh at gmail.com  Thu Jan  6 07:28:29 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 6 Jan 2011 12:58:29 +0530
Subject: [Linux-cluster] Determining red hat cluster version
In-Reply-To: <4D256995.5040706@redhat.com>
References: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>
	<4D256995.5040706@redhat.com>
Message-ID: <AANLkTik=7sEGRN3Lgq5zLyNUEvu6vq+a71d8iqMEm=C6@mail.gmail.com>

Hi Fabio

This produces output -

cman-2.0.115-29.el5

So does it indicate 2.0.115-29 is version?

On Thu, Jan 6, 2011 at 12:34 PM, Fabio M. Di Nitto <fdinitto at redhat.com> wrote:
> On 1/6/2011 6:24 AM, Parvez Shaikh wrote:
>> Hi all,
>>
>> Is there any command which states Red Hat cluster version?
>>
>> I tried cman_tool version, and ccs_tool -V both produce different
>> results, most likely reporting version of their own (not of Cluster
>> suite)
>>
>
> rpm -q -f $(which cman_tool) is one option, otherwise you need to parse
> cman_tool protocol version manually.
>
> Fabio
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From fdinitto at redhat.com  Thu Jan  6 07:44:52 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Thu, 06 Jan 2011 08:44:52 +0100
Subject: [Linux-cluster] Determining red hat cluster version
In-Reply-To: <AANLkTik=7sEGRN3Lgq5zLyNUEvu6vq+a71d8iqMEm=C6@mail.gmail.com>
References: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>	<4D256995.5040706@redhat.com>
	<AANLkTik=7sEGRN3Lgq5zLyNUEvu6vq+a71d8iqMEm=C6@mail.gmail.com>
Message-ID: <4D2572F4.8070903@redhat.com>

On 1/6/2011 8:28 AM, Parvez Shaikh wrote:
> Hi Fabio
> 
> This produces output -
> 
> cman-2.0.115-29.el5
> 
> So does it indicate 2.0.115-29 is version?

yes

Fabio



From parvez.h.shaikh at gmail.com  Thu Jan  6 08:02:52 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 6 Jan 2011 13:32:52 +0530
Subject: [Linux-cluster] Determining red hat cluster version
In-Reply-To: <4D2572F4.8070903@redhat.com>
References: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>
	<4D256995.5040706@redhat.com>
	<AANLkTik=7sEGRN3Lgq5zLyNUEvu6vq+a71d8iqMEm=C6@mail.gmail.com>
	<4D2572F4.8070903@redhat.com>
Message-ID: <AANLkTimFpo95a=StcOMLQ=kp0tMk5H3UF0cUSC3LLRsJ@mail.gmail.com>

Thanks Fabio

Is this version same as what can be referred as version of "Red Hat
Cluster Suite"?

The reason I am asking is, as a part of RHCS there are various
components (Cluster_Administration-en-US, cluster-cim, cluster-snmp,
cman, rgmanager, luci, ricci etc etc) and each of which shows its own
version -

cman has version as below, rgmanager as version 2.0.52. cluster-cim
and cluster-snmp,modcluster has version 0.12.1, system-config-cluster
has 1.0.57 version.

Is there  one version number referring to "Cluster Suite" which would
have encompassed entire set of components (with their own versions may
be)

Gratefully yours

On Thu, Jan 6, 2011 at 1:14 PM, Fabio M. Di Nitto <fdinitto at redhat.com> wrote:
> On 1/6/2011 8:28 AM, Parvez Shaikh wrote:
>> Hi Fabio
>>
>> This produces output -
>>
>> cman-2.0.115-29.el5
>>
>> So does it indicate 2.0.115-29 is version?
>
> yes
>
> Fabio
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From fdinitto at redhat.com  Thu Jan  6 08:18:37 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Thu, 06 Jan 2011 09:18:37 +0100
Subject: [Linux-cluster] Determining red hat cluster version
In-Reply-To: <AANLkTimFpo95a=StcOMLQ=kp0tMk5H3UF0cUSC3LLRsJ@mail.gmail.com>
References: <AANLkTinv_bXD2ok_uqAsTLghpT6URGu3CxG-vvVCkYvG@mail.gmail.com>	<4D256995.5040706@redhat.com>	<AANLkTik=7sEGRN3Lgq5zLyNUEvu6vq+a71d8iqMEm=C6@mail.gmail.com>	<4D2572F4.8070903@redhat.com>
	<AANLkTimFpo95a=StcOMLQ=kp0tMk5H3UF0cUSC3LLRsJ@mail.gmail.com>
Message-ID: <4D257ADD.3080306@redhat.com>

On 1/6/2011 9:02 AM, Parvez Shaikh wrote:
> Thanks Fabio
> 
> Is this version same as what can be referred as version of "Red Hat
> Cluster Suite"?
> 
> The reason I am asking is, as a part of RHCS there are various
> components (Cluster_Administration-en-US, cluster-cim, cluster-snmp,
> cman, rgmanager, luci, ricci etc etc) and each of which shows its own
> version -

Each component is separate for a reason. In general it?s safe enough to
refer to RHCS to the cman version, but when filing bugs, it?s always
best to get the correct version for a specific component.

Fabio



From parvez.h.shaikh at gmail.com  Thu Jan  6 13:53:20 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 6 Jan 2011 19:23:20 +0530
Subject: [Linux-cluster] configuring bladecenter fence device
Message-ID: <AANLkTinEUqRWFc+GvbbhTR3VCVu-7Q0nikDU6aEywFw0@mail.gmail.com>

Hi all,

>From RHCS documentation, I could see that bladecenter is one of the
fence devices -
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-fence-device-param-CA.html

Table B.9. IBM Blade Center
Field 	Description
Name 	A name for the IBM BladeCenter device connected to the cluster.
IP Address 	The IP address assigned to the device.
Login 	The login name used to access the device.
Password 	The password used to authenticate the connection to the device.
Password Script (optional) 	The script that supplies a password for
access to the fence device. Using this supersedes the Password
parameter.
Blade 	The blade of the device.
Use SSH 	(Rhel 5.4 and later) Indicates that system will use SSH to
access the device.

As per my understanding, IP address is IP address of management module
of IBM blade center, login/password represent credentials to access
the same.

However did not get the parameter 'Blade'. How does it play role in fencing?

In a situation where there  are two blades - Blade-1 and Blade-2 and
if Blade-1 goes down(hardware node failure), Blade-2 should fence out
Blade-1, in that situation fenced on Blade-2 should power off(?)
blade-2 using fence_bladecenter, so how should below sniplet of
cluster.conf file should look like? -


        <clusternodes>
                <clusternode name="blade1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device blade="?????"
name="BLADECENTER"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="blade2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device blade="????"
name="BLADECENTER"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>

In which situation fence_bladecenter would be used to power on the blade?

Your gratefully



From bturner at redhat.com  Thu Jan  6 16:06:41 2011
From: bturner at redhat.com (Ben Turner)
Date: Thu, 6 Jan 2011 11:06:41 -0500 (EST)
Subject: [Linux-cluster] configuring bladecenter fence device
In-Reply-To: <AANLkTinEUqRWFc+GvbbhTR3VCVu-7Q0nikDU6aEywFw0@mail.gmail.com>
Message-ID: <752752065.118339.1294330001557.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

To address:

As per my understanding, IP address is IP address of management module
of IBM blade center, login/password represent credentials to access
the same.

>> Correct.

However did not get the parameter 'Blade'. How does it play role in
fencing?

>> If I recall correctly the blade= is the identifier used to identify the blade in the AMM.  I can't remember if it is a number of a slot or a user defined name.  It corresponds to 

# fence_bladecenter -h 

   -n, --plug=<id>                Physical plug number on device or
                                        name of virtual machine
If the fencing code:

        "port" : {
                "getopt" : "n:",
                "longopt" : "plug",
                "help" : "-n, --plug=<id>                Physical plug number on device or\n" + 
                "                                        name of virtual machine",
                "required" : "1",
                "shortdesc" : "Physical plug number or name of virtual machine",
                "order" : 1 },

To test this try running:

/sbin/fence_bladecenter -a <ip or hostname of bladecenter> -l <login> -p <passwd> -n <blade number of the blade you want to fence> -o status -v

An example cluster.conf looks like:

                <clusternode name="node1" votes="1">
                        <fence>
                                <method name="1">
                                        <device blade="2" name="chassis_fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node2" votes="1">
                        <fence>
                                <method name="1">
                                        <device blade="3" name="chassis_fence"/>
                                </method>
                        </fence>

       <fencedevices>
                <fencedevice agent="fence_bladecenter" ipaddr="XXX.XXX.1.143" login="rchs_fence" name="chassis_fence" 
passwd="XXXXXXX"/>
        </fencedevices>

-Ben




----- Original Message -----
> Hi all,
> 
> >From RHCS documentation, I could see that bladecenter is one of the
> fence devices -
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-fence-device-param-CA.html
> 
> Table B.9. IBM Blade Center
> Field Description
> Name A name for the IBM BladeCenter device connected to the cluster.
> IP Address The IP address assigned to the device.
> Login The login name used to access the device.
> Password The password used to authenticate the connection to the
> device.
> Password Script (optional) The script that supplies a password for
> access to the fence device. Using this supersedes the Password
> parameter.
> Blade The blade of the device.
> Use SSH (Rhel 5.4 and later) Indicates that system will use SSH to
> access the device.
> 
> As per my understanding, IP address is IP address of management module
> of IBM blade center, login/password represent credentials to access
> the same.
> 
> However did not get the parameter 'Blade'. How does it play role in
> fencing?
> 
> In a situation where there are two blades - Blade-1 and Blade-2 and
> if Blade-1 goes down(hardware node failure), Blade-2 should fence out
> Blade-1, in that situation fenced on Blade-2 should power off(?)
> blade-2 using fence_bladecenter, so how should below sniplet of
> cluster.conf file should look like? -
> 
> 
> <clusternodes>
> <clusternode name="blade1" nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device blade="?????"
> name="BLADECENTER"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="blade2" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device blade="????"
> name="BLADECENTER"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> 
> In which situation fence_bladecenter would be used to power on the
> blade?
> 
> Your gratefully
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From bradley.a.morrison at jpmchase.com  Thu Jan  6 17:01:32 2011
From: bradley.a.morrison at jpmchase.com (Morrison, Bradley A)
Date: Thu, 6 Jan 2011 12:01:32 -0500
Subject: [Linux-cluster] reboot to rejoin RAC cluster?
Message-ID: <DDED3C1645C03C4EB881075AE494889405A950A92B@EMASC202VS01.exchad.jpmchase.net>

Q: Can I reboot one node in a two-node cluster and have it rejoin the cluster?

I've a two-node cluster which recently had HBAs replaced on both cluster nodes.
Node 1 was ejected sometime after its latest reboot, and now won't mount its OCFS volumes. The volumes' headers are verified from n1, i.e., it can see the volumes, but mounting fails. Restarting o2cb yields "modprobe: FATAL: Module ocfs2_stackglue not found."

I want to reboot n1 IFF this will have it rejoin the cluster - unless there's another way to have n1 rejoin the cluster w/o reboot.

Status for n1:
n1# service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 61
  Network idle timeout: 30000
  Network keepalive delay: 2000
  Network reconnect delay: 2000
Checking O2CB heartbeat: Not active
n1#

Status for n2:
n2# service o2cb status
service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 61
  Network idle timeout: 30000
  Network keepalive delay: 2000
  Network reconnect delay: 2000
Checking O2CB heartbeat: Active
n2#


This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110106/d8d596e0/attachment.htm>

From james.hofmeister at hp.com  Thu Jan  6 18:00:30 2011
From: james.hofmeister at hp.com (Hofmeister, James (WTEC Linux))
Date: Thu, 6 Jan 2011 18:00:30 +0000
Subject: [Linux-cluster] What is the current recommendation concerning
 shutting down a cluster node?
Message-ID: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>

What is the current recommendation concerning shutting down a cluster node?

Is it acceptable to use shutdown or init to reboot an active/running cluster node?

I have reviewed the RHCS admin guide and it does not state to *not* use shutdown or init.

RHCS RH436 training manual (page 305) says the way to remove a node from the cluster is: 

umount gfs
service rgmanager stop
service gfs stop
service clvmd stop
service cman stop

...and I have found other references that say:

Use the "leave cluster" functionality from luci.  Also "cman_tool leave".  Also fence_ command.

In the RH436 class it was verbally discussed during the fencing chapter that it was a bad thing to shutdown a node, but to instead power down the machine which has been implemented in many of the fence_ scripts.

So the question is:  Is it currently acceptable to use shutdown or init to reboot an active/running cluster node?
~and~ if not, is this documented?

Regards, 
????? James Hofmeister? Hewlett Packard Linux Solutions Engineer




From linux at alteeve.com  Thu Jan  6 18:51:21 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 06 Jan 2011 13:51:21 -0500
Subject: [Linux-cluster] What is the current recommendation concerning
 shutting down a cluster node?
In-Reply-To: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
References: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
Message-ID: <4D260F29.2070706@alteeve.com>

On 01/06/2011 01:00 PM, Hofmeister, James (WTEC Linux) wrote:
> umount gfs
> service rgmanager stop
> service gfs stop
> service clvmd stop
> service cman stop

This is my method. However, note that stopping GFS unmounts the volumes,
so you can skip the manual unmount.

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From yvette at dbtgroup.com  Thu Jan  6 19:30:16 2011
From: yvette at dbtgroup.com (yvette hirth)
Date: Thu, 06 Jan 2011 19:30:16 +0000
Subject: [Linux-cluster] [DRBD-user] DRBD and KVM for a HA-Cluster ?
In-Reply-To: <4D2611DC.10205@alteeve.com>
References: <15785B7E063D464C86DD482FCAE4EBA5012E8F22C05E@XCH11.scidom.de>	<4D24BC39.50102@alteeve.com>	<AANLkTikM13MnGXNURw2_Sgo2uHCRcsTKsaWNbivP+Qd0@mail.gmail.com>	<15785B7E063D464C86DD482FCAE4EBA5012E8FCD44A9@XCH11.scidom.de>	<4D25C92F.3020104@alteeve.com>	<15785B7E063D464C86DD482FCAE4EBA5012E8FCD44AB@XCH11.scidom.de>	<4D25E7F8.6090101@alteeve.com>	<15785B7E063D464C86DD482FCAE4EBA5012E8FD78613@XCH11.scidom.de>
	<4D2611DC.10205@alteeve.com>
Message-ID: <4D261848.9030109@dbtgroup.com>

Digimer wrote:

(snippage)
> In fact, it's a benefit because, last I checked, snapshot'ing of
> clvm was not possible.

and it still isn't.  i tried to slapshot a gfs2 volume and it refused, 
which makes tar'ing a gfs2 directory - without getting "source volume 
changed during processing" messages - impossible, at least in my setup 
(RHEL/Centos 5.5, five-way clustah).

so i created xfs filesystems, rsync the gfs2 -> xfs stuff hourly, and 
backup from the xfs.  it's "not-really-wonderful" but works fine.

yvette



From luiceur at gmail.com  Thu Jan  6 19:36:16 2011
From: luiceur at gmail.com (Luis Cebamanos)
Date: Thu, 06 Jan 2011 19:36:16 +0000
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D249C28.7050406@alteeve.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
Message-ID: <4D2619B0.2040001@gmail.com>

Is a cluster with 16 nodes and I suspect the problem is in the head node:
$cat /proc/version
Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 3.3.5 
20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 2006

$cat /proc/cpuinfo
processor    : 0
vendor_id    : AuthenticAMD
cpu family    : 15
model        : 37
model name    : AMD Opteron(tm) Processor 246
stepping    : 1
cpu MHz        : 1994.349
cache size    : 1024 KB
physical id    : 255
siblings    : 1
fpu        : yes
fpu_exception    : yes
cpuid level    : 1
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext fxsr_opt 
lm 3dnowext 3dnow pni lahf_lm
bogomips    : 3923.96
TLB size    : 1024 4K pages
clflush size    : 64
cache_alignment    : 64
address sizes    : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp


processor    : 1
vendor_id    : AuthenticAMD
cpu family    : 15
model        : 37
model name    : AMD Opteron(tm) Processor 246
stepping    : 1
cpu MHz        : 1994.349
cache size    : 1024 KB
physical id    : 255
siblings    : 1
fpu        : yes
fpu_exception    : yes
cpuid level    : 1
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext fxsr_opt 
lm 3dnowext 3dnow pni lahf_lm
bogomips    : 3981.31
TLB size    : 1024 4K pages
clflush size    : 64
cache_alignment    : 64
address sizes    : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

cat /proc/meminfo #
MemTotal:      2055264 kB
MemFree:       1781708 kB
Buffers:         96696 kB
Cached:          99892 kB
SwapCached:          0 kB
Active:         114836 kB
Inactive:        98388 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      2055264 kB
LowFree:       1781708 kB
SwapTotal:     4192924 kB
SwapFree:      4192924 kB
Dirty:              52 kB
Writeback:           0 kB
Mapped:          28104 kB
Slab:            43688 kB
CommitLimit:   5220556 kB
Committed_AS:   232052 kB
PageTables:       1512 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      2412 kB
VmallocChunk: 34359735867 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

df -T
Filesystem    Type   1K-blocks      Used Available Use% Mounted on
/dev/sda2     ext3    33032228  19971720  11382520  64% /
tmpfs        tmpfs     1027632         0   1027632   0% /dev/shm
/dev/sda1     ext3      124427      9412    108591   8% /boot
/dev/sda6     ext3     2063504     33820   1924864   2% /tmp
/dev/sda9     ext3   166698068 119970708  38259504  76% /users
/dev/sda8     ext3    32250392    966320  29645848   4% /usr/local
/dev/sda7     ext3     2063504    901804   1056880  47% /var

We were trying to install new hard drives to the system but something 
that we don't know went wrong and it ended up in almost 4 years of work 
lost!!!
Please, let me know what else can I do to be able to get the data back!

Best
On 01/05/2011 04:28 PM, Digimer wrote:
> On 01/05/2011 11:08 AM, Luis Cebamanos wrote:
>> Dear everyone!
>>
>> we have recently had an unknown problem with our cluster and we have
>> lost some data, including the latest user accounts created.
>> Does anyone have any idea of how to recover those user accounts and data?
>> The data haven't been deleted so it should be in somewhere in the disk!!!
>>
>> Please, any help would be much more appreciated.
>>
>>
>> Best,
>>
>> Luis
> Please provide more details. Specifically, what file system? How did the
> data loss occur (as best as you know)? What versions of what cluster
> applications? What has been done since then? Where and how was the data
> stored? etc.
>



From thomas at sjolshagen.net  Thu Jan  6 19:42:37 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Thu, 06 Jan 2011 14:42:37 -0500
Subject: [Linux-cluster]
 =?utf-8?q?What_is_the_current_recommendation_conc?=
 =?utf-8?q?erning_shutting_down_a_cluster_node=3F?=
In-Reply-To: <4D260F29.2070706@alteeve.com>
References: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
	<4D260F29.2070706@alteeve.com>
Message-ID: <ae66b13f1d9f18f2f2a6195bb72149b5@sjolshagen.net>

 On Thu, 06 Jan 2011 13:51:21 -0500, Digimer <linux at alteeve.com> wrote:
> On 01/06/2011 01:00 PM, Hofmeister, James (WTEC Linux) wrote:
>> umount gfs
>> service rgmanager stop
>> service gfs stop
>> service clvmd stop
>> service cman stop
>
> This is my method. However, note that stopping GFS unmounts the 
> volumes,
> so you can skip the manual unmount.

 Being a Fedora 14 user (at this point), rebooting the cluster node(s) 
 with shutdown -r|h now has been working fine for me since transitioning 
 to F14 (should be the same cluster stack as what RHEL 6 uses now, I 
 believe)

 Migrating any rgmanager services off of the node I'm bringing down 
 first seems to be required (httpd/apache is pretty slow at shutting down 
 and thus causes my shutdown -r generated umount of the gfs|gfs2 file 
 systems to fail)

 // Thomas

 PS: is there a specific reason why the GFS/GFS2 service shutdown option 
 doesn't do something akin to "fuser -k" for the mountpoints in order to 
 kill off any processes that would cause the umount to fail? Yes it's 
 brutal, and certain databases would be a little unhappy, but I'd argue 
 any db shutdown script that doesn't stall until the DB is actually down 
 are buggy by definition.



From linux at alteeve.com  Thu Jan  6 19:44:17 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 06 Jan 2011 14:44:17 -0500
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D2619B0.2040001@gmail.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com>
Message-ID: <4D261B91.3050308@alteeve.com>

On 01/06/2011 02:36 PM, Luis Cebamanos wrote:
> Is a cluster with 16 nodes and I suspect the problem is in the head node:
> $cat /proc/version
> Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 3.3.5
> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 2006
> 
> df -T
> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
> /dev/sda2     ext3    33032228  19971720  11382520  64% /
> tmpfs        tmpfs     1027632         0   1027632   0% /dev/shm
> /dev/sda1     ext3      124427      9412    108591   8% /boot
> /dev/sda6     ext3     2063504     33820   1924864   2% /tmp
> /dev/sda9     ext3   166698068 119970708  38259504  76% /users
> /dev/sda8     ext3    32250392    966320  29645848   4% /usr/local
> /dev/sda7     ext3     2063504    901804   1056880  47% /var
> 
> We were trying to install new hard drives to the system but something
> that we don't know went wrong and it ended up in almost 4 years of work
> lost!!!
> Please, let me know what else can I do to be able to get the data back!
> 
> Best

It's a bit late now, but I suppose you don't have backups?

As for how to help, what you provided was only marginally helpful. We
need a much more extensive overview of your setup and configuration,
versions, etc. before we can have any idea if we can help.

In the short term, don't try anything yourself without careful thought.
If the data is very valuable, consider hiring a data recovery firm near
you who can come and look at your setup.

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From luiceur at gmail.com  Thu Jan  6 19:51:18 2011
From: luiceur at gmail.com (Luis Cebamanos)
Date: Thu, 06 Jan 2011 19:51:18 +0000
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D261B91.3050308@alteeve.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com> <4D261B91.3050308@alteeve.com>
Message-ID: <4D261D36.7080008@gmail.com>

Well, there are not backups of that valuable lost data but the cluster 
was using disk mirroring but no one has a clue of how to take advantage 
of that. Nobody here is a cluster expert and that has been the problem I 
guess.
We haven't really "touch" any important system file, after physically 
installed the hard disk, we realized that the cluster wasn't properly 
working. We rebooted the cluster without the old disks and that has been 
the result.
Worst scenario, we will need to call an expert, but we think it can not 
be a big deal as I said, we haven't modified the previous configuration...
On 01/06/2011 07:44 PM, Digimer wrote:
> On 01/06/2011 02:36 PM, Luis Cebamanos wrote:
>> Is a cluster with 16 nodes and I suspect the problem is in the head node:
>> $cat /proc/version
>> Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 3.3.5
>> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 2006
>>
>> df -T
>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>> /dev/sda2     ext3    33032228  19971720  11382520  64% /
>> tmpfs        tmpfs     1027632         0   1027632   0% /dev/shm
>> /dev/sda1     ext3      124427      9412    108591   8% /boot
>> /dev/sda6     ext3     2063504     33820   1924864   2% /tmp
>> /dev/sda9     ext3   166698068 119970708  38259504  76% /users
>> /dev/sda8     ext3    32250392    966320  29645848   4% /usr/local
>> /dev/sda7     ext3     2063504    901804   1056880  47% /var
>>
>> We were trying to install new hard drives to the system but something
>> that we don't know went wrong and it ended up in almost 4 years of work
>> lost!!!
>> Please, let me know what else can I do to be able to get the data back!
>>
>> Best
> It's a bit late now, but I suppose you don't have backups?
>
> As for how to help, what you provided was only marginally helpful. We
> need a much more extensive overview of your setup and configuration,
> versions, etc. before we can have any idea if we can help.
>
> In the short term, don't try anything yourself without careful thought.
> If the data is very valuable, consider hiring a data recovery firm near
> you who can come and look at your setup.
>



From gordan at bobich.net  Thu Jan  6 20:03:04 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 06 Jan 2011 20:03:04 +0000
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D261D36.7080008@gmail.com>
References: <4D24977C.1080301@gmail.com>
	<4D249C28.7050406@alteeve.com>	<4D2619B0.2040001@gmail.com>
	<4D261B91.3050308@alteeve.com> <4D261D36.7080008@gmail.com>
Message-ID: <4D261FF8.4050704@bobich.net>

Luis,

You still haven't provided the relevant details of your configuration. 
The df output you provided isn't relevant in terms of the data recovery. 
You haven't even mentioned what file system the lost data was on.

You mention mirroring - what was doing the mirroring? Hardware RAID? 
Software MD RAID? Software DM RAID? LVM?

Gordan

On 01/06/2011 07:51 PM, Luis Cebamanos wrote:
> Well, there are not backups of that valuable lost data but the cluster
> was using disk mirroring but no one has a clue of how to take advantage
> of that. Nobody here is a cluster expert and that has been the problem I
> guess.
> We haven't really "touch" any important system file, after physically
> installed the hard disk, we realized that the cluster wasn't properly
> working. We rebooted the cluster without the old disks and that has been
> the result.
> Worst scenario, we will need to call an expert, but we think it can not
> be a big deal as I said, we haven't modified the previous configuration...
> On 01/06/2011 07:44 PM, Digimer wrote:
>> On 01/06/2011 02:36 PM, Luis Cebamanos wrote:
>>> Is a cluster with 16 nodes and I suspect the problem is in the head
>>> node:
>>> $cat /proc/version
>>> Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 3.3.5
>>> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 2006
>>>
>>> df -T
>>> Filesystem Type 1K-blocks Used Available Use% Mounted on
>>> /dev/sda2 ext3 33032228 19971720 11382520 64% /
>>> tmpfs tmpfs 1027632 0 1027632 0% /dev/shm
>>> /dev/sda1 ext3 124427 9412 108591 8% /boot
>>> /dev/sda6 ext3 2063504 33820 1924864 2% /tmp
>>> /dev/sda9 ext3 166698068 119970708 38259504 76% /users
>>> /dev/sda8 ext3 32250392 966320 29645848 4% /usr/local
>>> /dev/sda7 ext3 2063504 901804 1056880 47% /var
>>>
>>> We were trying to install new hard drives to the system but something
>>> that we don't know went wrong and it ended up in almost 4 years of work
>>> lost!!!
>>> Please, let me know what else can I do to be able to get the data back!
>>>
>>> Best
>> It's a bit late now, but I suppose you don't have backups?
>>
>> As for how to help, what you provided was only marginally helpful. We
>> need a much more extensive overview of your setup and configuration,
>> versions, etc. before we can have any idea if we can help.
>>
>> In the short term, don't try anything yourself without careful thought.
>> If the data is very valuable, consider hiring a data recovery firm near
>> you who can come and look at your setup.
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From work at fajar.net  Thu Jan  6 20:05:34 2011
From: work at fajar.net (Fajar A. Nugraha)
Date: Fri, 7 Jan 2011 03:05:34 +0700
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D2619B0.2040001@gmail.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com>
Message-ID: <AANLkTim4d5MJcJnjP9bumPWq65mCQSn9AG_LtgD7_-oK@mail.gmail.com>

On Fri, Jan 7, 2011 at 2:36 AM, Luis Cebamanos <luiceur at gmail.com> wrote:
> Is a cluster with 16 nodes and I suspect the problem is in the head node:

> We were trying to install new hard drives to the system but something that
> we don't know went wrong and it ended up in almost 4 years of work lost!!!
> Please, let me know what else can I do to be able to get the data back!

You still haven't answered these questions

>> Please provide more details. Specifically, what file system?

If only ext3 is involved, and you're sharing it to other nodes from
the head node via nfs, then this list is probably the wrong place to
ask.

>> How did the
>> data loss occur (as best as you know)? What versions of what cluster
>> applications?

... and if you don't even know what cluster application you use, no
one will be able to help even if they want to.

>> What has been done since then? Where and how was the data
>> stored? etc.

... continuing the part of "how was the data stored", since you df
output only shows sda mounted, you can start by checking :
- how many disks you have
- are they all detected (e.g 4 disks usually show up as sda - sdd).
- do you use LVM
- how are the disks mounted. Is it through fstab, or does the cluster
takes care of mounting the resource as well

/etc/fstab and /var/log/messages should be a good place to start looking.

Simply saying "something that we don't know went wrong" won't be
getting you anywhere. If you think you need more expertise, getting
help from a local expert who can get hands-on to your servers and know
what to look for would be a good start.

-- 
Fajar



From linux at alteeve.com  Thu Jan  6 20:09:52 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 06 Jan 2011 15:09:52 -0500
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D261D36.7080008@gmail.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com> <4D261B91.3050308@alteeve.com>
	<4D261D36.7080008@gmail.com>
Message-ID: <4D262190.1090609@alteeve.com>

On 01/06/2011 02:51 PM, Luis Cebamanos wrote:
> Well, there are not backups of that valuable lost data but the cluster
> was using disk mirroring but no one has a clue of how to take advantage
> of that. Nobody here is a cluster expert and that has been the problem I
> guess.
> We haven't really "touch" any important system file, after physically
> installed the hard disk, we realized that the cluster wasn't properly
> working. We rebooted the cluster without the old disks and that has been
> the result.
> Worst scenario, we will need to call an expert, but we think it can not
> be a big deal as I said, we haven't modified the previous configuration...

Luis,

  Stop doing anything and call an expert right away. If your data is
valuable and you are not familiar with clustering and/or systems
administration, you will very likely remove any remaining chance for
data recovery.

In the future, *always* have external backups!

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From luiceur at gmail.com  Thu Jan  6 20:07:45 2011
From: luiceur at gmail.com (Luis Cebamanos)
Date: Thu, 06 Jan 2011 20:07:45 +0000
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D261B91.3050308@alteeve.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com> <4D261B91.3050308@alteeve.com>
Message-ID: <4D262111.6040903@gmail.com>

It is not only data, it is like the system were booting from a 4 years 
time configuration as everything that we have done, included other kind 
of configuration files, just disappeared!!!

On 01/06/2011 07:44 PM, Digimer wrote:
> On 01/06/2011 02:36 PM, Luis Cebamanos wrote:
>> Is a cluster with 16 nodes and I suspect the problem is in the head node:
>> $cat /proc/version
>> Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 3.3.5
>> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 2006
>>
>> df -T
>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>> /dev/sda2     ext3    33032228  19971720  11382520  64% /
>> tmpfs        tmpfs     1027632         0   1027632   0% /dev/shm
>> /dev/sda1     ext3      124427      9412    108591   8% /boot
>> /dev/sda6     ext3     2063504     33820   1924864   2% /tmp
>> /dev/sda9     ext3   166698068 119970708  38259504  76% /users
>> /dev/sda8     ext3    32250392    966320  29645848   4% /usr/local
>> /dev/sda7     ext3     2063504    901804   1056880  47% /var
>>
>> We were trying to install new hard drives to the system but something
>> that we don't know went wrong and it ended up in almost 4 years of work
>> lost!!!
>> Please, let me know what else can I do to be able to get the data back!
>>
>> Best
> It's a bit late now, but I suppose you don't have backups?
>
> As for how to help, what you provided was only marginally helpful. We
> need a much more extensive overview of your setup and configuration,
> versions, etc. before we can have any idea if we can help.
>
> In the short term, don't try anything yourself without careful thought.
> If the data is very valuable, consider hiring a data recovery firm near
> you who can come and look at your setup.
>



From thomas at sjolshagen.net  Thu Jan  6 20:18:17 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Thu, 06 Jan 2011 15:18:17 -0500
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D261D36.7080008@gmail.com>
References: <4D24977C.1080301@gmail.com> <4D249C28.7050406@alteeve.com>
	<4D2619B0.2040001@gmail.com> <4D261B91.3050308@alteeve.com>
	<4D261D36.7080008@gmail.com>
Message-ID: <099b08dbe663c998c8b14a7bf84bd0ad@sjolshagen.net>

 We'd need to know how your storage is configured.

 Everything from how storage is connected to the system (if it's 
 external) or if you're using an internal RAID controller. What sort of 
 mirroring are you using, how many locally connected HDD drives do you 
 have, how are the clustered file systems connected to this system 
 (iSCSI, Fibre Channel, something else), the contents of /etc/fstab, 
 /etc/cluster/cluster.conf, output from the lvm commands (IIRC, #pvs, 
 #lvs and #vgs will work on RHEL to display any LVM configuration 
 hightlights). Also, what version of the cluster stack are you using (and 
 what components - things like chkconfig --list, rpm -qa | grep cman, 
 etc, etc). There are a couple of documents on the web that will provide 
 you with examples/suggestions for the types of data you can/should 
 collect before reporting a problem with the Red Hat Cluster Stack.

 Also, what caused you to change the drive, how did it get changed and 
 was anything done as part of that change in order to get the system back 
 online again.

 If you don't know what you need to look at, I'm not sure you're the 
 optimal person to try and fix this problem. Data recovery can be a 
 fairly low-level thing in the right (wrong) set of circumstances.

 // Thomas

 On Thu, 06 Jan 2011 19:51:18 +0000, Luis Cebamanos <luiceur at gmail.com> 
 wrote:
> Well, there are not backups of that valuable lost data but the
> cluster was using disk mirroring but no one has a clue of how to take
> advantage of that. Nobody here is a cluster expert and that has been
> the problem I guess.
> We haven't really "touch" any important system file, after physically
> installed the hard disk, we realized that the cluster wasn't properly
> working. We rebooted the cluster without the old disks and that has
> been the result.
> Worst scenario, we will need to call an expert, but we think it can
> not be a big deal as I said, we haven't modified the previous
> configuration...
> On 01/06/2011 07:44 PM, Digimer wrote:
>> On 01/06/2011 02:36 PM, Luis Cebamanos wrote:
>>> Is a cluster with 16 nodes and I suspect the problem is in the head 
>>> node:
>>> $cat /proc/version
>>> Linux version 2.6.11.4-21.11-smp (geeko at buildhost) (gcc version 
>>> 3.3.5
>>> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Feb 2 20:54:26 GMT 
>>> 2006
>>>
>>> df -T
>>> Filesystem    Type   1K-blocks      Used Available Use% Mounted on
>>> /dev/sda2     ext3    33032228  19971720  11382520  64% /
>>> tmpfs        tmpfs     1027632         0   1027632   0% /dev/shm
>>> /dev/sda1     ext3      124427      9412    108591   8% /boot
>>> /dev/sda6     ext3     2063504     33820   1924864   2% /tmp
>>> /dev/sda9     ext3   166698068 119970708  38259504  76% /users
>>> /dev/sda8     ext3    32250392    966320  29645848   4% /usr/local
>>> /dev/sda7     ext3     2063504    901804   1056880  47% /var
>>>
>>> We were trying to install new hard drives to the system but 
>>> something
>>> that we don't know went wrong and it ended up in almost 4 years of 
>>> work
>>> lost!!!
>>> Please, let me know what else can I do to be able to get the data 
>>> back!
>>>
>>> Best
>> It's a bit late now, but I suppose you don't have backups?
>>
>> As for how to help, what you provided was only marginally helpful. 
>> We
>> need a much more extensive overview of your setup and configuration,
>> versions, etc. before we can have any idea if we can help.
>>
>> In the short term, don't try anything yourself without careful 
>> thought.
>> If the data is very valuable, consider hiring a data recovery firm 
>> near
>> you who can come and look at your setup.
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From linux at alteeve.com  Thu Jan  6 20:19:50 2011
From: linux at alteeve.com (Digimer)
Date: Thu, 06 Jan 2011 15:19:50 -0500
Subject: [Linux-cluster] How to re-store my lost data
In-Reply-To: <4D262111.6040903@gmail.com>
References: <4D24977C.1080301@gmail.com>
	<4D249C28.7050406@alteeve.com>	<4D2619B0.2040001@gmail.com>
	<4D261B91.3050308@alteeve.com> <4D262111.6040903@gmail.com>
Message-ID: <4D2623E6.4080708@alteeve.com>

On 01/06/2011 03:07 PM, Luis Cebamanos wrote:
> It is not only data, it is like the system were booting from a 4 years
> time configuration as everything that we have done, included other kind
> of configuration files, just disappeared!!!

I apologize now for being blunt;

None of that matters. What you are telling us is useless information.
You have not even told us yet what operating system you are running,
never mind other applications or configuration information.

Step back, take a deep breath, and come back with actual application
names, their versions and how they are configured. If you can not do
this then you should not be attempting to recover the data as you will
only make things worse.

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From jlbec at evilplan.org  Thu Jan  6 22:03:29 2011
From: jlbec at evilplan.org (Joel Becker)
Date: Thu, 6 Jan 2011 14:03:29 -0800
Subject: [Linux-cluster] reboot to rejoin RAC cluster?
In-Reply-To: <DDED3C1645C03C4EB881075AE494889405A950A92B@EMASC202VS01.exchad.jpmchase.net>
References: <DDED3C1645C03C4EB881075AE494889405A950A92B@EMASC202VS01.exchad.jpmchase.net>
Message-ID: <20110106220329.GA3312@mail.oracle.com>

On Thu, Jan 06, 2011 at 12:01:32PM -0500, Morrison, Bradley A wrote:
> Q: Can I reboot one node in a two-node cluster and have it rejoin the cluster?

	You certainly should be able to.  You should not need a reboot
either if you just want to rejoin.

> I've a two-node cluster which recently had HBAs replaced on both cluster nodes.
> Node 1 was ejected sometime after its latest reboot, and now won't mount its OCFS volumes. The volumes' headers are verified from n1, i.e., it can see the volumes, but mounting fails. Restarting o2cb yields "modprobe: FATAL: Module ocfs2_stackglue not found."

	This sounds like a kernel configuration problem.  If you can't
load ocfs2_stackglue, you can't get the filesystem started.  What kernel
are you running?

Joel

-- 

"When arrows don't penetrate, see...
 Cupid grabs the pistol."

			http://www.jlbec.org/
			jlbec at evilplan.org



From rhel-cluster at feystorm.net  Fri Jan  7 01:58:18 2011
From: rhel-cluster at feystorm.net (Patrick H.)
Date: Thu, 06 Jan 2011 18:58:18 -0700
Subject: [Linux-cluster] RHEL6 & IP load balancing
Message-ID: <4D26733A.9010109@feystorm.net>

So I just started setting up a RHEL6 box for use in a load balanced 
cluster and have run across a problem. The way you set up a virtual IP 
on the back end realhost side is to add an interface alias to the 
loopback device (such as lo:0). Well the ifup-eth script in RHEL6 
refuses to add aliases to the loopback interface. Additionally if you 
try to add the alias to the real ethX device instead it fails because 
the arping check it does finds that the IP is already running on the 
IPVS director.

So, how is one supposed to setup a realhost now?
The difference from RHEL5 is that RHEL5 doesnt check to see if youre 
adding an alias to the loopback device or not. Why does RHEL6 even care 
about that anyway? Theres nothing wrong with it...



From bradley.a.morrison at jpmchase.com  Fri Jan  7 04:37:17 2011
From: bradley.a.morrison at jpmchase.com (Morrison, Bradley A)
Date: Thu, 6 Jan 2011 23:37:17 -0500
Subject: [Linux-cluster] reboot to rejoin RAC cluster?
In-Reply-To: <20110106220329.GA3312@mail.oracle.com>
References: <DDED3C1645C03C4EB881075AE494889405A950A92B@EMASC202VS01.exchad.jpmchase.net>
	<20110106220329.GA3312@mail.oracle.com>
Message-ID: <DDED3C1645C03C4EB881075AE494889405A950B57C@EMASC202VS01.exchad.jpmchase.net>

Thanks for responding, Joel. It turned out to be a VLAN change which required a change with the bonded NIC. Immediately after issuing

   ifenslave -c bond-hb eth2

on the problem node, it joined the cluster and the OCFS volumes could be mounted.

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Joel Becker
Sent: Thursday, January 06, 2011 4:03 PM
To: linux clustering
Subject: Re: [Linux-cluster] reboot to rejoin RAC cluster?

On Thu, Jan 06, 2011 at 12:01:32PM -0500, Morrison, Bradley A wrote:
> Q: Can I reboot one node in a two-node cluster and have it rejoin the cluster?

	You certainly should be able to.  You should not need a reboot
either if you just want to rejoin.

> I've a two-node cluster which recently had HBAs replaced on both cluster nodes.
> Node 1 was ejected sometime after its latest reboot, and now won't mount its OCFS volumes. The volumes' headers are verified from n1, i.e., it can see the volumes, but mounting fails. Restarting o2cb yields "modprobe: FATAL: Module ocfs2_stackglue not found."

	This sounds like a kernel configuration problem.  If you can't
load ocfs2_stackglue, you can't get the filesystem started.  What kernel
are you running?

Joel

-- 

"When arrows don't penetrate, see...
 Cupid grabs the pistol."

			http://www.jlbec.org/
			jlbec at evilplan.org

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.



From parvez.h.shaikh at gmail.com  Fri Jan  7 04:42:16 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Fri, 7 Jan 2011 10:12:16 +0530
Subject: [Linux-cluster] configuring bladecenter fence device
In-Reply-To: <752752065.118339.1294330001557.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <AANLkTinEUqRWFc+GvbbhTR3VCVu-7Q0nikDU6aEywFw0@mail.gmail.com>
	<752752065.118339.1294330001557.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <AANLkTikzc7mTLaFsxOcu0+bPLA-hN78fYU2APoh=C7Eh@mail.gmail.com>

Hi Ben

Thanks a ton for below information. But I have doubt on cluster.conf
file snippet below -

               <clusternode name="node1" votes="1">
                       <fence>
                               <method name="1">
                                       <device blade="2" name="chassis_fence"/>
                               </method>
                       </fence>
               </clusternode>

Here for "node1" device blade is "2". Does it mean node1 is blade[2]
from AMM perspective? So in order to fence out node1 fence_bladecenter
would turn off blade[2]?

Thanks

On Thu, Jan 6, 2011 at 9:36 PM, Ben Turner <bturner at redhat.com> wrote:
> To address:
>
> As per my understanding, IP address is IP address of management module
> of IBM blade center, login/password represent credentials to access
> the same.
>
>>> Correct.
>
> However did not get the parameter 'Blade'. How does it play role in
> fencing?
>
>>> If I recall correctly the blade= is the identifier used to identify the blade in the AMM. ?I can't remember if it is a number of a slot or a user defined name. ?It corresponds to
>
> # fence_bladecenter -h
>
> ? -n, --plug=<id> ? ? ? ? ? ? ? ?Physical plug number on device or
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?name of virtual machine
> If the fencing code:
>
> ? ? ? ?"port" : {
> ? ? ? ? ? ? ? ?"getopt" : "n:",
> ? ? ? ? ? ? ? ?"longopt" : "plug",
> ? ? ? ? ? ? ? ?"help" : "-n, --plug=<id> ? ? ? ? ? ? ? ?Physical plug number on device or\n" +
> ? ? ? ? ? ? ? ?" ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?name of virtual machine",
> ? ? ? ? ? ? ? ?"required" : "1",
> ? ? ? ? ? ? ? ?"shortdesc" : "Physical plug number or name of virtual machine",
> ? ? ? ? ? ? ? ?"order" : 1 },
>
> To test this try running:
>
> /sbin/fence_bladecenter -a <ip or hostname of bladecenter> -l <login> -p <passwd> -n <blade number of the blade you want to fence> -o status -v
>
> An example cluster.conf looks like:
>
> ? ? ? ? ? ? ? ?<clusternode name="node1" votes="1">
> ? ? ? ? ? ? ? ? ? ? ? ?<fence>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<method name="1">
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<device blade="2" name="chassis_fence"/>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?</method>
> ? ? ? ? ? ? ? ? ? ? ? ?</fence>
> ? ? ? ? ? ? ? ?</clusternode>
> ? ? ? ? ? ? ? ?<clusternode name="node2" votes="1">
> ? ? ? ? ? ? ? ? ? ? ? ?<fence>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<method name="1">
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<device blade="3" name="chassis_fence"/>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?</method>
> ? ? ? ? ? ? ? ? ? ? ? ?</fence>
>
> ? ? ? <fencedevices>
> ? ? ? ? ? ? ? ?<fencedevice agent="fence_bladecenter" ipaddr="XXX.XXX.1.143" login="rchs_fence" name="chassis_fence"
> passwd="XXXXXXX"/>
> ? ? ? ?</fencedevices>
>
> -Ben
>
>
>
>
> ----- Original Message -----
>> Hi all,
>>
>> >From RHCS documentation, I could see that bladecenter is one of the
>> fence devices -
>> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-fence-device-param-CA.html
>>
>> Table B.9. IBM Blade Center
>> Field Description
>> Name A name for the IBM BladeCenter device connected to the cluster.
>> IP Address The IP address assigned to the device.
>> Login The login name used to access the device.
>> Password The password used to authenticate the connection to the
>> device.
>> Password Script (optional) The script that supplies a password for
>> access to the fence device. Using this supersedes the Password
>> parameter.
>> Blade The blade of the device.
>> Use SSH (Rhel 5.4 and later) Indicates that system will use SSH to
>> access the device.
>>
>> As per my understanding, IP address is IP address of management module
>> of IBM blade center, login/password represent credentials to access
>> the same.
>>
>> However did not get the parameter 'Blade'. How does it play role in
>> fencing?
>>
>> In a situation where there are two blades - Blade-1 and Blade-2 and
>> if Blade-1 goes down(hardware node failure), Blade-2 should fence out
>> Blade-1, in that situation fenced on Blade-2 should power off(?)
>> blade-2 using fence_bladecenter, so how should below sniplet of
>> cluster.conf file should look like? -
>>
>>
>> <clusternodes>
>> <clusternode name="blade1" nodeid="1" votes="1">
>> <fence>
>> <method name="1">
>> <device blade="?????"
>> name="BLADECENTER"/>
>> </method>
>> </fence>
>> </clusternode>
>> <clusternode name="blade2" nodeid="2" votes="1">
>> <fence>
>> <method name="1">
>> <device blade="????"
>> name="BLADECENTER"/>
>> </method>
>> </fence>
>> </clusternode>
>> </clusternodes>
>>
>> In which situation fence_bladecenter would be used to power on the
>> blade?
>>
>> Your gratefully
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From hal at elizium.za.net  Fri Jan  7 05:39:28 2011
From: hal at elizium.za.net (Hugo Lombard)
Date: Fri, 7 Jan 2011 07:39:28 +0200
Subject: [Linux-cluster] configuring bladecenter fence device
In-Reply-To: <AANLkTikzc7mTLaFsxOcu0+bPLA-hN78fYU2APoh=C7Eh@mail.gmail.com>
References: <AANLkTinEUqRWFc+GvbbhTR3VCVu-7Q0nikDU6aEywFw0@mail.gmail.com>
	<752752065.118339.1294330001557.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<AANLkTikzc7mTLaFsxOcu0+bPLA-hN78fYU2APoh=C7Eh@mail.gmail.com>
Message-ID: <20110107053928.GZ16798@squishy.elizium.za.net>

On Fri, Jan 07, 2011 at 10:12:16AM +0530, Parvez Shaikh wrote:
> 
>                <clusternode name="node1" votes="1">
>                        <fence>
>                                <method name="1">
>                                        <device blade="2" name="chassis_fence"/>
>                                </method>
>                        </fence>
>                </clusternode>
> 
> Here for "node1" device blade is "2". Does it mean node1 is blade[2]
> from AMM perspective? So in order to fence out node1 fence_bladecenter
> would turn off blade[2]?
> 

Hi Parvez

We use BladeCenters for our clusters, and I can confirm that the
'blade="2"' parameter will translate to 'blade[2]' on the AMM.  IOW, the
'2' is the slot number that the blade is in.

Two more things that might be of help:

- The user specified in the 'login' parameter under the fencedevice
  should be a 'Blade Administrator' for the slots in question.

- If you're running with SELinux enabled, check that the
  'fenced_can_network_connect' boolean is set to 'on'.

Regards

-- 
Hugo Lombard



From parvez.h.shaikh at gmail.com  Fri Jan  7 07:14:19 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Fri, 7 Jan 2011 12:44:19 +0530
Subject: [Linux-cluster] configuring bladecenter fence device
In-Reply-To: <20110107053928.GZ16798@squishy.elizium.za.net>
References: <AANLkTinEUqRWFc+GvbbhTR3VCVu-7Q0nikDU6aEywFw0@mail.gmail.com>
	<752752065.118339.1294330001557.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<AANLkTikzc7mTLaFsxOcu0+bPLA-hN78fYU2APoh=C7Eh@mail.gmail.com>
	<20110107053928.GZ16798@squishy.elizium.za.net>
Message-ID: <AANLkTikzPScD=m0OnpVc3=-XfFdL71oabx_aibxmZroh@mail.gmail.com>

Thanks Hugo

Your gratefully


On Fri, Jan 7, 2011 at 11:09 AM, Hugo Lombard <hal at elizium.za.net> wrote:
> On Fri, Jan 07, 2011 at 10:12:16AM +0530, Parvez Shaikh wrote:
>>
>> ? ? ? ? ? ? ? ?<clusternode name="node1" votes="1">
>> ? ? ? ? ? ? ? ? ? ? ? ?<fence>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<method name="1">
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?<device blade="2" name="chassis_fence"/>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?</method>
>> ? ? ? ? ? ? ? ? ? ? ? ?</fence>
>> ? ? ? ? ? ? ? ?</clusternode>
>>
>> Here for "node1" device blade is "2". Does it mean node1 is blade[2]
>> from AMM perspective? So in order to fence out node1 fence_bladecenter
>> would turn off blade[2]?
>>
>
> Hi Parvez
>
> We use BladeCenters for our clusters, and I can confirm that the
> 'blade="2"' parameter will translate to 'blade[2]' on the AMM. ?IOW, the
> '2' is the slot number that the blade is in.
>
> Two more things that might be of help:
>
> - The user specified in the 'login' parameter under the fencedevice
> ?should be a 'Blade Administrator' for the slots in question.
>
> - If you're running with SELinux enabled, check that the
> ?'fenced_can_network_connect' boolean is set to 'on'.
>
> Regards
>
> --
> Hugo Lombard
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From hal at elizium.za.net  Fri Jan  7 10:27:41 2011
From: hal at elizium.za.net (Hugo Lombard)
Date: Fri, 7 Jan 2011 12:27:41 +0200
Subject: [Linux-cluster] What is the current recommendation
	concerning	shutting down a cluster node?
In-Reply-To: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
References: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
Message-ID: <20110107102741.GA16798@squishy.elizium.za.net>

On Thu, Jan 06, 2011 at 06:00:30PM +0000, Hofmeister, James (WTEC Linux) wrote:
> 
> In the RH436 class it was verbally discussed during the fencing
> chapter that it was a bad thing to shutdown a node, but to instead
> power down the machine which has been implemented in many of the
> fence_ scripts.
> 

FWIW, I suspect the context of that comment is different from that of
your question.  In the context of fencing, the cluster has already
decided that the box being fenced is in a bad state, and should go down
as quickly as possible.  Seeing that the box is being fenced, it likely
means that it's not communicating with the cluster, so when an orderly
shutdown tries and stop the cluster and associated services on the node
that's being fenced, it will hang, and the shutdown won't continue, so
the fencing won't be successful, and so your cluster services like GFS
will remain hung.  Better to power kill the node so that the cluster can
continue as soon as possible.

-- 
Hugo Lombard



From rossnick-lists at cybercat.ca  Fri Jan  7 14:14:12 2011
From: rossnick-lists at cybercat.ca (Nicolas Ross)
Date: Fri, 7 Jan 2011 09:14:12 -0500
Subject: [Linux-cluster] What is the current recommendation concerning
	shutting down a cluster node?
References: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>
	<4D260F29.2070706@alteeve.com>
Message-ID: <39D54A755A74441FA5FEDCF5AC9EC4A1@versa>

> On 01/06/2011 01:00 PM, Hofmeister, James (WTEC Linux) wrote:
>> umount gfs
>> service rgmanager stop
>> service gfs stop
>> service clvmd stop
>> service cman stop
> 
> This is my method. However, note that stopping GFS unmounts the volumes,
> so you can skip the manual unmount.

Would a regular shutdown (-r/-h) now do the exact same thing ?



From linux at alteeve.com  Fri Jan  7 15:23:36 2011
From: linux at alteeve.com (Digimer)
Date: Fri, 07 Jan 2011 10:23:36 -0500
Subject: [Linux-cluster] What is the current recommendation concerning
 shutting down a cluster node?
In-Reply-To: <39D54A755A74441FA5FEDCF5AC9EC4A1@versa>
References: <EC61DD7B6048464AB0E1B713AF7521BC191D1DFE95@GVW0676EXC.americas.hpqcorp.net>	<4D260F29.2070706@alteeve.com>
	<39D54A755A74441FA5FEDCF5AC9EC4A1@versa>
Message-ID: <4D272FF8.7080002@alteeve.com>

On 01/07/2011 09:14 AM, Nicolas Ross wrote:
>> On 01/06/2011 01:00 PM, Hofmeister, James (WTEC Linux) wrote:
>>> umount gfs
>>> service rgmanager stop
>>> service gfs stop
>>> service clvmd stop
>>> service cman stop
>>
>> This is my method. However, note that stopping GFS unmounts the volumes,
>> so you can skip the manual unmount.
> 
> Would a regular shutdown (-r/-h) now do the exact same thing ?

Technically it should, I suppose, but I've had it not work enough times
that I've now created a little 'stop_cluster.sh' and 'start_cluster.sh'
scripts that I run.

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From parvez.h.shaikh at gmail.com  Mon Jan 10 08:51:14 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Mon, 10 Jan 2011 14:21:14 +0530
Subject: [Linux-cluster] Error while manual fencing and output of clustat
Message-ID: <AANLkTimtnjBK-k_Ux4pjAOPxHoXg12-TZAy6Eiw_VWxR@mail.gmail.com>

Dear experts,

I have two node cluster(node1 and node2), and manual fencing is
configured. Service S2 is running on node2. To ensure failover happen,
I shutdown node2.. I see following messages in /var/log/messages -

                    agent "fence_manual" reports: failed: fence_manual
no node name

fence_ack_manual -n node2 doesn't work saying there is no FIFO in
/tmp. fence_ack_manual -n node2 -e do work and then service S2 fails
over to node2.

Trying to find out why fence_manual is reporting error? node2 is
pingable hostname and its entry is in /etc/hosts of node1 (and vice
versa).  I also see that after failover when I do "clustat -x" I get
cluster status (in XML format) with -

<?xml version="1.0"?>
<clustat version="4.1.1">
  <groups>
    <group name="service:S" state="111" state_str="starting" flags="0"
flags_str="" owner="node1" last_owner="node1" restarts="0"
last_transition="1294676678" last_transition_str="xxxxxxxxxx"/>
  </groups>
</clustat>

I was expecting last_owner would correspond to node2(because this is
node which was running service S and has failed); which would indicate
that service is failing over FROM node2. Is there a way that node in
cluster (a node on which service is failing over) could determine from
which node the given service is failing over?

Any inputs would be greatly appreciated.

Thanks

Yours gratefully



From hostmaster at inwx.de  Mon Jan 10 09:05:38 2011
From: hostmaster at inwx.de (InterNetworX | Hostmaster)
Date: Mon, 10 Jan 2011 10:05:38 +0100
Subject: [Linux-cluster] Virtualization software with GFS2
Message-ID: <4D2ACBE2.9090202@inwx.de>

Hello,

what virtualization software are you using with GFS2? KVM, XEN, OpenVZ?

I think we are the first one who tries to use OpenVZ. We have many lock
problems. We can not recommend to use.

Regards,

Mario



From ableisch at redhat.com  Mon Jan 10 10:24:59 2011
From: ableisch at redhat.com (Andreas Bleischwitz)
Date: Mon, 10 Jan 2011 11:24:59 +0100
Subject: [Linux-cluster] Howto define two-node cluster in enterprise
	environment
Message-ID: <4D2ADE7B.3010500@redhat.com>

Hello list,

I recently ran into some questions regarding a two-node cluster in an
enterprise environment, where single-point-of-failures were tried to be
eliminated whenever possible.

The situation is the following:
Two-node cluster, SAN-based shared storage - multipathed; host-based
mirrored, bonded NICS, Quorum device as tie-breaker.

Problem:
The quorum device is the single-point-of-failure as the SAN-device could
fail and hence the quorum-disc wouldn't be accessible.
The quorum-disc can't be host-based mirrored, as this would require
cmirror - which depends on a quorate cluster.
One solution: use storage-based mirroring - with extra costs, limited to
no support with mixed storage vendors.
Another solution: Use a third - no service - node which has to have the
same SAN-connections as the other two nodes out of cluster reasons. This
node will idle most of the time and therefore be very uneconomic.

How are such situations usually solved using RHCS? There must be a way
of configuring a two-nodecluster without having a SPOF defined.

HP had a quorum-host with their no longer maintained Service Guard,
which could do quorum for more than on cluster at once.

Any suggestions appreciated.

Best regrads,

Andreas Bleischwitz



From gcharles at ups.com  Mon Jan 10 12:38:25 2011
From: gcharles at ups.com (gcharles at ups.com)
Date: Mon, 10 Jan 2011 07:38:25 -0500
Subject: [Linux-cluster] Howto define two-node cluster in
	enterprise	environment
In-Reply-To: <4D2ADE7B.3010500@redhat.com>
References: <4D2ADE7B.3010500@redhat.com>
Message-ID: <49CCA172B74C1B4D916CB9B71FB952DA27D409CB73@njrarsvr3bef.us.ups.com>

While a third idle node in the cluster is a way to regulate the quorum votes, you're right in that it's not very economical.

A way to keep the quorum device from being an SPOF is to assure it is multipathed as well.  However, by default, the quorum code does not define the device from its multipathed name.  Instead, it defaults to the dm-# which we've proven in the past does not retain its name through reboots or rescans.  What you need to do is get the disk ID number of the shared quorum disk itself, and create an alias name for it in multipath.conf:
...
multipaths {
         multipath {
                 wwid                    36006016338602300ca974b4b1b7edf11
                 alias                   qdisk
         }
...

...then define it in cluster.conf with a device="/dev/mapper/qdisk" in the quorumd stanza.  When you enter a clustat, your qdisk should show up like this:

/dev/mapper/qdisk         0 Online, Quorum Disk

You can test this by disconnecting one of your SAN connections, and watch the cluster log.  It will show a loss of communication with the quorum disk for a few seconds and then return to normal.


Regards;
Greg Charles

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Andreas Bleischwitz
Sent: Monday, January 10, 2011 5:25 AM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Howto define two-node cluster in enterprise environment

Hello list,

I recently ran into some questions regarding a two-node cluster in an enterprise environment, where single-point-of-failures were tried to be eliminated whenever possible.

The situation is the following:
Two-node cluster, SAN-based shared storage - multipathed; host-based mirrored, bonded NICS, Quorum device as tie-breaker.

Problem:
The quorum device is the single-point-of-failure as the SAN-device could fail and hence the quorum-disc wouldn't be accessible.
The quorum-disc can't be host-based mirrored, as this would require cmirror - which depends on a quorate cluster.
One solution: use storage-based mirroring - with extra costs, limited to no support with mixed storage vendors.
Another solution: Use a third - no service - node which has to have the same SAN-connections as the other two nodes out of cluster reasons. This node will idle most of the time and therefore be very uneconomic.

How are such situations usually solved using RHCS? There must be a way of configuring a two-nodecluster without having a SPOF defined.

HP had a quorum-host with their no longer maintained Service Guard, which could do quorum for more than on cluster at once.

Any suggestions appreciated.

Best regrads,

Andreas Bleischwitz

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From hostmaster at inwx.de  Mon Jan 10 12:48:05 2011
From: hostmaster at inwx.de (InterNetworX | Hostmaster)
Date: Mon, 10 Jan 2011 13:48:05 +0100
Subject: [Linux-cluster] waiting for glock: pid does not exists
Message-ID: <4D2B0005.1050807@inwx.de>

Hello,

we are trying to run OpenVZ on a GFS2. We copied a virtual machine to
the GFS2 storage (on node1) and added the service to cluster.conf. After
reloading the configuration on all nodes, rgmanager was trying to start
the virtual machine on node3. That is not working and now the machine is
hanging with a lock.

This is the result of the gfs2 hang analyzer:

There are 4 glocks with waiters.
node1, pid 2674 is waiting for glock 3/8389396, which is held by pid 6821
node3, pid 7024 is waiting for glock 3/8389396, which is held by pid 6821


node1, pid 10188 is waiting for glock 2/1857345, which is held by pid 6821
node3, pid 6772 is waiting for glock 2/1857345, which is held by pid 6821
node3, pid 7251 is waiting for glock 2/1857345, which is held by pid 6821
node3, pid 7289 is waiting for glock 2/1857345, which is held by pid 6821


carl, pid 23817 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 4243 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7055 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7090 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7129 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7176 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7230 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7270 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7306 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7345 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7369 is waiting for glock 2/394135, which is held by pid 7024
node3, pid 7402 is waiting for glock 2/394135, which is held by pid 7024


node3, pid 6821 is waiting for glock 5/8425127, which is held by pid 7258



The pid 6821 is still running on node3:

root      6821  0.0  0.0  12216   696 ?        D<   08:29   0:00 /bin/cp
-fp /etc/hosts /etc/hosts.12

The problem pid is 7258 - but I can not find this process running on any
node. Any idea what is the problem here?

Mario



From ableisch at redhat.com  Mon Jan 10 12:54:41 2011
From: ableisch at redhat.com (Andreas Bleischwitz)
Date: Mon, 10 Jan 2011 13:54:41 +0100
Subject: [Linux-cluster] Howto define two-node cluster in
	enterprise	environment
In-Reply-To: <49CCA172B74C1B4D916CB9B71FB952DA27D409CB73@njrarsvr3bef.us.ups.com>
References: <4D2ADE7B.3010500@redhat.com>
	<49CCA172B74C1B4D916CB9B71FB952DA27D409CB73@njrarsvr3bef.us.ups.com>
Message-ID: <4D2B0191.10607@redhat.com>

Hello Greg,

On 01/10/2011 01:38 PM, gcharles at ups.com wrote:
> While a third idle node in the cluster is a way to regulate the quorum votes, you're right in that it's not very economical.
> 
> A way to keep the quorum device from being an SPOF is to assure it is multipathed as well.  However, by default, the quorum code does not define the device from its multipathed name.  Instead, it defaults to the dm-# which we've proven in the past does not retain its name through reboots or rescans.  What you need to do is get the disk ID number of the shared quorum disk itself, and create an alias name for it in multipath.conf:
> ...
> multipaths {
>          multipath {
>                  wwid                    36006016338602300ca974b4b1b7edf11
>                  alias                   qdisk
>          }
> ...
This will not eleminate the SPOF using one ONE storage... What I meant
is using at least two storage devices in different locations i.e. DCs.
I'll assume serious SAN connections are always using multipath ;) And by
the way, I would prefer using the qdisc-label - which should be unique
and is scanned during start of qdiscd.
> 
> ...then define it in cluster.conf with a device="/dev/mapper/qdisk" in the quorumd stanza.  When you enter a clustat, your qdisk should show up like this:
> 
> /dev/mapper/qdisk         0 Online, Quorum Disk
> 
> You can test this by disconnecting one of your SAN connections, and watch the cluster log.  It will show a loss of communication with the quorum disk for a few seconds and then return to normal.
... and qdisc will fail if the underlaying SAN device got lost - f.e.
power failure in one DC. As already said, SAN-storage will be mirrored
to two DCs using cmirorror, but this is not possible for the qdisc.

Thanks for your answer, anyways.
> 
> 
> Regards;
> Greg Charles
> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Andreas Bleischwitz
> Sent: Monday, January 10, 2011 5:25 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Howto define two-node cluster in enterprise environment
> 
> Hello list,
> 
> I recently ran into some questions regarding a two-node cluster in an enterprise environment, where single-point-of-failures were tried to be eliminated whenever possible.
> 
> The situation is the following:
> Two-node cluster, SAN-based shared storage - multipathed; host-based mirrored, bonded NICS, Quorum device as tie-breaker.
> 
> Problem:
> The quorum device is the single-point-of-failure as the SAN-device could fail and hence the quorum-disc wouldn't be accessible.
> The quorum-disc can't be host-based mirrored, as this would require cmirror - which depends on a quorate cluster.
> One solution: use storage-based mirroring - with extra costs, limited to no support with mixed storage vendors.
> Another solution: Use a third - no service - node which has to have the same SAN-connections as the other two nodes out of cluster reasons. This node will idle most of the time and therefore be very uneconomic.
> 
> How are such situations usually solved using RHCS? There must be a way of configuring a two-nodecluster without having a SPOF defined.
> 
> HP had a quorum-host with their no longer maintained Service Guard, which could do quorum for more than on cluster at once.
> 
> Any suggestions appreciated.
> 
> Best regrads,
> 
> Andreas Bleischwitz
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From xavier.montagutelli at unilim.fr  Mon Jan 10 13:28:19 2011
From: xavier.montagutelli at unilim.fr (Xavier Montagutelli)
Date: Mon, 10 Jan 2011 14:28:19 +0100
Subject: [Linux-cluster] Error while manual fencing and output of clustat
In-Reply-To: <AANLkTimtnjBK-k_Ux4pjAOPxHoXg12-TZAy6Eiw_VWxR@mail.gmail.com>
References: <AANLkTimtnjBK-k_Ux4pjAOPxHoXg12-TZAy6Eiw_VWxR@mail.gmail.com>
Message-ID: <201101101428.19680.xavier.montagutelli@unilim.fr>

Hello Parvez,

On Monday 10 January 2011 09:51:14 Parvez Shaikh wrote:
> Dear experts,
> 
> I have two node cluster(node1 and node2), and manual fencing is
> configured. Service S2 is running on node2. To ensure failover happen,
> I shutdown node2.. I see following messages in /var/log/messages -
> 
>                     agent "fence_manual" reports: failed: fence_manual
> no node name

I am not an expert, but could you show us your cluster.conf file ?

You need to give a "nodename" attribute to the fence_manual agent somewhere, 
the error message makes me think it's missing. 

For example :

        <fencedevices>
                <fencedevice agent="fence_manual" name="my_fence_manual"/>
        </fencedevices>
...
<clusternode name="node2" ...>
  <fence>
     <method name="1">
         <device name="my_fence_manual" nodename="node2"/>
       </method>
   </fence>
</clusternode>

> 
> fence_ack_manual -n node2 doesn't work saying there is no FIFO in
> /tmp. fence_ack_manual -n node2 -e do work and then service S2 fails
> over to node2.
> 
> Trying to find out why fence_manual is reporting error? node2 is
> pingable hostname and its entry is in /etc/hosts of node1 (and vice
> versa).  I also see that after failover when I do "clustat -x" I get
> cluster status (in XML format) with -
> 
> <?xml version="1.0"?>
> <clustat version="4.1.1">
>   <groups>
>     <group name="service:S" state="111" state_str="starting" flags="0"
> flags_str="" owner="node1" last_owner="node1" restarts="0"
> last_transition="1294676678" last_transition_str="xxxxxxxxxx"/>
>   </groups>
> </clustat>
> 
> I was expecting last_owner would correspond to node2(because this is
> node which was running service S and has failed); which would indicate
> that service is failing over FROM node2. Is there a way that node in
> cluster (a node on which service is failing over) could determine from
> which node the given service is failing over?
> 
> Any inputs would be greatly appreciated.
> 
> Thanks
> 
> Yours gratefully
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
Xavier Montagutelli                      Tel : +33 (0)5 55 45 77 20
Service Commun Informatique              Fax : +33 (0)5 55 45 75 95
Universite de Limoges
123, avenue Albert Thomas
87060 Limoges cedex



From swhiteho at redhat.com  Mon Jan 10 14:07:46 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 10 Jan 2011 14:07:46 +0000
Subject: [Linux-cluster] waiting for glock: pid does not exists
In-Reply-To: <4D2B0005.1050807@inwx.de>
References: <4D2B0005.1050807@inwx.de>
Message-ID: <1294668466.2450.59.camel@dolmen>

Hi,

On Mon, 2011-01-10 at 13:48 +0100, InterNetworX | Hostmaster wrote:
> Hello,
> 
> we are trying to run OpenVZ on a GFS2. We copied a virtual machine to
> the GFS2 storage (on node1) and added the service to cluster.conf. After
> reloading the configuration on all nodes, rgmanager was trying to start
> the virtual machine on node3. That is not working and now the machine is
> hanging with a lock.
> 
> This is the result of the gfs2 hang analyzer:
> 
> There are 4 glocks with waiters.
> node1, pid 2674 is waiting for glock 3/8389396, which is held by pid 6821
> node3, pid 7024 is waiting for glock 3/8389396, which is held by pid 6821
> 
> 
> node1, pid 10188 is waiting for glock 2/1857345, which is held by pid 6821
> node3, pid 6772 is waiting for glock 2/1857345, which is held by pid 6821
> node3, pid 7251 is waiting for glock 2/1857345, which is held by pid 6821
> node3, pid 7289 is waiting for glock 2/1857345, which is held by pid 6821
> 
> 
> carl, pid 23817 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 4243 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7055 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7090 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7129 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7176 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7230 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7270 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7306 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7345 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7369 is waiting for glock 2/394135, which is held by pid 7024
> node3, pid 7402 is waiting for glock 2/394135, which is held by pid 7024
> 
> 
> node3, pid 6821 is waiting for glock 5/8425127, which is held by pid 7258
> 
> 
> 
> The pid 6821 is still running on node3:
> 
> root      6821  0.0  0.0  12216   696 ?        D<   08:29   0:00 /bin/cp
> -fp /etc/hosts /etc/hosts.12
> 
> The problem pid is 7258 - but I can not find this process running on any
> node. Any idea what is the problem here?
> 
> Mario
> 
If pid 7528 has exited, then it is almost certainly not a problem. What
makes you think that this is the issue? Since it is a type 5 glock, it
should not be blocking access to anything,

Steve.

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From kitgerrits at gmail.com  Mon Jan 10 20:17:43 2011
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Mon, 10 Jan 2011 21:17:43 +0100
Subject: [Linux-cluster] What is the current recommendation concerning
	shutting down a cluster node?
In-Reply-To: <4D272FF8.7080002@alteeve.com>
Message-ID: <4d2b6967.0607cc0a.7087.ffff8fa2@mx.google.com>


Hello,

I have found the same behaviour.
Shutting down a cluster node without first stopping the cluster services
results in a node that has not logged out from the cluster.
The cluster may even fence the node and bring it back on-line.
(it won't stay dead!)


You can also leave the cluster with 'cman_tool leave':

http://linux.die.net/man/8/cman_tool

Tells CMAN to leave the cluster. You cannot do this if there are subsystems
(eg DLM, GFS) active. You should dismount all GFS filesystems, shutdown
CLVM, fenced and anything else using the cluster manager before using
cman_tool leave. Look at 'cman_tool status|services' to see how many (and
which) services are running.
When a node leaves the cluster, the remaining nodes recalculate quorum and
this may block cluster activity if the required number of votes is not
present. If this node is to be down for an extended period of time and you
need to keep the cluster running, add the remove option, and the remaining
nodes will recalculate quorum such that activity can continue. 


Regards,

Kit


-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer
Sent: vrijdag 7 januari 2011 16:24
To: linux clustering
Subject: Re: [Linux-cluster] What is the current recommendation concerning
shutting down a cluster node?

On 01/07/2011 09:14 AM, Nicolas Ross wrote:
>> On 01/06/2011 01:00 PM, Hofmeister, James (WTEC Linux) wrote:
>>> umount gfs
>>> service rgmanager stop
>>> service gfs stop
>>> service clvmd stop
>>> service cman stop
>>
>> This is my method. However, note that stopping GFS unmounts the 
>> volumes, so you can skip the manual unmount.
> 
> Would a regular shutdown (-r/-h) now do the exact same thing ?

Technically it should, I suppose, but I've had it not work enough times that
I've now created a little 'stop_cluster.sh' and 'start_cluster.sh'
scripts that I run.

--
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From kitgerrits at gmail.com  Mon Jan 10 20:21:45 2011
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Mon, 10 Jan 2011 21:21:45 +0100
Subject: [Linux-cluster] Howto define two-node cluster in
	enterpriseenvironment
In-Reply-To: <4D2ADE7B.3010500@redhat.com>
Message-ID: <4d2b6a59.0607cc0a.7080.ffff8f43@mx.google.com>



Hello fellow administrator,

If you have a SAN...
Why can't you have the SAN publish the same LUN to the two cluster nodes
simultaneously?
It is only used as a raw device, so there should be no ugly filesystem
side-effects.


Regards,

Kit

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Andreas Bleischwitz
Sent: maandag 10 januari 2011 11:25
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Howto define two-node cluster in
enterpriseenvironment

Hello list,

I recently ran into some questions regarding a two-node cluster in an
enterprise environment, where single-point-of-failures were tried to be
eliminated whenever possible.

The situation is the following:
Two-node cluster, SAN-based shared storage - multipathed; host-based
mirrored, bonded NICS, Quorum device as tie-breaker.

Problem:
The quorum device is the single-point-of-failure as the SAN-device could
fail and hence the quorum-disc wouldn't be accessible.
The quorum-disc can't be host-based mirrored, as this would require cmirror
- which depends on a quorate cluster.
One solution: use storage-based mirroring - with extra costs, limited to no
support with mixed storage vendors.
Another solution: Use a third - no service - node which has to have the same
SAN-connections as the other two nodes out of cluster reasons. This node
will idle most of the time and therefore be very uneconomic.

How are such situations usually solved using RHCS? There must be a way of
configuring a two-nodecluster without having a SPOF defined.

HP had a quorum-host with their no longer maintained Service Guard, which
could do quorum for more than on cluster at once.

Any suggestions appreciated.

Best regrads,

Andreas Bleischwitz

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From thomas at sjolshagen.net  Mon Jan 10 20:38:12 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Mon, 10 Jan 2011 15:38:12 -0500
Subject: [Linux-cluster] Howto define two-node cluster in
 enterpriseenvironment
In-Reply-To: <4d2b6a59.0607cc0a.7080.ffff8f43@mx.google.com>
References: <4d2b6a59.0607cc0a.7080.ffff8f43@mx.google.com>
Message-ID: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>

 On Mon, 10 Jan 2011 21:21:45 +0100, "Kit Gerrits" 
 <kitgerrits at gmail.com> wrote:
> Hello fellow administrator,
>
> If you have a SAN...
> Why can't you have the SAN publish the same LUN to the two cluster 
> nodes
> simultaneously?

 You can, but you minimally need to guarantee (not believe or think, but 
 guarantee!) that both nodes do not

 a) write to the same sectors, file systems or LVM volumes at the same 
 time (this is actually a whole lot more difficult to do than most people 
 think) - including boot sectors, partition tables, LVM metadata, etc, 
 etc,

 b) think they're exclusively accessing the LUN I.e. there must be 
 something on the nodes - an application, OS tool or something else - 
 that understands that there is more than one reader & writer to a LUN 
 and thus synchronizes this.

> It is only used as a raw device, so there should be no ugly 
> filesystem
> side-effects.

 File systems only serve to make this a lot more obvious to the end user 
 or administrator since it's integrity tends to get shot fairly quickly 
 and there are integrity checks in place. On raw devices, you get the 
 "benefit" of ignorance about the fact that your data is corrupt, unless 
 b) above is true.

 Hth,

 // Thomas

>
>
> Regards,
>
> Kit
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Andreas 
> Bleischwitz
> Sent: maandag 10 januari 2011 11:25
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Howto define two-node cluster in
> enterpriseenvironment
>
> Hello list,
>
> I recently ran into some questions regarding a two-node cluster in an
> enterprise environment, where single-point-of-failures were tried to 
> be
> eliminated whenever possible.
>
> The situation is the following:
> Two-node cluster, SAN-based shared storage - multipathed; host-based
> mirrored, bonded NICS, Quorum device as tie-breaker.
>
> Problem:
> The quorum device is the single-point-of-failure as the SAN-device 
> could
> fail and hence the quorum-disc wouldn't be accessible.
> The quorum-disc can't be host-based mirrored, as this would require 
> cmirror
> - which depends on a quorate cluster.
> One solution: use storage-based mirroring - with extra costs, limited 
> to no
> support with mixed storage vendors.
> Another solution: Use a third - no service - node which has to have 
> the same
> SAN-connections as the other two nodes out of cluster reasons. This 
> node
> will idle most of the time and therefore be very uneconomic.
>
> How are such situations usually solved using RHCS? There must be a 
> way of
> configuring a two-nodecluster without having a SPOF defined.
>
> HP had a quorum-host with their no longer maintained Service Guard, 
> which
> could do quorum for more than on cluster at once.
>
> Any suggestions appreciated.
>
> Best regrads,
>
> Andreas Bleischwitz
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From carlopmart at gmail.com  Mon Jan 10 21:16:51 2011
From: carlopmart at gmail.com (carlopmart)
Date: Mon, 10 Jan 2011 22:16:51 +0100
Subject: [Linux-cluster] Problems with a script when is launched via
	rgmanager
Message-ID: <4D2B7743.2060800@gmail.com>

Hi all,

  I am trying to set up a splunk cluster service on two RHEL5.5 hosts (fully 
updated). My problems becomes when I trying to setup this service under rgmanager: 
script ever fails. If I launch the script manually, all works as expected. If I test 
the service using rg_test comand, all works ok as expected.

  This is the error when rgmanager tries to launch the service:

Jan 10 17:50:55 lorien clurgmgrd[25394]: <notice> Starting disabled service 
service:siemmgmt-svc
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Link for eth0: Detected
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <info> Adding IPv4 address 
172.25.70.22/28 to eth0
Jan 10 17:50:55 lorien clurgmgrd: [25394]: <debug> Pinging addr 172.25.70.22 from 
dev eth0
Jan 10 17:50:57 lorien clurgmgrd: [25394]: <debug> Sending gratuitous ARP: 
172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <warning> Unknown file system type 'ext4' 
for device /dev/inasvol/splunkvol.  Assuming fsck is required.
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> Running fsck on 
/dev/inasvol/splunkvol
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> mounting /dev/inasvol/splunkvol on 
/data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <debug> mount -t ext4 -o rw 
/dev/inasvol/splunkvol /data/services/siem/splunk
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing 
/data/config/etc/init.d/splunk-cluster start
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: start of 
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> start on script "splunk-cluster" 
returned 1 (generic error)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <warning> #68: Failed to start 
service:siemmgmt-svc; return value: 1
Jan 10 17:50:58 lorien clurgmgrd[25394]: <debug> Stopping failed service 
service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> Stopping service service:siemmgmt-svc
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <info> Executing 
/data/config/etc/init.d/splunk-cluster stop
Jan 10 17:50:58 lorien clurgmgrd: [25394]: <err> script:splunk-cluster: stop of 
/data/config/etc/init.d/splunk-cluster failed (returned 1)
Jan 10 17:50:58 lorien clurgmgrd[25394]: <notice> stop on script "splunk-cluster" 
returned 1 (generic error)
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> unmounting /data/services/siem/splunk
Jan 10 17:50:59 lorien clurgmgrd: [25394]: <info> Removing IPv4 address 
172.25.70.22/28 from eth0
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #12: RG service:siemmgmt-svc failed 
to stop; intervention required
Jan 10 17:51:09 lorien clurgmgrd[25394]: <notice> Service service:siemmgmt-svc is failed
Jan 10 17:51:09 lorien clurgmgrd[25394]: <crit> #13: Service service:siemmgmt-svc 
failed to stop cleanly
Jan 10 17:51:09 lorien clurgmgrd[25394]: <debug> Handling failure request for RG 
service:siemmgmt-svc
Jan 10 17:51:19 lorien clurgmgrd[25394]: <debug> 2 events processed


And this is the output using rg_test command:

[root at lorien ~]# rg_test test /etc/cluster/cluster.conf start service siemmgmt-svc
Running in test mode.
Starting siemmgmt-svc...
<warn>   Unknown file system type 'ext4' for device /dev/inasvol/splunkvol. 
Assuming fsck is required.
<debug>  Running fsck on /dev/inasvol/splunkvol
<info>   mounting /dev/inasvol/splunkvol on /data/services/siem/splunk
<debug>  mount -t ext4 -o rw /dev/inasvol/splunkvol /data/services/siem/splunk
<debug>  Link for eth0: Detected
<info>   Adding IPv4 address 172.25.70.22/28 to eth0
<debug>  Pinging addr 172.25.70.22 from dev eth0
<debug>  Sending gratuitous ARP: 172.25.70.22 00:50:56:14:5a:1e brd ff:ff:ff:ff:ff:ff
<info>   Executing /data/config/etc/init.d/splunk-cluster start
+ . /etc/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -f /etc/sysconfig/i18n -a -z '' ']'
++ . /etc/profile.d/lang.sh
+++ sourced=0
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /etc/sysconfig/i18n ']'
+++ . /etc/sysconfig/i18n
++++ LANG=en_US.UTF-8
++++ SYSFONT=latarcyrheb-sun16
+++ sourced=1
+++ for langfile in /etc/sysconfig/i18n '$HOME/.i18n'
+++ '[' -f /.i18n ']'
+++ '[' -n '' ']'
+++ '[' 1 = 1 ']'
+++ '[' -n en_US.UTF-8 ']'
+++ export LANG
+++ '[' -n '' ']'
+++ unset LC_ADDRESS
+++ '[' -n '' ']'
+++ unset LC_CTYPE
+++ '[' -n '' ']'
+++ unset LC_COLLATE
+++ '[' -n '' ']'
+++ unset LC_IDENTIFICATION
+++ '[' -n '' ']'
+++ unset LC_MEASUREMENT
+++ '[' -n '' ']'
+++ unset LC_MESSAGES
+++ '[' -n '' ']'
+++ unset LC_MONETARY
+++ '[' -n '' ']'
+++ unset LC_NAME
+++ '[' -n '' ']'
+++ unset LC_NUMERIC
+++ '[' -n '' ']'
+++ unset LC_PAPER
+++ '[' -n '' ']'
+++ unset LC_TELEPHONE
+++ '[' -n '' ']'
+++ unset LC_TIME
+++ '[' -n C ']'
+++ '[' C '!=' en_US.UTF-8 ']'
+++ export LC_ALL
+++ '[' -n '' ']'
+++ unset LANGUAGE
+++ '[' -n '' ']'
+++ unset LINGUAS
+++ '[' -n '' ']'
+++ unset _XKB_CHARSET
+++ consoletype=pty
+++ '[' -z pty ']'
+++ '[' -n '' ']'
+++ '[' -n '' ']'
+++ '[' -n en_US.UTF-8 ']'
+++ case $LANG in
+++ '[' dumb = linux ']'
+++ unset SYSFONTACM SYSFONT
+++ unset sourced
+++ unset langfile
++ '[' -z '' ']'
++ '[' -f /etc/sysconfig/init ']'
++ . /etc/sysconfig/init
+++ BOOTUP=color
+++ GRAPHICAL=yes
+++ RES_COL=60
+++ MOVE_TO_COL='echo -en \033[60G'
+++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
+++ SETCOLOR_FAILURE='echo -en \033[0;31m'
+++ SETCOLOR_WARNING='echo -en \033[0;33m'
+++ SETCOLOR_NORMAL='echo -en \033[0;39m'
+++ LOGLEVEL=3
+++ PROMPT=yes
+++ AUTOSWAP=no
++ '[' pty = serial ']'
++ '[' color '!=' verbose ']'
++ INITLOG_ARGS=-q
++ 
__sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
+ '[' '!' -d /data/services/siem/splunk/etc ']'
+ HOME=/data/services/siem/splunk
+ DIRECTORY=/data/services/siem/splunk
+ export HOME
+ case "$1" in
+ start
+ echo -n 'Starting Splunk: '
Starting Splunk:
+ sudo -H -u splunk /data/services/siem/splunk/bin/splunk start

Splunk> The IT Search Engine.

Checking prerequisites...
	Checking http port [172.25.70.22:9000]: open
	Checking mgmt port [172.25.70.22:9089]: open
	Checking configuration...  Done.
	Checking index directory...  Done.
	Checking databases...
	Validated databases: _audit, _blocksignature, _internal, _thefishbucket, history, 
main, sample, summary
	Checking for SELinux.
All preliminary checks passed.

                                                            [  OK  ]
                                                            [  OK  ]
Starting splunk server daemon (splunkd)... Done.Starting splunkweb... Done.
If you get stuck, we're here to help.
Look for answers here: http://www.splunk.com/base/Documentation

The Splunk web interface is at https://172.25.70.22:9000

+ RETVAL=0
+ '[' 0 -eq 0 ']'
+ success
+ '[' color '!=' verbose -a -z '' ']'
+ echo_success
+ '[' color = color ']'
+ echo -en '\033[60G'
                                                            + echo -n '['
[+ '[' color = color ']'
+ echo -en '\033[0;32m'
+ echo -n '  OK  '
   OK  + '[' color = color ']'
+ echo -en '\033[0;39m'
+ echo -n ']'
]+ echo -ne '\r'
+ return 0
+ return 0
+ echo

+ return 0
+ exit 0
Start of siemmgmt-svc complete

  As you can see, all works ok.

Service configuration under cluster.conf:

<service autostart="0" domain="PriCluster2" name="siemmgmt-svc" recovery="relocate">
              <fs ref="siemdata">
                 <ip ref="172.25.70.22">
                        <script ref="splunk-cluster"/>
                 </ip>
              </fs>
</service>


  Script:

#!/bin/sh -x
# Splunk:       Controls Splunk on Redhat-based systems
#
# chkconfig: 2345 99 15
# description: Starts and stops Splunk
#
# This will work on Redhat systems (maybe others too)

# Source function library.
. /etc/init.d/functions

if [ ! -d /data/services/siem/splunk/etc ]; then
         exit 1
fi

HOME="/data/services/siem/splunk"
DIRECTORY="/data/services/siem/splunk"

export HOME



start() {
         echo -n "Starting Splunk: "
         sudo -H -u splunk ${DIRECTORY}/bin/splunk start > /dev/null
         RETVAL=$?
         if [ $RETVAL -eq 0 ]; then
                 success
         else
                 failure
         fi
         echo
         return $RETVAL
}

stop() {
         echo -n "Stopping Splunk: "
         sudo -H -u splunk ${DIRECTORY}/bin/splunk stop > /dev/null
         RETVAL=$?
         if [ $RETVAL -eq 0 ]; then
                 success
         else
                 failure
         fi
         echo
         return $RETVAL
}

status() {
         exit 0
}

case "$1" in
   start)
         start
         ;;
   stop)
         stop
         ;;
   restart)
         stop
         start
         ;;
   status)
         status
         ;;
   *)
         echo $"Usage: $0 {start|stop|restart|status}"
         exit 1
esac

exit $?


  How can I debug this error?? I don't why fails when is launched via rgmanager ...

-- 
CL Martinez
carlopmart {at} gmail {d0t} com



From thomas at sjolshagen.net  Mon Jan 10 21:45:00 2011
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Mon, 10 Jan 2011 16:45:00 -0500
Subject: [Linux-cluster] Problems with a script when is launched via
 rgmanager
In-Reply-To: <4D2B7743.2060800@gmail.com>
References: <4D2B7743.2060800@gmail.com>
Message-ID: <073cfbcf8c8ec69c639ff93b9ea2d274@sjolshagen.net>


 On Mon, 10 Jan 2011 22:16:51 +0100, carlopmart <carlopmart at gmail.com> 
 wrote:
> Hi all,
>
>  I am trying to set up a splunk cluster service on two RHEL5.5 hosts
> (fully updated). My problems becomes when I trying to setup this
> service under rgmanager: script ever fails. If I launch the script
> manually, all works as expected. If I test the service using rg_test
> comand, all works ok as expected.
>

 Often, issues like these (works when running interactively, but not as 
 part of a cluster manager or cron job) seem to be environment variables 
 not having been set (by the crond or cluster manager), or the 
 script/application expects to have a tty, but one isn't provided when 
 running non-interactively.

 May be worth adding some debug statements that get logged to a file on 
 the node to see where the splunk-cluster script fails.

 // Thomas



From parvez.h.shaikh at gmail.com  Tue Jan 11 05:45:36 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Tue, 11 Jan 2011 11:15:36 +0530
Subject: [Linux-cluster] Error while manual fencing and output of clustat
In-Reply-To: <201101101428.19680.xavier.montagutelli@unilim.fr>
References: <AANLkTimtnjBK-k_Ux4pjAOPxHoXg12-TZAy6Eiw_VWxR@mail.gmail.com>
	<201101101428.19680.xavier.montagutelli@unilim.fr>
Message-ID: <AANLkTinC0PkqTi97NZk6vH0B20-KhXemq=h+-oR7TxEy@mail.gmail.com>

Thanks Xaviar.

It resolved the error on fencing.

However I still am grappling with issue of finding name of "Failed
cluster node" on another cluster node to which service on failed node
has failed over to.

I was using output of "clustat -x -S service name" and was parsing XML
file to obtain value of "last_owner" field.

Any input on how to find out name of failed node on another cluster
node, over which services from failed node are starting?

Thanks

On Mon, Jan 10, 2011 at 6:58 PM, Xavier Montagutelli
<xavier.montagutelli at unilim.fr> wrote:
> Hello Parvez,
>
> On Monday 10 January 2011 09:51:14 Parvez Shaikh wrote:
>> Dear experts,
>>
>> I have two node cluster(node1 and node2), and manual fencing is
>> configured. Service S2 is running on node2. To ensure failover happen,
>> I shutdown node2.. I see following messages in /var/log/messages -
>>
>> ? ? ? ? ? ? ? ? ? ? agent "fence_manual" reports: failed: fence_manual
>> no node name
>
> I am not an expert, but could you show us your cluster.conf file ?
>
> You need to give a "nodename" attribute to the fence_manual agent somewhere,
> the error message makes me think it's missing.
>
> For example :
>
> ? ? ? ?<fencedevices>
> ? ? ? ? ? ? ? ?<fencedevice agent="fence_manual" name="my_fence_manual"/>
> ? ? ? ?</fencedevices>
> ...
> <clusternode name="node2" ...>
> ?<fence>
> ? ? <method name="1">
> ? ? ? ? <device name="my_fence_manual" nodename="node2"/>
> ? ? ? </method>
> ? </fence>
> </clusternode>
>
>>
>> fence_ack_manual -n node2 doesn't work saying there is no FIFO in
>> /tmp. fence_ack_manual -n node2 -e do work and then service S2 fails
>> over to node2.
>>
>> Trying to find out why fence_manual is reporting error? node2 is
>> pingable hostname and its entry is in /etc/hosts of node1 (and vice
>> versa). ?I also see that after failover when I do "clustat -x" I get
>> cluster status (in XML format) with -
>>
>> <?xml version="1.0"?>
>> <clustat version="4.1.1">
>> ? <groups>
>> ? ? <group name="service:S" state="111" state_str="starting" flags="0"
>> flags_str="" owner="node1" last_owner="node1" restarts="0"
>> last_transition="1294676678" last_transition_str="xxxxxxxxxx"/>
>> ? </groups>
>> </clustat>
>>
>> I was expecting last_owner would correspond to node2(because this is
>> node which was running service S and has failed); which would indicate
>> that service is failing over FROM node2. Is there a way that node in
>> cluster (a node on which service is failing over) could determine from
>> which node the given service is failing over?
>>
>> Any inputs would be greatly appreciated.
>>
>> Thanks
>>
>> Yours gratefully
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Xavier Montagutelli ? ? ? ? ? ? ? ? ? ? ?Tel : +33 (0)5 55 45 77 20
> Service Commun Informatique ? ? ? ? ? ? ?Fax : +33 (0)5 55 45 75 95
> Universite de Limoges
> 123, avenue Albert Thomas
> 87060 Limoges cedex
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From fdinitto at redhat.com  Tue Jan 11 05:59:11 2011
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Tue, 11 Jan 2011 06:59:11 +0100
Subject: [Linux-cluster] Problems with a script when is launched
	via	rgmanager
In-Reply-To: <4D2B7743.2060800@gmail.com>
References: <4D2B7743.2060800@gmail.com>
Message-ID: <4D2BF1AF.8030304@redhat.com>

On 01/10/2011 10:16 PM, carlopmart wrote:
> Hi all,
> 
>  I am trying to set up a splunk cluster service on two RHEL5.5 hosts
> (fully updated). My problems becomes when I trying to setup this service
> under rgmanager: script ever fails. If I launch the script manually, all
> works as expected. If I test the service using rg_test comand, all works
> ok as expected.

Do you have selinux enabled? If so you probably need to allow rgmanager
to run the service or set selinux in permissive mode and check for avc.

> #!/bin/sh -x
> # Splunk:       Controls Splunk on Redhat-based systems
> #
> # chkconfig: 2345 99 15
> # description: Starts and stops Splunk
> #
> # This will work on Redhat systems (maybe others too)
> 
> # Source function library.
> . /etc/init.d/functions
> 
> if [ ! -d /data/services/siem/splunk/etc ]; then
>         exit 1
> fi
> 
> HOME="/data/services/siem/splunk"
> DIRECTORY="/data/services/siem/splunk"
> 
> export HOME
> 
> 
> 
> start() {
>         echo -n "Starting Splunk: "
>         sudo -H -u splunk ${DIRECTORY}/bin/splunk start > /dev/null
>         RETVAL=$?
>         if [ $RETVAL -eq 0 ]; then
>                 success
>         else
>                 failure
>         fi
>         echo
>         return $RETVAL
> }
> 
> stop() {
>         echo -n "Stopping Splunk: "
>         sudo -H -u splunk ${DIRECTORY}/bin/splunk stop > /dev/null
>         RETVAL=$?
>         if [ $RETVAL -eq 0 ]; then
>                 success
>         else
>                 failure
>         fi
>         echo
>         return $RETVAL
> }
> 
> status() {
>         exit 0
> }
> 
> case "$1" in
>   start)
>         start
>         ;;
>   stop)
>         stop
>         ;;
>   restart)
>         stop
>         start
>         ;;
>   status)
>         status
>         ;;
>   *)
>         echo $"Usage: $0 {start|stop|restart|status}"
>         exit 1
> esac
> 
> exit $?
> 
> 
>  How can I debug this error?? I don't why fails when is launched via
> rgmanager ...
> 

This script is dangerous. It doesn't implement a proper status check,
returning always 0. rgmanager will not be able to monitor the status of
the application and in case of application failure, it cannot take any
action to recover it.

Fabio



From carlopmart at gmail.com  Tue Jan 11 08:44:05 2011
From: carlopmart at gmail.com (carlopmart)
Date: Tue, 11 Jan 2011 09:44:05 +0100
Subject: [Linux-cluster] Problems with a script when is launched via
 rgmanager (SOLVED)
In-Reply-To: <4D2BF1AF.8030304@redhat.com>
References: <4D2B7743.2060800@gmail.com> <4D2BF1AF.8030304@redhat.com>
Message-ID: <4D2C1855.9010608@gmail.com>

On 01/11/2011 06:59 AM, Fabio M. Di Nitto wrote:
> On 01/10/2011 10:16 PM, carlopmart wrote:
>> Hi all,
>>
>>   I am trying to set up a splunk cluster service on two RHEL5.5 hosts
>> (fully updated). My problems becomes when I trying to setup this service
>> under rgmanager: script ever fails. If I launch the script manually, all
>> works as expected. If I test the service using rg_test comand, all works
>> ok as expected.
>
> Do you have selinux enabled? If so you probably need to allow rgmanager
> to run the service or set selinux in permissive mode and check for avc.
>
>> #!/bin/sh -x
>> # Splunk:       Controls Splunk on Redhat-based systems
>> #
>> # chkconfig: 2345 99 15
>> # description: Starts and stops Splunk
>> #
>> # This will work on Redhat systems (maybe others too)
>>
>> # Source function library.
>> . /etc/init.d/functions
>>
>> if [ ! -d /data/services/siem/splunk/etc ]; then
>>          exit 1
>> fi
>>
>> HOME="/data/services/siem/splunk"
>> DIRECTORY="/data/services/siem/splunk"
>>
>> export HOME
>>
>>
>>
>> start() {
>>          echo -n "Starting Splunk: "
>>          sudo -H -u splunk ${DIRECTORY}/bin/splunk start>  /dev/null
>>          RETVAL=$?
>>          if [ $RETVAL -eq 0 ]; then
>>                  success
>>          else
>>                  failure
>>          fi
>>          echo
>>          return $RETVAL
>> }
>>
>> stop() {
>>          echo -n "Stopping Splunk: "
>>          sudo -H -u splunk ${DIRECTORY}/bin/splunk stop>  /dev/null
>>          RETVAL=$?
>>          if [ $RETVAL -eq 0 ]; then
>>                  success
>>          else
>>                  failure
>>          fi
>>          echo
>>          return $RETVAL
>> }
>>
>> status() {
>>          exit 0
>> }
>>
>> case "$1" in
>>    start)
>>          start
>>          ;;
>>    stop)
>>          stop
>>          ;;
>>    restart)
>>          stop
>>          start
>>          ;;
>>    status)
>>          status
>>          ;;
>>    *)
>>          echo $"Usage: $0 {start|stop|restart|status}"
>>          exit 1
>> esac
>>
>> exit $?
>>
>>
>>   How can I debug this error?? I don't why fails when is launched via
>> rgmanager ...
>>
>
> This script is dangerous. It doesn't implement a proper status check,
> returning always 0. rgmanager will not be able to monitor the status of
> the application and in case of application failure, it cannot take any
> action to recover it.
>
> Fabio
>

Finally I have found the problem. I have setup splunk user with /sbin/nologin shell 
and I have changed to /bin/sh and the script works now.

Yes, I know that using exit 0 under status flag is dangerous, but I have configured 
like this to test the script first.

Many thanks.


-- 
CL Martinez
carlopmart {at} gmail {d0t} com



From emilio at ugr.es  Tue Jan 11 09:14:57 2011
From: emilio at ugr.es (Emilio Arjona)
Date: Tue, 11 Jan 2011 10:14:57 +0100
Subject: [Linux-cluster] Red Hat Cluster 5.4 to 5.5 Update / gfs2-hangalyzer
Message-ID: <AANLkTikx0WgyZrPMeBRqTyDNa5dmL=zrniO-Y0GE_7-p@mail.gmail.com>

Hi experts,

we are planning to update our cluster packages but we don't know what's the
best procedure to do it:
Must we shut down the whole cluster, update all the nodes individually while
the cluster is off, and then join them all again?
What are the packages to update? Order?

And another question:
Where can I download gfs2-hangalyzer?

Regards,

-- 
*******************************************
Emilio Arjona Heredia
*******************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110111/6df4e361/attachment.htm>

From kclo2000 at gmail.com  Tue Jan 11 09:19:17 2011
From: kclo2000 at gmail.com (KC LO)
Date: Tue, 11 Jan 2011 17:19:17 +0800
Subject: [Linux-cluster] unfreeze a node of the cluster and cause reboot
	remaining nodes
Message-ID: <AANLkTinD6OpD0eKJO7i3rsb7w_n6f2_hvR1Sg7ztL83X@mail.gmail.com>

Dear all,

We have set up a 3 + 1 cluster which is 3 active node and 1 standby nodes
and quorum disks.

clustat
Member Status: Quorate
 Member Name                             ID   Status
 ------ ----                             ---- ------
 servera                                1 Online, rgmanager
 serverb                                2 Online, rgmanager
 serverc                                3 Online, rgmanager
 standby                               4 Online, Local, rgmanager
 /dev/emcpowers                    0 Online, Quorum Disk

 Service Name                 Owner (Last)                   State
 service:servicea              servera                   started
 service:serviceb              serverb                   started
 service:servicec              serverc                   started

Any server failure and cause server relocate to the standby server and
basically all cluster functions properly.

However, when I type clusvcadm -Z servera, it can sucessfully freeze the
nodes.  However, if I type clusvcadm -U servera to unfreeze the node, it
will check the status of the running application under cluster monitoring.
But don't know why it return status failed while the application is running
properly.  It will then try to stop the application and reported that it
failed to unmount the partition and cause servera rebooted.  During servera
reboot, servicea can not failover to standby node and the service state
shows "recoverable".  After servera rebooted successfully, servicea can run
on servera but then serverb and serverc reboot togeter.

Do you have any idea?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110111/f83b5370/attachment.htm>

From mailtoaneeshvs at gmail.com  Tue Jan 11 09:51:16 2011
From: mailtoaneeshvs at gmail.com (aneesh vs)
Date: Tue, 11 Jan 2011 15:21:16 +0530
Subject: [Linux-cluster] unfreeze a node of the cluster and cause reboot
 remaining nodes
In-Reply-To: <AANLkTinD6OpD0eKJO7i3rsb7w_n6f2_hvR1Sg7ztL83X@mail.gmail.com>
References: <AANLkTinD6OpD0eKJO7i3rsb7w_n6f2_hvR1Sg7ztL83X@mail.gmail.com>
Message-ID: <AANLkTik+3+h-oTw-_s36+B6O3+HWee5RSnuKBexacM2o@mail.gmail.com>

Hello,

Read "man clusvcadm" to know about freeze feature . clusvcadm -Z doesn't
freeze a node, instead it will freeze monitoring the service.
Please refer https://access.redhat.com/kb/docs/DOC-43505 and
http://sources.redhat.com/cluster/wiki/ServiceFreeze more details about this
feature.

The service going in failed may have many reasons. If any of resource's
status check fails, rgmanager will use recovery policy and try to restart or
relocate. But as you said, if there is any issue in unmounting fs resource,
rgmanager will move service to failed state. If you set self_fence=1 in fs
resource section and if force_unmount to fs resource is not successful, node
will do self fence.


>>During servera reboot, servicea can not failover to standby node and the
service state shows "recoverable"
You atleasr need paste snip of /var/log/messages from all nodes for this
time to explain exactly what is happening.

>>After servera rebooted successfully, servicea can run on servera but then
serverb and serverc reboot togeter.
Need to check logs to know reason for this.



On Tue, Jan 11, 2011 at 2:49 PM, KC LO <kclo2000 at gmail.com> wrote:

> Dear all,
>
> We have set up a 3 + 1 cluster which is 3 active node and 1 standby nodes
> and quorum disks.
>
> clustat
> Member Status: Quorate
>  Member Name                             ID   Status
>  ------ ----                             ---- ------
>  servera                                1 Online, rgmanager
>  serverb                                2 Online, rgmanager
>  serverc                                3 Online, rgmanager
>  standby                               4 Online, Local, rgmanager
>  /dev/emcpowers                    0 Online, Quorum Disk
>
>  Service Name                 Owner (Last)                   State
>  service:servicea              servera                   started
>  service:serviceb              serverb                   started
>  service:servicec              serverc                   started
>
> Any server failure and cause server relocate to the standby server and
> basically all cluster functions properly.
>
> However, when I type clusvcadm -Z servera, it can sucessfully freeze the
> nodes.  However, if I type clusvcadm -U servera to unfreeze the node, it
> will check the status of the running application under cluster monitoring.
> But don't know why it return status failed while the application is running
> properly.  It will then try to stop the application and reported that it
> failed to unmount the partition and cause servera rebooted.  During servera
> reboot, servicea can not failover to standby node and the service state
> shows "recoverable".  After servera rebooted successfully, servicea can run
> on servera but then serverb and serverc reboot togeter.
>
> Do you have any idea?
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110111/45192f50/attachment.htm>

From mailtoaneeshvs at gmail.com  Tue Jan 11 10:02:44 2011
From: mailtoaneeshvs at gmail.com (aneesh vs)
Date: Tue, 11 Jan 2011 15:32:44 +0530
Subject: [Linux-cluster] Red Hat Cluster 5.4 to 5.5 Update /
	gfs2-hangalyzer
In-Reply-To: <AANLkTikx0WgyZrPMeBRqTyDNa5dmL=zrniO-Y0GE_7-p@mail.gmail.com>
References: <AANLkTikx0WgyZrPMeBRqTyDNa5dmL=zrniO-Y0GE_7-p@mail.gmail.com>
Message-ID: <AANLkTin+N1BQerRgBRqEjmT6w3_rO6pK6E-x5g+3-DEg@mail.gmail.com>

Hello,

You may do the update in multiple ways. If you don't want to take full
cluster down, you can stop cluster services on one node and perform package
update on it and rejoin it. Like this way do the update on one by one.

Instead if you can take maintenance window, you can stop cluster on all
nodes and perform update. Do a reboot of nodes at a time and allow it to
join fresh.

You will need to update cman,openais,rgmanager,lvm2-cluster(if you are using
clvm) gfs-utils, gfs2-utils, kmod-gfs (only if you are using gfs). If the
system is registered with rhn you give something like "yum install cman
openais rgmanager ......." it will install it in proper order. I think you
can do yum update using cluster group also so that it will update all of
packages in cluster group.

The kbase article https://access.redhat.com/kb/docs/DOC-43210 may be helpful
to you.

I would recommend to do this update in a test cluster first. If you don;t
have test cluster, perform this in off business hours or maintenance window

Regards
Aneesh

On Tue, Jan 11, 2011 at 2:44 PM, Emilio Arjona <emilio at ugr.es> wrote:

> Hi experts,
>
> we are planning to update our cluster packages but we don't know what's the
> best procedure to do it:
> Must we shut down the whole cluster, update all the nodes individually
> while the cluster is off, and then join them all again?
> What are the packages to update? Order?
>
> And another question:
> Where can I download gfs2-hangalyzer?
>
> Regards,
>
> --
> *******************************************
> Emilio Arjona Heredia
> *******************************************
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110111/15231446/attachment.htm>

From kclo2000 at gmail.com  Wed Jan 12 04:02:52 2011
From: kclo2000 at gmail.com (KC LO)
Date: Wed, 12 Jan 2011 12:02:52 +0800
Subject: [Linux-cluster] Cluster service monitoring on application
Message-ID: <AANLkTinHgcmYcfY03FkmGbgJBWaUTShsmKmtCnRkwL09@mail.gmail.com>

Hi all,

For our cluster service group, we have setup mount point, ip address and an
application(like apache).  For the current setup, if any of the component
failure, it will trigger failover.  For application mointoring, can we
configure in cluster which only restart the application during application
failure instead of trigger failover.

THanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110112/6b4b1347/attachment.htm>

From parvez.h.shaikh at gmail.com  Wed Jan 12 06:04:24 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Wed, 12 Jan 2011 11:34:24 +0530
Subject: [Linux-cluster] Determining failed node on another node of cluster
	during failover
Message-ID: <AANLkTikug-LqvrM7bN7+fNXH3k0WGK=d3gz978dTk8my@mail.gmail.com>

Hi all,

Taking this question from another thread, here is a challenge that I am facing -

Following is simple cluster configuration -

Node 1, node 2, node 3, and node4 are part of cluster, its
unrestricted unordered fail-over domain with active - active nxn
configuration

So a node 2 can get services from node1, node3 or node4 when any of
these(1,3,4) node fails(e.g. power failure).

In that event I want to find out which of the node has failed over
node2, I was invoking "clustat -x -S service name" on node2 in my
custom agent and was parsing for "last_owner" field to obtain name of
node on which service was previously running.

This however doesn't seem to be working in case if I shutdown node(but
works if I migrate service from one node to another using clusvcadm)

Is there anyway that I can find out which node has failed during
failover of service on a standby node? Any tool which I might have
missed or some command which I can send to ccsd to get this
information

Thanks



From mailtoaneeshvs at gmail.com  Wed Jan 12 06:50:01 2011
From: mailtoaneeshvs at gmail.com (aneesh vs)
Date: Wed, 12 Jan 2011 12:20:01 +0530
Subject: [Linux-cluster] Cluster service monitoring on application
In-Reply-To: <AANLkTinHgcmYcfY03FkmGbgJBWaUTShsmKmtCnRkwL09@mail.gmail.com>
References: <AANLkTinHgcmYcfY03FkmGbgJBWaUTShsmKmtCnRkwL09@mail.gmail.com>
Message-ID: <AANLkTi=1SAgTn2eMhkKxLDaX7DpGJqA_uCgsL6MyMp62@mail.gmail.com>

Hello,

The explanation is not so clear. Which cluster suite version? RHEL5 or
RHEL6?
Are you saying that if cluster(rgmanager) status check fails on your
application script, you want to restart this script instead of full service
failover or restart?
If this is what you are trying to achieve, please refer "Failure Recovery &
Independent Subtrees" section in
http://sources.redhat.com/cluster/wiki/ResourceTrees

Regards
Aneesh

On Wed, Jan 12, 2011 at 9:32 AM, KC LO <kclo2000 at gmail.com> wrote:

> Hi all,
>
> For our cluster service group, we have setup mount point, ip address and an
> application(like apache).  For the current setup, if any of the component
> failure, it will trigger failover.  For application mointoring, can we
> configure in cluster which only restart the application during application
> failure instead of trigger failover.
>
> THanks!
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110112/388d70ad/attachment.htm>

From kitgerrits at gmail.com  Wed Jan 12 09:28:47 2011
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Wed, 12 Jan 2011 10:28:47 +0100
Subject: [Linux-cluster] Determining failed node on another node of
	clusterduring failover
In-Reply-To: <AANLkTikug-LqvrM7bN7+fNXH3k0WGK=d3gz978dTk8my@mail.gmail.com>
Message-ID: <4d2d7453.857a0e0a.5c5f.1722@mx.google.com>


Hello,

If you want to find out which cluster node has failed, you could either
check /var/log/messages and see which member has left the cluster, or you
can set up monitoring to check if your servers are all in good shape.

If you are running a cluster, I would suggest also setting up monitoring.
The monitoring package can then notify you if any cluster member fails.


Regards,

Kit 

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
Sent: woensdag 12 januari 2011 7:04
To: linux clustering
Subject: [Linux-cluster] Determining failed node on another node of
clusterduring failover

Hi all,

Taking this question from another thread, here is a challenge that I am
facing -

Following is simple cluster configuration -

Node 1, node 2, node 3, and node4 are part of cluster, its unrestricted
unordered fail-over domain with active - active nxn configuration

So a node 2 can get services from node1, node3 or node4 when any of
these(1,3,4) node fails(e.g. power failure).

In that event I want to find out which of the node has failed over node2, I
was invoking "clustat -x -S service name" on node2 in my custom agent and
was parsing for "last_owner" field to obtain name of node on which service
was previously running.

This however doesn't seem to be working in case if I shutdown node(but works
if I migrate service from one node to another using clusvcadm)

Is there anyway that I can find out which node has failed during failover of
service on a standby node? Any tool which I might have missed or some
command which I can send to ccsd to get this information

Thanks

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From kitgerrits at gmail.com  Wed Jan 12 09:43:15 2011
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Wed, 12 Jan 2011 10:43:15 +0100
Subject: [Linux-cluster] Howto define two-node cluster in
	enterpriseenvironment
In-Reply-To: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>
Message-ID: <4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>


FYI:

http://magazine.redhat.com/2007/12/19/enhancing-cluster-quorum-with-qdisk/
QDisk uses a small 10MB disk partition shared across the cluster. Qdiskd
runs on each node in the cluster, periodically evaluating its own health and
then placing its state information into an assigned portion of the shared
disk area. Each qdiskd then looks at the state of the other nodes in the
cluster as posted in their area of the QDisk partition. When in a healthy
state, the quorum of the cluster adds the count for each node plus the value
of the QDisk partition. In the example above, the total quorum count is 7;
one for each node and 3 for the QDisk partition.

If, on a particular node, QDisk is unable to access its shared disk area
after several attempts, then the qdiskd running on another node in the
cluster will request that the troubled node be fenced. This will reset that
machine and get it into a operational state.


http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Ad
ministration/s1-qdisk-considerations-CA.html
A quorum disk device should be a shared block device with concurrent
read/write access by all nodes in a cluster. The minimum size of the block
device is 10 Megabytes. Examples of shared block devices that can be used by
qdiskd are a multi-port SCSI RAID array, a Fibre Channel RAID SAN, or a
RAID-configured iSCSI target. You can create a quorum disk device with
mkqdisk, the Cluster Quorum Disk Utility. For information about using the
utility refer to the mkqdisk(8) man page. 


A/
The Quorum disk is a raw device, NOT a filesystem, which all servers use
together.
It is meant to be written to from multiple servers at the same time.
(really!)
There is no filesystem on there to kill.
Should the Quorum disk get corrupted, I assume the Qdisk Service will let
you know.

AFAIK, all nodes update a little chunk of the Qdisk to let the others know
they are still alive.
Upon a "split brain" cluster failure, the first node to write to the Qdisk
gets an extra vote.


B/
If you want to share a REAL filesystem LUN between cluster nodes, you can
set is up as a Cluster LVM.
Clustering will then guarantee that only one server can mount it at a time.
Should clustering fail for some reason, the LVM will be disabled.
Should the machine running the LVM volume crash, it will be fenced
(rebooted) and the LVM disabled.
Once the machine has been successfully fenced, the LVM will be re-enabled,
so other cluster members can mount volumes on there.


Regards,

Kit


-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Thomas Sjolshagen
Sent: maandag 10 januari 2011 21:38
To: linux clustering
Subject: Re: [Linux-cluster] Howto define two-node cluster in
enterpriseenvironment

 On Mon, 10 Jan 2011 21:21:45 +0100, "Kit Gerrits" 
 <kitgerrits at gmail.com> wrote:
> Hello fellow administrator,
>
> If you have a SAN...
> Why can't you have the SAN publish the same LUN to the two cluster 
> nodes simultaneously?

 You can, but you minimally need to guarantee (not believe or think, but
 guarantee!) that both nodes do not

 a) write to the same sectors, file systems or LVM volumes at the same  time
(this is actually a whole lot more difficult to do than most people
 think) - including boot sectors, partition tables, LVM metadata, etc,  etc,

 b) think they're exclusively accessing the LUN I.e. there must be
something on the nodes - an application, OS tool or something else -  that
understands that there is more than one reader & writer to a LUN  and thus
synchronizes this.

> It is only used as a raw device, so there should be no ugly filesystem 
> side-effects.

 File systems only serve to make this a lot more obvious to the end user  or
administrator since it's integrity tends to get shot fairly quickly  and
there are integrity checks in place. On raw devices, you get the  "benefit"
of ignorance about the fact that your data is corrupt, unless
 b) above is true.

 Hth,

 // Thomas

>
>
> Regards,
>
> Kit
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Andreas 
> Bleischwitz
> Sent: maandag 10 januari 2011 11:25
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Howto define two-node cluster in 
> enterpriseenvironment
>
> Hello list,
>
> I recently ran into some questions regarding a two-node cluster in an 
> enterprise environment, where single-point-of-failures were tried to 
> be eliminated whenever possible.
>
> The situation is the following:
> Two-node cluster, SAN-based shared storage - multipathed; host-based 
> mirrored, bonded NICS, Quorum device as tie-breaker.
>
> Problem:
> The quorum device is the single-point-of-failure as the SAN-device 
> could fail and hence the quorum-disc wouldn't be accessible.
> The quorum-disc can't be host-based mirrored, as this would require 
> cmirror
> - which depends on a quorate cluster.
> One solution: use storage-based mirroring - with extra costs, limited 
> to no support with mixed storage vendors.
> Another solution: Use a third - no service - node which has to have 
> the same SAN-connections as the other two nodes out of cluster 
> reasons. This node will idle most of the time and therefore be very 
> uneconomic.
>
> How are such situations usually solved using RHCS? There must be a way 
> of configuring a two-nodecluster without having a SPOF defined.
>
> HP had a quorum-host with their no longer maintained Service Guard, 
> which could do quorum for more than on cluster at once.
>
> Any suggestions appreciated.
>
> Best regrads,
>
> Andreas Bleischwitz
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From parvez.h.shaikh at gmail.com  Wed Jan 12 10:01:03 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Wed, 12 Jan 2011 15:31:03 +0530
Subject: [Linux-cluster] Determining failed node on another node of
 clusterduring failover
In-Reply-To: <4d2d7453.857a0e0a.5c5f.1722@mx.google.com>
References: <AANLkTikug-LqvrM7bN7+fNXH3k0WGK=d3gz978dTk8my@mail.gmail.com>
	<4d2d7453.857a0e0a.5c5f.1722@mx.google.com>
Message-ID: <AANLkTi=3=VBixWTroZc==4MLTpYJbTAiARnhvj8kG2e8@mail.gmail.com>

Hi

Is monitoring package part of RHCS? What is name of this component?

Is there any other mechanism which doesn't require to parse
log/messages to determine which node has left the cluster on stand by
node before failover is complee?

Thanks

On Wed, Jan 12, 2011 at 2:58 PM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>
> Hello,
>
> If you want to find out which cluster node has failed, you could either
> check /var/log/messages and see which member has left the cluster, or you
> can set up monitoring to check if your servers are all in good shape.
>
> If you are running a cluster, I would suggest also setting up monitoring.
> The monitoring package can then notify you if any cluster member fails.
>
>
> Regards,
>
> Kit
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
> Sent: woensdag 12 januari 2011 7:04
> To: linux clustering
> Subject: [Linux-cluster] Determining failed node on another node of
> clusterduring failover
>
> Hi all,
>
> Taking this question from another thread, here is a challenge that I am
> facing -
>
> Following is simple cluster configuration -
>
> Node 1, node 2, node 3, and node4 are part of cluster, its unrestricted
> unordered fail-over domain with active - active nxn configuration
>
> So a node 2 can get services from node1, node3 or node4 when any of
> these(1,3,4) node fails(e.g. power failure).
>
> In that event I want to find out which of the node has failed over node2, I
> was invoking "clustat -x -S service name" on node2 in my custom agent and
> was parsing for "last_owner" field to obtain name of node on which service
> was previously running.
>
> This however doesn't seem to be working in case if I shutdown node(but works
> if I migrate service from one node to another using clusvcadm)
>
> Is there anyway that I can find out which node has failed during failover of
> service on a standby node? Any tool which I might have missed or some
> command which I can send to ccsd to get this information
>
> Thanks
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From stefan at lsd.co.za  Wed Jan 12 12:41:26 2011
From: stefan at lsd.co.za (Stefan Lesicnik)
Date: Wed, 12 Jan 2011 14:41:26 +0200 (SAST)
Subject: [Linux-cluster] RHCS Multipath / Fence
In-Reply-To: <759846465.318.1294835102074.JavaMail.root@zimbra>
Message-ID: <1227419517.11.1294836086720.JavaMail.root@zimbra>


Hi,

I have a RHEL 5.5 2 node cluster running with a qdisk a 3/2 quorum.
>From the OS i have dual HBAs with 4 paths. Failover works if i remove a fiber it reports down and i can still access my disks over the other 2 paths.

When i activate the cluster and remove one fiber, it fences the node. Is there a config or setting I need to tell the cluster to tolerate 1 fiber failure?

(Thinking about it now, it may be a timeout issue - as i think my multipath has a 60sec port retry value that may be to high for the cluster)

Thanks

Stefan



From mailtoaneeshvs at gmail.com  Wed Jan 12 13:32:02 2011
From: mailtoaneeshvs at gmail.com (aneesh vs)
Date: Wed, 12 Jan 2011 19:02:02 +0530
Subject: [Linux-cluster] RHCS Multipath / Fence
In-Reply-To: <1227419517.11.1294836086720.JavaMail.root@zimbra>
References: <759846465.318.1294835102074.JavaMail.root@zimbra>
	<1227419517.11.1294836086720.JavaMail.root@zimbra>
Message-ID: <AANLkTinWbGuMrAiNN=PZ_Dt=VO=t1_Hj6YKMxiaHr_vi@mail.gmail.com>

Hello,

please check
https://access.redhat.com/kb/docs/DOC-37204
&
 https://access.redhat.com/kb/docs/DOC-2882

Aneesh

On Wed, Jan 12, 2011 at 6:11 PM, Stefan Lesicnik <stefan at lsd.co.za> wrote:

>
> Hi,
>
> I have a RHEL 5.5 2 node cluster running with a qdisk a 3/2 quorum.
> >From the OS i have dual HBAs with 4 paths. Failover works if i remove a
> fiber it reports down and i can still access my disks over the other 2
> paths.
>
> When i activate the cluster and remove one fiber, it fences the node. Is
> there a config or setting I need to tell the cluster to tolerate 1 fiber
> failure?
>
> (Thinking about it now, it may be a timeout issue - as i think my multipath
> has a 60sec port retry value that may be to high for the cluster)
>
> Thanks
>
> Stefan
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110112/7a2ca01f/attachment.htm>

From keith.schincke at gmail.com  Wed Jan 12 13:46:37 2011
From: keith.schincke at gmail.com (Keith Schincke)
Date: Wed, 12 Jan 2011 07:46:37 -0600
Subject: [Linux-cluster] Howto define two-node cluster in
	enterpriseenvironment
In-Reply-To: <4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>
References: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>
	<4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>
Message-ID: <AANLkTim4qmcDFYq6oRjfSZKBcRe2Ov=vv_hoZNS_C8Ji@mail.gmail.com>

On Wed, Jan 12, 2011 at 3:43 AM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>
>
> B/
> If you want to share a REAL filesystem LUN between cluster nodes, you can
> set is up as a Cluster LVM.
> Clustering will then guarantee that only one server can mount it at a time.
> Should clustering fail for some reason, the LVM will be disabled.
> Should the machine running the LVM volume crash, it will be fenced
> (rebooted) and the LVM disabled.
> Once the machine has been successfully fenced, the LVM will be re-enabled,
> so other cluster members can mount volumes on there.
>
>

As a quick note, you should not use part of the LVM as the qdisk. The
cluster needs
to be quorate for clvmd to run. In a two node cluster, you need the
first node plus the
qdisk to be quorate. You end up on a chicken and the egg situation.

My two  node cluster uses part of a SAN disk as the qdisk. The rest of
the SAN disk
is a PV used by the LVM.

Keith



From kitgerrits at gmail.com  Wed Jan 12 23:05:28 2011
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Thu, 13 Jan 2011 00:05:28 +0100
Subject: [Linux-cluster] Determining failed node on another node of
	clusterduring failover
In-Reply-To: <AANLkTi=3=VBixWTroZc==4MLTpYJbTAiARnhvj8kG2e8@mail.gmail.com>
Message-ID: <4d2e33b3.ce7c0e0a.266f.467a@mx.google.com>


Hello,

The Clustering software itself monitors nodes and devices in use by cluster
services, but logs to /var/log/messages.
A quick overview is presented by the 'clustat' command.

Monitoring tools are freely available for any platform.
Basic monitoring in Linux is available with Big Brother, Cacti, OpenNMS or
Nagios (in order of increasing complexity).
If you're bound to windows, maybe try ServersCheck .


Parsing logs can be trivial, once you know how.
What do you want to know and when do you want to know it?

Have you looked at 'clustat' and 'cman_tool'?


Regards,

Kit

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
Sent: woensdag 12 januari 2011 11:01
To: linux clustering
Subject: Re: [Linux-cluster] Determining failed node on another node of
clusterduring failover

Hi

Is monitoring package part of RHCS? What is name of this component?

Is there any other mechanism which doesn't require to parse log/messages to
determine which node has left the cluster on stand by node before failover
is complee?

Thanks

On Wed, Jan 12, 2011 at 2:58 PM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>
> Hello,
>
> If you want to find out which cluster node has failed, you could 
> either check /var/log/messages and see which member has left the 
> cluster, or you can set up monitoring to check if your servers are all in
good shape.
>
> If you are running a cluster, I would suggest also setting up monitoring.
> The monitoring package can then notify you if any cluster member fails.
>
>
> Regards,
>
> Kit
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
> Sent: woensdag 12 januari 2011 7:04
> To: linux clustering
> Subject: [Linux-cluster] Determining failed node on another node of 
> clusterduring failover
>
> Hi all,
>
> Taking this question from another thread, here is a challenge that I 
> am facing -
>
> Following is simple cluster configuration -
>
> Node 1, node 2, node 3, and node4 are part of cluster, its 
> unrestricted unordered fail-over domain with active - active nxn 
> configuration
>
> So a node 2 can get services from node1, node3 or node4 when any of
> these(1,3,4) node fails(e.g. power failure).
>
> In that event I want to find out which of the node has failed over 
> node2, I was invoking "clustat -x -S service name" on node2 in my 
> custom agent and was parsing for "last_owner" field to obtain name of 
> node on which service was previously running.
>
> This however doesn't seem to be working in case if I shutdown node(but 
> works if I migrate service from one node to another using clusvcadm)
>
> Is there anyway that I can find out which node has failed during 
> failover of service on a standby node? Any tool which I might have 
> missed or some command which I can send to ccsd to get this 
> information
>
> Thanks
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From parvez.h.shaikh at gmail.com  Thu Jan 13 04:34:03 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 13 Jan 2011 10:04:03 +0530
Subject: [Linux-cluster] Determining failed node on another node of
 clusterduring failover
In-Reply-To: <4d2e33b3.ce7c0e0a.266f.467a@mx.google.com>
References: <AANLkTi=3=VBixWTroZc==4MLTpYJbTAiARnhvj8kG2e8@mail.gmail.com>
	<4d2e33b3.ce7c0e0a.266f.467a@mx.google.com>
Message-ID: <AANLkTimRa56VgPaYsSR5c+Lj=9n3oUhfifoSUpBzrsuB@mail.gmail.com>

Hi,

I have been using clustat command. clustat -x -s servicename to get
following XML file -

<?xml version="1.0"?>
<clustat version="4.1.1">
  <groups>
    <group name="service:service_on_node1" state="112"
state_str="started" flags="0" flags_str="" owner="node1"
last_owner="none" restarts="0" last_transition="1294752663"
last_transition_str="Tue Jan 11 19:01:03 2011"/>
  </groups>
</clustat>

I was under impression that "last_owner" field in the above XML file
should give me node name where service was last running. I was parsing
this XML file to obtain this information.

Note that, this holds true if you migrate or relocate service from one
node to another using clusvcadm or from conga or system-config-luster
BUT if node is shutdown and service relocate to another node,
last_owner is either 'none' or same as current node on which service
is relocated.

Parsing var/messages/log is easy but not optimal solution, it will
need "grep"ing entire log file for some specific message where failed
node name is appearing in clumgr messages.



On Thu, Jan 13, 2011 at 4:35 AM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>
> Hello,
>
> The Clustering software itself monitors nodes and devices in use by cluster
> services, but logs to /var/log/messages.
> A quick overview is presented by the 'clustat' command.
>
> Monitoring tools are freely available for any platform.
> Basic monitoring in Linux is available with Big Brother, Cacti, OpenNMS or
> Nagios (in order of increasing complexity).
> If you're bound to windows, maybe try ServersCheck .
>
>
> Parsing logs can be trivial, once you know how.
> What do you want to know and when do you want to know it?
>
> Have you looked at 'clustat' and 'cman_tool'?
>
>
> Regards,
>
> Kit
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
> Sent: woensdag 12 januari 2011 11:01
> To: linux clustering
> Subject: Re: [Linux-cluster] Determining failed node on another node of
> clusterduring failover
>
> Hi
>
> Is monitoring package part of RHCS? What is name of this component?
>
> Is there any other mechanism which doesn't require to parse log/messages to
> determine which node has left the cluster on stand by node before failover
> is complee?
>
> Thanks
>
> On Wed, Jan 12, 2011 at 2:58 PM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>>
>> Hello,
>>
>> If you want to find out which cluster node has failed, you could
>> either check /var/log/messages and see which member has left the
>> cluster, or you can set up monitoring to check if your servers are all in
> good shape.
>>
>> If you are running a cluster, I would suggest also setting up monitoring.
>> The monitoring package can then notify you if any cluster member fails.
>>
>>
>> Regards,
>>
>> Kit
>>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
>> Sent: woensdag 12 januari 2011 7:04
>> To: linux clustering
>> Subject: [Linux-cluster] Determining failed node on another node of
>> clusterduring failover
>>
>> Hi all,
>>
>> Taking this question from another thread, here is a challenge that I
>> am facing -
>>
>> Following is simple cluster configuration -
>>
>> Node 1, node 2, node 3, and node4 are part of cluster, its
>> unrestricted unordered fail-over domain with active - active nxn
>> configuration
>>
>> So a node 2 can get services from node1, node3 or node4 when any of
>> these(1,3,4) node fails(e.g. power failure).
>>
>> In that event I want to find out which of the node has failed over
>> node2, I was invoking "clustat -x -S service name" on node2 in my
>> custom agent and was parsing for "last_owner" field to obtain name of
>> node on which service was previously running.
>>
>> This however doesn't seem to be working in case if I shutdown node(but
>> works if I migrate service from one node to another using clusvcadm)
>>
>> Is there anyway that I can find out which node has failed during
>> failover of service on a standby node? Any tool which I might have
>> missed or some command which I can send to ccsd to get this
>> information
>>
>> Thanks
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From stefan at lsd.co.za  Thu Jan 13 05:17:31 2011
From: stefan at lsd.co.za (Stefan Lesicnik)
Date: Thu, 13 Jan 2011 07:17:31 +0200 (SAST)
Subject: [Linux-cluster] RHCS Multipath / Fence
In-Reply-To: <AANLkTinWbGuMrAiNN=PZ_Dt=VO=t1_Hj6YKMxiaHr_vi@mail.gmail.com>
Message-ID: <231016294.3.1294895851234.JavaMail.root@zimbra>

Hi Aneesh, Exactly what I was looking for. Thanks for the pointing it out. Stefan ----- Original Message -----
> Hello,
> please check
> https://access.redhat.com/kb/docs/DOC-37204
> &
> https://access.redhat.com/kb/docs/DOC-2882
> Aneesh
> On Wed, Jan 12, 2011 at 6:11 PM, Stefan Lesicnik < stefan at lsd.co.za >
> wrote:
> > Hi,
> > I have a RHEL 5.5 2 node cluster running with a qdisk a 3/2 quorum.
> > >From the OS i have dual HBAs with 4 paths. Failover works if i
> > >remove
> > >a fiber it reports down and i can still access my disks over the
> > >other 2 paths.
> > When i activate the cluster and remove one fiber, it fences the
> > node.
> > Is there a config or setting I need to tell the cluster to tolerate
> > 1
> > fiber failure?
> > (Thinking about it now, it may be a timeout issue - as i think my
> > multipath has a 60sec port retry value that may be to high for the
> > cluster)
> > Thanks
> > Stefan
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110113/7bfbfcc9/attachment.htm>

From shariq.siddiqui at yahoo.com  Thu Jan 13 14:51:35 2011
From: shariq.siddiqui at yahoo.com (Shariq Siddiqui)
Date: Thu, 13 Jan 2011 06:51:35 -0800 (PST)
Subject: [Linux-cluster] Need a good start with RHCS
In-Reply-To: <AANLkTim4qmcDFYq6oRjfSZKBcRe2Ov=vv_hoZNS_C8Ji@mail.gmail.com>
References: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>
	<4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>
	<AANLkTim4qmcDFYq6oRjfSZKBcRe2Ov=vv_hoZNS_C8Ji@mail.gmail.com>
Message-ID: <970156.33359.qm@web39802.mail.mud.yahoo.com>

Dear All,
This is Shariq Siddiqui

I am going to work with RHCS, so I need some good documentations related to this 
to take a start.

also please advice about the hardware constraints in it.

 
Best Regards,

Shariq Siddiqui


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110113/53a34783/attachment.htm>

From erenfro at gmail.com  Thu Jan 13 14:56:41 2011
From: erenfro at gmail.com (Eric Renfro)
Date: Thu, 13 Jan 2011 09:56:41 -0500
Subject: [Linux-cluster] Problems with GFS2 faulting.
Message-ID: <4D2F12A9.5080600@gmail.com>

Hello,

I've been having MAJOR issues with GFS2 faulting while doing extremely 
simple operations. My test-bed is my /home being GFS2, storage servers 
being two iSCSI targets running DRBD accross both replicating the data, 
and the client side using open-iscsi to bring it in, multipath'ing both 
nodes. I've done this setup as well without DRBD and multipath and just 
going straight iSCSI with no difference in the problem.

What is being done to cause this error is, I have an Apache server 
running my home page which is simply just Dokuwiki. Dokuwiki uses 
flatfiles for everything so it uses file locking techniques on the wiki 
data files.

After editing documents a few times, the same document or even mixed 
documents, within a short period of time, for example within about 30 
minutes, it causes GFS2 to dump a stack trace and fault and crowbar both 
nodes to a complete halt.

There's 2 client servers using the same GFS2 mountpoint are KVM guests 
on two physical virtual server machines; the storage servers are bare 
metal iSCSI targets with DRBD replication on them over 1 GB ethernet.

The cluster glue is comprised of the clustering PPA stack for Ubuntu 
10.04.1, pacemaker, dlm_controld.pcmk, gfs_controld.pcmk.

The only recovery that can be done to get even one node back online is 
to forcefully take down ALL nodes at once, because gfs_controld.pcmk 
will not die, dlm_controld.pcmk will not die, can't umount /home, can't 
even stop open-iscsi as they just throw more stack traces and time outs. 
After taking them down, I bring one node back up, setup to not 
re-activate the gfs2, but to load gfs2_controld.pcmk, fsck it, and 
finally re-enable the mount before bring the secondary node back online.

Obviously taking all nodes down means 100% downtime during that period 
of recovery, so this completely fails.

gfs_controld.pcmk is setup to start with the command-line arguments: -g 
0 -l 0 -o 0

Here's the stack traces I'm getting when it faults:

Jan 13 03:31:27 cweb1 kernel: [1387920.160141] INFO: task 
flush-251:1:27497 blocked for more than 120 seconds.
Jan 13 03:31:27 cweb1 kernel: [1387920.160802] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 03:31:27 cweb1 kernel: [1387920.161474] flush-251:1   D 
0000000000000002     0 27497      2 0x00000000
Jan 13 03:31:27 cweb1 kernel: [1387920.161479]  ffff88004854fa10 
0000000000000046 0000000000015bc0 0000000000015bc0
Jan 13 03:31:27 cweb1 kernel: [1387920.161483]  ffff880060bf03b8 
ffff88004854ffd8 0000000000015bc0 ffff880060bf0000
Jan 13 03:31:27 cweb1 kernel: [1387920.161485]  0000000000015bc0 
ffff88004854ffd8 0000000000015bc0 ffff880060bf03b8
Jan 13 03:31:27 cweb1 kernel: [1387920.161488] Call Trace:
Jan 13 03:31:27 cweb1 kernel: [1387920.161508]  [<ffffffffa022e730>] ? 
gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161516]  [<ffffffffa022e73e>] 
gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161540]  [<ffffffff81558fcf>] 
__wait_on_bit+0x5f/0x90
Jan 13 03:31:27 cweb1 kernel: [1387920.161547]  [<ffffffffa022fd9d>] ? 
do_promote+0xcd/0x290 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161555]  [<ffffffffa022e730>] ? 
gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161558]  [<ffffffff81559078>] 
out_of_line_wait_on_bit+0x78/0x90
Jan 13 03:31:27 cweb1 kernel: [1387920.161575]  [<ffffffff810843c0>] ? 
wake_bit_function+0x0/0x40
Jan 13 03:31:27 cweb1 kernel: [1387920.161582]  [<ffffffffa022f971>] 
gfs2_glock_wait+0x31/0x40 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161590]  [<ffffffffa0230975>] 
gfs2_glock_nq+0x2a5/0x360 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161597]  [<ffffffffa022f064>] ? 
gfs2_glock_put+0x104/0x130 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161606]  [<ffffffffa02497f2>] 
gfs2_write_inode+0x82/0x190 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161614]  [<ffffffffa02497ea>] ? 
gfs2_write_inode+0x7a/0x190 [gfs2]
Jan 13 03:31:27 cweb1 kernel: [1387920.161629]  [<ffffffff811661d4>] 
writeback_single_inode+0x2b4/0x3d0
Jan 13 03:31:27 cweb1 kernel: [1387920.161631]  [<ffffffff81166745>] 
writeback_sb_inodes+0x195/0x280
Jan 13 03:31:27 cweb1 kernel: [1387920.161638]  [<ffffffff81061671>] ? 
dequeue_entity+0x1a1/0x1e0
Jan 13 03:31:27 cweb1 kernel: [1387920.161641]  [<ffffffff81166f60>] 
writeback_inodes_wb+0xa0/0x1b0
Jan 13 03:31:27 cweb1 kernel: [1387920.161643]  [<ffffffff811672ab>] 
wb_writeback+0x23b/0x2a0
Jan 13 03:31:27 cweb1 kernel: [1387920.161648]  [<ffffffff81075f3c>] ? 
lock_timer_base+0x3c/0x70
Jan 13 03:31:27 cweb1 kernel: [1387920.161651]  [<ffffffff8116748c>] 
wb_do_writeback+0x17c/0x190
Jan 13 03:31:27 cweb1 kernel: [1387920.161653]  [<ffffffff81076050>] ? 
process_timeout+0x0/0x10
Jan 13 03:31:27 cweb1 kernel: [1387920.161656]  [<ffffffff811674f3>] 
bdi_writeback_task+0x53/0xf0
Jan 13 03:31:27 cweb1 kernel: [1387920.161667]  [<ffffffff8110e9c6>] 
bdi_start_fn+0x86/0x100
Jan 13 03:31:27 cweb1 kernel: [1387920.161669]  [<ffffffff8110e940>] ? 
bdi_start_fn+0x0/0x100
Jan 13 03:31:27 cweb1 kernel: [1387920.161671]  [<ffffffff81084006>] 
kthread+0x96/0xa0
Jan 13 03:31:27 cweb1 kernel: [1387920.161680]  [<ffffffff810131ea>] 
child_rip+0xa/0x20
Jan 13 03:31:27 cweb1 kernel: [1387920.161683]  [<ffffffff81083f70>] ? 
kthread+0x0/0xa0
Jan 13 03:31:27 cweb1 kernel: [1387920.161685]  [<ffffffff810131e0>] ? 
child_rip+0x0/0x20

-- 
Eric Renfro



From swhiteho at redhat.com  Thu Jan 13 15:08:43 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Thu, 13 Jan 2011 15:08:43 +0000
Subject: [Linux-cluster] Reporting GFS2 bugs
Message-ID: <1294931323.2401.52.camel@dolmen>

Hi,

Just a quick clarification on the bast way to report bugs and problems
relating to GFS2.

Firstly, if you are a paying Red Hat customer, all issues should be
reported via the Red Hat support team. That doesn't prevent you from
monitoring any bugzilla opened as part of that process directly, if you
want to. The information below applies to those who are _not_ paying
customers (i.e. members of the community)

All bugs (or potential bugs) should be reported via the Red Hat bugzilla
http://bugzilla.redhat.com and the appropriate distro selected. If you
are using CentOS, report it against the relevant RHEL version and
clearly mark the description of the bug as being from CentOS. If you are
using a non-Red Hat distro or an upstream kernel, then please report it
under Fedora/rawhide and again clearly mark the description with the
exact versions of the kernel and userland in question.

The reason that we ask for things to be reported in this way is so that
we have a single place where all the GFS2 issues can be queried. That
means both kernel and userland issues. Also it will automatically appear
on my todo list, it will be available for others to reference in case
they have the same or a similar issue, and for us to query in order to
look for patterns which may help us track down a problem faster.

Don't worry about whether your issue is a duplicate too much. If it is
obviously a duplicate, then please use the existing open bug, if you are
not sure, then report it as a new issue and we'll mark it as a duplicate
if required. Sometimes it isn't always obvious whether a bug is a
duplicate of an existing issue until the issue is found and fixed.

Bugs reported via IRC or mailing lists are tricky to keep track of, and
are difficult for other people to query in order to see whether they
have the same issue. By all means use those methods to ask about whether
a particular item is a bug, but please always file a bugzilla if that
turns out to be the case.

Questions sent to the mailing list are better since they are more likely
to be easily available via the archives for future reference by others.

One thing which is often missing from bug reports, but is usually vital
when working out what went wrong, is a basic description of the cluster
and the workload in question: how many nodes? What kind of storage? What
was the application which was running when the problem occurred? just
enough to give a picture of the basic architecture involved.

Don't just restrict bug reports to the code itself. If there are errors
in documents or things on the web site, or whatever, then we'd like to
know about those too.

Finally, don't forget that bugzilla can also be used to request new
features. In that case open your bug against fedora/rawhide and put
[RFE] in the description, and "FutureFeature" in the key words.

Do let us know if you have any questions or comments,

Steve.




From finnzi at finnzi.com  Thu Jan 13 15:08:57 2011
From: finnzi at finnzi.com (=?ISO-8859-1?Q?Finnur_=D6rn_Gu=F0mundsson?=)
Date: Thu, 13 Jan 2011 15:08:57 +0000
Subject: [Linux-cluster] Need a good start with RHCS
In-Reply-To: <970156.33359.qm@web39802.mail.mud.yahoo.com>
References: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>	<4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>	<AANLkTim4qmcDFYq6oRjfSZKBcRe2Ov=vv_hoZNS_C8Ji@mail.gmail.com>
	<970156.33359.qm@web39802.mail.mud.yahoo.com>
Message-ID: <4D2F1589.1050708@finnzi.com>

On 13.1.2011 14:51, Shariq Siddiqui wrote:
> Dear All,
> This is Shariq Siddiqui
>
> I am going to work with RHCS, so I need some good documentations 
> related to this to take a start.
>
> also please advice about the hardware constraints in it.
>
> Best Regards,
>
> Shariq Siddiqui
>
> View Shariq Siddiqui's profile on LinkedIn 
> <http://www.linkedin.com/in/shariqsiddiqui>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
Hi there,

Then why don't you start with the documentation?
See:
(RHEL5): 
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/index.html
(RHEL6): 
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html

Bgrds,
Finnur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110113/a26c67bd/attachment.htm>

From work at fajar.net  Thu Jan 13 15:12:16 2011
From: work at fajar.net (Fajar A. Nugraha)
Date: Thu, 13 Jan 2011 22:12:16 +0700
Subject: [Linux-cluster] Need a good start with RHCS
In-Reply-To: <970156.33359.qm@web39802.mail.mud.yahoo.com>
References: <4a614ae3080a5ce8f908a9b877081986@sjolshagen.net>
	<4d2d77b9.5989cc0a.6546.0ea5@mx.google.com>
	<AANLkTim4qmcDFYq6oRjfSZKBcRe2Ov=vv_hoZNS_C8Ji@mail.gmail.com>
	<970156.33359.qm@web39802.mail.mud.yahoo.com>
Message-ID: <AANLkTin=5=B1maRzjfND9+oWgK-Z73Xuvy3yjZYrbPRd@mail.gmail.com>

On Thu, Jan 13, 2011 at 9:51 PM, Shariq Siddiqui
<shariq.siddiqui at yahoo.com>wrote:

> Dear All,
> This is Shariq Siddiqui
>
> I am going to work with RHCS, so I need some good documentations related to
> this to take a start.
>
>
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Suite_Overview/index.html
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html

http://www.redhat.com/rhel/ (there's free evaluation link on that page).

-- 
Fajar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110113/78be93ba/attachment.htm>

From gordan at bobich.net  Thu Jan 13 15:13:23 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Thu, 13 Jan 2011 15:13:23 +0000
Subject: [Linux-cluster] Problems with GFS2 faulting.
In-Reply-To: <4D2F12A9.5080600@gmail.com>
References: <4D2F12A9.5080600@gmail.com>
Message-ID: <4D2F1693.5050009@bobich.net>

Eric Renfro wrote:

> Here's the stack traces I'm getting when it faults:
> 
> Jan 13 03:31:27 cweb1 kernel: [1387920.160141] INFO: task 
> flush-251:1:27497 blocked for more than 120 seconds.
> Jan 13 03:31:27 cweb1 kernel: [1387920.160802] "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.

As the error says, this isn't an actual kernel ooops, it means that 
something in your stack (likely the iSCSI implementation since that is 
the fixed thing throughout what you tested) is blocking somewhere.

What version of RHEL, gfs2 and iscsi are you using? My guess is that 
iscsi might be getting into a race somewhere and locks up. Have you 
tried connecting both clients to just a single server via iscsi to see 
if the problem goes away?

Gordan



From erenfro at gmail.com  Thu Jan 13 15:41:44 2011
From: erenfro at gmail.com (Eric Renfro)
Date: Thu, 13 Jan 2011 10:41:44 -0500
Subject: [Linux-cluster] Problems with GFS2 faulting.
In-Reply-To: <4D2F1693.5050009@bobich.net>
References: <4D2F12A9.5080600@gmail.com> <4D2F1693.5050009@bobich.net>
Message-ID: <4D2F1D38.1030803@gmail.com>

It's not RHEL, as I stated in my post. It's Ubuntu 10.04.1 with the 
ubuntu-ha-maintainer PPA for pacemaker 1.0.8, open-iscsi 2.0.871, and 
gfs2-tools 3.0.7.

I have, also stated, tried without multipathed iSCSI and just used a 
singular iSCSI target for the nodes having problems, with the same 
situation. The issue is strictly with GFS2 somehow after locking and 
unlocking files. In fact, it can't be iSCSI at all, because the root 
filesystem of both nodes are iSCSI targets provided by kvm on the host 
OS, and they have given no issues as a result to iSCSI related issues. 
If it would be caused by iSCSI blocking, it would happen to the root 
filesystem as well I'm sure.

Eric Renfro

On 1/13/2011 10:13 AM, Gordan Bobic wrote:
> Eric Renfro wrote:
>
>> Here's the stack traces I'm getting when it faults:
>>
>> Jan 13 03:31:27 cweb1 kernel: [1387920.160141] INFO: task 
>> flush-251:1:27497 blocked for more than 120 seconds.
>> Jan 13 03:31:27 cweb1 kernel: [1387920.160802] "echo 0 > 
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> As the error says, this isn't an actual kernel ooops, it means that 
> something in your stack (likely the iSCSI implementation since that is 
> the fixed thing throughout what you tested) is blocking somewhere.
>
> What version of RHEL, gfs2 and iscsi are you using? My guess is that 
> iscsi might be getting into a race somewhere and locks up. Have you 
> tried connecting both clients to just a single server via iscsi to see 
> if the problem goes away?
>
> Gordan
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From swhiteho at redhat.com  Thu Jan 13 20:52:38 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Thu, 13 Jan 2011 20:52:38 +0000
Subject: [Linux-cluster] Problems with GFS2 faulting.
In-Reply-To: <4D2F1D38.1030803@gmail.com>
References: <4D2F12A9.5080600@gmail.com> <4D2F1693.5050009@bobich.net>
	<4D2F1D38.1030803@gmail.com>
Message-ID: <1294951958.2401.67.camel@dolmen>

Hi,

On Thu, 2011-01-13 at 10:41 -0500, Eric Renfro wrote:
> It's not RHEL, as I stated in my post. It's Ubuntu 10.04.1 with the 
> ubuntu-ha-maintainer PPA for pacemaker 1.0.8, open-iscsi 2.0.871, and 
> gfs2-tools 3.0.7.
> 
> I have, also stated, tried without multipathed iSCSI and just used a 
> singular iSCSI target for the nodes having problems, with the same 
> situation. The issue is strictly with GFS2 somehow after locking and 
> unlocking files. In fact, it can't be iSCSI at all, because the root 
> filesystem of both nodes are iSCSI targets provided by kvm on the host 
> OS, and they have given no issues as a result to iSCSI related issues. 
> If it would be caused by iSCSI blocking, it would happen to the root 
> filesystem as well I'm sure.
> 
> Eric Renfro
> 
I don't know a lot about pacemaker, and in fact the pcmk versions of
dlm_controld and gfs_controld have been removed from more recent
packages. Can you recreate the issue using the "normal" dlm and gfs
controlds?

The back trace you posted earlier seemed to show that there was a
process stuck waiting for a glock. The next part of the debug process is
to figure out which glock is being waited for and why. If you could get
a dump of the glocks from debugfs, that should help point the way,

Steve.


> On 1/13/2011 10:13 AM, Gordan Bobic wrote:
> > Eric Renfro wrote:
> >
> >> Here's the stack traces I'm getting when it faults:
> >>
> >> Jan 13 03:31:27 cweb1 kernel: [1387920.160141] INFO: task 
> >> flush-251:1:27497 blocked for more than 120 seconds.
> >> Jan 13 03:31:27 cweb1 kernel: [1387920.160802] "echo 0 > 
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >
> > As the error says, this isn't an actual kernel ooops, it means that 
> > something in your stack (likely the iSCSI implementation since that is 
> > the fixed thing throughout what you tested) is blocking somewhere.
> >
> > What version of RHEL, gfs2 and iscsi are you using? My guess is that 
> > iscsi might be getting into a race somewhere and locks up. Have you 
> > tried connecting both clients to just a single server via iscsi to see 
> > if the problem goes away?
> >
> > Gordan
> >
> > -- 
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From karlesven at gmail.com  Thu Jan 13 21:39:21 2011
From: karlesven at gmail.com (Sven Karlsson)
Date: Thu, 13 Jan 2011 22:39:21 +0100
Subject: [Linux-cluster] nolock and dlm nodes in the same cluster
Message-ID: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>

G'day,

We have a GFS2 cluster on a fibre-SAN with three machines, of which
one machine is used for remote backups.

The cluster contains a lot of small files, and the backup operation
takes about a day to complete. When investigating, we found that the
major performance bottleneck was the file locking operations. We
stopped the cluster and mounted the backup-node with the lock_nolock
option, and now backups were blazing.

After careful consideration of the nolock-warning in the documentation
(i.e. corruption and kernel panics may happen), I wonder if that is
still the case if spectator mode is used?

Or is there some other options that are available? The files will not
be modified by the other nodes during this time, so there is no actual
need for file-level locking... but perhaps the DLM is also handling
meta-data and other locking that is necessary and it is therefore not
possible to use nolock?

/Sven



From parvez.h.shaikh at gmail.com  Fri Jan 14 04:27:48 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Fri, 14 Jan 2011 09:57:48 +0530
Subject: [Linux-cluster] Determining failed node on another node of
 clusterduring failover
In-Reply-To: <AANLkTimRa56VgPaYsSR5c+Lj=9n3oUhfifoSUpBzrsuB@mail.gmail.com>
References: <AANLkTi=3=VBixWTroZc==4MLTpYJbTAiARnhvj8kG2e8@mail.gmail.com>
	<4d2e33b3.ce7c0e0a.266f.467a@mx.google.com>
	<AANLkTimRa56VgPaYsSR5c+Lj=9n3oUhfifoSUpBzrsuB@mail.gmail.com>
Message-ID: <AANLkTinbf3EypFLB_xF4+Z07pM=ueU_7KjdEsMCE0Xs7@mail.gmail.com>

Hi

Any idea on how to get name of failed node using available cluster
tools or commands? I have tried clustat but it seems to be producing
unexpected output.

I will have to obtain this information on target host/node; to which
service is relocating as a part of failover.

Thanks in advance

Gratefully yours



On 1/13/11, Parvez Shaikh <parvez.h.shaikh at gmail.com> wrote:
> Hi,
>
> I have been using clustat command. clustat -x -s servicename to get
> following XML file -
>
> <?xml version="1.0"?>
> <clustat version="4.1.1">
>   <groups>
>     <group name="service:service_on_node1" state="112"
> state_str="started" flags="0" flags_str="" owner="node1"
> last_owner="none" restarts="0" last_transition="1294752663"
> last_transition_str="Tue Jan 11 19:01:03 2011"/>
>   </groups>
> </clustat>
>
> I was under impression that "last_owner" field in the above XML file
> should give me node name where service was last running. I was parsing
> this XML file to obtain this information.
>
> Note that, this holds true if you migrate or relocate service from one
> node to another using clusvcadm or from conga or system-config-luster
> BUT if node is shutdown and service relocate to another node,
> last_owner is either 'none' or same as current node on which service
> is relocated.
>
> Parsing var/messages/log is easy but not optimal solution, it will
> need "grep"ing entire log file for some specific message where failed
> node name is appearing in clumgr messages.
>
>
>
> On Thu, Jan 13, 2011 at 4:35 AM, Kit Gerrits <kitgerrits at gmail.com> wrote:
>>
>> Hello,
>>
>> The Clustering software itself monitors nodes and devices in use by
>> cluster
>> services, but logs to /var/log/messages.
>> A quick overview is presented by the 'clustat' command.
>>
>> Monitoring tools are freely available for any platform.
>> Basic monitoring in Linux is available with Big Brother, Cacti, OpenNMS
>> or
>> Nagios (in order of increasing complexity).
>> If you're bound to windows, maybe try ServersCheck .
>>
>>
>> Parsing logs can be trivial, once you know how.
>> What do you want to know and when do you want to know it?
>>
>> Have you looked at 'clustat' and 'cman_tool'?
>>
>>
>> Regards,
>>
>> Kit
>>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
>> Sent: woensdag 12 januari 2011 11:01
>> To: linux clustering
>> Subject: Re: [Linux-cluster] Determining failed node on another node of
>> clusterduring failover
>>
>> Hi
>>
>> Is monitoring package part of RHCS? What is name of this component?
>>
>> Is there any other mechanism which doesn't require to parse log/messages
>> to
>> determine which node has left the cluster on stand by node before
>> failover
>> is complee?
>>
>> Thanks
>>
>> On Wed, Jan 12, 2011 at 2:58 PM, Kit Gerrits <kitgerrits at gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> If you want to find out which cluster node has failed, you could
>>> either check /var/log/messages and see which member has left the
>>> cluster, or you can set up monitoring to check if your servers are all
>>> in
>> good shape.
>>>
>>> If you are running a cluster, I would suggest also setting up
>>> monitoring.
>>> The monitoring package can then notify you if any cluster member fails.
>>>
>>>
>>> Regards,
>>>
>>> Kit
>>>
>>> -----Original Message-----
>>> From: linux-cluster-bounces at redhat.com
>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Parvez Shaikh
>>> Sent: woensdag 12 januari 2011 7:04
>>> To: linux clustering
>>> Subject: [Linux-cluster] Determining failed node on another node of
>>> clusterduring failover
>>>
>>> Hi all,
>>>
>>> Taking this question from another thread, here is a challenge that I
>>> am facing -
>>>
>>> Following is simple cluster configuration -
>>>
>>> Node 1, node 2, node 3, and node4 are part of cluster, its
>>> unrestricted unordered fail-over domain with active - active nxn
>>> configuration
>>>
>>> So a node 2 can get services from node1, node3 or node4 when any of
>>> these(1,3,4) node fails(e.g. power failure).
>>>
>>> In that event I want to find out which of the node has failed over
>>> node2, I was invoking "clustat -x -S service name" on node2 in my
>>> custom agent and was parsing for "last_owner" field to obtain name of
>>> node on which service was previously running.
>>>
>>> This however doesn't seem to be working in case if I shutdown node(but
>>> works if I migrate service from one node to another using clusvcadm)
>>>
>>> Is there anyway that I can find out which node has failed during
>>> failover of service on a standby node? Any tool which I might have
>>> missed or some command which I can send to ccsd to get this
>>> information
>>>
>>> Thanks
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>



From kclo2000 at gmail.com  Fri Jan 14 04:39:37 2011
From: kclo2000 at gmail.com (KC LO)
Date: Fri, 14 Jan 2011 12:39:37 +0800
Subject: [Linux-cluster] Testing cluster failover
Message-ID: <AANLkTikG_9n2qR1TRGEo7ZYmbhim5YKK2wzG33=dMyZV@mail.gmail.com>

Dear Friends,

Thanks for your advise on my new cluster setup.  I am going into failover
testing of the cluster.

My configuraiton involves 3 active member and 1 standby servers and running
Redhat 5.5 with Cluster installed.
In my active server, when I type "reboot", it can successfully relocate the
service to the standby node.
However, If I type "init 0" to simulate server failure, it can't relocate to
the standby node.  Is it the correct behaviour?
Anything that I should check.  Any advice?

Thanks for your help!

Below is the rgmanager log file from the standby server.  It can detect the
server down.

*Jan 14 11:24:53 yzbstb01 clurgmgrd[16286]: <notice> Member 1 shutting down
*Jan 14 11:25:17 yzbstb01 openais[15365]: [TOTEM] The token was lost in the
OPERATIONAL state.
Jan 14 11:25:17 yzbstb01 openais[15365]: [TOTEM] Receive multicast socket
recv buffer size (320000 bytes).
Jan 14 11:25:17 yzbstb01 openais[15365]: [TOTEM] Transmit multicast socket
send buffer size (262142 bytes).
Jan 14 11:25:17 yzbstb01 openais[15365]: [TOTEM] entering GATHER state from
2.
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] entering GATHER state from
0.
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] Saving state aru c8 high
seq received c8
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] Storing new sequence id for
ring 5e8
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] entering COMMIT state.
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] entering RECOVERY state.
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] position [0] member
10.10.10.164:
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1508 rep
10.10.10.164
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] aru c8 high delivered c8
received flag 1
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] position [1] member
10.10.10.165:
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1508 rep
10.10.10.164
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] aru c8 high delivered c8
received flag 1
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] position [2] member
10.10.10.167:
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1508 rep
10.10.10.164
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] aru c8 high delivered c8
received flag 1
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] Did not need to originate
any messages in recovery.
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] CLM CONFIGURATION CHANGE
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] New Configuration:
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.164)
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.165)
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.167)
*Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] Members Left:
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.166)
*Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] Members Joined:
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] CLM CONFIGURATION CHANGE
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] New Configuration:
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.164)
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.165)
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.167)
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] Members Left:
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] Members Joined:
Jan 14 11:25:37 yzbstb01 openais[15365]: [SYNC ] This node is within the
primary component and will provide service.
Jan 14 11:25:37 yzbstb01 openais[15365]: [TOTEM] entering OPERATIONAL state.
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.164
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.165
Jan 14 11:25:37 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.167
Jan 14 11:25:37 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 4
Jan 14 11:25:37 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 3
Jan 14 11:25:37 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 2
Jan 14 11:51:34 yzbstb01 openais[15365]: [TOTEM] entering GATHER state from
9.
Jan 14 11:51:37 yzbstb01 openais[15365]: [TOTEM] Saving state aru 25 high
seq received 25
Jan 14 11:51:37 yzbstb01 openais[15365]: [TOTEM] Storing new sequence id for
ring 5ec
Jan 14 11:51:37 yzbstb01 openais[15365]: [TOTEM] entering COMMIT state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] The token was lost in the
COMMIT state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering GATHER state from
4.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] Storing new sequence id for
ring 5f0
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering COMMIT state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering GATHER state from
13.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] Storing new sequence id for
ring 5f4
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering COMMIT state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering RECOVERY state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] position [0] member
10.10.10.164:
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1512 rep
10.10.10.164
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] aru 25 high delivered 25
received flag 1
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] position [1] member
10.10.10.165:
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1512 rep
10.10.10.164
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] aru 25 high delivered 25
received flag 1
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] position [2] member
10.10.10.166:
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1512 rep
10.10.10.166
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] aru c high delivered c
received flag 1
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] position [3] member
10.10.10.167:
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] previous ring seq 1512 rep
10.10.10.164
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] aru 25 high delivered 25
received flag 1
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] Did not need to originate
any messages in recovery.
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] CLM CONFIGURATION CHANGE
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] New Configuration:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.164)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.165)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.167)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] Members Left:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] Members Joined:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] CLM CONFIGURATION CHANGE
*Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] New Configuration:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.164)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.165)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.166)
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.167)
*Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] Members Left:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] Members Joined:
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ]        r(0)
ip(10.10.10.166)
Jan 14 11:51:47 yzbstb01 openais[15365]: [SYNC ] This node is within the
primary component and will provide service.
Jan 14 11:51:47 yzbstb01 openais[15365]: [TOTEM] entering OPERATIONAL state.
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.165
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.166
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.167
Jan 14 11:51:47 yzbstb01 openais[15365]: [CLM  ] got nodejoin message
10.10.10.164
Jan 14 11:51:47 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 3
Jan 14 11:51:47 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 1
Jan 14 11:51:47 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 2
Jan 14 11:51:47 yzbstb01 openais[15365]: [CPG  ] got joinlist message from
node 4
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110114/a7306c44/attachment.htm>

From swhiteho at redhat.com  Fri Jan 14 10:26:40 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 14 Jan 2011 10:26:40 +0000
Subject: [Linux-cluster] nolock and dlm nodes in the same cluster
In-Reply-To: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>
References: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>
Message-ID: <1295000800.2451.5.camel@dolmen>

Hi,

On Thu, 2011-01-13 at 22:39 +0100, Sven Karlsson wrote:
> G'day,
> 
> We have a GFS2 cluster on a fibre-SAN with three machines, of which
> one machine is used for remote backups.
> 
> The cluster contains a lot of small files, and the backup operation
> takes about a day to complete. When investigating, we found that the
> major performance bottleneck was the file locking operations. We
> stopped the cluster and mounted the backup-node with the lock_nolock
> option, and now backups were blazing.
> 
> After careful consideration of the nolock-warning in the documentation
> (i.e. corruption and kernel panics may happen), I wonder if that is
> still the case if spectator mode is used?
> 
Spectator mode is bascially the same as a mount of a read only block
device. It is intended to allow nodes which will never write to the
block device to mount the file system read only without the requirement
for a journal.

Although there is nothing to stop you using it in conjunction with
nolock it was not intended to be used that way and it doesn't make sense
to do that.

> Or is there some other options that are available? The files will not
> be modified by the other nodes during this time, so there is no actual
> need for file-level locking... but perhaps the DLM is also handling
> meta-data and other locking that is necessary and it is therefore not
> possible to use nolock?
> 
> /Sven
> 
The dlm does not handle metadata, it only deals with the locking. Is the
issue really the dlm, or the fact that there were other nodes mounted
during the backup process?

There is no harm in unmounting the cluster filesystem on all nodes and
then mounting it on exactly one node with lock_nolock to back it up. The
only issue is that you have to be very careful in the commands that you
issue in order to be certain that it has not been left accidentally
mounted on one of the cluster nodes.

I suspect though, that it is not the dlm itself, but the overheads of
passing the locks from the other nodes that is the issue here,

Steve.

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From sunhux at gmail.com  Fri Jan 14 16:22:07 2011
From: sunhux at gmail.com (sunhux G)
Date: Sat, 15 Jan 2011 00:22:07 +0800
Subject: [Linux-cluster] clamav upgrade issue
Message-ID: <AANLkTi=Z8XJkkn_f03mmSes0kbsURN8PzZzkve1ZV_GA@mail.gmail.com>

I have an old version (ie Ver 0.91.2-1.el4) currently running on my
RHES 4.x


First, I tried installing two releases' rpm of clamav but got the errors
below:

# rpm -i ./clamd-0.96.5-1.el4.rf.i386.rpm
error: Failed dependencies:
        clamav = 0.96.5-1.el4.rf is needed by clamd-0.96.5-1.el4.rf.i386
        libclamav.so.6 is needed by clamd-0.96.5-1.el4.rf.i386
        libclamav.so.6(CLAMAV_PRIVATE) is needed by
clamd-0.96.5-1.el4.rf.i386
        libclamav.so.6(CLAMAV_PUBLIC) is needed by
clamd-0.96.5-1.el4.rf.i386
#
# rpm -i ./clamd-0.96.4-1.el4.rf.i386.rpm
error: Failed dependencies:
        clamav = 0.96.4-1.el4.rf is needed by clamd-0.96.4-1.el4.rf.i386
        libclamav.so.6 is needed by clamd-0.96.4-1.el4.rf.i386
        libclamav.so.6(CLAMAV_PRIVATE) is needed by
clamd-0.96.4-1.el4.rf.i386
        libclamav.so.6(CLAMAV_PUBLIC) is needed by clamd-0.96.4-1.el4.rf



Second, I got another version of rpm from another site but this time it gave
a different problem.  This same problem persisted despite that I've
uninstalled
(ie "rpm -e ... ") :

# rpm -i ./clamav-0.96.1-1.el4.rf.i386.rpm
        file /etc/freshclam.conf from install of clamav-0.96.1-1.el4.rf
conflicts with file from package clamav-0.91.2-1.el4.rf
        file /usr/bin/clamscan from install of clamav-0.96.1-1.el4.rf
conflicts with file from package clamav-0.91.2-1.el4.rf
        file /usr/bin/freshclam from install of clamav-0.96.1-1.el4.rf
conflicts with file from package clamav-0.91.2-1.el4.rf
        file /usr/bin/sigtool from install of clamav-0.96.1-1.el4.rf
conflicts with file from package clamav-0.91.2-1.el4.rf
        file /usr/share/man/man1/clamscan.1.gz from install of
clamav-0.96.1-1.el4.rf conflicts with file from package
clamav-0.91.2-1.el4.rf
        file /usr/share/man/man1/freshclam.1.gz from install of
clamav-0.96.1-1.el4.rf conflicts with file from package
clamav-0.91.2-1.el4.rf
        file /usr/share/man/man1/sigtool.1.gz from install of
clamav-0.96.1-1.el4.rf conflicts with file from package
clamav-0.91.2-1.el4.rf
        file /usr/share/man/man5/freshclam.conf.5.gz from install of
clamav-0.96.1-1.el4.rf conflicts with file from package
clamav-0.91.2-1.el4.rf


# rpm -e clamd-0.91.2-1.el4.rf
# rpm -qa | grep clam    <== this verified the above old rpm is deinstalled

When I repeated "rpm -i ./clamav-0.96.1-1.el4.rf.i386.rpm" above, it still
gave
the same "conflicts" errors.  I also tried renaming the above 8 conflicting
files & reinstall but it still gave the same error

So what do I do next?


Thanks
U
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110115/c62c02e5/attachment.htm>

From esanchezvela.redhatcluster at gmail.com  Sat Jan 15 23:21:38 2011
From: esanchezvela.redhatcluster at gmail.com (Enrique Sanchez)
Date: Sat, 15 Jan 2011 18:21:38 -0500
Subject: [Linux-cluster] Testing cluster failover
In-Reply-To: <AANLkTikG_9n2qR1TRGEo7ZYmbhim5YKK2wzG33=dMyZV@mail.gmail.com>
References: <AANLkTikG_9n2qR1TRGEo7ZYmbhim5YKK2wzG33=dMyZV@mail.gmail.com>
Message-ID: <AANLkTimo9hveoEHs1i4Gr0Ez=EXU_a4hVY=8ah6uTXR3@mail.gmail.com>

On Thu, Jan 13, 2011 at 11:39 PM, KC LO <kclo2000 at gmail.com> wrote:
> Dear Friends,
>
> Thanks for your advise on my new cluster setup.? I am going into failover
> testing of the cluster.
>
> My configuraiton involves 3 active member and 1 standby servers and running
> Redhat 5.5 with Cluster installed.
> In my active server, when I type "reboot", it can successfully relocate the
> service to the standby node.
> However, If I type "init 0" to simulate server failure, it can't relocate to
> the standby node.? Is it the correct behaviour?
> Anything that I should check.? Any advice?
>

there are differences in behavior between an abnormal cluster exit and
a normal one, reboot makes it abnormal while "init 0" first calls all
services running on your run level with an "stop" option, making it
normal.

You need to configure the cluster to force a take-over even on these situations.

regards,
enrique.

-- 
Enrique Sanchez Vela
------------------------------------------



From scooter at cgl.ucsf.edu  Sun Jan 16 00:46:30 2011
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Sat, 15 Jan 2011 16:46:30 -0800
Subject: [Linux-cluster] Question about GFS2 and mmap
Message-ID: <4D323FE6.6020406@cgl.ucsf.edu>

We have a RedHat cluster (5.5 currently) with 3 nodes, and are sharing a 
number of gfs2 filesystems across all nodes.  One of the applications we 
run is a standard bioinformatics application called BLAST that searches 
large indexed files to find similar dna (or protein) sequences.  BLAST 
will typically mmap a fair amount of data into memory from the index 
files.  Normally, this significantly speeds up subsequent executions of 
BLAST.  This doesn't appear to work on gfs2, however, when I involve 
other nodes.  For example, if I run blast three times on a single node, 
the first execution is very slow, but subsequent executions are 
significantly quicker.  If I then run it on another node in the cluster 
(accessing the same data files over gfs2), the first execution is slow, 
and subsequent executions are quicker.  This makes sense.  The problem 
is that when I run it on multiple nodes, the speeds of subsequent runs 
on the same node are no quicker.  It almost seems as if gfs2 is flushing 
the in-memory copy (which is read only) immediately when the file is 
accessed on another node.  Is this the case?  If so, is there a reason 
for this, or is it a bug?  If it's a known bug, is there a workaround?

Any help would be appreciated!  This is a critical application for us.

Thanks in advance,

-- scooter



From bernardchew at gmail.com  Sun Jan 16 14:52:02 2011
From: bernardchew at gmail.com (Bernard Chew)
Date: Sun, 16 Jan 2011 22:52:02 +0800
Subject: [Linux-cluster] nolock and dlm nodes in the same cluster
In-Reply-To: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>
References: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>
Message-ID: <AANLkTim9rP-AwNBF_eWQgUW5QtumQZbEJnMwxhaFOR2v@mail.gmail.com>

Hi Sven,

Not sure if it helps; have you tried to export GFS2 as NFS read-only
and mount it in the backup node for backups?

Regards,
Bernard

> On Fri, Jan 14, 2011 at 5:39 AM, Sven Karlsson <karlesven at gmail.com> wrote:
> G'day,
>
> We have a GFS2 cluster on a fibre-SAN with three machines, of which
> one machine is used for remote backups.
>
> The cluster contains a lot of small files, and the backup operation
> takes about a day to complete. When investigating, we found that the
> major performance bottleneck was the file locking operations. We
> stopped the cluster and mounted the backup-node with the lock_nolock
> option, and now backups were blazing.
>
> After careful consideration of the nolock-warning in the documentation
> (i.e. corruption and kernel panics may happen), I wonder if that is
> still the case if spectator mode is used?
>
> Or is there some other options that are available? The files will not
> be modified by the other nodes during this time, so there is no actual
> need for file-level locking... but perhaps the DLM is also handling
> meta-data and other locking that is necessary and it is therefore not
> possible to use nolock?
>
> /Sven
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From swhiteho at redhat.com  Sun Jan 16 15:32:49 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Sun, 16 Jan 2011 15:32:49 +0000
Subject: [Linux-cluster] Question about GFS2 and mmap
In-Reply-To: <4D323FE6.6020406@cgl.ucsf.edu>
References: <4D323FE6.6020406@cgl.ucsf.edu>
Message-ID: <1295191969.2606.39.camel@dolmen>

Hi,

On Sat, 2011-01-15 at 16:46 -0800, Scooter Morris wrote:
> We have a RedHat cluster (5.5 currently) with 3 nodes, and are sharing a 
> number of gfs2 filesystems across all nodes.  One of the applications we 
> run is a standard bioinformatics application called BLAST that searches 
> large indexed files to find similar dna (or protein) sequences.  BLAST 
> will typically mmap a fair amount of data into memory from the index 
> files.  Normally, this significantly speeds up subsequent executions of 
> BLAST.  This doesn't appear to work on gfs2, however, when I involve 
> other nodes.  For example, if I run blast three times on a single node, 
> the first execution is very slow, but subsequent executions are 
> significantly quicker.  If I then run it on another node in the cluster 
> (accessing the same data files over gfs2), the first execution is slow, 
> and subsequent executions are quicker.  This makes sense.  The problem 
> is that when I run it on multiple nodes, the speeds of subsequent runs 
> on the same node are no quicker.  It almost seems as if gfs2 is flushing 
> the in-memory copy (which is read only) immediately when the file is 
> accessed on another node.  Is this the case?  If so, is there a reason 
> for this, or is it a bug?  If it's a known bug, is there a workaround?
> 
> Any help would be appreciated!  This is a critical application for us.
> 
> Thanks in advance,
> 
> -- scooter
> 
Are you sure that the noatime mount option has been used? I can't figure
out why that shouldn't work if the BLAST processes are really only
reading the files and not writing to them.

GFS2 is able to tell the difference between read and write accesses to
shared, writable mmap()ed files (unlike GFS which has to assume that all
accesses are write accesses). Some early versions of GFS2 did that too,
but anything recent (has ->page_mkwrite() in the source) and certainly
5.5 does, should be ok.

You can use the glock dump to see what mode the glock associated with
the mmap()ed inode is in. With RHEL6/Fedora/upstream you can use the
tracepoints to watch the state dynamically during the operations. I'm
afraid that isn't available on RHEL5. All you need to know is the inode
number of the file in question and then look for a type 2 glock with the
same number.

Let us know if that helps narrow down the issue. BLAST is something that
I'd like to see running well on GFS2,

Steve.

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From jeff.sturm at eprize.com  Mon Jan 17 15:31:50 2011
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Mon, 17 Jan 2011 10:31:50 -0500
Subject: [Linux-cluster] nolock and dlm nodes in the same cluster
In-Reply-To: <1295000800.2451.5.camel@dolmen>
References: <AANLkTinUxoe18imu8FUB0wBhBr+bxnsnxBkP7_o79Yit@mail.gmail.com>
	<1295000800.2451.5.camel@dolmen>
Message-ID: <64D0546C5EBBD147B75DE133D798665F06A12A4E@hugo.eprize.local>

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Steven Whitehouse
> 
> There is no harm in unmounting the cluster filesystem on all nodes and
then mounting
> it on exactly one node with lock_nolock to back it up. The only issue
is that you have to
> be very careful in the commands that you issue in order to be certain
that it has not
> been left accidentally mounted on one of the cluster nodes.

The obvious downside is that the GFS volume cannot be used at all during
the lock_nolock mount/backup.

Using a SAN that supports LUN snapshots, there is another way to get a
backup.  You can:

1. Freeze the GFS volume
2. Create a snapshot
3. Unfreeze GFS
4. Mount the snapshot with lock_nolock
5. Perform a conventional backup of the mounted snapshot

The first three steps should complete very quickly, minimizing downtime.
Steps 4 and 5 are performed on the snapshot only and do not disrupt the
original volume, nor does this result in increased dlm traffic.  The
only performance impact may be from increased block I/O from the SAN
during the backup.

-Jeff





From scooter at cgl.ucsf.edu  Mon Jan 17 19:06:53 2011
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Mon, 17 Jan 2011 11:06:53 -0800
Subject: [Linux-cluster] Question about GFS2 and mmap
In-Reply-To: <1295191969.2606.39.camel@dolmen>
References: <4D323FE6.6020406@cgl.ucsf.edu> <1295191969.2606.39.camel@dolmen>
Message-ID: <4D34934D.4040106@cgl.ucsf.edu>

Steven,
     Thanks for getting back to me.  Yes, I've checked and noatime is 
definitely set.  While blast was running, I did a lockdump and the 
mmaped files had EX locks on them:

G:  s:EX n:2/5497229 f:q t:EX d:EX/0 l:0 a:0 r:3
  I: n:1055314/88699433 t:8 f:0x10 d:0x00000000 s:55237024/55237024

where inode 88699433 is one of the mapped files:

[root at crick blast]# ls -li /databases/mol/blast/db_current/nr.01.pin
88699433 -rw-r--r-- 1 rpcuser sacs 55237024 Jan 17 02:53 
/databases/mol/blast/db_current/nr.01.pin

so that explains the behavior.  What I don't understand is why they had 
EX locks.  I did an strace of the blast, and what I see when the files 
are mmaped is something like:

stat("/databases/mol/blast/db/nr.01.pin", {st_mode=S_IFREG|0644, 
st_size=55237024, ...}) = 0
open("/databases/mol/blast/db/nr.01.pin", O_RDONLY) = 8
mmap(NULL, 55237024, PROT_READ, MAP_SHARED, 8, 0) = 0x2b9ec1a14000

Where /databases/mol/blast is the gfs2 filesystem.  So, the files are 
not opened read/write, and the mmap'ed segment is not read/write.  It's 
not clear why gfs2 would create an exclusive glock for this file?  Does 
this make any sense to you?

-- scooter

On 01/16/2011 07:32 AM, Steven Whitehouse wrote:
> Hi,
>
> On Sat, 2011-01-15 at 16:46 -0800, Scooter Morris wrote:
>> We have a RedHat cluster (5.5 currently) with 3 nodes, and are sharing a
>> number of gfs2 filesystems across all nodes.  One of the applications we
>> run is a standard bioinformatics application called BLAST that searches
>> large indexed files to find similar dna (or protein) sequences.  BLAST
>> will typically mmap a fair amount of data into memory from the index
>> files.  Normally, this significantly speeds up subsequent executions of
>> BLAST.  This doesn't appear to work on gfs2, however, when I involve
>> other nodes.  For example, if I run blast three times on a single node,
>> the first execution is very slow, but subsequent executions are
>> significantly quicker.  If I then run it on another node in the cluster
>> (accessing the same data files over gfs2), the first execution is slow,
>> and subsequent executions are quicker.  This makes sense.  The problem
>> is that when I run it on multiple nodes, the speeds of subsequent runs
>> on the same node are no quicker.  It almost seems as if gfs2 is flushing
>> the in-memory copy (which is read only) immediately when the file is
>> accessed on another node.  Is this the case?  If so, is there a reason
>> for this, or is it a bug?  If it's a known bug, is there a workaround?
>>
>> Any help would be appreciated!  This is a critical application for us.
>>
>> Thanks in advance,
>>
>> -- scooter
>>
> Are you sure that the noatime mount option has been used? I can't figure
> out why that shouldn't work if the BLAST processes are really only
> reading the files and not writing to them.
>
> GFS2 is able to tell the difference between read and write accesses to
> shared, writable mmap()ed files (unlike GFS which has to assume that all
> accesses are write accesses). Some early versions of GFS2 did that too,
> but anything recent (has ->page_mkwrite() in the source) and certainly
> 5.5 does, should be ok.
>
> You can use the glock dump to see what mode the glock associated with
> the mmap()ed inode is in. With RHEL6/Fedora/upstream you can use the
> tracepoints to watch the state dynamically during the operations. I'm
> afraid that isn't available on RHEL5. All you need to know is the inode
> number of the file in question and then look for a type 2 glock with the
> same number.
>
> Let us know if that helps narrow down the issue. BLAST is something that
> I'd like to see running well on GFS2,
>
> Steve.
>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From jose.neto at liber4e.com  Tue Jan 18 11:05:11 2011
From: jose.neto at liber4e.com (jose nuno neto)
Date: Tue, 18 Jan 2011 11:05:11 -0000 (GMT)
Subject: [Linux-cluster] Qdisk and Reboot problem
Message-ID: <933a85aea100106c66316d92b5b64d92.squirrel@208.77.96.130>

I we have several redhat cluster running cman-2.0.115-34.el5 and yesterday
we faced and incident I cant figure it out:

We have 2node cluster+qdisk connected through iscsi with dual paths
we had some network issue and one of paths to the quorum disk was
unavailable. after that I see some qdisk eviction messages and system
reboot shortly after, followed by fencing from the 2nd node some seconds
later

That shouldn't be the correct behavior right? I have reboot=1 on qdisk
settings but it's supposed to work on heuristics downgrade if I understood
it right.
I tried to simulate on the system but cant get this behavior, on the test
systems nothing happens after qdisk unavailability witch is ok

Im posting some info below, some hints on how to investigate would help.
Disabling reboot should disable this anyway, yes?

cluster.conf:
<totem token="15000"/>
<quorumd max_error_cycles="10" tko_up="2" master_wait="2" allow_kill="0"
interval="2" label="OracleOne_Quorum" min_score="1" reboot="1" tko="20"
votes="1" log_level="7" status_file="/tmp/Quorumstatus">
<heuristic interval="5" program="/bin/ping `cat /etc/sysconfig/network  |
awk -F '=' '/GATEWAY/ {print $2}'` -c1 -t2" score="1" tko="50"/>
........
<fence_daemon clean_start="0" post_fail_delay="60" post_join_delay="3"/>
<cman expected_votes="3"/>


cman_tool status:
Version: 6.2.0
Config Version: 23
Cluster Name: OracleOne
Cluster Id: 47365
Cluster Member: Yes
Cluster Generation: 444
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Quorum device votes: 1
Total votes: 3
Quorum: 2
Active subsystems: 9
Flags: Dirty
Ports Bound: 0 177
Node ID: 2
.....

cat /tmp/Quorumstatus
Time Stamp: Tue Jan 18 11:54:17 2011
Node ID: 2
Score: 1/1 (Minimum required = 1)
Current state: Running
Initializing Set: { }
Visible Set: { 1 2 }
Master Node ID: 1


/var/log/messages:
Jan 17 17:53:35 <kern.err> NODE2 kernel:  connection1:0: ping timeout of 5
secs expired, recv timeout 5, last rx 4554104323, last ping 4554109323,
now 4554114323
Jan 17 17:53:35 <kern.info> NODE2 kernel:  connection1:0: detected conn
error (1011)
Jan 17 17:53:36 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 1:0 error (1011) state (3)
Jan 17 17:53:36 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 1:0 error (1011) state (3)
Jan 17 17:53:37 <kern.err> NODE2 kernel:  connection2:0: ping timeout of 5
secs expired, recv timeout 5, last rx 4554106362, last ping 4554111362,
now 4554116362
Jan 17 17:53:37 <kern.info> NODE2 kernel:  connection2:0: detected conn
error (1011)
Jan 17 17:53:38 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 2:0 error (1011) state (3)
Jan 17 17:53:38 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 2:0 error (1011) state (3)
Jan 17 17:53:42 <kern.info> NODE2 kernel:  session2: session recovery
timed out after 5 secs
Jan 17 17:53:42 <kern.info> NODE2 kernel: sd 12:0:0:1: SCSI error: return
code = 0x000f0000
Jan 17 17:53:42 <kern.warn> NODE2 kernel: end_request: I/O error, dev
sdau, sector 8
Jan 17 17:53:42 <kern.warn> NODE2 kernel: device-mapper: multipath:
Failing path 66:224.
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-116: remove map (uevent)
Jan 17 17:53:42 <daemon.warn> NODE2 multipathd: sdau: tur checker reports
path is down
Jan 17 17:53:42 <daemon.warn> NODE2 multipathd: sdau: tur checker reports
path is down
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: checker failed path
66:224 in map mpath_iSCSI_qdisk
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: checker failed path
66:224 in map mpath_iSCSI_qdisk
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: mpath_iSCSI_qdisk:
remaining active paths: 1
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: mpath_iSCSI_qdisk:
remaining active paths: 1
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-75: add map (uevent)
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-75: add map (uevent)
Jan 17 17:53:44 <local4.info> NODE2 openais[8952]: [CMAN ] lost contact
with quorum device
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> qdiskd: read
(system call) has hung for 20 seconds
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> qdiskd: read
(system call) has hung for 20 seconds
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> In 20 more
seconds, we will be evicted
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> In 20 more
seconds, we will be evicted
Jan 17 17:55:10 <kern.info> NODE2 kernel: md: stopping all md devices. (
Shutdown )
Jan 17 17:55:12 <kern.info> NODE2 kernel: bonding: bond2: link status down
for interface eth0, disabling it in 2000 ms.

2nd node /var/log/messages:
Jan 17 17:55:57 <kern.err> NODE1 kernel: dlm: closing connection to node 2
Jan 17 17:56:57 <daemon.info> NODE1 fenced[8982]: NODE2-cl not a cluster
member after 60 sec post_fail_delay
Jan 17 17:56:57 <daemon.info> NODE1 fenced[8982]: fencing node "NODE2-cl"

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



From swhiteho at redhat.com  Tue Jan 18 11:41:53 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Tue, 18 Jan 2011 11:41:53 +0000
Subject: [Linux-cluster] Question about GFS2 and mmap
In-Reply-To: <4D34934D.4040106@cgl.ucsf.edu>
References: <4D323FE6.6020406@cgl.ucsf.edu>
	<1295191969.2606.39.camel@dolmen>  <4D34934D.4040106@cgl.ucsf.edu>
Message-ID: <1295350913.2533.52.camel@dolmen>

Hi,

On Mon, 2011-01-17 at 11:06 -0800, Scooter Morris wrote:
> Steven,
>      Thanks for getting back to me.  Yes, I've checked and noatime is 
> definitely set.  While blast was running, I did a lockdump and the 
> mmaped files had EX locks on them:
> 
> G:  s:EX n:2/5497229 f:q t:EX d:EX/0 l:0 a:0 r:3
>   I: n:1055314/88699433 t:8 f:0x10 d:0x00000000 s:55237024/55237024
> 
> where inode 88699433 is one of the mapped files:
> 
> [root at crick blast]# ls -li /databases/mol/blast/db_current/nr.01.pin
> 88699433 -rw-r--r-- 1 rpcuser sacs 55237024 Jan 17 02:53 
> /databases/mol/blast/db_current/nr.01.pin
> 
> so that explains the behavior.  What I don't understand is why they had 
> EX locks.  I did an strace of the blast, and what I see when the files 
> are mmaped is something like:
> 
> stat("/databases/mol/blast/db/nr.01.pin", {st_mode=S_IFREG|0644, 
> st_size=55237024, ...}) = 0
> open("/databases/mol/blast/db/nr.01.pin", O_RDONLY) = 8
> mmap(NULL, 55237024, PROT_READ, MAP_SHARED, 8, 0) = 0x2b9ec1a14000
> 
> Where /databases/mol/blast is the gfs2 filesystem.  So, the files are 
> not opened read/write, and the mmap'ed segment is not read/write.  It's 
> not clear why gfs2 would create an exclusive glock for this file?  Does 
> this make any sense to you?
> 
> -- scooter
> 
Well it depends on the history of the glock in question. Once a glock
has been cached in exclusive mode, it will never be dropped unless (a)
there is memory pressure and the glock is reclaimed or (b) another node
requests it.

So if there has been only a single node doing read only work on the
inode and previous to that something on that same node wrote to that
inode, then it will still be in exclusive mode. A read only request from
a different node should push the glock into shared mode (on both nodes)
and if that isn't happening correctly, then it sounds like something has
gone awry somewhere.

The other thing, is that mmap() itself does grab an exclusive lock on
the inode - it has to update atime if that is turned on (but it doesn't
take any locks if atime is turned off as in your case). Note that this
is only the initial call to mmap() though, and the locks taken during
page faults are either shared or exclusive according to the type of
fault (read or write).

If you could check the atime before the mmap and after it, that should
tell us if there is a problem with the noatime check here.

If that still doesn't work, I'll try and duplicate what you are doing
with a small test program and see if I can reproduce the problem,

Steve.


> On 01/16/2011 07:32 AM, Steven Whitehouse wrote:
> > Hi,
> >
> > On Sat, 2011-01-15 at 16:46 -0800, Scooter Morris wrote:
> >> We have a RedHat cluster (5.5 currently) with 3 nodes, and are sharing a
> >> number of gfs2 filesystems across all nodes.  One of the applications we
> >> run is a standard bioinformatics application called BLAST that searches
> >> large indexed files to find similar dna (or protein) sequences.  BLAST
> >> will typically mmap a fair amount of data into memory from the index
> >> files.  Normally, this significantly speeds up subsequent executions of
> >> BLAST.  This doesn't appear to work on gfs2, however, when I involve
> >> other nodes.  For example, if I run blast three times on a single node,
> >> the first execution is very slow, but subsequent executions are
> >> significantly quicker.  If I then run it on another node in the cluster
> >> (accessing the same data files over gfs2), the first execution is slow,
> >> and subsequent executions are quicker.  This makes sense.  The problem
> >> is that when I run it on multiple nodes, the speeds of subsequent runs
> >> on the same node are no quicker.  It almost seems as if gfs2 is flushing
> >> the in-memory copy (which is read only) immediately when the file is
> >> accessed on another node.  Is this the case?  If so, is there a reason
> >> for this, or is it a bug?  If it's a known bug, is there a workaround?
> >>
> >> Any help would be appreciated!  This is a critical application for us.
> >>
> >> Thanks in advance,
> >>
> >> -- scooter
> >>
> > Are you sure that the noatime mount option has been used? I can't figure
> > out why that shouldn't work if the BLAST processes are really only
> > reading the files and not writing to them.
> >
> > GFS2 is able to tell the difference between read and write accesses to
> > shared, writable mmap()ed files (unlike GFS which has to assume that all
> > accesses are write accesses). Some early versions of GFS2 did that too,
> > but anything recent (has ->page_mkwrite() in the source) and certainly
> > 5.5 does, should be ok.
> >
> > You can use the glock dump to see what mode the glock associated with
> > the mmap()ed inode is in. With RHEL6/Fedora/upstream you can use the
> > tracepoints to watch the state dynamically during the operations. I'm
> > afraid that isn't available on RHEL5. All you need to know is the inode
> > number of the file in question and then look for a type 2 glock with the
> > same number.
> >
> > Let us know if that helps narrow down the issue. BLAST is something that
> > I'd like to see running well on GFS2,
> >
> > Steve.
> >
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From alvaro.fernandez at sivsa.com  Tue Jan 18 16:36:36 2011
From: alvaro.fernandez at sivsa.com (Alvaro Jose Fernandez)
Date: Tue, 18 Jan 2011 17:36:36 +0100
Subject: [Linux-cluster] Qdisk and Reboot problem
References: <933a85aea100106c66316d92b5b64d92.squirrel@208.77.96.130>
Message-ID: <607D6181D9919041BE792D70EF2AEC48015EDF2C@LIMENS.sivsa.int>

Hi

There was recently a thread about evictions due to multipath timeouts
with qdisk, it's in the " Re: [Linux-cluster] RHCS Multipath / Fence"
thread. The pointed RHN docs aids over suggested timings in
multipath.conf to try avoid these issues, perhaps it would help.

regards,


alvaro 

-----Mensaje original-----
De: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] En nombre de jose nuno neto
Enviado el: martes, 18 de enero de 2011 12:05
Para: linux-cluster at redhat.com
Asunto: [Linux-cluster] Qdisk and Reboot problem

I we have several redhat cluster running cman-2.0.115-34.el5 and
yesterday
we faced and incident I cant figure it out:

We have 2node cluster+qdisk connected through iscsi with dual paths
we had some network issue and one of paths to the quorum disk was
unavailable. after that I see some qdisk eviction messages and system
reboot shortly after, followed by fencing from the 2nd node some seconds
later

That shouldn't be the correct behavior right? I have reboot=1 on qdisk
settings but it's supposed to work on heuristics downgrade if I
understood
it right.
I tried to simulate on the system but cant get this behavior, on the
test
systems nothing happens after qdisk unavailability witch is ok

Im posting some info below, some hints on how to investigate would help.
Disabling reboot should disable this anyway, yes?

cluster.conf:
<totem token="15000"/>
<quorumd max_error_cycles="10" tko_up="2" master_wait="2" allow_kill="0"
interval="2" label="OracleOne_Quorum" min_score="1" reboot="1" tko="20"
votes="1" log_level="7" status_file="/tmp/Quorumstatus">
<heuristic interval="5" program="/bin/ping `cat /etc/sysconfig/network
|
awk -F '=' '/GATEWAY/ {print $2}'` -c1 -t2" score="1" tko="50"/>
........
<fence_daemon clean_start="0" post_fail_delay="60" post_join_delay="3"/>
<cman expected_votes="3"/>


cman_tool status:
Version: 6.2.0
Config Version: 23
Cluster Name: OracleOne
Cluster Id: 47365
Cluster Member: Yes
Cluster Generation: 444
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Quorum device votes: 1
Total votes: 3
Quorum: 2
Active subsystems: 9
Flags: Dirty
Ports Bound: 0 177
Node ID: 2
.....

cat /tmp/Quorumstatus
Time Stamp: Tue Jan 18 11:54:17 2011
Node ID: 2
Score: 1/1 (Minimum required = 1)
Current state: Running
Initializing Set: { }
Visible Set: { 1 2 }
Master Node ID: 1


/var/log/messages:
Jan 17 17:53:35 <kern.err> NODE2 kernel:  connection1:0: ping timeout of
5
secs expired, recv timeout 5, last rx 4554104323, last ping 4554109323,
now 4554114323
Jan 17 17:53:35 <kern.info> NODE2 kernel:  connection1:0: detected conn
error (1011)
Jan 17 17:53:36 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 1:0 error (1011) state (3)
Jan 17 17:53:36 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 1:0 error (1011) state (3)
Jan 17 17:53:37 <kern.err> NODE2 kernel:  connection2:0: ping timeout of
5
secs expired, recv timeout 5, last rx 4554106362, last ping 4554111362,
now 4554116362
Jan 17 17:53:37 <kern.info> NODE2 kernel:  connection2:0: detected conn
error (1011)
Jan 17 17:53:38 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 2:0 error (1011) state (3)
Jan 17 17:53:38 <daemon.warn> NODE2 iscsid: Kernel reported iSCSI
connection 2:0 error (1011) state (3)
Jan 17 17:53:42 <kern.info> NODE2 kernel:  session2: session recovery
timed out after 5 secs
Jan 17 17:53:42 <kern.info> NODE2 kernel: sd 12:0:0:1: SCSI error:
return
code = 0x000f0000
Jan 17 17:53:42 <kern.warn> NODE2 kernel: end_request: I/O error, dev
sdau, sector 8
Jan 17 17:53:42 <kern.warn> NODE2 kernel: device-mapper: multipath:
Failing path 66:224.
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-116: remove map
(uevent)
Jan 17 17:53:42 <daemon.warn> NODE2 multipathd: sdau: tur checker
reports
path is down
Jan 17 17:53:42 <daemon.warn> NODE2 multipathd: sdau: tur checker
reports
path is down
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: checker failed path
66:224 in map mpath_iSCSI_qdisk
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: checker failed path
66:224 in map mpath_iSCSI_qdisk
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: mpath_iSCSI_qdisk:
remaining active paths: 1
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: mpath_iSCSI_qdisk:
remaining active paths: 1
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-75: add map
(uevent)
Jan 17 17:53:42 <daemon.notice> NODE2 multipathd: dm-75: add map
(uevent)
Jan 17 17:53:44 <local4.info> NODE2 openais[8952]: [CMAN ] lost contact
with quorum device
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> qdiskd: read
(system call) has hung for 20 seconds
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> qdiskd: read
(system call) has hung for 20 seconds
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> In 20 more
seconds, we will be evicted
Jan 17 17:53:49 <daemon.warn> NODE2 qdiskd[8981]: <warning> In 20 more
seconds, we will be evicted
Jan 17 17:55:10 <kern.info> NODE2 kernel: md: stopping all md devices. (
Shutdown )
Jan 17 17:55:12 <kern.info> NODE2 kernel: bonding: bond2: link status
down
for interface eth0, disabling it in 2000 ms.

2nd node /var/log/messages:
Jan 17 17:55:57 <kern.err> NODE1 kernel: dlm: closing connection to node
2
Jan 17 17:56:57 <daemon.info> NODE1 fenced[8982]: NODE2-cl not a cluster
member after 60 sec post_fail_delay
Jan 17 17:56:57 <daemon.info> NODE1 fenced[8982]: fencing node
"NODE2-cl"

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From scooter at cgl.ucsf.edu  Tue Jan 18 18:59:51 2011
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Tue, 18 Jan 2011 10:59:51 -0800
Subject: [Linux-cluster] Question about GFS2 and mmap
In-Reply-To: <1295350913.2533.52.camel@dolmen>
References: <4D323FE6.6020406@cgl.ucsf.edu>	<1295191969.2606.39.camel@dolmen>
	<4D34934D.4040106@cgl.ucsf.edu> <1295350913.2533.52.camel@dolmen>
Message-ID: <4D35E327.20007@cgl.ucsf.edu>

Hi Steven,

As near as I can tell, the access times aren't getting updated, but when 
I run BLAST on a second node, it also gets an EX lock on the inode.  On 
the other hand, if I run them simultaneously, they both get shared (SH) 
locks (!?!):

node1:
lsof:
blastp    21730    root  mem    REG  253,8   46265880 88699363 
/databases/mol/blast/db_current/nr.00.pin

lockdump:
G:  s:SH n:2/54971e3 f:q t:SH d:EX/0 l:0 a:0 r:3
  I: n:1055306/88699363 t:8 f:0x10 d:0x00000000 s:46265880/46265880

node2:
blastp  18052 root  mem    REG  253,0   46265880 88699363 
/databases/mol/blast/db_current/nr.00.pin
G:  s:SH n:2/54971e3 f:q t:SH d:EX/0 l:0 a:0 r:3
  I: n:1055306/88699363 t:8 f:0x10 d:0x00000000 s:46265880/46265880

It seems like the mmap proactively grabs an exclusive lock (perhaps in 
preparation to update the atime?) if it can.  The problem with this 
strategy, is that it appears as though when the lock gets demoted from 
an exclusive lock to a shared lock, any cached pages are flushed, 
resulting in a significant decrease in repeated performance.  The 
biggest issue from out perspective is what appears to be the flush of 
the cached pages since that a huge impact on the performance of this 
particular application.

Thanks again for your help on this!

-- scooter


On 01/18/2011 03:41 AM, Steven Whitehouse wrote:
> Hi,
>
> On Mon, 2011-01-17 at 11:06 -0800, Scooter Morris wrote:
>> Steven,
>>       Thanks for getting back to me.  Yes, I've checked and noatime is
>> definitely set.  While blast was running, I did a lockdump and the
>> mmaped files had EX locks on them:
>>
>> G:  s:EX n:2/5497229 f:q t:EX d:EX/0 l:0 a:0 r:3
>>    I: n:1055314/88699433 t:8 f:0x10 d:0x00000000 s:55237024/55237024
>>
>> where inode 88699433 is one of the mapped files:
>>
>> [root at crick blast]# ls -li /databases/mol/blast/db_current/nr.01.pin
>> 88699433 -rw-r--r-- 1 rpcuser sacs 55237024 Jan 17 02:53
>> /databases/mol/blast/db_current/nr.01.pin
>>
>> so that explains the behavior.  What I don't understand is why they had
>> EX locks.  I did an strace of the blast, and what I see when the files
>> are mmaped is something like:
>>
>> stat("/databases/mol/blast/db/nr.01.pin", {st_mode=S_IFREG|0644,
>> st_size=55237024, ...}) = 0
>> open("/databases/mol/blast/db/nr.01.pin", O_RDONLY) = 8
>> mmap(NULL, 55237024, PROT_READ, MAP_SHARED, 8, 0) = 0x2b9ec1a14000
>>
>> Where /databases/mol/blast is the gfs2 filesystem.  So, the files are
>> not opened read/write, and the mmap'ed segment is not read/write.  It's
>> not clear why gfs2 would create an exclusive glock for this file?  Does
>> this make any sense to you?
>>
>> -- scooter
>>
> Well it depends on the history of the glock in question. Once a glock
> has been cached in exclusive mode, it will never be dropped unless (a)
> there is memory pressure and the glock is reclaimed or (b) another node
> requests it.
>
> So if there has been only a single node doing read only work on the
> inode and previous to that something on that same node wrote to that
> inode, then it will still be in exclusive mode. A read only request from
> a different node should push the glock into shared mode (on both nodes)
> and if that isn't happening correctly, then it sounds like something has
> gone awry somewhere.
>
> The other thing, is that mmap() itself does grab an exclusive lock on
> the inode - it has to update atime if that is turned on (but it doesn't
> take any locks if atime is turned off as in your case). Note that this
> is only the initial call to mmap() though, and the locks taken during
> page faults are either shared or exclusive according to the type of
> fault (read or write).
>
> If you could check the atime before the mmap and after it, that should
> tell us if there is a problem with the noatime check here.
>
> If that still doesn't work, I'll try and duplicate what you are doing
> with a small test program and see if I can reproduce the problem,
>
> Steve.
>
>
>> On 01/16/2011 07:32 AM, Steven Whitehouse wrote:
>>> Hi,
>>>
>>> On Sat, 2011-01-15 at 16:46 -0800, Scooter Morris wrote:
>>>> We have a RedHat cluster (5.5 currently) with 3 nodes, and are sharing a
>>>> number of gfs2 filesystems across all nodes.  One of the applications we
>>>> run is a standard bioinformatics application called BLAST that searches
>>>> large indexed files to find similar dna (or protein) sequences.  BLAST
>>>> will typically mmap a fair amount of data into memory from the index
>>>> files.  Normally, this significantly speeds up subsequent executions of
>>>> BLAST.  This doesn't appear to work on gfs2, however, when I involve
>>>> other nodes.  For example, if I run blast three times on a single node,
>>>> the first execution is very slow, but subsequent executions are
>>>> significantly quicker.  If I then run it on another node in the cluster
>>>> (accessing the same data files over gfs2), the first execution is slow,
>>>> and subsequent executions are quicker.  This makes sense.  The problem
>>>> is that when I run it on multiple nodes, the speeds of subsequent runs
>>>> on the same node are no quicker.  It almost seems as if gfs2 is flushing
>>>> the in-memory copy (which is read only) immediately when the file is
>>>> accessed on another node.  Is this the case?  If so, is there a reason
>>>> for this, or is it a bug?  If it's a known bug, is there a workaround?
>>>>
>>>> Any help would be appreciated!  This is a critical application for us.
>>>>
>>>> Thanks in advance,
>>>>
>>>> -- scooter
>>>>
>>> Are you sure that the noatime mount option has been used? I can't figure
>>> out why that shouldn't work if the BLAST processes are really only
>>> reading the files and not writing to them.
>>>
>>> GFS2 is able to tell the difference between read and write accesses to
>>> shared, writable mmap()ed files (unlike GFS which has to assume that all
>>> accesses are write accesses). Some early versions of GFS2 did that too,
>>> but anything recent (has ->page_mkwrite() in the source) and certainly
>>> 5.5 does, should be ok.
>>>
>>> You can use the glock dump to see what mode the glock associated with
>>> the mmap()ed inode is in. With RHEL6/Fedora/upstream you can use the
>>> tracepoints to watch the state dynamically during the operations. I'm
>>> afraid that isn't available on RHEL5. All you need to know is the inode
>>> number of the file in question and then look for a type 2 glock with the
>>> same number.
>>>
>>> Let us know if that helps narrow down the issue. BLAST is something that
>>> I'd like to see running well on GFS2,
>>>
>>> Steve.
>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From parvez.h.shaikh at gmail.com  Wed Jan 19 06:19:45 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Wed, 19 Jan 2011 11:49:45 +0530
Subject: [Linux-cluster] Questions related to cluster quorum and fencing
Message-ID: <AANLkTint9oB5n_4yaOuOmLkX1twx2h8wuHrDjCMcnXK+@mail.gmail.com>

Hi all,

*Quorum - *
The questions are bit theoretical, I have gone through documentation and man
pages and have understood that, a cluster is "quorate" if a cluster or its
partition has nodes, with votes equal to or more than "expected_votes" in
"cman" section of cluster.conf file (with no requirement mandating use of
quorum disk)

So how does cluster being quorate or non-quorate affects functioning of a
cluster or services? If cluster is non-quorate, does it indicate an alarming
situation and why?

If a cluster is composed of resource groups which including only IP resource
and script resource monitoring my application server listening on IP
resource(no shared disk or shared resource between cluster nodes), then is
cluster being "quorate" (or non quorate) important for services and/or
cluster?

*Fencing - *
Is fencing and cluster being quorate or non-quorate related? I tried one
experiment, wherein I removed "fencing" for cluster nodes and shutdown one
of the nodes in cluster. And I got message in /var/log/messages indicating
fencing failed for node, and service was not failed over from that node. So
is fencing mandatory even if there is no "shared" disk between two cluster
nodes?

Also is a cluster non-quorate in a time window when a node has failed and
has not been fenced successfully?

Yours gratefully
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110119/feb0ac60/attachment.htm>

From linux at alteeve.com  Wed Jan 19 12:16:05 2011
From: linux at alteeve.com (Digimer)
Date: Wed, 19 Jan 2011 07:16:05 -0500
Subject: [Linux-cluster] Questions related to cluster quorum and fencing
In-Reply-To: <AANLkTint9oB5n_4yaOuOmLkX1twx2h8wuHrDjCMcnXK+@mail.gmail.com>
References: <AANLkTint9oB5n_4yaOuOmLkX1twx2h8wuHrDjCMcnXK+@mail.gmail.com>
Message-ID: <4D36D605.10201@alteeve.com>

On 01/19/2011 01:19 AM, Parvez Shaikh wrote:
> Hi all,
> 
> *Quorum - *
> The questions are bit theoretical, I have gone through documentation and
> man pages and have understood that, a cluster is "quorate" if a cluster
> or its partition has nodes, with votes equal to or more than
> "expected_votes" in "cman" section of cluster.conf file (with no
> requirement mandating use of quorum disk)
> 
> So how does cluster being quorate or non-quorate affects functioning of
> a cluster or services? If cluster is non-quorate, does it indicate an
> alarming situation and why?
> 
> If a cluster is composed of resource groups which including only IP
> resource and script resource monitoring my application server listening
> on IP resource(no shared disk or shared resource between cluster nodes),
> then is cluster being "quorate" (or non quorate) important for services
> and/or cluster?

With the exception of 'two_nodes="1"', "expected_votes" should be the
total votes of all nodes. Quorum is achieved once 50%+1 votes coming
from the available nodes in a cluster. Without quorum, the cluster will
not work.

I've covered some of the theory here:

http://wiki.alteeve.com/index.php/Red_Hat_Cluster_Service_3_Tutorial#An_Overview_Before_We_Begin

> *Fencing - *
> Is fencing and cluster being quorate or non-quorate related? I tried one
> experiment, wherein I removed "fencing" for cluster nodes and shutdown
> one of the nodes in cluster. And I got message in /var/log/messages
> indicating fencing failed for node, and service was not failed over from
> that node. So is fencing mandatory even if there is no "shared" disk
> between two cluster nodes?
> 
> Also is a cluster non-quorate in a time window when a node has failed
> and has not been fenced successfully?
> 
> Yours gratefully

Without fencing, your cluster is unstable. Once a fence is needed but
fails, the entire cluster will block. Please check out the link above.
In the same section is a detailed example of why fencing is critical.

Cheers

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From cos at aaaaa.org  Wed Jan 19 23:21:05 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 19 Jan 2011 18:21:05 -0500
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
Message-ID: <20110119232105.GY864@mip.aaaaa.org>

I've got a cluster on CentOS 5.5, cman-2.0.115-34.el5_5.4, using VMWare.

VMWare fencing used to work.  At some point, it stopped, and I'm not
sure exactly when.  I noticed when one of the nodes developed a
problem and dropped out of the cluster and another node tried and
failed to fence it, and I had to reboot it manually to get it to
rejoin the cluster.

When trying to troubleshoot, I get this:
$ sudo fence_vmware_ng -a virtualcenter -l [loginname] -p [password] -o status -n [fencedevice-name]
Traceback (most recent call last):
  File "/sbin/fence_vmware_ng", line 304, in ?
    main()
  File "/sbin/fence_vmware_ng", line 301, in main
    fence_action(None, options, set_power_status, get_power_status)
  File "/usr/lib/fence/fencing.py", line 726, in fence_action
    status = get_power_fn(tn, options)
  File "/sbin/fence_vmware_ng", line 193, in get_power_status
    outlets=vmware_get_outlets_vi(conn,options,True)
  File "/sbin/fence_vmware_ng", line 145, in vmware_get_outlets_vi
    all_machines=vmware_run_command(options,True,("--operation status
    --vmname '%s'"%(options["-n"])),0)
  File "/sbin/fence_vmware_ng", line 124, in vmware_run_command
    (res_output,res_code)=pexpect.run(command,SHELL_TIMEOUT+LOGIN_TIMEOUT+additional_timeout,True)
NameError: global name 'SHELL_TIMEOUT' is not defined

Same thing if I run $ sudo fence_node [nodename]


Google gives me no hits on this message, and I'd never encountered it
before.  I'm trying to find documentation on where SHELL_TIMEOUT and
LOGIN_TIMEOUT are supposed to be defined, and how they're supposed to
be passed to fence_vmware_ng.  Anyone know what might've gone wrong,
and what the right fix is?
  -- Cos



From cos at aaaaa.org  Thu Jan 20 00:11:01 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 19 Jan 2011 19:11:01 -0500
Subject: [Linux-cluster] fence_agents package for 5.5?
Message-ID: <20110120001101.GZ864@mip.aaaaa.org>

At http://sources.redhat.com/cluster/wiki/VMware_FencingConfig I see this:

"If you are using RHEL 5.5/RHEL 6 just install fence_agents package and
 you are ready to use fence_vmware."

I'm running CentOS 5.5, however I see nothing in the 5.5 repos with
that name.  I tried a search on rpm.pbone.net and got a lot of hits
for fence-agents (with a dash, not an underscore - maybe the wiki
page has a typoe?) for various versions of fedora core, RHEL/CentOS 6,
and others, but none for 5.5

Is this package supposed to be available for 5.5?
Or should I be using a version of it built for some other redhat variant?
Or is the wiki wrong, and these agents aren't built for RHEL/CentOS 5 at all?
  -- Cos



From mgrac at redhat.com  Thu Jan 20 08:53:16 2011
From: mgrac at redhat.com (=?ISO-8859-1?Q?Marek_=27marx=27_Gr=E1c?=)
Date: Thu, 20 Jan 2011 09:53:16 +0100
Subject: [Linux-cluster] fence_agents package for 5.5?
In-Reply-To: <20110120001101.GZ864@mip.aaaaa.org>
References: <20110120001101.GZ864@mip.aaaaa.org>
Message-ID: <4D37F7FC.8030702@redhat.com>

Hi,

Ofer Inbar wrote:
> At http://sources.redhat.com/cluster/wiki/VMware_FencingConfig I see this:
>
> "If you are using RHEL 5.5/RHEL 6 just install fence_agents package and
>  you are ready to use fence_vmware."
>
> I'm running CentOS 5.5, however I see nothing in the 5.5 repos with
> that name.  I tried a search on rpm.pbone.net and got a lot of hits
> for fence-agents (with a dash, not an underscore - maybe the wiki
> page has a typoe?) for various versions of fedora core, RHEL/CentOS 6,
> and others, but none for 5.5
>
> Is this package supposed to be available for 5.5?
> Or should I be using a version of it built for some other redhat variant?
> Or is the wiki wrong, and these agents aren't built for RHEL/CentOS 5 at all?
>   -- Cos
>   
This package is available for RHEV-H (based on RHEL5.5) and is part of 
RHEL 5.6.

m,



From mgrac at redhat.com  Thu Jan 20 08:59:41 2011
From: mgrac at redhat.com (=?ISO-8859-1?Q?Marek_=27marx=27_Gr=E1c?=)
Date: Thu, 20 Jan 2011 09:59:41 +0100
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <20110119232105.GY864@mip.aaaaa.org>
References: <20110119232105.GY864@mip.aaaaa.org>
Message-ID: <4D37F97D.1050904@redhat.com>

Hi,

Ofer Inbar wrote:
> I've got a cluster on CentOS 5.5, cman-2.0.115-34.el5_5.4, using VMWare.
>
> VMWare fencing used to work.  At some point, it stopped, and I'm not
> sure exactly when.  I noticed when one of the nodes developed a
> problem and dropped out of the cluster and another node tried and
> failed to fence it, and I had to reboot it manually to get it to
> rejoin the cluster.
>
> When trying to troubleshoot, I get this:
> $ sudo fence_vmware_ng -a virtualcenter -l [loginname] -p [password] -o status -n [fencedevice-name]
> Traceback (most recent call last):
>   File "/sbin/fence_vmware_ng", line 304, in ?
>     main()
>   File "/sbin/fence_vmware_ng", line 301, in main
>     fence_action(None, options, set_power_status, get_power_status)
>   File "/usr/lib/fence/fencing.py", line 726, in fence_action
>     status = get_power_fn(tn, options)
>   File "/sbin/fence_vmware_ng", line 193, in get_power_status
>     outlets=vmware_get_outlets_vi(conn,options,True)
>   File "/sbin/fence_vmware_ng", line 145, in vmware_get_outlets_vi
>     all_machines=vmware_run_command(options,True,("--operation status
>     --vmname '%s'"%(options["-n"])),0)
>   File "/sbin/fence_vmware_ng", line 124, in vmware_run_command
>     (res_output,res_code)=pexpect.run(command,SHELL_TIMEOUT+LOGIN_TIMEOUT+additional_timeout,True)
> NameError: global name 'SHELL_TIMEOUT' is not defined
>
> Same thing if I run $ sudo fence_node [nodename]
>
>
> Google gives me no hits on this message, and I'd never encountered it
> before.  I'm trying to find documentation on where SHELL_TIMEOUT and
> LOGIN_TIMEOUT are supposed to be defined, and how they're supposed to
> be passed to fence_vmware_ng.  Anyone know what might've gone wrong,
> and what the right fix is?
>   
Bug was very likely fixed in cman-2.0.115-39. But you can change these 
constants with suitable values (few seconds should be enough).

m,



From member at linkedin.com  Mon Jan 24 07:22:33 2011
From: member at linkedin.com (Balaji Haridoss via LinkedIn)
Date: Mon, 24 Jan 2011 07:22:33 +0000 (UTC)
Subject: [Linux-cluster] Balaji Haridoss wants to stay in touch on LinkedIn
Message-ID: <1543035092.4517340.1295853753166.JavaMail.app@ela4-bed34.prod>

LinkedIn
------------Balaji Haridoss requested to add you as a connection on LinkedIn:
------------------------------------------

Marian,

I'd like to add you to my professional network on LinkedIn.

- Balaji Haridoss

Accept invitation from Balaji Haridoss
http://www.linkedin.com/e/-odgn7o-gjb1v77v-2c/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2573933511_2/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYOnP4NdjcPejcTdj99bSB9bjh7qnxvbP8OcjwVdjsVcj8LrCBxbOYWrSlI/EML_comm_afe/

View invitation from Balaji Haridoss
http://www.linkedin.com/e/-odgn7o-gjb1v77v-2c/ulDuieLaAX544oVCOYcgj_GaXIys4TuLMXGmOx/blk/I2573933511_2/39vcj4RcPcVcPsRcAALqnpPbOYWrSlI/svi/ 
------------------------------------------

DID YOU KNOW you can showcase your professional knowledge on LinkedIn to receive job/consulting offers and enhance your professional reputation? Posting replies to questions on LinkedIn Answers puts you in front of the world's professional community.
http://www.linkedin.com/e/-odgn7o-gjb1v77v-2c/abq/inv-24/

 
-- 
(c) 2010, LinkedIn Corporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110124/76ff73ed/attachment.htm>

From linux at alteeve.com  Mon Jan 24 15:37:05 2011
From: linux at alteeve.com (Digimer)
Date: Mon, 24 Jan 2011 10:37:05 -0500
Subject: [Linux-cluster] A better understanding of multicast issues
Message-ID: <4D3D9CA1.7040707@alteeve.com>

Hi all,

  It seems to me that a very good number of clustering problems end up
being multicast and smart switch related. I know that IGMP snooping and
STP are often the cause, and PIM can help solve it. Despite
understanding this, though, I can't quite understand exactly *why* IGMP
snooping and STP break things.

  Reading up on them leads me to think that they should cleanly create
and handle multicast groups, but this obviously isn't the case. When a
switch restarts, shouldn't it send a request to clients asking to
resubscribe to multicast groups? When corosync starts, I expect it would
also send multicast joins.

  Sorry if the question is a little vague or odd. I'm trying to get my
head around the troubles when, on the surface, the docs seem to make the
process of creating/managing multicast quite simple and straight forward.

Thanks!

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org



From cos at aaaaa.org  Mon Jan 24 18:03:04 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Mon, 24 Jan 2011 13:03:04 -0500
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <4D37F97D.1050904@redhat.com>
References: <20110119232105.GY864@mip.aaaaa.org> <4D37F97D.1050904@redhat.com>
Message-ID: <20110124180304.GV13584@mip.aaaaa.org>

On Thu, Jan 20, 2011 at 09:59:41AM +0100,
Marek 'marx' Grc <mgrac at redhat.com> wrote:

> Ofer Inbar wrote:
> >I've got a cluster on CentOS 5.5, cman-2.0.115-34.el5_5.4, using VMWare.

> >$ sudo fence_vmware_ng -a virtualcenter -l [loginname] -p [password] -o 

> >NameError: global name 'SHELL_TIMEOUT' is not defined

> Bug was very likely fixed in cman-2.0.115-39. But you can change these 
> constants with suitable values (few seconds should be enough).

Thanks.

Q1: Where can I get that build (or later) of cman for CentOS 5?  I
can't find it in the CentOS or RHEL repos, or EPEL, or via
rpm.pbone.net or some other places I tried.

Q2: As for changing the constants: I don't see where they are set in
the first place, nor any description of what they're for, or what
values they're supposed to have.  Are you suggesting I directly edit
the fence_vmware_ng script and add definitions for these constants?
Won't that be destroyed by the next install or upgrade of the RPM?  Is
there someplace these constants are supposed to be defined?  And if
I'm hacking the script directly, and have no idea what these constants
are for, is there any reason I shouldn't just edit them out of the one
line where they're used, and replace them with a small integer?

  -- Cos



From cos at aaaaa.org  Mon Jan 24 18:33:47 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Mon, 24 Jan 2011 13:33:47 -0500
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <20110124180304.GV13584@mip.aaaaa.org>
References: <20110119232105.GY864@mip.aaaaa.org> <4D37F97D.1050904@redhat.com>
	<20110124180304.GV13584@mip.aaaaa.org>
Message-ID: <20110124183347.GX13584@mip.aaaaa.org>

> Q2: As for changing the constants: I don't see where they are set in
> the first place, nor any description of what they're for, or what
> values they're supposed to have.  Are you suggesting I directly edit
> the fence_vmware_ng script and add definitions for these constants?
> Won't that be destroyed by the next install or upgrade of the RPM?  Is
> there someplace these constants are supposed to be defined?  And if
> I'm hacking the script directly, and have no idea what these constants
> are for, is there any reason I shouldn't just edit them out of the one
> line where they're used, and replace them with a small integer?

I did hack the script to completely remove the references to
SHELL_TIMEOUT and LOGIN_TIMEOUT in that line, tried fencing again,
and got: NameError: global name 'POWER_TIMEOUT' is not defined
I could keep hacking away at constant after constant, but I feel like
this is the wrong approach and there's an underlying problem: these
constants are supposed to be set somewhere, by something, but I don't
know where or how.  For some reason, they used to get set, and now
they no longer are.

Does anyone know where the values for these are supposed to come from,
and how they're supposed to reach the fencing agent?  I might be on a
wild goose chase without realizing it, because there's no documentation.
  -- Cos



From yvette at dbtgroup.com  Mon Jan 24 19:16:58 2011
From: yvette at dbtgroup.com (yvette hirth)
Date: Mon, 24 Jan 2011 19:16:58 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
Message-ID: <4D3DD02A.3000303@dbtgroup.com>

hi all,

does anyone have any performance comparisons of gfs2 v. zfs?

our five-node cluster is working fine, the clustering software is great, 
but when accessing gfs2-based files, enumeration can be very slow...

thanks!
yvette



From cos at aaaaa.org  Mon Jan 24 19:21:07 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Mon, 24 Jan 2011 14:21:07 -0500
Subject: [Linux-cluster] fence_agents package for 5.5?
In-Reply-To: <4D37F7FC.8030702@redhat.com>
References: <20110120001101.GZ864@mip.aaaaa.org> <4D37F7FC.8030702@redhat.com>
Message-ID: <20110124192107.GY13584@mip.aaaaa.org>

On Thu, Jan 20, 2011 at 09:53:16AM +0100,
Marek 'marx' Grc <mgrac at redhat.com> wrote:
> Ofer Inbar wrote:
> >At http://sources.redhat.com/cluster/wiki/VMware_FencingConfig I see this:
> >
> >"If you are using RHEL 5.5/RHEL 6 just install fence_agents package and
> > you are ready to use fence_vmware."

> This package is available for RHEV-H (based on RHEL5.5) and is part
> of RHEL 5.6.

So...
 - It's called fence-agents, not fence_agents (minor typo)
 - It's not available for RHEL or CentOS 5.5
... the wiki is mistaken on both of these counts, yes?


I've tried finding a fence-agents RPM I can use, but all the newer
ones have big disruptive dependencies (such as needing a newer glibc),
and older ones no longer seem to exist on the mirrors I've found
pointers to.  I can find no evidence that fence-agents was ever built
and distributed for any version of CentOS 5 / RHEL 5.

Am I on the wrong track here, or is this a reasonable path towards
getting a working vmware fencing agent for CentOS 5.5?
  -- Cos



From gordan at bobich.net  Mon Jan 24 19:37:22 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 24 Jan 2011 19:37:22 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3DD02A.3000303@dbtgroup.com>
References: <4D3DD02A.3000303@dbtgroup.com>
Message-ID: <4D3DD4F2.4090600@bobich.net>

On 01/24/2011 07:16 PM, yvette hirth wrote:
> hi all,
>
> does anyone have any performance comparisons of gfs2 v. zfs?
>
> our five-node cluster is working fine, the clustering software is great,
> but when accessing gfs2-based files, enumeration can be very slow...

The comparison is a bit like comparing apples and oranges. GFS is a 
cluster file system, ZFS is a single-machine file system.

Gordan



From swhiteho at redhat.com  Mon Jan 24 19:48:55 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 24 Jan 2011 19:48:55 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3DD02A.3000303@dbtgroup.com>
References: <4D3DD02A.3000303@dbtgroup.com>
Message-ID: <1295898535.4672.75.camel@dolmen>

Hi,

On Mon, 2011-01-24 at 19:16 +0000, yvette hirth wrote:
> hi all,
> 
> does anyone have any performance comparisons of gfs2 v. zfs?
> 
Not that I'm aware of

> our five-node cluster is working fine, the clustering software is great, 
> but when accessing gfs2-based files, enumeration can be very slow...
> 
What do you mean be "enumeration can be very slow" ? It might be
possible to slightly rearrange the I/O pattern in order to get greater
performance. Can you explain what you are trying to achieve?

Steve.

> thanks!
> yvette
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From yvette at dbtgroup.com  Mon Jan 24 19:51:19 2011
From: yvette at dbtgroup.com (yvette hirth)
Date: Mon, 24 Jan 2011 19:51:19 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3DD4F2.4090600@bobich.net>
References: <4D3DD02A.3000303@dbtgroup.com> <4D3DD4F2.4090600@bobich.net>
Message-ID: <4D3DD837.6070309@dbtgroup.com>

Gordan Bobic wrote:
> On 01/24/2011 07:16 PM, yvette hirth wrote:
>> hi all,
>>
>> does anyone have any performance comparisons of gfs2 v. zfs?
>>
>> our five-node cluster is working fine, the clustering software is great,
>> but when accessing gfs2-based files, enumeration can be very slow...
> 
> The comparison is a bit like comparing apples and oranges. GFS is a 
> cluster file system, ZFS is a single-machine file system.
> 
> Gordan
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
my apologies.  i heard zfs was cluster-aware; thanks for the info.

does anyone have any performance comparisons of gfs2 v. any other 
cluster-aware filesystems?

thanks again
yvette



From gordan at bobich.net  Mon Jan 24 20:01:51 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 24 Jan 2011 20:01:51 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3DD837.6070309@dbtgroup.com>
References: <4D3DD02A.3000303@dbtgroup.com> <4D3DD4F2.4090600@bobich.net>
	<4D3DD837.6070309@dbtgroup.com>
Message-ID: <4D3DDAAF.8090506@bobich.net>

On 01/24/2011 07:51 PM, yvette hirth wrote:
> Gordan Bobic wrote:
>> On 01/24/2011 07:16 PM, yvette hirth wrote:
>>> hi all,
>>>
>>> does anyone have any performance comparisons of gfs2 v. zfs?
>>>
>>> our five-node cluster is working fine, the clustering software is great,
>>> but when accessing gfs2-based files, enumeration can be very slow...
>>
>> The comparison is a bit like comparing apples and oranges. GFS is a
>> cluster file system, ZFS is a single-machine file system.
>>
>> Gordan
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
> my apologies. i heard zfs was cluster-aware; thanks for the info.
>
> does anyone have any performance comparisons of gfs2 v. any other
> cluster-aware filesystems?

You may want to google about GFS / GFS2 / OCFS2. That's pretty much all 
that is freely available as far as cluster file systems that live 
directly on top of block devices go.

The original OCFS (Oracle) and VMFS (VMware) work in a similar way but 
they are designed for few large files rather than many small files so 
they aren't suitable for generic use.

Symantec Veritas Cluster also comes with a similar cluster aware file 
system, but it's heavily licenced and last time I checked it didn't 
provide any compelling reasons to use it instead of GFS, GFS2 or OCFS2.

There are a few other things available that may be more suitable for 
what you want to do, but it's impossible to say without knowing more 
about your use-case. Depending on ecactly what you require you may find 
that SeznamFS, GlusterFS, Lustre or even HDFS (Hadoop) are more suitable 
for what you want to do.

Gordan



From s.wendy.cheng at gmail.com  Mon Jan 24 20:25:54 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 12:25:54 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3DDAAF.8090506@bobich.net>
References: <4D3DD02A.3000303@dbtgroup.com> <4D3DD4F2.4090600@bobich.net>
	<4D3DD837.6070309@dbtgroup.com> <4D3DDAAF.8090506@bobich.net>
Message-ID: <AANLkTik_B9vRhodzxffTwOKmryQtbqDMbUj_0S1j0BJ=@mail.gmail.com>

Sometime ago, the following was advertised:

"ZFS is not a native cluster, distributed, or parallel file system and
cannot provide concurrent access from multiple hosts as ZFS is a local
file system. Sun's Lustre distributed filesystem will adapt ZFS as
back-end storage for both data and metadata in version 3.0, which is
scheduled to be released in 2010."

You can google "Lustre" to see whether their plan (built Lustre on top
of ZFS) is panned out.

-- Wendy

On Mon, Jan 24, 2011 at 12:01 PM, Gordan Bobic <gordan at bobich.net> wrote:
> On 01/24/2011 07:51 PM, yvette hirth wrote:
>>
>> Gordan Bobic wrote:
>>>
>>> On 01/24/2011 07:16 PM, yvette hirth wrote:
>>>>
>>>> hi all,
>>>>
>>>> does anyone have any performance comparisons of gfs2 v. zfs?
>>>>
>>>> our five-node cluster is working fine, the clustering software is great,
>>>> but when accessing gfs2-based files, enumeration can be very slow...
>>>
>>> The comparison is a bit like comparing apples and oranges. GFS is a
>>> cluster file system, ZFS is a single-machine file system.
>>>
>>> Gordan
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>> my apologies. i heard zfs was cluster-aware; thanks for the info.
>>
>> does anyone have any performance comparisons of gfs2 v. any other
>> cluster-aware filesystems?
>
> You may want to google about GFS / GFS2 / OCFS2. That's pretty much all that
> is freely available as far as cluster file systems that live directly on top
> of block devices go.
>
> The original OCFS (Oracle) and VMFS (VMware) work in a similar way but they
> are designed for few large files rather than many small files so they aren't
> suitable for generic use.
>
> Symantec Veritas Cluster also comes with a similar cluster aware file
> system, but it's heavily licenced and last time I checked it didn't provide
> any compelling reasons to use it instead of GFS, GFS2 or OCFS2.
>
> There are a few other things available that may be more suitable for what
> you want to do, but it's impossible to say without knowing more about your
> use-case. Depending on ecactly what you require you may find that SeznamFS,
> GlusterFS, Lustre or even HDFS (Hadoop) are more suitable for what you want
> to do.
>
> Gordan
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From s.wendy.cheng at gmail.com  Mon Jan 24 20:46:05 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 12:46:05 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <1295898535.4672.75.camel@dolmen>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
Message-ID: <AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>

On Mon, Jan 24, 2011 at 11:48 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:
>> our five-node cluster is working fine, the clustering software is great,
>> but when accessing gfs2-based files, enumeration can be very slow...
>>
> What do you mean be "enumeration can be very slow" ? It might be
> possible to slightly rearrange the I/O pattern in order to get greater
> performance. Can you explain what you are trying to achieve?
>

I would guess this "enumeration" means "walk"; e.g. doing backup. One
of the most-liked features that ZFS offers is snapshots. So I would
suggest telling GFS2 users/customers to use LVM snapshot AND making
sure GFS2 works well with Linux LVMm snapshot.

-- Wendy



From rossnick-lists at cybercat.ca  Mon Jan 24 20:27:41 2011
From: rossnick-lists at cybercat.ca (Nicolas Ross)
Date: Mon, 24 Jan 2011 15:27:41 -0500
Subject: [Linux-cluster] rc.sysinit, LVM in RHEL 6
Message-ID: <BA3BFA9547E6464D955767E39215DF14@versa>

Hi !

I am setting up my cluster and I have now some clustered file systems (gfs2)
under LVM.

Now I am doing some shutdown / reboot tests on divers node that are running
or not different services.

I noticed something at the verry begining of the startup :

Setting up Logical Volume Management:   connect() failed on local socket: No
such file or directory
  4 logical volume(s) in volume group VG now active
  Skipping clustered volume group VGa
  Skipping clustered volume group VGb
  Skipping clustered volume group VGc
               [FAILED]

So, I dug up in the rc.sysinit startup script and found at line 200 (on RHEL
6) this :

action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a
y --sysinit

That command (/sbin/lvm vgchange -a y --sysinit) return a non-zero value
when we have some clustered VG's, hence the fail in the startup.

That doesn't prevent any services from running, all is ok afterward, it's
purely esthetic...

Can other confirm this ? I did not find any bugs on this on the redhat bugs
site.



From rossnick-lists at cybercat.ca  Mon Jan 24 21:07:46 2011
From: rossnick-lists at cybercat.ca (Nicolas Ross)
Date: Mon, 24 Jan 2011 16:07:46 -0500
Subject: [Linux-cluster] gfs2 v. zfs?
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
Message-ID: <021EA18B28FE4C968F4538D7FB621E20@versa>

> I would guess this "enumeration" means "walk"; e.g. doing backup. One
> of the most-liked features that ZFS offers is snapshots. So I would
> suggest telling GFS2 users/customers to use LVM snapshot AND making
> sure GFS2 works well with Linux LVMm snapshot.

AFAIK, clustered volume group doesn't support LVM snapshots :

<quote>
LVM snapshots are not supported across the nodes in a cluster. You cannot 
create a snapshot volume in a clustered volume group.
</quote>

from

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Logical_Volume_Manager_Administration/index.html#snapshot_volumes



From rafagriman at gmail.com  Mon Jan 24 21:10:31 2011
From: rafagriman at gmail.com (Rafa =?utf-8?q?Grim=C3=A1n?=)
Date: Mon, 24 Jan 2011 22:10:31 +0100
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTik_B9vRhodzxffTwOKmryQtbqDMbUj_0S1j0BJ=@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com> <4D3DDAAF.8090506@bobich.net>
	<AANLkTik_B9vRhodzxffTwOKmryQtbqDMbUj_0S1j0BJ=@mail.gmail.com>
Message-ID: <201101242210.32073.rafagriman@gmail.com>

Hi :)

On Monday 24 January 2011 21:25 Wendy Cheng wrote
> Sometime ago, the following was advertised:
> 
> "ZFS is not a native cluster, distributed, or parallel file system and
> cannot provide concurrent access from multiple hosts as ZFS is a local
> file system. Sun's Lustre distributed filesystem will adapt ZFS as
> back-end storage for both data and metadata in version 3.0, which is
> scheduled to be released in 2010."
> 
> You can google "Lustre" to see whether their plan (built Lustre on top
> of ZFS) is panned out.


But Lustre isn't a clustered filesystem, it's a parallel filesystem. Similar to 
pNFS, PanFS, ... Comparing GFS to Lustre wouldn't be quite right.

   Rafa

-- 
"We cannot treat computers as Humans. Computers need love."

Happily using KDE 4.5.4 :)



From s.wendy.cheng at gmail.com  Mon Jan 24 21:37:31 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 13:37:31 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <201101242210.32073.rafagriman@gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com> <4D3DDAAF.8090506@bobich.net>
	<AANLkTik_B9vRhodzxffTwOKmryQtbqDMbUj_0S1j0BJ=@mail.gmail.com>
	<201101242210.32073.rafagriman@gmail.com>
Message-ID: <AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>

I would love to get an education here. From usage model point of view,
what is the difference between a "parallel file system" and a "cluster
file system" ? i.e., when to use a parallel file system and when to
use a cluster file system ?

.. Wendy

On Mon, Jan 24, 2011 at 1:10 PM, Rafa Grim?n <rafagriman at gmail.com> wrote:
> Hi :)
>
> On Monday 24 January 2011 21:25 Wendy Cheng wrote
>> Sometime ago, the following was advertised:
>>
>> "ZFS is not a native cluster, distributed, or parallel file system and
>> cannot provide concurrent access from multiple hosts as ZFS is a local
>> file system. Sun's Lustre distributed filesystem will adapt ZFS as
>> back-end storage for both data and metadata in version 3.0, which is
>> scheduled to be released in 2010."
>>
>> You can google "Lustre" to see whether their plan (built Lustre on top
>> of ZFS) is panned out.
>
>
> But Lustre isn't a clustered filesystem, it's a parallel filesystem. Similar to
> pNFS, PanFS, ... Comparing GFS to Lustre wouldn't be quite right.
>
> ? Rafa
>
> --
> "We cannot treat computers as Humans. Computers need love."
>
> Happily using KDE 4.5.4 :)
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From s.wendy.cheng at gmail.com  Mon Jan 24 21:41:47 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 13:41:47 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <021EA18B28FE4C968F4538D7FB621E20@versa>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
Message-ID: <AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>

Guess GFS2 is out as an "enterprise" file system ? W/out a workable
backup solution, it'll be seriously limited. I have been puzzled why
CLVM is slow to add this feature.

-- Wendy

On Mon, Jan 24, 2011 at 1:07 PM, Nicolas Ross
<rossnick-lists at cybercat.ca> wrote:
>> I would guess this "enumeration" means "walk"; e.g. doing backup. One
>> of the most-liked features that ZFS offers is snapshots. So I would
>> suggest telling GFS2 users/customers to use LVM snapshot AND making
>> sure GFS2 works well with Linux LVMm snapshot.
>
> AFAIK, clustered volume group doesn't support LVM snapshots :
>
> <quote>
> LVM snapshots are not supported across the nodes in a cluster. You cannot
> create a snapshot volume in a clustered volume group.
> </quote>
>
> from
>
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Logical_Volume_Manager_Administration/index.html#snapshot_volumes
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From jeff.sturm at eprize.com  Mon Jan 24 21:58:21 2011
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Mon, 24 Jan 2011 16:58:21 -0500
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<4D3DDAAF.8090506@bobich.net><AANLkTik_B9vRhodzxffTwOKmryQtbqDMbUj_0S1j0BJ=@mail.gmail.com><201101242210.32073.rafagriman@gmail.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
Message-ID: <64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Wendy Cheng
> Subject: Re: [Linux-cluster] gfs2 v. zfs?
> 
> I would love to get an education here. From usage model point of view,
what is the
> difference between a "parallel file system" and a "cluster file
system" ? i.e., when to
> use a parallel file system and when to use a cluster file system ?

Getting off-topic but I'd also like to hear who uses a parallel
distributed FS, and what problem space they work well in.

GFS works fine, but we have trouble scaling it up due solely to the
limits of central storage.  I'm sure there are esoteric storage designs
to get around this, but at some point I also suspect there's an easier
way...

-Jeff





From jeff.sturm at eprize.com  Mon Jan 24 22:01:50 2011
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Mon, 24 Jan 2011 17:01:50 -0500
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<1295898535.4672.75.camel@dolmen><AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com><021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
Message-ID: <64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of Wendy Cheng
> Subject: Re: [Linux-cluster] gfs2 v. zfs?
> 
> Guess GFS2 is out as an "enterprise" file system ? W/out a workable
backup solution,
> it'll be seriously limited. I have been puzzled why CLVM is slow to
add this feature.

But an "enterprise" SAN will surely have native snapshots.  In fact, for
production use we run GFS without CLVM.

-Jeff





From rafagriman at gmail.com  Mon Jan 24 22:20:23 2011
From: rafagriman at gmail.com (Rafa =?utf-8?q?Grim=C3=A1n?=)
Date: Mon, 24 Jan 2011 23:20:23 +0100
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<201101242210.32073.rafagriman@gmail.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
Message-ID: <201101242320.23958.rafagriman@gmail.com>

Hi :)


On Monday 24 January 2011 22:37 Wendy Cheng wrote
> I would love to get an education here. From usage model point of view,
> what is the difference between a "parallel file system" and a "cluster
> file system" ? i.e., when to use a parallel file system and when to
> use a cluster file system ?


Please don't top post :)

A parallel file system is a file system in which:

	- metadata and data servers are separated

	- a file's data is distributed/striped among the data servers
	  (each data server has its own storage)

Due to #2, a file is read or written in parallel so you get a higher bandwidth. 
Each data server/node serves a chunk of each file. This is something similar to 
a RAID 0 on many servers.

Metadata is stored on a metadata server so it doesn't "get in the way" ;) That 
is, the client node asks the metadata server where the file's chunks are. The 
metadata server sends the client a list of data nodes which contain the chunks 
and then the client talks directly to the data nodes without having to talk 
again with the metadata server. Obviously, this is over simplified ;)

Also, take into account that metadata is IOPS intensive while data is 
bandwidth/throughput intensive. If you separate them both ... you can tune 
each storage susbsytem to get the best performance for IOPS or bandwidth.

Parallel file systems are useful for high bandwidth/throughput systems (HPC).

In clustered file systems:

	- metadata and data servers aren't usually separated (in CXFS they are)

	- a file's data is not striped among the data servers since there is a
	  single storage array

Due to #2, a file is not read/written in parallel. 1 file is served by 1 data 
node/server. This means you can have 2 nodes serving 2 files at the same time, 
but each node serves 1 file, not chunks of the same file.

Clustered file systems are useful for active/active HA/loadbalancing 
configurations.

This is a very simplified explanation of both. For more in depth explanations 
check Google ;) Look for GPFS, PVFS, Lustre, PanFS (Panasas), CXFS, GFS, 
OCFSv2, ...

HTH

   Rafa


> On Mon, Jan 24, 2011 at 1:10 PM, Rafa Grim?n <rafagriman at gmail.com> wrote:
> > Hi :)
> > 
> > On Monday 24 January 2011 21:25 Wendy Cheng wrote
> > 
> >> Sometime ago, the following was advertised:
> >> 
> >> "ZFS is not a native cluster, distributed, or parallel file system and
> >> cannot provide concurrent access from multiple hosts as ZFS is a local
> >> file system. Sun's Lustre distributed filesystem will adapt ZFS as
> >> back-end storage for both data and metadata in version 3.0, which is
> >> scheduled to be released in 2010."
> >> 
> >> You can google "Lustre" to see whether their plan (built Lustre on top
> >> of ZFS) is panned out.
> > 
> > But Lustre isn't a clustered filesystem, it's a parallel filesystem.
> > Similar to pNFS, PanFS, ... Comparing GFS to Lustre wouldn't be quite
> > right.
> > 
> > ? Rafa


-- 
"We cannot treat computers as Humans. Computers need love."

Happily using KDE 4.5.4 :)



From rafagriman at gmail.com  Mon Jan 24 22:26:06 2011
From: rafagriman at gmail.com (Rafa =?utf-8?q?Grim=C3=A1n?=)
Date: Mon, 24 Jan 2011 23:26:06 +0100
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
References: <4D3DD02A.3000303@dbtgroup.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
Message-ID: <201101242326.06753.rafagriman@gmail.com>

On Monday 24 January 2011 22:58 Jeff Sturm wrote
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> 
> [mailto:linux-cluster-bounces at redhat.com]
> 
> > On Behalf Of Wendy Cheng
> > Subject: Re: [Linux-cluster] gfs2 v. zfs?
> > 
> > I would love to get an education here. From usage model point of view,
> > what is the
> > difference between a "parallel file system" and a "cluster file
> > system" ? i.e., when to
> > use a parallel file system and when to use a cluster file system ?
> 
> Getting off-topic but I'd also like to hear who uses a parallel
> distributed FS, and what problem space they work well in.


HPC where you need  very high bandwidth/throughput to disk (usually scratch 
filesystem).


> GFS works fine, but we have trouble scaling it up due solely to the
> limits of central storage.  I'm sure there are esoteric storage designs
> to get around this, but at some point I also suspect there's an easier
> way...


>From a HW point of view, DDN has very good storage arrays ... but you pay for 
it ;) They promise 6 GB/s and you get those 6 GB/s.

You can also use more storage arrays and use some volume manager to group them 
together. This is a bit tricky because it won't guarantee a load balance on 
the different storage arrays.

HTH

   Rafa

-- 
"We cannot treat computers as Humans. Computers need love."

Happily using KDE 4.5.4 :)



From cos at aaaaa.org  Mon Jan 24 22:59:24 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Mon, 24 Jan 2011 17:59:24 -0500
Subject: [Linux-cluster] fence_vmware_ng fails: Solution found
In-Reply-To: <4D37F97D.1050904@redhat.com>
References: <20110119232105.GY864@mip.aaaaa.org> <4D37F97D.1050904@redhat.com>
Message-ID: <20110124225924.GB13584@mip.aaaaa.org>

Gave up on fence_vmware_ng, because it turns out that the fence_vmware
supplied with cman-2.0.115-34 does work.  That was disguised by what
turned out to be a bug in VI Perl Toolkit.

Since I got no hits on Google when I looked for this, I'm putting this
here for the archive, and anyone else who happens to search...

| $sudo fence_vmware -a virtualcenter -l [username] -p [password] -o reboot -n [node]
| fence_vmware_helper returned Undefined subroutine &Opts::add_options called at /sbin/fence_vmware_helper line 82.

Find VMware/VIRuntime.pm and add this line:

| *** 13,18 ****
| --- 13,19 ----
|   
|   package VMware::VIRuntime;
|   
| + use VMware::VILib;
|   use VMware::VIM2Runtime;
|   use VMware::VIM2Stub;

Now it works.
  -- Cos



From s.wendy.cheng at gmail.com  Mon Jan 24 23:45:35 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 15:45:35 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <201101242326.06753.rafagriman@gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
	<201101242326.06753.rafagriman@gmail.com>
Message-ID: <AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>

On Mon, Jan 24, 2011 at 2:26 PM, Rafa Grim?n <rafagriman at gmail.com> wrote:
> On Monday 24 January 2011 22:58 Jeff Sturm wrote
>> > -----Original Message-----
>> > From: linux-cluster-bounces at redhat.com
>>
>> [mailto:linux-cluster-bounces at redhat.com]
>>
>> > On Behalf Of Wendy Cheng
>> > Subject: Re: [Linux-cluster] gfs2 v. zfs?
>> >
>> > I would love to get an education here. From usage model point of view,
>> > what is the
>> > difference between a "parallel file system" and a "cluster file
>> > system" ? i.e., when to
>> > use a parallel file system and when to use a cluster file system ?
>>
>> Getting off-topic but I'd also like to hear who uses a parallel
>> distributed FS, and what problem space they work well in.
>
>
> HPC where you need ?very high bandwidth/throughput to disk (usually scratch
> filesystem).

You hit the right points (and thanks for previous explanation) !
However, from usage point of view, I think the line is blurry these
days (e.g. IBM's GPFS is said to be a cluster filesystem but have been
used well in HPC environment).

BTW, I never understand why top-post is evil ? Isn't it making some
emails hard to read ?

-- Wendy



From jcasale at activenetwerx.com  Tue Jan 25 01:06:46 2011
From: jcasale at activenetwerx.com (Joseph L. Casale)
Date: Tue, 25 Jan 2011 01:06:46 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
Message-ID: <CA5A491E9DEFBE4CB777DE97E21575E907628F31@prato.activenetwerx.local>

>BTW, I never understand why top-post is evil ? Isn't it making some
>emails hard to read ?

People on these lists take their precious time to help you almost always
with nothing in return, often allowing you (me, us, everyone who uses
them) the luxury of getting our job that we are paid for done. The very
least we can do is make it easy for them to perform this gesture.

Part of that involves creating a conversation they can go back and re-read
or follow along with and chime in when possible. In order to do this, you
need to trim replies to include what you are answering, and add your part.

Now if we all just put our comments in different places how would you
know what one person's statement was in reference to compared to another's?
If someone asked me for the output to several commands, wouldn't it
be logical to position this output after each command request as an example?

When you don't do this, people who are helping  you sometimes have to
go back, reread entire emails and try to make sense of gibberish, its time
consuming and exhausting to perform as a free favor.

In other words, why don't we top post or neglect to trim (Does the following
make sense?):

A.  Because it breaks the flow and reads backwards.

> Q.  Why is top posting considered harmful?

Hope that was informative:)
jlc



From s.wendy.cheng at gmail.com  Tue Jan 25 02:42:30 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 18:42:30 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <CA5A491E9DEFBE4CB777DE97E21575E907628F31@prato.activenetwerx.local>
References: <4D3DD02A.3000303@dbtgroup.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
	<CA5A491E9DEFBE4CB777DE97E21575E907628F31@prato.activenetwerx.local>
Message-ID: <AANLkTikC0-0nPSUtMa-h3n7YDyOQbDX=DAH4S1HM=X-N@mail.gmail.com>

On Mon, Jan 24, 2011 at 5:06 PM, Joseph L. Casale
<jcasale at activenetwerx.com> wrote:
>> A. ?Because it breaks the flow and reads backwards.
>
>> Q. ?Why is top posting considered harmful?
>
> Hope that was informative:)
> jlc
>

I don't have any intention to start a flame and/or religion war.
However, I'm hoping people could relax a little bit about this "rule",
if it is a rule at all ... Check out:
http://en.wikipedia.org/wiki/Posting_style#Top-posting to see what it
says. You may find it interesting.

At the same time, I don't see comparing performance numbers between
parallel filesystem and cluster filesystem is a bad practice. After
all, I see users and IT shops comparing NFS and GFS numbers from time
to time (as a way to decide which one to use). The bottome line is "I
have a storage box and I want to access it from different machines,
which one is the best solution for me and lets get the capacity
estimated".

-- Wendy



From Chris.Jankowski at hp.com  Tue Jan 25 02:55:02 2011
From: Chris.Jankowski at hp.com (Jankowski, Chris)
Date: Tue, 25 Jan 2011 02:55:02 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
References: <4D3DD02A.3000303@dbtgroup.com>
	<1295898535.4672.75.camel@dolmen><AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com><021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
Message-ID: <036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>

>-----Original Message-----
>From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm
>Sent: Tuesday, 25 January 2011 09:02
>To: linux clustering
>Subject: Re: [Linux-cluster] gfs2 v. zfs?

>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com]
>> On Behalf Of Wendy Cheng
>> Subject: Re: [Linux-cluster] gfs2 v. zfs?
>> 
>> Guess GFS2 is out as an "enterprise" file system ? W/out a workable backup solution,
>> it'll be seriously limited. I have been puzzled why CLVM is slow to add this feature.

>But an "enterprise" SAN will surely have native snapshots.  In fact, for production use we run GFS without CLVM.

>-Jeff

A few comments, which might contrast uses of GFS2 and XFS in enterprise class production environments:

1. 
SAN snapshot is not a panacea, as it is only crash consistent and only within a single LUN.
If you have your data or database spread over multiple LUNs each with its own filesystem,
then you are on your own.

2. 
Therefore, we still need at least OS level (filesystem level) consistent backup
if the application itself does not provide a hot backup mechanism, which very few do.
The consistent filesystem level backup requires freeze and thaw commands.
XFS offers them, GFS2 does not.

3.
GFS2 provides only tar(1) as a backup mechanism. 
Unfortunately, tar(1) does not cope efficiently with sparse files, 
which many applications create. 
As an exercise create a 10 TB sparse file with just one byte of non-null data at the end.
Then try to back it up to disk using tar(1). 
The tar image will be correctly created, but it will take many, many hours.
Dump(8) would do the job in a blink, but is not available for GFS2 filesystem.
However, XFS does have XFS specific dump(8) command and will backup sparse files
efficiently.

4.
GFS2 is very convenient to use, as by its nature is clusterised.
However, there is huge performance cost to pay for all this convenience.
This cost stems from serialization imposed by distributed lock manager.

5.
For these reason, for the HA applications running on one node at a time,
I found that XFS on top of LVM gives me the best mix of performance and functionality:
- high performance
- efficient backup of sparse files
- backup consistency through freeze/thaw
- zero downtime backup through use of LVM snapshots
- short failover times due to efficient XFS transaction logs

So, for this type of HA applications (failover HA) and environment, 
it makes perfect sense to use XFS in a cluster instead of GFS2.

Having said that, GFS2 can, in principle, be engineered to be much better
for failover HA applications.

It would require development of:
- GFS2 specific dump(8)
- GFS2 specific freeze and thaw commands
- CLVM wide snapshots
- more efficient DLM

It certainly is possible to do. Digital/Compaq/HP TruCluster Cluster File System (CFS) built on top of AdvFS had all of these features and much, much more by circa year 2000.

Regards,

Chris Jankowski






--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



From jcasale at activenetwerx.com  Tue Jan 25 03:00:18 2011
From: jcasale at activenetwerx.com (Joseph L. Casale)
Date: Tue, 25 Jan 2011 03:00:18 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTikC0-0nPSUtMa-h3n7YDyOQbDX=DAH4S1HM=X-N@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<AANLkTi=9RZxUxUVV6ystJGUzD9ULU33ONYt2MTi0+mLy@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AEF@hugo.eprize.local>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
	<CA5A491E9DEFBE4CB777DE97E21575E907628F31@prato.activenetwerx.local>
	<AANLkTikC0-0nPSUtMa-h3n7YDyOQbDX=DAH4S1HM=X-N@mail.gmail.com>
Message-ID: <CA5A491E9DEFBE4CB777DE97E21575E90762A101@prato.activenetwerx.local>

>I don't have any intention to start a flame and/or religion war.
>However, I'm hoping people could relax a little bit about this "rule",
>if it is a rule at all ... Check out:
>http://en.wikipedia.org/wiki/Posting_style#Top-posting to see what it
>says. You may find it interesting.

No war to start:) I had always top posted as a kid before I started using
usenet and mailing lists. I was corrected and saw it was logical, also the
ways where in place long before I came along but they do make sense.

A few times I gave up helping as following along in a thread was too hard.
Not worth my effort...

>At the same time, I don't see comparing performance numbers between
>parallel filesystem and cluster filesystem is a bad practice.

Sure, so long as you know the use case and scenarios etc. Often we see
people doing things like exporting an ext formatted block dev w/ an iscsi
target to multiple clients, so maybe the knee-jerk reaction with an odd
comparo like that is an attempt to "inform". If that _actual_ comparison
is of value to you, so be it...

>The bottome line is "I
>have a storage box and I want to access it from different machines,
>which one is the best solution for me and lets get the capacity
>estimated".

Most of us would compare a cluster to a cluster, or an apple to an apple:)
I suppose you could phrase the question: "I need to export data, every
option is possible, I can cluster, or not, give me #'s" but we digress...



From s.wendy.cheng at gmail.com  Tue Jan 25 04:37:05 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Mon, 24 Jan 2011 20:37:05 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>

Comments in-line ...

On Mon, Jan 24, 2011 at 6:55 PM, Jankowski, Chris
<Chris.Jankowski at hp.com> wrote:
> A few comments, which might contrast uses of GFS2 and XFS in enterprise class production environments:
>
> 1.
> SAN snapshot is not a panacea, as it is only crash consistent and only within a single LUN.
> If you have your data or database spread over multiple LUNs each with its own filesystem,
> then you are on your own.

It depends on the SAN box. Some products have aggregate level
snapshots that can contain multiple LUNs.

However, the argument here is correct; that is, SAN snaphost is not a
panacea. Other than different SAN vendors may have different setup(s),
snapshot restore could require specific knowledge of the filesystem
involved (e.g. how the journal is replayed). So there are integration
and test  efforts required for the restore to work well.

>
> 2.
> Therefore, we still need at least OS level (filesystem level) consistent backup
> if the application itself does not provide a hot backup mechanism, which very few do.
> The consistent filesystem level backup requires freeze and thaw commands.
> XFS offers them, GFS2 does not.

I seem to see GFS2 having freeze/thaw patches in the past ? But for
backup to work well, it requires more than freeze/thaw.

>
> 3.
> GFS2 provides only tar(1) as a backup mechanism.
> Unfortunately, tar(1) does not cope efficiently with sparse files,
> which many applications create.
> As an exercise create a 10 TB sparse file with just one byte of non-null data at the end.
> Then try to back it up to disk using tar(1).
> The tar image will be correctly created, but it will take many, many hours.
> Dump(8) would do the job in a blink, but is not available for GFS2 filesystem.
> However, XFS does have XFS specific dump(8) command and will backup sparse files
> efficiently.
>
> 4.
> GFS2 is very convenient to use, as by its nature is clusterised.
> However, there is huge performance cost to pay for all this convenience.
> This cost stems from serialization imposed by distributed lock manager.
>
> 5.
> For these reason, for the HA applications running on one node at a time,
> I found that XFS on top of LVM gives me the best mix of performance and functionality:
> - high performance
> - efficient backup of sparse files
> - backup consistency through freeze/thaw
> - zero downtime backup through use of LVM snapshots
> - short failover times due to efficient XFS transaction logs
>
> So, for this type of HA applications (failover HA) and environment,
> it makes perfect sense to use XFS in a cluster instead of GFS2.
>
> Having said that, GFS2 can, in principle, be engineered to be much better
> for failover HA applications.
>
> It would require development of:
> - GFS2 specific dump(8)
> - GFS2 specific freeze and thaw commands
> - CLVM wide snapshots
> - more efficient DLM

You did a great summary here. By looking at the list, I would imagine
CLVM snapshoting is probably the easiest, technically and politically.
It's all up to GFS2 engineers to take the note.

>
> It certainly is possible to do. Digital/Compaq/HP TruCluster Cluster File System (CFS) built on top of AdvFS had all of these features and much, much more by circa year 2000.
>

Yep, I met a TruCluster developer 3 years ago. Based on his
description, I was impressed. Not sure HP is still marketing it
though.

Again, a great summary !

-- Wendy



From parvez.h.shaikh at gmail.com  Tue Jan 25 09:39:43 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Tue, 25 Jan 2011 15:09:43 +0530
Subject: [Linux-cluster] Running cluster tools using non-root user
Message-ID: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>

Hi all

Is it possible to run cluster tools like clustat or clusvcadm etc. using
non-root user?

If yes, to which groups this user should belong to? Otherwise can this be
done using sudo(and sudoers) file.

As of now I get following error on clustat -

Could not connect to CMAN: Permission denied


Thanks,
Parvez
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110125/553fccc2/attachment.htm>

From swhiteho at redhat.com  Tue Jan 25 10:01:28 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Tue, 25 Jan 2011 10:01:28 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
Message-ID: <1295949688.2598.19.camel@dolmen>

Hi,

On Mon, 2011-01-24 at 20:37 -0800, Wendy Cheng wrote:
> Comments in-line ...
> 
> On Mon, Jan 24, 2011 at 6:55 PM, Jankowski, Chris
> <Chris.Jankowski at hp.com> wrote:
> > A few comments, which might contrast uses of GFS2 and XFS in enterprise class production environments:
> >
> > 1.
> > SAN snapshot is not a panacea, as it is only crash consistent and only within a single LUN.
> > If you have your data or database spread over multiple LUNs each with its own filesystem,
> > then you are on your own.
> 
> It depends on the SAN box. Some products have aggregate level
> snapshots that can contain multiple LUNs.
> 
> However, the argument here is correct; that is, SAN snaphost is not a
> panacea. Other than different SAN vendors may have different setup(s),
> snapshot restore could require specific knowledge of the filesystem
> involved (e.g. how the journal is replayed). So there are integration
> and test  efforts required for the restore to work well.
> 
> >
> > 2.
> > Therefore, we still need at least OS level (filesystem level) consistent backup
> > if the application itself does not provide a hot backup mechanism, which very few do.
> > The consistent filesystem level backup requires freeze and thaw commands.
> > XFS offers them, GFS2 does not.
> 
> I seem to see GFS2 having freeze/thaw patches in the past ? But for
> backup to work well, it requires more than freeze/thaw.
> 
Yes, GFS2 has freeze/thaw just like many local filesystems. It uses the
same interface. The main difference is that with GFS2 the freeze will
freeze all nodes when the freeze is run from a single node. The thaw
must be run from the same node which did the freeze.

Currently due to lack of supported cluster snapshots, SAN backup is our
suggested solution. Obviously there are some caveats with that, as
mentioned above.

> >
> > 3.
> > GFS2 provides only tar(1) as a backup mechanism.
> > Unfortunately, tar(1) does not cope efficiently with sparse files,
> > which many applications create.
> > As an exercise create a 10 TB sparse file with just one byte of non-null data at the end.
> > Then try to back it up to disk using tar(1).
> > The tar image will be correctly created, but it will take many, many hours.
> > Dump(8) would do the job in a blink, but is not available for GFS2 filesystem.
> > However, XFS does have XFS specific dump(8) command and will backup sparse files
> > efficiently.
> >
You don't need dump in order to do this (since dump reads directly from
the block device itself, that would be problematic on GFS/GFS2 anyway).
All that is required is a backup too which support the FIEMAP ioctl. I
don't know if that has made it into tar yet, I suspect probably not.

> > 4.
> > GFS2 is very convenient to use, as by its nature is clusterised.
> > However, there is huge performance cost to pay for all this convenience.
> > This cost stems from serialization imposed by distributed lock manager.
> >
That depends largely on the I/O pattern, hence my original question. It
can sometimes be difficult to arrange for the I/O to follow a pattern
which allows GFS2 to work at full efficiency, but by doing so, it will
make a big difference to the performance and retains the advantage of
the unified name space.

> > 5.
> > For these reason, for the HA applications running on one node at a time,
> > I found that XFS on top of LVM gives me the best mix of performance and functionality:
> > - high performance
> > - efficient backup of sparse files
> > - backup consistency through freeze/thaw
> > - zero downtime backup through use of LVM snapshots
> > - short failover times due to efficient XFS transaction logs
> >
> > So, for this type of HA applications (failover HA) and environment,
> > it makes perfect sense to use XFS in a cluster instead of GFS2.
> >
> > Having said that, GFS2 can, in principle, be engineered to be much better
> > for failover HA applications.
> >
> > It would require development of:
> > - GFS2 specific dump(8)
I don't agree that we need GFS2 specific dump. It is something that, to
date, nobody has requested. I suspect that there is another solution to
achieving what you are after.

> > - GFS2 specific freeze and thaw commands
Already exists

> > - CLVM wide snapshots
That would be nice, but you need to ask the LVM team.

> > - more efficient DLM
Also nice, but I very much doubt that this has any effect on the case in
question. It is usually the disk I/O that causes the performance issues
rather than the DLM.

I hope that answers a few questions,

Steve.

> 
> You did a great summary here. By looking at the list, I would imagine
> CLVM snapshoting is probably the easiest, technically and politically.
> It's all up to GFS2 engineers to take the note.
> 
> >
> > It certainly is possible to do. Digital/Compaq/HP TruCluster Cluster File System (CFS) built on top of AdvFS had all of these features and much, much more by circa year 2000.
> >
> 
> Yep, I met a TruCluster developer 3 years ago. Based on his
> description, I was impressed. Not sure HP is still marketing it
> though.
> 
> Again, a great summary !
> 
> -- Wendy
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From gordan at bobich.net  Tue Jan 25 10:15:14 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Tue, 25 Jan 2011 10:15:14 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
References: <4D3DD02A.3000303@dbtgroup.com>	<1295898535.4672.75.camel@dolmen><AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com><021EA18B28FE4C968F4538D7FB621E20@versa>	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <4D3EA2B2.7050803@bobich.net>

Jankowski, Chris wrote:
[...]
> It would require development of:
> - GFS2 specific dump(8)
> - GFS2 specific freeze and thaw commands
> - CLVM wide snapshots
> - more efficient DLM
> 
> It certainly is possible to do. Digital/Compaq/HP TruCluster Cluster File System (CFS) built on top of AdvFS had all of these features and much, much more by circa year 2000.

Indeed, I, too, have been noticing that 10-15 years ago we had all sorts 
of technologies that have all but disappeared and haven't been replaced 
with anything nearly as advanced. Personally, I blame the rise of the 
fenomenon of people calling themselves "programmers" because they can 
write "Hello world!" in HTML. But I digress, and it's somewhat off-topic.

On topic - for consistent "snapshots" for backup purposes there is a 
cobbled together solution that can be used for this sort of thing, if 
you have the option of using DRBD. If you have a DRBD setup you can 
unmount the file system on a slave node and drop it out of replication 
completely (make it standalone). Mount the GFS file system with 
ro,lock=nolock, and back it up with whatever your prefered backup 
solution is. Then bounce DRBD completely on the slave to make it catch 
up with the master.

Gordan



From ana.perez at roche.com  Tue Jan 25 15:19:36 2011
From: ana.perez at roche.com (Perez, Ana)
Date: Tue, 25 Jan 2011 16:19:36 +0100
Subject: [Linux-cluster] Qdiskd: heuristics not working
Message-ID: <9E370B183F253D48A26A4D803EDF355A0250D2B326@RBAMSEM705.emea.roche.com>

Hi,

I'm working with RHCS version


cman.x86_64                      2.0.115-68.el5_6.1   installed

openais.x86_64                   0.80.6-28.el5        installed

rgmanager.x86_64                 2.0.52-9.el5         installed

I've a two node cluster with quorum device. I'm using heuristics to prevent fence death situation. My heuristic was working fine with a previous version of cman.
When I simulated a network partition. I saw the message below:

qdiskd[10657]: <info> Heuristic: '/usr/local/cmcluster/conf/admin/test_hb.sh' DOWN (1/1)

However with the new version of cman, with the same heuristics, if I simulate a network partition, qdiskd doesn't realize and just after totem token timeout I see the message below:

openais[3750]: [TOTEM] The token was lost in the OPERATIONAL state.

The nodes fence each other and my cluster ends powered off. Anyone suffering with quorum heuristics with the latest cman version?

Thanks
Best Regards,
Ana
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110125/40a1e755/attachment.htm>

From s.wendy.cheng at gmail.com  Tue Jan 25 17:16:32 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Tue, 25 Jan 2011 09:16:32 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <1295949688.2598.19.camel@dolmen>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
	<1295949688.2598.19.camel@dolmen>
Message-ID: <AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>

On Tue, Jan 25, 2011 at 2:01 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:

>> On Mon, Jan 24, 2011 at 6:55 PM, Jankowski, Chris
>> <Chris.Jankowski at hp.com> wrote:
>> > A few comments, which might contrast uses of GFS2 and XFS in enterprise class production environments:
>> >
>> > 3.
>> > GFS2 provides only tar(1) as a backup mechanism.
>> > Unfortunately, tar(1) does not cope efficiently with sparse files,
>> > which many applications create.
>> > As an exercise create a 10 TB sparse file with just one byte of non-null data at the end.
>> > Then try to back it up to disk using tar(1).
>> > The tar image will be correctly created, but it will take many, many hours.
>> > Dump(8) would do the job in a blink, but is not available for GFS2 filesystem.
>> > However, XFS does have XFS specific dump(8) command and will backup sparse files
>> > efficiently.
>> >
> You don't need dump in order to do this (since dump reads directly from
> the block device itself, that would be problematic on GFS/GFS2 anyway).
> All that is required is a backup too which support the FIEMAP ioctl. I
> don't know if that has made it into tar yet, I suspect probably not.
>

If cluster snapshot is in the hand of another develop team (that may
not see it as a high priority), a GFS2 specific dump command could be
a good alternative. The bottom line here is GFS2 is lacking a sensible
(read as "easy to use") backup strategy that can significantly
jeopardize its deployment.

Of couse, this depends on .... someone has to be less stubborn and
willing to move GFS2's inode number away from its physical disk block
number. Cough !

-- Wendy



From rafagriman at gmail.com  Tue Jan 25 18:28:29 2011
From: rafagriman at gmail.com (Rafa =?utf-8?q?Grim=C3=A1n?=)
Date: Tue, 25 Jan 2011 19:28:29 +0100
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
Message-ID: <201101251928.29665.rafagriman@gmail.com>

Hi :)

On Tuesday 25 January 2011 00:45 Wendy Cheng wrote
> On Mon, Jan 24, 2011 at 2:26 PM, Rafa Grim?n <rafagriman at gmail.com> wrote:
> > On Monday 24 January 2011 22:58 Jeff Sturm wrote
> > 
> >> > -----Original Message-----
> >> > From: linux-cluster-bounces at redhat.com
> >> 
> >> [mailto:linux-cluster-bounces at redhat.com]
> >> 
> >> > On Behalf Of Wendy Cheng
> >> > Subject: Re: [Linux-cluster] gfs2 v. zfs?
> >> > 
> >> > I would love to get an education here. From usage model point of view,
> >> > what is the
> >> > difference between a "parallel file system" and a "cluster file
> >> > system" ? i.e., when to
> >> > use a parallel file system and when to use a cluster file system ?
> >> 
> >> Getting off-topic but I'd also like to hear who uses a parallel
> >> distributed FS, and what problem space they work well in.
> > 
> > HPC where you need ?very high bandwidth/throughput to disk (usually
> > scratch filesystem).
> 
> You hit the right points (and thanks for previous explanation) !
> However, from usage point of view, I think the line is blurry these
> days (e.g. IBM's GPFS is said to be a cluster filesystem but have been
> used well in HPC environment).


Yes that is true. It's a bit blurry because some file systems have features 
others have so "classifying" them is quite difficult.

   Rafa

-- 
"We cannot treat computers as Humans. Computers need love."

Happily using KDE 4.5.4 :)



From yvette at dbtgroup.com  Tue Jan 25 19:34:15 2011
From: yvette at dbtgroup.com (yvette hirth)
Date: Tue, 25 Jan 2011 19:34:15 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <201101251928.29665.rafagriman@gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>	<201101242326.06753.rafagriman@gmail.com>	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
	<201101251928.29665.rafagriman@gmail.com>
Message-ID: <4D3F25B7.3080306@dbtgroup.com>

Rafa Grim?n wrote:

> Yes that is true. It's a bit blurry because some file systems have features 
> others have so "classifying" them is quite difficult.

i'm amazed at the conversation that has taken place by me simply asking 
a question.

*Thank You* all for all of this info!

we've traced the response time slowdown to "number of subdirectories 
that need to be listed when their parent directory is enumerated".

btw, my usage of "enumeration" means, "list contents".  sorry for any 
confusion.

we've noticed that if we do:

ls -lhad /foo
ls -lhad /foo/stuff
ls -lhad /foo/stuff/moreStuff

response time is good, because , but

ls -lhad /foo/stuff/moreStuff/*

is where response time increases by a magnitude, because moreStuff has 
~260 directories.  enumerating moreStuff and other "directories with 
many subdirectories" appear to be the culprits.

for now, we'll be moving directories around, trying to reduce the number 
of nested levels, and number of elements in each level.

in human interaction there is a rule:  as the number of people 
interacting increases linearly, the number of interactions between the 
people increases exponentially.  is it true that as the number of nodes, 
"n", increases linearly, the amount of metadata being passed around / 
inspected during disk access increases geometrically?  does this "rule" 
apply?  or does metadata processing increase linearly as well, because 
the querying is all done by one node?

thanks again - what a group!
yvette



From steve.little at gmail.com  Tue Jan 25 20:01:00 2011
From: steve.little at gmail.com (Steve Little)
Date: Tue, 25 Jan 2011 20:01:00 +0000
Subject: [Linux-cluster] Deadlock detection in libdlm
Message-ID: <AANLkTik_18nO9U0JGerkRMbBViCi53YEoLaqD6ndNOH7@mail.gmail.com>

I've been trying to make use of deadlock detection in libdlm, but
without any luck so far. I'm hoping someone can tell me what I'm doing
wrong, or how to debug this further.

My test code looks like this:

#include <sys/types.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define _REENTRANT
#include <libdlm.h>

void lock(struct dlm_lksb *l, const char *name, int mode) {
  printf("[%d] Attempting to lock %s, mode %d\n",getpid(),name,mode);
  int status = dlm_lock_wait(LKM_NLMODE, l, LKF_EXPEDITE, name, strlen(name),
                            0, NULL, NULL, NULL);
  if(status != 0) abort();

  status = dlm_lock_wait(mode, l, LKF_CONVERT | LKF_CONVDEADLK, name,
strlen(name),
                            0, NULL, NULL, NULL);
  if(status == 0) status = l->sb_status;

  printf("[%d] Status was %d\n",getpid(),status);
}

int main(void) {

  pid_t pid = fork();

  if(pid == 0) { // child process
    if(dlm_pthread_init() != 0) abort();

    struct dlm_lksb l1,l2;
    memset(&l1,0,sizeof(l1));
    memset(&l2,0,sizeof(l2));

    lock(&l1,"A",LKM_PRMODE);

    lock(&l2,"B",LKM_EXMODE);

    dlm_unlock_wait(l1.sb_lkid,0,&l1);
    dlm_unlock_wait(l2.sb_lkid,0,&l2);
    return EXIT_SUCCESS;
  } else { // parent process
    if(dlm_pthread_init() != 0) abort();

    struct dlm_lksb l1,l2;
    memset(&l1,0,sizeof(l1));
    memset(&l2,0,sizeof(l2));

    lock(&l1,"B",LKM_PRMODE);

    sleep(5); // wait to ensure child has grabbed A
    lock(&l2,"A",LKM_EXMODE);

    dlm_unlock_wait(l2.sb_lkid,0,&l2);
    dlm_unlock_wait(l1.sb_lkid,0,&l1);
  }

  return EXIT_SUCCESS;
}

This should cause a classic deadlock: process 1 is waiting on resource
A, which is locked by process 2. Process 2 is waiting on resource B,
which is locked by process 1.

>From the manpage, I would expect this to be detected and resolved by
one of the lock requests being refused:

"Return values
      *snip*
       EDEADLOCK       The lock operation is causing a deadlock and has been
                       cancelled. If this was a conversion then the lock is
                       reverted to its previously granted state. If it was a
                       new lock then it has not been granted. (NB Only
                       conversion deadlocks are currently detected)"

But instead, the process hangs indefinitely, until I kill it:


$ ./a.out
[27986] Attempting to lock A, mode 3
[27985] Attempting to lock B, mode 3
[27986] Status was 0
[27986] Attempting to lock B, mode 5
[27985] Status was 0
[27985] Attempting to lock A, mode 5
<hangs here>

Here's the output of lockdump:

$ /sbin/dlm_tool lockdump default
id 01aa0005 gr PR rq IV pid 27986 master 2 "A"
id 034f0004 gr NL rq EX pid 27985 master 2 "A"
id 03630001 gr PR rq IV pid 27985 master 4 "B"
id 02070004 gr NL rq EX pid 27986 master 4 "B"

and lockdebug:

$ /sbin/dlm_tool lockdebug default

Resource ffff810c1f02c080 Name (len=1) "A"
Local Copy, Master is node 2
Granted Queue
01aa0005 PR Master:     03b80003
Conversion Queue
034f0004 NL (EX) Master:     02310005
Waiting Queue

Resource ffff810c1f02cc80 Name (len=1) "B"
Local Copy, Master is node 4
Granted Queue
03630001 PR Master:     030c0001
Conversion Queue
02070004 NL (EX) Master:     03530003
Waiting Queue

The machine I'm using is running RHEL5.



From jeff.sturm at eprize.com  Tue Jan 25 20:18:29 2011
From: jeff.sturm at eprize.com (Jeff Sturm)
Date: Tue, 25 Jan 2011 15:18:29 -0500
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3F25B7.3080306@dbtgroup.com>
References: <4D3DD02A.3000303@dbtgroup.com>	<201101242326.06753.rafagriman@gmail.com>	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com><201101251928.29665.rafagriman@gmail.com>
	<4D3F25B7.3080306@dbtgroup.com>
Message-ID: <64D0546C5EBBD147B75DE133D798665F06A12B14@hugo.eprize.local>

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com]
> On Behalf Of yvette hirth
> Subject: Re: [Linux-cluster] gfs2 v. zfs?
> 
> ls -lhad /foo/stuff/moreStuff/*
> 
> is where response time increases by a magnitude, because moreStuff has
> ~260 directories.  enumerating moreStuff and other "directories with many
> subdirectories" appear to be the culprits.

Try "unalias ls" before you do that.  Does it make a difference?

On my systems (and probably most CentOS/Red Hat) I have:

	$ alias ls
	alias ls='ls --color=tty'

defined in /etc/profile.d/colorls.sh.  This results in an extra stat() system call per file in the output listing to determine the node type.

-Jeff




From teigland at redhat.com  Tue Jan 25 22:19:48 2011
From: teigland at redhat.com (David Teigland)
Date: Tue, 25 Jan 2011 17:19:48 -0500
Subject: [Linux-cluster] Deadlock detection in libdlm
In-Reply-To: <AANLkTik_18nO9U0JGerkRMbBViCi53YEoLaqD6ndNOH7@mail.gmail.com>
References: <AANLkTik_18nO9U0JGerkRMbBViCi53YEoLaqD6ndNOH7@mail.gmail.com>
Message-ID: <20110125221948.GB3703@redhat.com>

On Tue, Jan 25, 2011 at 08:01:00PM +0000, Steve Little wrote:
> I've been trying to make use of deadlock detection in libdlm, but
> without any luck so far. I'm hoping someone can tell me what I'm doing
> wrong, or how to debug this further.

The dlm detects *conversion* deadlocks on a single resource and returns
EDEADLK for them.

> This should cause a classic deadlock: process 1 is waiting on resource
> A, which is locked by process 2. Process 2 is waiting on resource B,
> which is locked by process 1.

A "classic" multi-resource deadlock is not detected.

Google came up with this nice description of the difference:
http://books.google.com/books?id=ydKIsgCiFVsC&pg=PA143&lpg=PA143&dq=conversion+deadlock&source=bl&ots=LSJEUQU3HI&sig=eo4UhF9sR474OvQ1Nbeid2iHTOI&hl=en&ei=6kk_TZOeNImycdLyidEB&sa=X&oi=book_result&ct=result&resnum=6&ved=0CDkQ6AEwBTgK#v=onepage&q=conversion%20deadlock&f=false

The dlm also does lock timeouts which could be used to approximate
deadlock detection/resolution.

I wrote a "toy" proof of concept for full deadlock detection once.  The
code still exists in dlm_controld, I'm not sure if the sufficient flags
exist in the API to enable and play with it any more (that's about all
it's good for.)


>        EDEADLOCK       The lock operation is causing a deadlock and has been
>                        cancelled. If this was a conversion then the lock is
>                        reverted to its previously granted state. If it was a
>                        new lock then it has not been granted. (NB Only
>                        conversion deadlocks are currently detected)"

It does note the limitation.

Dave



From s.wendy.cheng at gmail.com  Tue Jan 25 23:27:16 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Tue, 25 Jan 2011 15:27:16 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D3F25B7.3080306@dbtgroup.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
	<201101251928.29665.rafagriman@gmail.com>
	<4D3F25B7.3080306@dbtgroup.com>
Message-ID: <AANLkTimtKdTx=dF6275wmdOV+Sb3oFSO6DbEEdtm+qOP@mail.gmail.com>

On Tue, Jan 25, 2011 at 11:34 AM, yvette hirth <yvette at dbtgroup.com> wrote:
> Rafa Grim?n wrote:
>
>> Yes that is true. It's a bit blurry because some file systems have
>> features others have so "classifying" them is quite difficult.
>
> i'm amazed at the conversation that has taken place by me simply asking a
> question.
>
> *Thank You* all for all of this info!

We purposely diverted your question to backup, since it is easier to
have productive discussions (compared to directory list) :) ... In
general, any "walk" operation on GFS2 can become a pain for various
reasons. Directory listing is certainly one of them. It is an age old
problem. Other than the inherited issues from the horrible stat()
system call, it is also to do with the way GFS(1/2) likes to
"distribute" its block all over the device upon write contention. I
don't see how GFS2 can alleviate this pain w/out doing some sorts of
block reallocation.

I'll let other capable people to have another round of fun discussions
.... Maybe some creative ideas can get popped out as the result ...

-- Wendy

>
> we've traced the response time slowdown to "number of subdirectories that
> need to be listed when their parent directory is enumerated".
>
> btw, my usage of "enumeration" means, "list contents". ?sorry for any
> confusion.
>
> we've noticed that if we do:
>
> ls -lhad /foo
> ls -lhad /foo/stuff
> ls -lhad /foo/stuff/moreStuff
>
> response time is good, because , but
>
> ls -lhad /foo/stuff/moreStuff/*
>
> is where response time increases by a magnitude, because moreStuff has ~260
> directories. ?enumerating moreStuff and other "directories with many
> subdirectories" appear to be the culprits.
>
> for now, we'll be moving directories around, trying to reduce the number of
> nested levels, and number of elements in each level.
>
> in human interaction there is a rule: ?as the number of people interacting
> increases linearly, the number of interactions between the people increases
> exponentially. ?is it true that as the number of nodes, "n", increases
> linearly, the amount of metadata being passed around / inspected during disk
> access increases geometrically? ?does this "rule" apply? ?or does metadata
> processing increase linearly as well, because the querying is all done by
> one node?
>
> thanks again - what a group!
> yvette
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From andrew at beekhof.net  Wed Jan 26 09:52:39 2011
From: andrew at beekhof.net (Andrew Beekhof)
Date: Wed, 26 Jan 2011 10:52:39 +0100
Subject: [Linux-cluster] Running cluster tools using non-root user
In-Reply-To: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>
References: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>
Message-ID: <AANLkTin390eSrG=RCuocD-0VjBOJQ9DXaSnms26N8QjS@mail.gmail.com>

[Shameless plug]

The next version of Pacemaker (1.1.6) will have this feature :-)
The patches were merged form our devel branch about a week ago.

[/Shameless plug]

On Tue, Jan 25, 2011 at 10:39 AM, Parvez Shaikh
<parvez.h.shaikh at gmail.com> wrote:
> Hi all
>
> Is it possible to run cluster tools like clustat or clusvcadm etc. using
> non-root user?
>
> If yes, to which groups this user should belong to? Otherwise can this be
> done using sudo(and sudoers) file.
>
> As of now I get following error on clustat -
>
> Could not connect to CMAN: Permission denied
>
>
> Thanks,
> Parvez
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From swhiteho at redhat.com  Wed Jan 26 10:00:17 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 26 Jan 2011 10:00:17 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTimtKdTx=dF6275wmdOV+Sb3oFSO6DbEEdtm+qOP@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>
	<201101242326.06753.rafagriman@gmail.com>
	<AANLkTi=Uv+v+QHmPGfRRC4twSwp3a9XJ8d1_yVm5E9X4@mail.gmail.com>
	<201101251928.29665.rafagriman@gmail.com>
	<4D3F25B7.3080306@dbtgroup.com>
	<AANLkTimtKdTx=dF6275wmdOV+Sb3oFSO6DbEEdtm+qOP@mail.gmail.com>
Message-ID: <1296036017.2575.2.camel@dolmen>

Hi,

On Tue, 2011-01-25 at 15:27 -0800, Wendy Cheng wrote:
> On Tue, Jan 25, 2011 at 11:34 AM, yvette hirth <yvette at dbtgroup.com> wrote:
> > Rafa Grim?n wrote:
> >
> >> Yes that is true. It's a bit blurry because some file systems have
> >> features others have so "classifying" them is quite difficult.
> >
> > i'm amazed at the conversation that has taken place by me simply asking a
> > question.
> >
> > *Thank You* all for all of this info!
> 
> We purposely diverted your question to backup, since it is easier to
> have productive discussions (compared to directory list) :) ... In
> general, any "walk" operation on GFS2 can become a pain for various
> reasons. Directory listing is certainly one of them. It is an age old
> problem. Other than the inherited issues from the horrible stat()
> system call, it is also to do with the way GFS(1/2) likes to
> "distribute" its block all over the device upon write contention. I
> don't see how GFS2 can alleviate this pain w/out doing some sorts of
> block reallocation.
> 
> I'll let other capable people to have another round of fun discussions
> .... Maybe some creative ideas can get popped out as the result ...
> 
> -- Wendy
> 
Although the block allocation can be an issue, the larger issue is that
of caching and when/whether the cache is flushed due to a write
operation on another node. The combination of workloads which scan all
files and a write workload, updating the same file set when run from
different nodes can cause dramatic slowdowns.

The solution is to try and partition the workload in a way which makes
the best use of the cache and reduces the number of invalidations which
are done,

Steve.


> >
> > we've traced the response time slowdown to "number of subdirectories that
> > need to be listed when their parent directory is enumerated".
> >
> > btw, my usage of "enumeration" means, "list contents".  sorry for any
> > confusion.
> >
> > we've noticed that if we do:
> >
> > ls -lhad /foo
> > ls -lhad /foo/stuff
> > ls -lhad /foo/stuff/moreStuff
> >
> > response time is good, because , but
> >
> > ls -lhad /foo/stuff/moreStuff/*
> >
> > is where response time increases by a magnitude, because moreStuff has ~260
> > directories.  enumerating moreStuff and other "directories with many
> > subdirectories" appear to be the culprits.
> >
> > for now, we'll be moving directories around, trying to reduce the number of
> > nested levels, and number of elements in each level.
> >
> > in human interaction there is a rule:  as the number of people interacting
> > increases linearly, the number of interactions between the people increases
> > exponentially.  is it true that as the number of nodes, "n", increases
> > linearly, the amount of metadata being passed around / inspected during disk
> > access increases geometrically?  does this "rule" apply?  or does metadata
> > processing increase linearly as well, because the querying is all done by
> > one node?
> >
> > thanks again - what a group!
> > yvette
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From swhiteho at redhat.com  Wed Jan 26 10:19:27 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 26 Jan 2011 10:19:27 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
	<1295949688.2598.19.camel@dolmen>
	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
Message-ID: <1296037167.2575.21.camel@dolmen>

Hi,

On Tue, 2011-01-25 at 09:16 -0800, Wendy Cheng wrote:
> On Tue, Jan 25, 2011 at 2:01 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:
> 
> >> On Mon, Jan 24, 2011 at 6:55 PM, Jankowski, Chris
> >> <Chris.Jankowski at hp.com> wrote:
> >> > A few comments, which might contrast uses of GFS2 and XFS in enterprise class production environments:
> >> >
> >> > 3.
> >> > GFS2 provides only tar(1) as a backup mechanism.
> >> > Unfortunately, tar(1) does not cope efficiently with sparse files,
> >> > which many applications create.
> >> > As an exercise create a 10 TB sparse file with just one byte of non-null data at the end.
> >> > Then try to back it up to disk using tar(1).
> >> > The tar image will be correctly created, but it will take many, many hours.
> >> > Dump(8) would do the job in a blink, but is not available for GFS2 filesystem.
> >> > However, XFS does have XFS specific dump(8) command and will backup sparse files
> >> > efficiently.
> >> >
> > You don't need dump in order to do this (since dump reads directly from
> > the block device itself, that would be problematic on GFS/GFS2 anyway).
> > All that is required is a backup too which support the FIEMAP ioctl. I
> > don't know if that has made it into tar yet, I suspect probably not.
> >
> 
> If cluster snapshot is in the hand of another develop team (that may
> not see it as a high priority), a GFS2 specific dump command could be
> a good alternative. The bottom line here is GFS2 is lacking a sensible
> (read as "easy to use") backup strategy that can significantly
> jeopardize its deployment.
> 
> Of couse, this depends on .... someone has to be less stubborn and
> willing to move GFS2's inode number away from its physical disk block
> number. Cough !
> 
> -- Wendy
> 
I don't know of any reason why the inode number should be related to
back up. The reason why it was suggested that the inode number should be
independent of the physical block number was in order to allow
filesystem shrink without upsetting (for example) NFS which assumed that
its filehandles are valid "forever".

The problem with doing that is that it adds an extra layer of
indirection (and one which had not been written in gfs2 at the point in
time we took that decision). That extra layer of indirection means more
overhead on every lookup of the inode. It would also be a contention
point in a distributed filesystem, since it would be global state.

The dump command directly accesses the filesystem via the block device
which is a problem for GFS2, since there is no guarantee (and in general
it won't be) that the information read via this method will match the
actual content of the filesystem. Unlike ext2/3 etc., GFS2 caches its
metadata in per-inode address spaces which are kept coherent using
glocks. In ext2/3 etc., the metadata is cached in the block device
address space which is why dump can work with them.

With GFS2 the only way to ensure that the block device was consistent
would be to umount the filesystem on all nodes. In that case it is no
problem to simply copy the block device using dd, for example. So dump
is not required.

Ideally we want backup to be online (i.e. with the filesystem mounted),
and we also do not want it to disrupt the workload which the cluster was
designed for, so far as possible. So the best solution is to back up
files from the node which is most likely to be caching them. That also
means that the backup can proceed in parallel across the nodes, reducing
the time taken.

It does mean that a bit more thought has to go into it, since it may not
be immediately obvious what the working set of each node actually is.
Usually though, it is possible to make a reasonable approximation of it,

Steve.




From m.watts at eris.qinetiq.com  Wed Jan 26 14:12:54 2011
From: m.watts at eris.qinetiq.com (Mark Watts)
Date: Wed, 26 Jan 2011 14:12:54 +0000
Subject: [Linux-cluster] CentOS 5.5 RHCS (cluster.conf) apache resource
	issues
Message-ID: <4D402BE6.8050205@eris.qinetiq.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I have a 2-node CentOS 5.5 cluster, with a single service comprising of
an IP address and apache resource.

Starting and stopping rgmanager works fine - the service starts and
stops with no issues.


# service rgmanager start
Jan 26 14:01:56 proxy01 clurgmgrd[5305]: <notice> Resource Group Manager
Starting
Jan 26 14:01:56 proxy01 clurgmgrd[5305]: <info> Loading Service Data
Jan 26 14:02:00 proxy01 clurgmgrd[5305]: <info> Initializing Services
Jan 26 14:02:01 proxy01 clurgmgrd: [5305]: <info> Stopping Service
apache:Public_Proxy_HTTPD
Jan 26 14:02:01 proxy01 clurgmgrd: [5305]: <err> Checking Existence Of
File /var/run/cluster/apache/apache:Public_Proxy_HTTPD.pid
[apache:Public_Proxy_HTTPD] > Failed - File Doesn't Exist
Jan 26 14:02:01 proxy01 clurgmgrd: [5305]: <info> Stopping Service
apache:Public_Proxy_HTTPD > Succeed
Jan 26 14:02:02 proxy01 clurgmgrd[5305]: <info> Services Initialized
Jan 26 14:02:02 proxy01 clurgmgrd[5305]: <info> State change: Local UP
Jan 26 14:02:07 proxy01 clurgmgrd[5305]: <notice> Starting stopped
service service:Public_Proxy
Jan 26 14:02:07 proxy01 clurgmgrd: [5305]: <info> Adding IPv4 address
192.168.0.1/24 to eth0
Jan 26 14:02:11 proxy01 clurgmgrd: [5305]: <info> Starting Service
apache:Public_Proxy_HTTPD
Jan 26 14:02:14 proxy01 clurgmgrd[5305]: <notice> Service
service:Public_Proxy started

# service rgmanager stop
Jan 26 14:02:20 proxy01 rgmanager: [5947]: <notice> Shutting down
Cluster Service Manager...
Jan 26 14:02:20 proxy01 clurgmgrd[5305]: <notice> Shutting down
Jan 26 14:02:20 proxy01 clurgmgrd[5305]: <notice> Shutting down
Jan 26 14:02:20 proxy01 clurgmgrd[5305]: <notice> Stopping service
service:Public_Proxy
Jan 26 14:02:21 proxy01 clurgmgrd: [5305]: <info> Stopping Service
apache:Public_Proxy_HTTPD
Jan 26 14:02:21 proxy01 clurgmgrd: [5305]: <info> Stopping Service
apache:Public_Proxy_HTTPD > Succeed
Jan 26 14:02:22 proxy01 clurgmgrd: [5305]: <info> Removing IPv4 address
192.168.0.1/24 from eth0
Jan 26 14:02:32 proxy01 clurgmgrd[5305]: <notice> Service
service:Public_Proxy is stopped
Jan 26 14:02:32 proxy01 clurgmgrd[5305]: <notice> Shutdown complete, exiting
Jan 26 14:02:32 proxy01 rgmanager: [5947]: <notice> Cluster Service
Manager is stopped.


Starting rgmanager then using clusvcadm to disable/enable/disable the
service fails on the last disable:

# service rgmanager start
Jan 26 13:53:26 proxy01 clurgmgrd[3157]: <notice> Resource Group Manager
Starting
Jan 26 13:53:26 proxy01 clurgmgrd[3157]: <info> Loading Service Data
Jan 26 13:53:31 proxy01 clurgmgrd[3157]: <info> Initializing Services
Jan 26 13:53:33 proxy01 clurgmgrd: [3157]: <info> Stopping Service
apache:Public_Proxy_HTTPD
Jan 26 13:53:33 proxy01 clurgmgrd: [3157]: <err> Checking Existence Of
File /var/run/cluster/apache/apache:Public_Proxy_HTTPD.pid
[apache:Public_Proxy_HTTPD] > Failed - File Doesn't Exist
Jan 26 13:53:33 proxy01 clurgmgrd: [3157]: <info> Stopping Service
apache:Public_Proxy_HTTPD > Succeed
Jan 26 13:53:33 proxy01 clurgmgrd[3157]: <info> Services Initialized
Jan 26 13:53:34 proxy01 clurgmgrd[3157]: <info> State change: Local UP
Jan 26 13:53:39 proxy01 clurgmgrd[3157]: <notice> Starting stopped
service service:Public_Proxy
Jan 26 13:53:39 proxy01 clurgmgrd[3157]: <info> State change:
ukmaboedprp02.mgmt UP
Jan 26 13:53:39 proxy01 clurgmgrd: [3157]: <info> Adding IPv4 address
192.168.0.1/24 to eth0
Jan 26 13:53:44 proxy01 clurgmgrd: [3157]: <info> Starting Service
apache:Public_Proxy_HTTPD
Jan 26 13:53:47 proxy01 clurgmgrd[3157]: <notice> Service
service:Public_Proxy started

# clusvcadm -d Public_Proxy
Jan 26 13:58:46 proxy01 clurgmgrd[3157]: <notice> Stopping service
service:Public_Proxy
Jan 26 13:58:46 proxy01 clurgmgrd: [3157]: <info> Stopping Service
apache:Public_Proxy_HTTPD
Jan 26 13:58:46 proxy01 clurgmgrd: [3157]: <info> Stopping Service
apache:Public_Proxy_HTTPD > Succeed
Jan 26 13:58:47 proxy01 clurgmgrd: [3157]: <info> Removing IPv4 address
192.168.0.1/24 from eth0
Jan 26 13:58:57 proxy01 clurgmgrd[3157]: <notice> Service
service:Public_Proxy is disabled

# clusvcadm -e Public_Proxy
Jan 26 13:59:04 proxy01 clurgmgrd[3157]: <notice> Starting disabled
service service:Public_Proxy
Jan 26 13:59:04 proxy01 clurgmgrd: [3157]: <info> Adding IPv4 address
192.168.0.1/24 to eth0
Jan 26 13:59:09 proxy01 clurgmgrd: [3157]: <info> Starting Service
apache:Public_Proxy_HTTPD
Jan 26 13:59:11 proxy01 clurgmgrd[3157]: <notice> Service
service:Public_Proxy started

# clusvcadm -d Public_Proxy
Jan 26 13:59:19 proxy01 clurgmgrd[3157]: <notice> Stopping service
service:Public_Proxy
Jan 26 13:59:20 proxy01 clurgmgrd: [3157]: <info> Stopping Service
apache:Public_Proxy_HTTPD
Jan 26 13:59:21 proxy01 clurgmgrd: [3157]: <err> Stopping Service
apache:Public_Proxy_HTTPD > Failed - Application Is Still Running
Jan 26 13:59:21 proxy01 clurgmgrd: [3157]: <err> Stopping Service
apache:Public_Proxy_HTTPD > Failed
Jan 26 13:59:21 proxy01 clurgmgrd[3157]: <notice> stop on apache
"Public_Proxy_HTTPD" returned 1 (generic error)
Jan 26 13:59:21 proxy01 clurgmgrd: [3157]: <info> Removing IPv4 address
192.168.0.1/24 from eth0
Jan 26 13:59:31 proxy01 clurgmgrd[3157]: <crit> #12: RG
service:Public_Proxy failed to stop; intervention required
Jan 26 13:59:31 proxy01 clurgmgrd[3157]: <notice> Service
service:Public_Proxy is failed


Anyone got any ideas why this is the case, and where I can look to work
out why I get a failure after an enable/disable cycle?

Regards,

Mark.

- -- 
Mark Watts BSc RHCE
Senior Systems Engineer, MSS Secure Managed Hosting
www.QinetiQ.com
QinetiQ - Delivering customer-focused solutions
GPG Key: http://www.linux-corner.info/mwatts.gpg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAk1AK+YACgkQBn4EFUVUIO1XXwCg6YdjpVkOicUllNerjfKtGilf
KawAoPtvfF43Jmrli8zyLCvEozYk+syh
=h5ra
-----END PGP SIGNATURE-----



From s.wendy.cheng at gmail.com  Wed Jan 26 16:16:24 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 26 Jan 2011 08:16:24 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <1296037167.2575.21.camel@dolmen>
References: <4D3DD02A.3000303@dbtgroup.com>
	<1295898535.4672.75.camel@dolmen>	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>	<021EA18B28FE4C968F4538D7FB621E20@versa>	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>	<1295949688.2598.19.camel@dolmen>	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
	<1296037167.2575.21.camel@dolmen>
Message-ID: <4D4048D8.6070304@gmail.com>

On 01/26/2011 02:19 AM, Steven Whitehouse wrote:
>
> I don't know of any reason why the inode number should be related to
> back up. The reason why it was suggested that the inode number should be
> independent of the physical block number was in order to allow
> filesystem shrink without upsetting (for example) NFS which assumed that
> its filehandles are valid "forever".

I'm not able to ping-pong too many emails on external mailing lists, at 
least during week days. However, a quick note on this ...

GFS2 fragments very soon and very badly ! Its blocks are all over the 
device due to the nature of how the resource group works. That slows 
down *every* thing, particularly for backup applications. A production 
deployment will encounter this issue very soon and they'll find the 
issue more than annoying.

Now, educate me (though I'll probably not read it until weekend) ... how 
will you defragment the FS with that inode number attaching to physical 
block number ? You can *not* move these inodes.

-- Wendy
> The problem with doing that is that it adds an extra layer of
> indirection (and one which had not been written in gfs2 at the point in
> time we took that decision). That extra layer of indirection means more
> overhead on every lookup of the inode. It would also be a contention
> point in a distributed filesystem, since it would be global state.
>
> The dump command directly accesses the filesystem via the block device
> which is a problem for GFS2, since there is no guarantee (and in general
> it won't be) that the information read via this method will match the
> actual content of the filesystem. Unlike ext2/3 etc., GFS2 caches its
> metadata in per-inode address spaces which are kept coherent using
> glocks. In ext2/3 etc., the metadata is cached in the block device
> address space which is why dump can work with them.
>
> With GFS2 the only way to ensure that the block device was consistent
> would be to umount the filesystem on all nodes. In that case it is no
> problem to simply copy the block device using dd, for example. So dump
> is not required.
>
> Ideally we want backup to be online (i.e. with the filesystem mounted),
> and we also do not want it to disrupt the workload which the cluster was
> designed for, so far as possible. So the best solution is to back up
> files from the node which is most likely to be caching them. That also
> means that the backup can proceed in parallel across the nodes, reducing
> the time taken.
>
> It does mean that a bit more thought has to go into it, since it may not
> be immediately obvious what the working set of each node actually is.
> Usually though, it is possible to make a reasonable approximation of it,
>
> Steve.
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From gordan at bobich.net  Wed Jan 26 16:36:03 2011
From: gordan at bobich.net (Gordan Bobic)
Date: Wed, 26 Jan 2011 16:36:03 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D4048D8.6070304@gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com>	<1295898535.4672.75.camel@dolmen>	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>	<021EA18B28FE4C968F4538D7FB621E20@versa>	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>	<1295949688.2598.19.camel@dolmen>	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>	<1296037167.2575.21.camel@dolmen>
	<4D4048D8.6070304@gmail.com>
Message-ID: <4D404D73.8080306@bobich.net>

Wendy Cheng wrote:

> GFS2 fragments very soon and very badly ! Its blocks are all over the 
> device due to the nature of how the resource group works. That slows 
> down *every* thing, particularly for backup applications. A production 
> deployment will encounter this issue very soon and they'll find the 
> issue more than annoying.

While I don't disagree that it's a problem, most people will use GFS and 
similar FS-es on a SAN. A typical SAN will allocate sparse files for 
backing the block device. That means it'll be badly fragmented very 
quickly on the back end regardless of what the FS does.

Gordan



From swhiteho at redhat.com  Wed Jan 26 16:59:58 2011
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 26 Jan 2011 16:59:58 +0000
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D4048D8.6070304@gmail.com>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
	<1295949688.2598.19.camel@dolmen>
	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
	<1296037167.2575.21.camel@dolmen>  <4D4048D8.6070304@gmail.com>
Message-ID: <1296061198.2575.70.camel@dolmen>

Hi,

On Wed, 2011-01-26 at 08:16 -0800, Wendy Cheng wrote:
> On 01/26/2011 02:19 AM, Steven Whitehouse wrote:
> >
> > I don't know of any reason why the inode number should be related to
> > back up. The reason why it was suggested that the inode number should be
> > independent of the physical block number was in order to allow
> > filesystem shrink without upsetting (for example) NFS which assumed that
> > its filehandles are valid "forever".
> 
> I'm not able to ping-pong too many emails on external mailing lists, at 
> least during week days. However, a quick note on this ...
> 
> GFS2 fragments very soon and very badly ! Its blocks are all over the 
> device due to the nature of how the resource group works. That slows 
> down *every* thing, particularly for backup applications. A production 
> deployment will encounter this issue very soon and they'll find the 
> issue more than annoying.
> 
The fragmentation pattern of GFS2 depends upon the workload. If there
are lots of small files being created, and/or appended to and the fs is
nearly full then fragmentation will increase.

Each inode has a resource group that it will use by default. It only
changes resource group when either there are no blocks left in that
resource group, or if contention occurs wrt to another node. Provided
the number of resource groups is much larger than the number of nodes,
contention at the resource group level is largely avoided.

Given a filesystem which is say, no more than 80% full, and given files
which are reasonably large and written sequentially, the fragmentation
should be relatively low.

> Now, educate me (though I'll probably not read it until weekend) ... how 
> will you defragment the FS with that inode number attaching to physical 
> block number ? You can *not* move these inodes.
> 
> -- Wendy

It is true that we don't have a defragmentation solution at the moment.
We could potentially do that in an off-line manner, but so far we've not
really found that to be a big problem.

When there are performance issues, the caching issue where there is
contention on a single inode, is always a much greater issue overall.

Also, as an aside, if anybody wants to look at their GFS2 files to see
whether they are fragmented, then the filefrag tool from e2fsprogs will
tell you on a file by file basis. Often the best way to defragment a
file is simply to copy it, provided the filesystem is not already nearly
full. That also has the advantage of being able to be done online, and
with standard system tools.

Nevertheless, I agree that it would be nice to be able to move the
inodes around freely. I'm not sure that the cost of the required extra
layer of indirection would be worth it though, in terms of the benefits
gained.

I'm happy to be proved wrong, if someone can come up with a workable
solution,

Steve.


> > The problem with doing that is that it adds an extra layer of
> > indirection (and one which had not been written in gfs2 at the point in
> > time we took that decision). That extra layer of indirection means more
> > overhead on every lookup of the inode. It would also be a contention
> > point in a distributed filesystem, since it would be global state.
> >
> > The dump command directly accesses the filesystem via the block device
> > which is a problem for GFS2, since there is no guarantee (and in general
> > it won't be) that the information read via this method will match the
> > actual content of the filesystem. Unlike ext2/3 etc., GFS2 caches its
> > metadata in per-inode address spaces which are kept coherent using
> > glocks. In ext2/3 etc., the metadata is cached in the block device
> > address space which is why dump can work with them.
> >
> > With GFS2 the only way to ensure that the block device was consistent
> > would be to umount the filesystem on all nodes. In that case it is no
> > problem to simply copy the block device using dd, for example. So dump
> > is not required.
> >
> > Ideally we want backup to be online (i.e. with the filesystem mounted),
> > and we also do not want it to disrupt the workload which the cluster was
> > designed for, so far as possible. So the best solution is to back up
> > files from the node which is most likely to be caching them. That also
> > means that the backup can proceed in parallel across the nodes, reducing
> > the time taken.
> >
> > It does mean that a bit more thought has to go into it, since it may not
> > be immediately obvious what the working set of each node actually is.
> > Usually though, it is possible to make a reasonable approximation of it,
> >
> > Steve.
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster




From mgrac at redhat.com  Wed Jan 26 16:59:55 2011
From: mgrac at redhat.com (Marek Grac)
Date: Wed, 26 Jan 2011 17:59:55 +0100
Subject: [Linux-cluster] fence_agents package for 5.5?
In-Reply-To: <20110124192107.GY13584@mip.aaaaa.org>
References: <20110120001101.GZ864@mip.aaaaa.org> <4D37F7FC.8030702@redhat.com>
	<20110124192107.GY13584@mip.aaaaa.org>
Message-ID: <4D40530B.3000104@redhat.com>

On 01/24/2011 08:21 PM, Ofer Inbar wrote:
> On Thu, Jan 20, 2011 at 09:53:16AM +0100,
> Marek 'marx' Grc<mgrac at redhat.com>  wrote:
>> Ofer Inbar wrote:
>>> At http://sources.redhat.com/cluster/wiki/VMware_FencingConfig I see this:
>>>
>>> "If you are using RHEL 5.5/RHEL 6 just install fence_agents package and
>>> you are ready to use fence_vmware."
>> This package is available for RHEV-H (based on RHEL5.5) and is part
>> of RHEL 5.6.
> So...
>   - It's called fence-agents, not fence_agents (minor typo)
fixed
>   - It's not available for RHEL or CentOS 5.5
it it available in some channels for RHEL - it is not part of base or 
cluster channel; I do not know about CentOS
> ... the wiki is mistaken on both of these counts, yes?
>
>
> I've tried finding a fence-agents RPM I can use, but all the newer
> ones have big disruptive dependencies (such as needing a newer glibc),
> and older ones no longer seem to exist on the mirrors I've found
> pointers to.  I can find no evidence that fence-agents was ever built
> and distributed for any version of CentOS 5 / RHEL 5.
>
> Am I on the wrong track here, or is this a reasonable path towards
> getting a working vmware fencing agent for CentOS 5.5?
You can wait for CentOS 5.6 or you can use directly our git tree 
(cluster.git with branches RHEL5x) or fence-agents.git to have latest 
upstream version.

m,



From mgrac at redhat.com  Wed Jan 26 17:04:07 2011
From: mgrac at redhat.com (Marek Grac)
Date: Wed, 26 Jan 2011 18:04:07 +0100
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <20110124180304.GV13584@mip.aaaaa.org>
References: <20110119232105.GY864@mip.aaaaa.org> <4D37F97D.1050904@redhat.com>
	<20110124180304.GV13584@mip.aaaaa.org>
Message-ID: <4D405407.7000006@redhat.com>

On 01/24/2011 07:03 PM, Ofer Inbar wrote:
> On Thu, Jan 20, 2011 at 09:59:41AM +0100,
> Marek 'marx' Grc<mgrac at redhat.com>  wrote:
>
>> Ofer Inbar wrote:
>>> I've got a cluster on CentOS 5.5, cman-2.0.115-34.el5_5.4, using VMWare.
>>> $ sudo fence_vmware_ng -a virtualcenter -l [loginname] -p [password] -o
>>> NameError: global name 'SHELL_TIMEOUT' is not defined
>> Bug was very likely fixed in cman-2.0.115-39. But you can change these
>> constants with suitable values (few seconds should be enough).
> Thanks.
>
> Q1: Where can I get that build (or later) of cman for CentOS 5?  I
> can't find it in the CentOS or RHEL repos, or EPEL, or via
> rpm.pbone.net or some other places I tried.
All of our codebase is available in our git repositories which are open 
to public. They are on the cluster wiki page. You can also use your 
support contact to get fix to yours bug via other methods.
> Q2: As for changing the constants: I don't see where they are set in
> the first place, nor any description of what they're for, or what
> values they're supposed to have.  Are you suggesting I directly edit
> the fence_vmware_ng script and add definitions for these constants?
> Won't that be destroyed by the next install or upgrade of the RPM?  Is
> there someplace these constants are supposed to be defined?  And if
> I'm hacking the script directly, and have no idea what these constants
> are for, is there any reason I shouldn't just edit them out of the one
> line where they're used, and replace them with a small integer?
They are set in the fencing library (lib/fencing.py) but various fence 
agents needs different values. Those constants are described on  
http://sources.redhat.com/cluster/wiki/FenceTiming

m,



From m.watts at eris.qinetiq.com  Wed Jan 26 17:11:06 2011
From: m.watts at eris.qinetiq.com (Mark Watts)
Date: Wed, 26 Jan 2011 17:11:06 +0000
Subject: [Linux-cluster] fence_agents package for 5.5?
In-Reply-To: <4D40530B.3000104@redhat.com>
References: <20110120001101.GZ864@mip.aaaaa.org>
	<4D37F7FC.8030702@redhat.com>	<20110124192107.GY13584@mip.aaaaa.org>
	<4D40530B.3000104@redhat.com>
Message-ID: <4D4055AA.3000609@eris.qinetiq.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/26/2011 04:59 PM, Marek Grac wrote:
> On 01/24/2011 08:21 PM, Ofer Inbar wrote:
>> On Thu, Jan 20, 2011 at 09:53:16AM +0100,
>> Marek 'marx' Grc<mgrac at redhat.com>  wrote:
>>> Ofer Inbar wrote:
>>>> At http://sources.redhat.com/cluster/wiki/VMware_FencingConfig I see
>>>> this:
>>>>
>>>> "If you are using RHEL 5.5/RHEL 6 just install fence_agents package and
>>>> you are ready to use fence_vmware."
>>> This package is available for RHEV-H (based on RHEL5.5) and is part
>>> of RHEL 5.6.
>> So...
>>   - It's called fence-agents, not fence_agents (minor typo)
> fixed
>>   - It's not available for RHEL or CentOS 5.5
> it it available in some channels for RHEL - it is not part of base or
> cluster channel; I do not know about CentOS
>> ... the wiki is mistaken on both of these counts, yes?
>>
>>
>> I've tried finding a fence-agents RPM I can use, but all the newer
>> ones have big disruptive dependencies (such as needing a newer glibc),
>> and older ones no longer seem to exist on the mirrors I've found
>> pointers to.  I can find no evidence that fence-agents was ever built
>> and distributed for any version of CentOS 5 / RHEL 5.
>>
>> Am I on the wrong track here, or is this a reasonable path towards
>> getting a working vmware fencing agent for CentOS 5.5?
> You can wait for CentOS 5.6 or you can use directly our git tree
> (cluster.git with branches RHEL5x) or fence-agents.git to have latest
> upstream version.
> 
> m,
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

Does this imply that using CentOS 5.5, one is unable to use fence_vmware
to fence an ESXi VM since fence_vmware is too old?

Mark.

- -- 
Mark Watts BSc RHCE
Senior Systems Engineer, MSS Secure Managed Hosting
www.QinetiQ.com
QinetiQ - Delivering customer-focused solutions
GPG Key: http://www.linux-corner.info/mwatts.gpg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAk1AVaoACgkQBn4EFUVUIO1n7wCg2WTLhQabSK3bBstrQPxPegOX
cY4AmwVQ+Bfzh6F+4hI6zMmouqTXN/45
=oYgO
-----END PGP SIGNATURE-----



From s.wendy.cheng at gmail.com  Wed Jan 26 17:39:29 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Wed, 26 Jan 2011 09:39:29 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <4D404D73.8080306@bobich.net>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
	<1295949688.2598.19.camel@dolmen>
	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
	<1296037167.2575.21.camel@dolmen> <4D4048D8.6070304@gmail.com>
	<4D404D73.8080306@bobich.net>
Message-ID: <AANLkTinOpx53=D-Dc_1d1FwSXv51Wtwsnw3ZrX7HKz3Q@mail.gmail.com>

On Wed, Jan 26, 2011 at 8:36 AM, Gordan Bobic <gordan at bobich.net> wrote:
> Wendy Cheng wrote:
>
>> GFS2 fragments very soon and very badly ! Its blocks are all over the
>> device due to the nature of how the resource group works. That slows down
>> *every* thing, particularly for backup applications. A production deployment
>> will encounter this issue very soon and they'll find the issue more than
>> annoying.
>
> While I don't disagree that it's a problem, most people will use GFS and
> similar FS-es on a SAN. A typical SAN will allocate sparse files for backing
> the block device. That means it'll be badly fragmented very quickly on the
> back end regardless of what the FS does.

I don't know how a "typical SAN" is defined ... but I could guess
which SAN box does this. Did you check their admin guide ? You may be
surprised by the turning knobs they offer.

-- Wendy

>
> Gordan
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From cos at aaaaa.org  Wed Jan 26 18:11:41 2011
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 26 Jan 2011 13:11:41 -0500
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <4D405407.7000006@redhat.com>
References: <20110119232105.GY864@mip.aaaaa.org> <4D37F97D.1050904@redhat.com>
	<20110124180304.GV13584@mip.aaaaa.org>
	<4D405407.7000006@redhat.com>
Message-ID: <20110126181141.GP13584@mip.aaaaa.org>

Marek Grac <mgrac at redhat.com> wrote:
> They are set in the fencing library (lib/fencing.py) but various
> fence agents needs different values. Those constants are described
> on http://sources.redhat.com/cluster/wiki/FenceTiming

Problem is they're *not* set there, nor did I find any reference in
any documentation to the fact that they're supposed to be set there.
One thing I did when I first ran into this problem was:

# find / -type -f | xargs grep -l SHELL_TIMEOUT

... and similarly for POWER_TIMEOUT and LOGIN_TIMEOUT.

I found no useful hits, and none of these appear in
/usr/lib/fence/fencing.py.  That file is supplied by the cman
RPM, so I would've been reluctant to edit it anyway, because
I don't want to have to depend on running a locally patched cman,
and worry than any upgrade will break my patch.

However, fence_vmware_ng did not provide any guidance on setting
these, either.  So how is it ever supposed to work for anyone?

[As you can see from my last email on this thread, I gave up on
fence_vmware_ng when I found that fence_vmware works if I patch
the VI Perl Toolkit]
  -- Cos



From scooter at cgl.ucsf.edu  Wed Jan 26 18:35:38 2011
From: scooter at cgl.ucsf.edu (Scooter Morris)
Date: Wed, 26 Jan 2011 10:35:38 -0800
Subject: [Linux-cluster] Seeing lots of: [TOTEM] Retransmit List messages
Message-ID: <4D40697A.8080907@cgl.ucsf.edu>

We're seeing lots of these messages on a single node in our 3 node RHEL 
5.6 cluster (but never on the other nodes).  According to previous 
posts, this is indicative of lost packets and openais should recover 
just fine.  On the other hand, it's really disconcerting that we're 
losing packets (we've been fighting gfs2 glock issues also).  Is this a 
"normal" message, or should we really be concerned about losing 
packets?  We're not logging any errors on the private ethernet 
interfaces of any of the nodes, so I'm not really sure what might be 
going on.

-- scooter



From sdake at redhat.com  Wed Jan 26 20:49:24 2011
From: sdake at redhat.com (Steven Dake)
Date: Wed, 26 Jan 2011 13:49:24 -0700
Subject: [Linux-cluster] Seeing lots of: [TOTEM] Retransmit List messages
In-Reply-To: <4D40697A.8080907@cgl.ucsf.edu>
References: <4D40697A.8080907@cgl.ucsf.edu>
Message-ID: <4D4088D4.5020708@redhat.com>

On 01/26/2011 11:35 AM, Scooter Morris wrote:
> We're seeing lots of these messages on a single node in our 3 node RHEL
> 5.6 cluster (but never on the other nodes).  According to previous
> posts, this is indicative of lost packets and openais should recover
> just fine.  On the other hand, it's really disconcerting that we're
> losing packets (we've been fighting gfs2 glock issues also).  Is this a
> "normal" message, or should we really be concerned about losing
> packets?  We're not logging any errors on the private ethernet
> interfaces of any of the nodes, so I'm not really sure what might be
> going on.
> 

We implemented a workaround in 5.6 for a delayed multicast situation
that occurs on many switch vendor's hardware in openais-0.80.6-27.el5.
The workaround should solve the problem unless indeed there are lost
packets or there is a build error.  Which openais rpm are you using?

Regards
-steve

> -- scooter
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From jfriesse at redhat.com  Thu Jan 27 07:47:10 2011
From: jfriesse at redhat.com (Jan Friesse)
Date: Thu, 27 Jan 2011 08:47:10 +0100
Subject: [Linux-cluster] fence_vmware_ng fails: SHELL_TIMEOUT not defined
In-Reply-To: <20110126181141.GP13584@mip.aaaaa.org>
References: <20110119232105.GY864@mip.aaaaa.org>
	<4D37F97D.1050904@redhat.com>	<20110124180304.GV13584@mip.aaaaa.org>	<4D405407.7000006@redhat.com>
	<20110126181141.GP13584@mip.aaaaa.org>
Message-ID: <4D4122FE.3010606@redhat.com>

Ofer,
you cannot find that values because upstream change patch
43872d098d9406fd09afd81e972188a728e1c7a3 removed them and replaced them 
by CLI options.

Take a look to http://sources.redhat.com/cluster/wiki/VMware_FencingConfig


2010-05-04: Fence_vmware in RHEL 5.5 is currently fence_vmware_ng (same 
syntax but named as fence_vmware) so fence_vmware_ng is ***no longer 
needed***

This *** no longer needed *** also means, you shouldn't use it because 
it's no longer developed by upstream. In other words, use fence_vmware.

It's very sad that vmware decides to change VI Perl API which causes 
problems.

Work *should* be in progress on creating vmware agent using directly 
vmware SOAP API, so even patching VI Perl shouldn't be needed in the 
near future.

Can you please send me which version of VI Perl Toolkit are you using?

Regards,
   Honza

Ofer Inbar napsal(a):
> Marek Grac <mgrac at redhat.com> wrote:
>> They are set in the fencing library (lib/fencing.py) but various
>> fence agents needs different values. Those constants are described
>> on http://sources.redhat.com/cluster/wiki/FenceTiming
> 
> Problem is they're *not* set there, nor did I find any reference in
> any documentation to the fact that they're supposed to be set there.
> One thing I did when I first ran into this problem was:
> 
> # find / -type -f | xargs grep -l SHELL_TIMEOUT
> 
> ... and similarly for POWER_TIMEOUT and LOGIN_TIMEOUT.
> 
> I found no useful hits, and none of these appear in
> /usr/lib/fence/fencing.py.  That file is supplied by the cman
> RPM, so I would've been reluctant to edit it anyway, because
> I don't want to have to depend on running a locally patched cman,
> and worry than any upgrade will break my patch.
> 
> However, fence_vmware_ng did not provide any guidance on setting
> these, either.  So how is it ever supposed to work for anyone?
> 
> [As you can see from my last email on this thread, I gave up on
> fence_vmware_ng when I found that fence_vmware works if I patch
> the VI Perl Toolkit]
>   -- Cos
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster



From parvez.h.shaikh at gmail.com  Thu Jan 27 09:56:13 2011
From: parvez.h.shaikh at gmail.com (Parvez Shaikh)
Date: Thu, 27 Jan 2011 15:26:13 +0530
Subject: [Linux-cluster] Running cluster tools using non-root user
In-Reply-To: <AANLkTin390eSrG=RCuocD-0VjBOJQ9DXaSnms26N8QjS@mail.gmail.com>
References: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>
	<AANLkTin390eSrG=RCuocD-0VjBOJQ9DXaSnms26N8QjS@mail.gmail.com>
Message-ID: <AANLkTikRoLgf-_A5PShjKvGzgzjm1JAA1XwE2_aKSC6A@mail.gmail.com>

I believe Pacemaker is not same as "RHCS" or do they share code?

If yes, in which version of RHCS would this feature would be available?

I require to enable service, disable service, and get status. I am using CLI
tools and any scripting trick can help me running clusvcadm and/or clustat.

su -c "clusvcadm...." require entering password, can this also be eliminated
using sudoers?

Thanks

On Wed, Jan 26, 2011 at 3:22 PM, Andrew Beekhof <andrew at beekhof.net> wrote:

> [Shameless plug]
>
> The next version of Pacemaker (1.1.6) will have this feature :-)
> The patches were merged form our devel branch about a week ago.
>
> [/Shameless plug]
>
> On Tue, Jan 25, 2011 at 10:39 AM, Parvez Shaikh
> <parvez.h.shaikh at gmail.com> wrote:
> > Hi all
> >
> > Is it possible to run cluster tools like clustat or clusvcadm etc. using
> > non-root user?
> >
> > If yes, to which groups this user should belong to? Otherwise can this be
> > done using sudo(and sudoers) file.
> >
> > As of now I get following error on clustat -
> >
> > Could not connect to CMAN: Permission denied
> >
> >
> > Thanks,
> > Parvez
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110127/9288ebc6/attachment.htm>

From edgar at edgar-matzinger.nl  Thu Jan 27 11:21:34 2011
From: edgar at edgar-matzinger.nl (Edgar Matzinger)
Date: Thu, 27 Jan 2011 12:21:34 +0100
Subject: [Linux-cluster] Running cluster tools using non-root user
In-Reply-To: <AANLkTikRoLgf-_A5PShjKvGzgzjm1JAA1XwE2_aKSC6A@mail.gmail.com>
References: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>	<AANLkTin390eSrG=RCuocD-0VjBOJQ9DXaSnms26N8QjS@mail.gmail.com>
	<AANLkTikRoLgf-_A5PShjKvGzgzjm1JAA1XwE2_aKSC6A@mail.gmail.com>
Message-ID: <4D41553E.2020807@edgar-matzinger.nl>

Hi Parvez,
On 27/01/11 10:56, Parvez Shaikh wrote:

> 
> su -c "clusvcadm...." require entering password, can this also be
> eliminated using sudoers?

yes. Set up a list of commands (in sudoers) your users may execute and
assign that list to those users (again in sudoers). You can also create
a new unix group, add that group to the users in question and assign
that new group (again in sudoers) the command list created above. And as
always is Google your friend....

HTH, regards, Edgar.
-- 
Edgar Matzinger                               \\\|///
MailTo:edgar at edgar-matzinger.nl             \\  - -  //
                                             (  @ @  )
-------------------------------------------oOOo-(_)-oOOo------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 553 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110127/4f079d2f/attachment.sig>

From andrew at beekhof.net  Thu Jan 27 11:35:12 2011
From: andrew at beekhof.net (Andrew Beekhof)
Date: Thu, 27 Jan 2011 12:35:12 +0100
Subject: [Linux-cluster] Running cluster tools using non-root user
In-Reply-To: <AANLkTikRoLgf-_A5PShjKvGzgzjm1JAA1XwE2_aKSC6A@mail.gmail.com>
References: <AANLkTinPmPZOe77J-1mxzFovL9zoVz1vGoGr2BcniVGK@mail.gmail.com>
	<AANLkTin390eSrG=RCuocD-0VjBOJQ9DXaSnms26N8QjS@mail.gmail.com>
	<AANLkTikRoLgf-_A5PShjKvGzgzjm1JAA1XwE2_aKSC6A@mail.gmail.com>
Message-ID: <AANLkTimjg0ADMaiT+V6A2=EVUnUciiQhmQkhnOg4o2N=@mail.gmail.com>

On Thu, Jan 27, 2011 at 10:56 AM, Parvez Shaikh
<parvez.h.shaikh at gmail.com> wrote:
> I believe Pacemaker is not same as "RHCS"

Correct. At least not yet anyway.
Thats why I called my reply a shameless plug since it was for a
competing project.

Pacemaker does ship in RHEL6 though.

> or do they share code?

A Pacemaker installation shares almost all the underlying
infrastructure of what you know as RHCS - it just replaces the
rgmanager part.

>
> If yes, in which version of RHCS would this feature would be available?

We can't comment on future releases sorry.

> I require to enable service, disable service, and get status. I am using CLI
> tools and any scripting trick can help me running clusvcadm and/or clustat.
>
> su -c "clusvcadm...." require entering password, can this also be eliminated
> using sudoers?
>
> Thanks
>
> On Wed, Jan 26, 2011 at 3:22 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
>>
>> [Shameless plug]
>>
>> The next version of Pacemaker (1.1.6) will have this feature :-)
>> The patches were merged form our devel branch about a week ago.
>>
>> [/Shameless plug]
>>
>> On Tue, Jan 25, 2011 at 10:39 AM, Parvez Shaikh
>> <parvez.h.shaikh at gmail.com> wrote:
>> > Hi all
>> >
>> > Is it possible to run cluster tools like clustat or clusvcadm etc. using
>> > non-root user?
>> >
>> > If yes, to which groups this user should belong to? Otherwise can this
>> > be
>> > done using sudo(and sudoers) file.
>> >
>> > As of now I get following error on clustat -
>> >
>> > Could not connect to CMAN: Permission denied
>> >
>> >
>> > Thanks,
>> > Parvez
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From s.wendy.cheng at gmail.com  Thu Jan 27 19:04:08 2011
From: s.wendy.cheng at gmail.com (Wendy Cheng)
Date: Thu, 27 Jan 2011 11:04:08 -0800
Subject: [Linux-cluster] gfs2 v. zfs?
In-Reply-To: <1296061198.2575.70.camel@dolmen>
References: <4D3DD02A.3000303@dbtgroup.com> <1295898535.4672.75.camel@dolmen>
	<AANLkTinVyGRi9g5spBSF0NVOJmDXsgpuPyJRkBv9jcnM@mail.gmail.com>
	<021EA18B28FE4C968F4538D7FB621E20@versa>
	<AANLkTinA0ao9tBAm1Uouffk2Aw8E8Rm+opBjWunLGvMT@mail.gmail.com>
	<64D0546C5EBBD147B75DE133D798665F06A12AF0@hugo.eprize.local>
	<036B68E61A28CA49AC2767596576CD596F5CFF829B@GVW1113EXC.americas.hpqcorp.net>
	<AANLkTi=DUc2ynf6MxVsikC3ONbqxUNxwPOift2e1sPsT@mail.gmail.com>
	<1295949688.2598.19.camel@dolmen>
	<AANLkTinBazhxqPLuz--hNCt4ChMLN3z2SkL-CuK-6jSS@mail.gmail.com>
	<1296037167.2575.21.camel@dolmen> <4D4048D8.6070304@gmail.com>
	<1296061198.2575.70.camel@dolmen>
Message-ID: <AANLkTi=7waJEKexctX=+tcDdkXROToFck4cfDd6cALhu@mail.gmail.com>

On Wed, Jan 26, 2011 at 8:59 AM, Steven Whitehouse <swhiteho at redhat.com> wrote:

> Nevertheless, I agree that it would be nice to be able to move the
> inodes around freely. I'm not sure that the cost of the required extra
> layer of indirection would be worth it though, in terms of the benefits
> gained.
>

If the cost is about possible performance hits ....say it is y%. Let's
take the difference between GFS2 (performance) numbers and other
filesystem's number that users love to compare .. assume it is x%.
Regardless GFS2 is better or worse, what really matters  .. is  ...
"does (x+y)% or (x-y)% make any difference ?" and "what will this y%
buy ?" . If I do a guess, I would say x is close to 20 and y is close
to 3. So does "23 vs 20" or "17 vs 20" make differences ?

On the other hand, what can this "y"  buy  ? ... an infrastructure to
shrink the filesystem (if users not on thin-provision SAN), better
backup strategy (snapshots have its catches), a straightforward
defragmentation tool, *AND* a possibility to group the scattered
inodes within a directory into a sensible (disk) layout such that ...
each time a directory read is issued (e.g. the "ls" cmd family), it
can give enough hints to the underline SAN to trigger its own
readahead engine. ... say you want to read inodes in a huge directory
but part of these inodes are out in other nodes with exclusive glocks
held. You can still read in the rest of these inodes and the reading
pattern may be good enough to trigger the readahead code within the
SAN. By the time these exclusive glocks start to sync their blocks,
these blocks are already in SAN's cache. Many rounds of disk reads
(from SAN point of view) can be avoided. At the same time, if these
to-be-write inodes are close to each other in a reasonable layout. it
helps SAN's writes as well.

Something to think ....

-- Wendy



From sunhux at gmail.com  Sun Jan 30 04:15:25 2011
From: sunhux at gmail.com (sunhux G)
Date: Sun, 30 Jan 2011 12:15:25 +0800
Subject: [Linux-cluster] Seeing lots of: [TOTEM] Retransmit List messages
In-Reply-To: <4D40697A.8080907@cgl.ucsf.edu>
References: <4D40697A.8080907@cgl.ucsf.edu>
Message-ID: <AANLkTinyc=Zv22-TDB-R8=6Qnvq_-+dPA6Pu_hDtCNo1@mail.gmail.com>

I'm trying to get yesterday's date in YYYYMMDD format.

Anyone know if RHES 4.x has GNU date that allows
yesterday=$(date -d "-1 day" '+%Y%m%d')



& if there's no GNU date, how do we go about achieving this?



Thanks

Sun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110130/922637c2/attachment.htm>

From sunhux at gmail.com  Sun Jan 30 04:58:02 2011
From: sunhux at gmail.com (sunhux G)
Date: Sun, 30 Jan 2011 12:58:02 +0800
Subject: [Linux-cluster] unlocking locked accounts
Message-ID: <AANLkTik8waNkKkJMOGTskz9Yxg+ha4ArGPtOypxSe3cL@mail.gmail.com>

Besides  "/sbin/pam_tally --reset",
I thought I saw someone using another command to view
which accounts were locked & then another command to
unlock the accounts.

What are the commands for RHES 4.x & RHES 5.x ?


Btw, sorry for my earlier post where I requested for how
to obtain " yesterday's " date but forgot to change the
subject / title of my posting.


Thanks
Sun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110130/a2db30aa/attachment.htm>

From hal at elizium.za.net  Sun Jan 30 05:13:58 2011
From: hal at elizium.za.net (Hugo Lombard)
Date: Sun, 30 Jan 2011 07:13:58 +0200
Subject: [Linux-cluster] GNU date on RHES 4.x
In-Reply-To: <AANLkTinyc=Zv22-TDB-R8=6Qnvq_-+dPA6Pu_hDtCNo1@mail.gmail.com>
References: <4D40697A.8080907@cgl.ucsf.edu>
	<AANLkTinyc=Zv22-TDB-R8=6Qnvq_-+dPA6Pu_hDtCNo1@mail.gmail.com>
Message-ID: <20110130051358.GJ16798@squishy.elizium.za.net>

On Sun, Jan 30, 2011 at 12:15:25PM +0800, sunhux G wrote:
>    I'm trying to get yesterday's date in YYYYMMDD format.
>    Anyone know if RHES 4.x has GNU date that allows
>    yesterday=$(date -d "-1 day" '+%Y%m%d')
> 

Does this help?

$ yesterday=$(date -d "-1 day" '+%Y%m%d')
$ echo $yesterday
20110129
$ date
Sun Jan 30 07:10:59 SAST 2011
$ cat /etc/issue
CentOS release 4.8 (Final)
Kernel \r on an \m

$ date --version
date (coreutils) 5.2.1
Written by David MacKenzie.

Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ which date
/bin/date
$ rpm -qf /bin/date
coreutils-5.2.1-36.el4.centos
$ 

-- 
Hugo Lombard



From omerfsen at gmail.com  Sun Jan 30 10:08:27 2011
From: omerfsen at gmail.com (Omer Faruk SEN)
Date: Sun, 30 Jan 2011 12:08:27 +0200
Subject: [Linux-cluster] Configuring a samba resource under RHCS
In-Reply-To: <AANLkTinjFJnE6d6RDpzDnKthBZcNUPQouV71G3ctyidz@mail.gmail.com>
References: <AANLkTinjFJnE6d6RDpzDnKthBZcNUPQouV71G3ctyidz@mail.gmail.com>
Message-ID: <AANLkTikwuvCo93Z94OXWRwNxcLi4_jmYtEY3hejH4wtu@mail.gmail.com>

maybe public readonly shares?

On Mon, Oct 18, 2010 at 2:46 PM, C. L. Martinez <carlopmart at gmail.com> wrote:
> Hi all,
>
> ?How can I configure different shared folders with samba under RHCS??
> Exists some resource agents?? I need to allow to access to sme Windows
> 7 And Windows 2008 R2 clients without AD authentication.
>
> Thanks.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



From sdake at redhat.com  Mon Jan 31 21:54:00 2011
From: sdake at redhat.com (Steven Dake)
Date: Mon, 31 Jan 2011 14:54:00 -0700
Subject: [Linux-cluster] unlocking locked accounts
In-Reply-To: <AANLkTik8waNkKkJMOGTskz9Yxg+ha4ArGPtOypxSe3cL@mail.gmail.com>
References: <AANLkTik8waNkKkJMOGTskz9Yxg+ha4ArGPtOypxSe3cL@mail.gmail.com>
Message-ID: <4D472F78.7090601@redhat.com>

On 01/29/2011 09:58 PM, sunhux G wrote:
> 
> Besides  "/sbin/pam_tally --reset",
> I thought I saw someone using another command to view
> which accounts were locked & then another command to
> unlock the accounts.
> 
> What are the commands for RHES 4.x & RHES 5.x ?
> 
> 
> Btw, sorry for my earlier post where I requested for how
> to obtain " yesterday's " date but forgot to change the
> subject / title of my posting.
> 
> 
> Thanks
> Sun
> 
> 
wrong list

> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster