From fdinitto at redhat.com Sun Nov 1 15:34:29 2009 From: fdinitto at redhat.com (Fabio Massimo Di Nitto) Date: Sun, 01 Nov 2009 16:34:29 +0100 Subject: [Linux-cluster] mount.gfs2 hangs on cluster-3.0.3 In-Reply-To: <4AEBED1A.6040902@quah.ro> References: <4AEBED1A.6040902@quah.ro> Message-ID: <4AEDAA85.3040804@redhat.com> Dan Candea wrote: > hi all > > I really need some help. > > I have set up a cluster 3.0.3 with 2.6.31 kernel > All went well until I tried a gfs2 mount. The mount hangs without an error > gfs_control dump reports nothing: > > gfs_control dump > 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 > /var/log/cluster/gfs_controld.log > 1256941054 gfs_controld 3.0.3 started > 1256941054 /cluster/gfs_controld/@plock_ownership is 1 > 1256941054 /cluster/gfs_controld/@plock_rate_limit is 0 > 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 > /var/log/cluster/gfs_controld.log > 1256941054 group_mode 3 compat 0 > > Can you please provide your cluster.conf and setup information please? If you add: [other bits of the config here] can you please tar /var/log/cluster/* and send it to us? Best would be to track everything in a bugzilla. Fabio From dan at quah.ro Sun Nov 1 17:32:23 2009 From: dan at quah.ro (Dan Candea) Date: Sun, 01 Nov 2009 19:32:23 +0200 Subject: [Linux-cluster] mount.gfs2 hangs on cluster-3.0.3 In-Reply-To: <4AEDAA85.3040804@redhat.com> References: <4AEBED1A.6040902@quah.ro> <4AEDAA85.3040804@redhat.com> Message-ID: <4AEDC627.9020203@quah.ro> Fabio Massimo Di Nitto wrote: > Dan Candea wrote: >> hi all >> >> I really need some help. >> >> I have set up a cluster 3.0.3 with 2.6.31 kernel >> All went well until I tried a gfs2 mount. The mount hangs without an >> error >> gfs_control dump reports nothing: >> >> gfs_control dump >> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 >> /var/log/cluster/gfs_controld.log >> 1256941054 gfs_controld 3.0.3 started >> 1256941054 /cluster/gfs_controld/@plock_ownership is 1 >> 1256941054 /cluster/gfs_controld/@plock_rate_limit is 0 >> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 >> /var/log/cluster/gfs_controld.log >> 1256941054 group_mode 3 compat 0 >> >> > > Can you please provide your cluster.conf and setup information please? > > If you add: > > > > [other bits of the config here] > > > can you please tar /var/log/cluster/* and send it to us? > > Best would be to track everything in a bugzilla. > > Fabio > > Hi. thank you for your help. attached are all the informations requested. should I open a bug? I assumed that I have a issue and not a bug in the software -- Dan C?ndea Does God Play Dice? __________ Information from ESET NOD32 Antivirus, version of virus signature database 4562 (20091101) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com -------------- next part -------------- A non-text attachment was scrubbed... Name: cluster.tar Type: application/x-tar Size: 194560 bytes Desc: not available URL: From agx at sigxcpu.org Sun Nov 1 17:42:18 2009 From: agx at sigxcpu.org (Guido =?iso-8859-1?Q?G=FCnther?=) Date: Sun, 1 Nov 2009 18:42:18 +0100 Subject: [Linux-cluster] ccs_config_validate in cluster 3.0.X In-Reply-To: <4AEBE4AE.1070502@redhat.com> References: <4AE81EAE.3040604@redhat.com> <20091030160119.GA21200@bogon.sigxcpu.org> <4AEBE4AE.1070502@redhat.com> Message-ID: <20091101174218.GA21399@bogon.sigxcpu.org> Hi Fabio, On Sat, Oct 31, 2009 at 08:18:06AM +0100, Fabio Massimo Di Nitto wrote: > Guido G?nther wrote: > >On Wed, Oct 28, 2009 at 11:36:30AM +0100, Fabio M. Di Nitto wrote: > >>Hi everybody, > >> > >>as briefly mentioned in 3.0.4 release note, a new system to validate the > >>configuration has been enabled in the code. > >> > >>What it does > >>------------ > >> > >>The general idea is to be able to perform as many sanity checks on the > >>configuration as possible. This check allows us to spot the most common > >>mistakes, such as typos or possibly invalid values, in cluster.conf. > >This is great. For what it's worth: I've pushed Cluster 3.0.4 into > >Debian experimental a couple of days ago. > >Cheers, > > -- Guido > > > > Hi Guido, > > thanks for pushing the packages to Debian. > > Please make sure to forward bugs related to this check so we can > address them quickly. Sure. Thanks. > Lon update the FAQ on our wiki to help debugging issues related to RelaxNG. > > It would be nice if you could do a package check around > (corosync/openais/cluster) and send us any local patch you have. I > have noticed at least corosync has one that is suitable for > upstream. Those were merged in corosync 1.1.0 so we're down to zero fortunately. > I didn?t have time to look at cluster. Only patch are the dlm headers so we can easily build against older kernel headers. If anything else pops up I'll forward it. Cheers, -- Guido From fdinitto at redhat.com Mon Nov 2 07:37:47 2009 From: fdinitto at redhat.com (Fabio Massimo Di Nitto) Date: Mon, 02 Nov 2009 08:37:47 +0100 Subject: [Linux-cluster] mount.gfs2 hangs on cluster-3.0.3 In-Reply-To: <4AEDC627.9020203@quah.ro> References: <4AEBED1A.6040902@quah.ro> <4AEDAA85.3040804@redhat.com> <4AEDC627.9020203@quah.ro> Message-ID: <4AEE8C4B.5020607@redhat.com> Dan Candea wrote: > Fabio Massimo Di Nitto wrote: >> Dan Candea wrote: >>> hi all >>> >>> I really need some help. >>> >>> I have set up a cluster 3.0.3 with 2.6.31 kernel >>> All went well until I tried a gfs2 mount. The mount hangs without an >>> error >>> gfs_control dump reports nothing: >>> >>> gfs_control dump >>> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 >>> /var/log/cluster/gfs_controld.log >>> 1256941054 gfs_controld 3.0.3 started >>> 1256941054 /cluster/gfs_controld/@plock_ownership is 1 >>> 1256941054 /cluster/gfs_controld/@plock_rate_limit is 0 >>> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 >>> /var/log/cluster/gfs_controld.log >>> 1256941054 group_mode 3 compat 0 >>> >>> >> >> Can you please provide your cluster.conf and setup information please? >> >> If you add: >> >> >> >> [other bits of the config here] >> >> >> can you please tar /var/log/cluster/* and send it to us? >> >> Best would be to track everything in a bugzilla. >> >> Fabio >> >> > > Hi. thank you for your help. > > attached are all the informations requested. > should I open a bug? I assumed that I have a issue and not a bug in the > software > It would be better if you could file a bug. Can you also attach the logs from all the other nodes? We need to get the picture of the whole cluster to understand what is going on. Fabio From dan at quah.ro Mon Nov 2 09:59:27 2009 From: dan at quah.ro (Dan Candea) Date: Mon, 2 Nov 2009 11:59:27 +0200 Subject: [Linux-cluster] mount.gfs2 hangs on cluster-3.0.3 In-Reply-To: <4AEE8C4B.5020607@redhat.com> References: <4AEBED1A.6040902@quah.ro> <4AEDC627.9020203@quah.ro> <4AEE8C4B.5020607@redhat.com> Message-ID: <200911021159.27387.dan@quah.ro> On Monday 02 November 2009 09:37:47 Fabio Massimo Di Nitto wrote: > Dan Candea wrote: > > Fabio Massimo Di Nitto wrote: > >> Dan Candea wrote: > >>> hi all > >>> > >>> I really need some help. > >>> > >>> I have set up a cluster 3.0.3 with 2.6.31 kernel > >>> All went well until I tried a gfs2 mount. The mount hangs without an > >>> error > >>> gfs_control dump reports nothing: > >>> > >>> gfs_control dump > >>> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 > >>> /var/log/cluster/gfs_controld.log > >>> 1256941054 gfs_controld 3.0.3 started > >>> 1256941054 /cluster/gfs_controld/@plock_ownership is 1 > >>> 1256941054 /cluster/gfs_controld/@plock_rate_limit is 0 > >>> 1256941054 logging mode 3 syslog f 160 p 6 logfile p 6 > >>> /var/log/cluster/gfs_controld.log > >>> 1256941054 group_mode 3 compat 0 > >> > >> Can you please provide your cluster.conf and setup information please? > >> > >> If you add: > >> > >> > >> > >> [other bits of the config here] > >> > >> > >> can you please tar /var/log/cluster/* and send it to us? > >> > >> Best would be to track everything in a bugzilla. > >> > >> Fabio > > > > Hi. thank you for your help. > > > > attached are all the informations requested. > > should I open a bug? I assumed that I have a issue and not a bug in the > > software > > It would be better if you could file a bug. Can you also attach the logs > from all the other nodes? > > We need to get the picture of the whole cluster to understand what is > going on. > > Fabio > Hi I've created the bug report, https://bugzilla.redhat.com/show_bug.cgi?id=532426 there are three archives, one for each node with logs and debug info -- Dan C?ndea Does God Play Dice? From swhiteho at redhat.com Mon Nov 2 11:42:49 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Mon, 02 Nov 2009 11:42:49 +0000 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <4AEB765B.3010408@isye.gatech.edu> References: <4AEB765B.3010408@isye.gatech.edu> Message-ID: <1257162169.6052.746.camel@localhost.localdomain> Hi, On Fri, 2009-10-30 at 19:27 -0400, Allen Belletti wrote: > Hi All, > > As I've mentioned before, I'm running a two-node clustered mail server > on GFS2 (with RHEL 5.4) Nearly all of the time, everything works > great. However, going all the way back to GFS1 on RHEL 5.1 (I think it > was), I've had occasional locking problems that force a reboot of one or > both cluster nodes. Lately I've paid closer attention since it's been > happening more often. > > I'll notice the problem when the load average starts rising. It's > always tied to "stuck" processes, and I believe always tied to IMAP > clients (I'm running Dovecot.) It seems like a file belonging to user > "x" (in this case, "jforrest" will become locked in some way, such that > every IMAP process tied that user will get stuck on the same thing. > Over time, as the user keeps trying to read that file, more & more > processes accumulate. They're always in state "D" (uninterruptible > sleep), and always on "dlm_posix_lock" according to WCHAN. The only way > I'm able to get out of this state is to reboot. If I let it persist for > too long, I/O generally stops entirely. > > This certainly seems like it ought to have a definite solution, but I've > no idea what it is. I've tried a variety of things using "find" to > pinpoint a particular file, but everything belonging to the affected > user seems just fine. At least, I can read and copy all of the files, > and do a stat via ls -l. > > Is it possible that this is a bug, not within GFS at all, but within > Dovecot IMAP? > > Any thoughts would be appreciated. It's been getting worse lately and > thus no fun at all. > > Cheers, > Allen > Do you know if dovecot IMAP uses signals at all? That would be the first thing that I'd look at. The other thing to check is whether it makes use of F_GETLK and in particular the l_pid field? strace should be able to answer both of those questions (except the l_pid field of course, but the chances are it it calls F_GETLK and then sends a signal, its also using the l_pid field), Steve. From gianluca.cecchi at gmail.com Mon Nov 2 14:09:26 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Mon, 2 Nov 2009 15:09:26 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 Message-ID: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> Hello, sorry for the long e-mail in advance. trying to do on a test environment what in subject and I think it could be useful for others too, both in RH EL and in CentOS. I have configured two ip+fs services and HA-LVM Starting point is CentOS 5.3 updated at these components: cman-2.0.98-1.el5_3.1 openais-0.80.3-22.el5_3.4 rgmanager-2.0.46-1.el5.centos.3 luci-0.12.1-7.3.el5.centos.1 ricci-0.12.1-7.3.el5.centos.1 lvm2-2.02.40-6.el5 device-mapper-multipath-0.4.7-23.el5_3.4 Target would be: cman-2.0.115-1.el5_4.3 openais-0.80.6-8.el5_4.1 rgmanager-2.0.52-1.el5.centos.2 luci-0.12.2-6.el5.centos ricci-0.12.2-6.el5.centos lvm2-2.02.46-8.el5_4.1 device-mapper-multipath-0.4.7-30.el5_4.2 they are guests in Qemu-KVM environment and I have a backup of the starting situation, so that I can reply and change eventually order of operations. node1 is mork, node2 is mindy Attempt of approach: - services are on node2 (mindy) - shutdown ad restart node1 in single user mode - activate network and update node1 with: yum clean all yum update glibc\* yum update yum\* rpm\* python\* yum clean all yum update shutdown -r now and start in single user mode to check correct start and so on - init 3 for node1 and join to cluster QUESTION1: are there any incompatibilities in this first join of the cluster, based on the different components' versions? Would it be better in your opinion to make a shutdown of node2 and then have node1 start alone and take the services and then upgrade node2 and have the first contemporary two-nodes join with aligned versions of clusterware software? Now, following my approach, after the init 3 on node1 all was ok with cluster join, but I forgot to do a touch of the initrd file of the updated kernel, due to de-optimized check in HA-LVM service comparing timestamp of initrd of running kernel and lvm.conf So clurgmgrd complains having -rw-r--r-- 1 root root 16433 Nov 2 12:28 /etc/lvm/lvm.conf newer than initrd that is dated end of September..... (see below) Nov 2 12:41:00 mork kernel: DLM (built Sep 30 2009 12:53:28) installed Nov 2 12:41:00 mork kernel: GFS2 (built Sep 30 2009 12:54:10) installed Nov 2 12:41:00 mork kernel: Lock_DLM (built Sep 30 2009 12:54:16) installed Nov 2 12:41:00 mork ccsd[2290]: Starting ccsd 2.0.115: Nov 2 12:41:00 mork ccsd[2290]: Built: Oct 26 2009 22:01:34 Nov 2 12:41:00 mork ccsd[2290]: Copyright (C) Red Hat, Inc. 2004 All rights reserved. Nov 2 12:41:00 mork ccsd[2290]: cluster.conf (cluster name = clumm, version = 5) found. Nov 2 12:41:00 mork ccsd[2290]: Remote copy of cluster.conf is from quorate node. Nov 2 12:41:00 mork ccsd[2290]: Local version # : 5 Nov 2 12:41:00 mork ccsd[2290]: Remote version #: 5 Nov 2 12:41:00 mork ccsd[2290]: Remote copy of cluster.conf is from quorate node. Nov 2 12:41:00 mork ccsd[2290]: Local version # : 5 Nov 2 12:41:00 mork ccsd[2290]: Remote version #: 5 Nov 2 12:41:00 mork ccsd[2290]: Remote copy of cluster.conf is from quorate node. Nov 2 12:41:00 mork ccsd[2290]: Local version # : 5 Nov 2 12:41:00 mork ccsd[2290]: Remote version #: 5 Nov 2 12:41:00 mork ccsd[2290]: Remote copy of cluster.conf is from quorate node. Nov 2 12:41:00 mork ccsd[2290]: Local version # : 5 Nov 2 12:41:00 mork ccsd[2290]: Remote version #: 5 Nov 2 12:41:00 mork openais[2302]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version 0.80.6' Nov 2 12:41:00 mork openais[2302]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors. Nov 2 12:41:00 mork openais[2302]: [MAIN ] Copyright (C) 2006 Red Hat, Inc. Nov 2 12:41:00 mork openais[2302]: [MAIN ] AIS Executive Service: started and ready to provide service. Nov 2 12:41:00 mork openais[2302]: [MAIN ] Using default multicast address of 239.192.12.183 Nov 2 12:41:00 mork openais[2302]: [TOTEM] Token Timeout (162000 ms) retransmit timeout (8019 ms) Nov 2 12:41:00 mork openais[2302]: [TOTEM] token hold (6405 ms) retransmits before loss (20 retrans) Nov 2 12:41:00 mork openais[2302]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms) Nov 2 12:41:00 mork openais[2302]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs) Nov 2 12:41:00 mork openais[2302]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1500 s) Nov 2 12:41:00 mork openais[2302]: [TOTEM] send threads (0 threads) Nov 2 12:41:00 mork openais[2302]: [TOTEM] RRP token expired timeout (8019 ms) Nov 2 12:41:00 mork openais[2302]: [TOTEM] RRP token problem counter (2000 ms) Nov 2 12:41:00 mork openais[2302]: [TOTEM] RRP threshold (10 problem count) Nov 2 12:41:00 mork openais[2302]: [TOTEM] RRP mode set to none. Nov 2 12:41:00 mork openais[2302]: [TOTEM] heartbeat_failures_allowed (0) Nov 2 12:41:00 mork openais[2302]: [TOTEM] max_network_delay (50 ms) Nov 2 12:41:00 mork openais[2302]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Nov 2 12:41:00 mork openais[2302]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes). Nov 2 12:41:00 mork openais[2302]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Nov 2 12:41:00 mork openais[2302]: [TOTEM] The network interface [172.16.0.11] is now up. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Created or loaded sequence id 336.172.16.0.11 for this ring. Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering GATHER state from 15. Nov 2 12:41:00 mork openais[2302]: [CMAN ] CMAN 2.0.115 (built Oct 26 2009 22:01:42) started Nov 2 12:41:00 mork openais[2302]: [MAIN ] Service initialized 'openais CMAN membership service 2.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais extended virtual synchrony service' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais cluster membership service B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais availability management framework B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais checkpoint service B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais event service B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais distributed locking service B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais message service B.01.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais configuration service' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais cluster closed process group service v1.01' Nov 2 12:41:00 mork openais[2302]: [SERV ] Service initialized 'openais cluster config database access v1.01' Nov 2 12:41:00 mork openais[2302]: [SYNC ] Not using a virtual synchrony filter. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Creating commit token because I am the rep. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Saving state aru 0 high seq received 0 Nov 2 12:41:00 mork openais[2302]: [TOTEM] Storing new sequence id for ring 154 Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering COMMIT state. Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering RECOVERY state. Nov 2 12:41:00 mork openais[2302]: [TOTEM] position [0] member 172.16.0.11: Nov 2 12:41:00 mork openais[2302]: [TOTEM] previous ring seq 336 rep 172.16.0.11 Nov 2 12:41:00 mork openais[2302]: [TOTEM] aru 0 high delivered 0 received flag 1 Nov 2 12:41:00 mork openais[2302]: [TOTEM] Did not need to originate any messages in recovery. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Sending initial ORF token Nov 2 12:41:00 mork openais[2302]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:41:00 mork openais[2302]: [CLM ] New Configuration: Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Left: Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Joined: Nov 2 12:41:00 mork openais[2302]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:41:00 mork openais[2302]: [CLM ] New Configuration: Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Left: Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Joined: Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:41:00 mork openais[2302]: [SYNC ] This node is within the primary component and will provide service. Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering OPERATIONAL state. Nov 2 12:41:00 mork openais[2302]: [CLM ] got nodejoin message 172.16.0.11 Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering GATHER state from 11. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Creating commit token because I am the rep. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Saving state aru a high seq received a Nov 2 12:41:00 mork openais[2302]: [TOTEM] Storing new sequence id for ring 158 Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering COMMIT state. Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering RECOVERY state. Nov 2 12:41:00 mork openais[2302]: [TOTEM] position [0] member 172.16.0.11: Nov 2 12:41:00 mork openais[2302]: [TOTEM] previous ring seq 340 rep 172.16.0.11 Nov 2 12:41:00 mork openais[2302]: [TOTEM] aru a high delivered a received flag 1 Nov 2 12:41:00 mork openais[2302]: [TOTEM] position [1] member 172.16.0.12: Nov 2 12:41:00 mork openais[2302]: [TOTEM] previous ring seq 340 rep 172.16.0.12 Nov 2 12:41:00 mork openais[2302]: [TOTEM] aru d high delivered d received flag 1 Nov 2 12:41:00 mork openais[2302]: [TOTEM] Did not need to originate any messages in recovery. Nov 2 12:41:00 mork openais[2302]: [TOTEM] Sending initial ORF token Nov 2 12:41:00 mork openais[2302]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:41:00 mork openais[2302]: [CLM ] New Configuration: Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Left: Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Joined: Nov 2 12:41:00 mork openais[2302]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:41:00 mork openais[2302]: [CLM ] New Configuration: Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Left: Nov 2 12:41:00 mork openais[2302]: [CLM ] Members Joined: Nov 2 12:41:00 mork openais[2302]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:41:00 mork openais[2302]: [SYNC ] This node is within the primary component and will provide service. Nov 2 12:41:00 mork openais[2302]: [TOTEM] entering OPERATIONAL state. Nov 2 12:41:00 mork openais[2302]: [CMAN ] quorum regained, resuming activity Nov 2 12:41:00 mork openais[2302]: [CLM ] got nodejoin message 172.16.0.11 Nov 2 12:41:00 mork openais[2302]: [CLM ] got nodejoin message 172.16.0.12 Nov 2 12:41:00 mork openais[2302]: [CPG ] got joinlist message from node 2 Nov 2 12:41:01 mork ccsd[2290]: Initial status:: Quorate uorum Nov 2 12:41:01 mork qdiskd[2331]: Quorum Daemon Initializing Nov 2 12:41:02 mork qdiskd[2331]: Heuristic: 'ping -c1 -w1 192.168.122.1' UP Nov 2 12:41:12 mork modclusterd: startup succeeded Nov 2 12:41:12 mork kernel: dlm: Using TCP for communications Nov 2 12:41:12 mork kernel: dlm: connecting to 2 Nov 2 12:41:12 mork kernel: dlm: got connection from 2 Nov 2 12:41:12 mork clurgmgrd[2886]: Resource Group Manager Starting Nov 2 12:41:13 mork oddjobd: oddjobd startup succeeded Nov 2 12:41:13 mork saslauthd[3338]: detach_tty : master pid is: 3338 Nov 2 12:41:13 mork saslauthd[3338]: ipc_init : listening on socket: /var/run/saslauthd/mux Nov 2 12:41:14 mork ricci: startup succeeded Nov 2 12:41:14 mork clurgmgrd: [2886]: HA LVM: Improper setup detected Nov 2 12:41:14 mork clurgmgrd: [2886]: HA LVM: Improper setup detected Nov 2 12:41:14 mork clurgmgrd: [2886]: - initrd image needs to be newer than lvm.conf Nov 2 12:41:14 mork clurgmgrd: [2886]: - initrd image needs to be newer than lvm.conf Nov 2 12:41:14 mork clurgmgrd: [2886]: WARNING: An improper setup can cause data corruption! Nov 2 12:41:14 mork clurgmgrd: [2886]: WARNING: An improper setup can cause data corruption! Nov 2 12:41:14 mork clurgmgrd: [2886]: node2 owns vg_cl1/lv_cl1 unable to stop Nov 2 12:41:14 mork clurgmgrd: [2886]: node2 owns vg_cl2/lv_cl2 unable to stop Nov 2 12:41:14 mork clurgmgrd[2886]: stop on lvm "CL2" returned 1 (generic error) Nov 2 12:41:14 mork clurgmgrd[2886]: stop on lvm "CL1" returned 1 (generic error) Nov 2 12:41:31 mork qdiskd[2331]: Node 2 is the master Nov 2 12:42:21 mork qdiskd[2331]: Initial score 1/1 Nov 2 12:42:21 mork qdiskd[2331]: Initialization complete Nov 2 12:42:21 mork openais[2302]: [CMAN ] quorum device registered Nov 2 12:42:21 mork qdiskd[2331]: Score sufficient for master operation (1/1; required=1); upgrading Note that a clustat of both nodes gives correct results (in the sens of nodes taking part in the cluster and rgmanager active on both and quorum disk). At this point, after touching initrd file, I think to do a shutdown -r of mork again and see if all goes well. It seems so, as I get again: ... Nov 2 12:46:23 mork openais[2278]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:46:23 mork openais[2278]: [CLM ] New Configuration: Nov 2 12:46:23 mork openais[2278]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:46:23 mork openais[2278]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:46:23 mork openais[2278]: [CLM ] Members Left: Nov 2 12:46:23 mork openais[2278]: [CLM ] Members Joined: Nov 2 12:46:23 mork openais[2278]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:46:23 mork openais[2278]: [SYNC ] This node is within the primary component and will provide service. Nov 2 12:46:23 mork openais[2278]: [TOTEM] entering OPERATIONAL state. Nov 2 12:46:23 mork openais[2278]: [CMAN ] quorum regained, resuming activity Nov 2 12:46:23 mork openais[2278]: [CLM ] got nodejoin message 172.16.0.11 Nov 2 12:46:23 mork openais[2278]: [CLM ] got nodejoin message 172.16.0.12 Nov 2 12:46:23 mork openais[2278]: [CPG ] got joinlist message from node 2 Nov 2 12:46:24 mork ccsd[2267]: Initial status:: Quorate uorum Nov 2 12:46:25 mork qdiskd[2310]: Quorum Daemon Initializing Nov 2 12:46:26 mork qdiskd[2310]: Heuristic: 'ping -c1 -w1 192.168.122.1' UP ... Nov 2 12:46:35 mork modclusterd: startup succeeded Nov 2 12:46:35 mork kernel: dlm: Using TCP for communications Nov 2 12:46:35 mork kernel: dlm: connecting to 2 Nov 2 12:46:36 mork oddjobd: oddjobd startup succeeded Nov 2 12:46:36 mork saslauthd[2990]: detach_tty : master pid is: 2990 Nov 2 12:46:36 mork saslauthd[2990]: ipc_init : listening on socket: /var/run/saslauthd/mux Nov 2 12:46:36 mork ricci: startup succeeded Nov 2 12:46:55 mork qdiskd[2310]: Node 2 is the master Nov 2 12:47:45 mork qdiskd[2310]: Initial score 1/1 Nov 2 12:47:45 mork qdiskd[2310]: Initialization complete Nov 2 12:47:45 mork openais[2278]: [CMAN ] quorum device registered Nov 2 12:47:45 mork qdiskd[2310]: Score sufficient for master operation (1/1; required=1); upgrading but instead, on mindy I get this error and the node goes out of memory and I have to power off it.... Nov 2 12:47:54 mindy kernel: dlm: connect from non cluster node Donna if the problem with cluster is the cause or the effect of the problem.... In particular, these are messages on mindy, during the first join of the cluster and the reboot of mork: Nov 2 12:42:20 mindy openais[2465]: [TOTEM] entering GATHER state from 11. Nov 2 12:42:20 mindy openais[2465]: [TOTEM] Saving state aru d high seq received d Nov 2 12:42:20 mindy openais[2465]: [TOTEM] Storing new sequence id for ring 158 Nov 2 12:42:20 mindy openais[2465]: [TOTEM] entering COMMIT state. Nov 2 12:42:20 mindy openais[2465]: [TOTEM] entering RECOVERY state. Nov 2 12:42:20 mindy openais[2465]: [TOTEM] position [0] member 172.16.0.11: Nov 2 12:42:20 mindy openais[2465]: [TOTEM] previous ring seq 340 rep 172.16.0.11 Nov 2 12:42:20 mindy openais[2465]: [TOTEM] aru a high delivered a received flag 1 Nov 2 12:42:20 mindy openais[2465]: [TOTEM] position [1] member 172.16.0.12: Nov 2 12:42:20 mindy openais[2465]: [TOTEM] previous ring seq 340 rep 172.16.0.12 Nov 2 12:42:20 mindy openais[2465]: [TOTEM] aru d high delivered d received flag 1 Nov 2 12:42:20 mindy openais[2465]: [TOTEM] Did not need to originate any messages in recovery. Nov 2 12:42:20 mindy openais[2465]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:42:20 mindy openais[2465]: [CLM ] New Configuration: Nov 2 12:42:20 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:42:20 mindy openais[2465]: [CLM ] Members Left: Nov 2 12:42:20 mindy openais[2465]: [CLM ] Members Joined: Nov 2 12:42:20 mindy openais[2465]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:42:20 mindy openais[2465]: [CLM ] New Configuration: Nov 2 12:42:20 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:42:20 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:42:20 mindy openais[2465]: [CLM ] Members Left: Nov 2 12:42:20 mindy openais[2465]: [CLM ] Members Joined: Nov 2 12:42:20 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:42:20 mindy openais[2465]: [SYNC ] This node is within the primary component and will provide service. Nov 2 12:42:20 mindy openais[2465]: [TOTEM] entering OPERATIONAL state. Nov 2 12:42:20 mindy openais[2465]: [CLM ] got nodejoin message 172.16.0.11 Nov 2 12:42:20 mindy openais[2465]: [CLM ] got nodejoin message 172.16.0.12 Nov 2 12:42:20 mindy openais[2465]: [CPG ] got joinlist message from node 2 Nov 2 12:42:32 mindy kernel: dlm: connecting to 1 Nov 2 12:42:32 mindy kernel: dlm: got connection from 1 Nov 2 12:46:16 mindy clurgmgrd[3101]: Member 1 shutting down Nov 2 12:46:26 mindy qdiskd[2508]: Node 1 shutdown Nov 2 12:47:43 mindy openais[2465]: [TOTEM] entering GATHER state from 12. Nov 2 12:47:43 mindy openais[2465]: [TOTEM] Saving state aru 3e high seq received 3e Nov 2 12:47:43 mindy openais[2465]: [TOTEM] Storing new sequence id for ring 160 Nov 2 12:47:43 mindy openais[2465]: [TOTEM] entering COMMIT state. Nov 2 12:47:43 mindy openais[2465]: [TOTEM] entering RECOVERY state. Nov 2 12:47:43 mindy openais[2465]: [TOTEM] position [0] member 172.16.0.11: Nov 2 12:47:43 mindy openais[2465]: [TOTEM] previous ring seq 348 rep 172.16.0.11 Nov 2 12:47:43 mindy openais[2465]: [TOTEM] aru a high delivered a received flag 1 Nov 2 12:47:43 mindy openais[2465]: [TOTEM] position [1] member 172.16.0.12: Nov 2 12:47:43 mindy openais[2465]: [TOTEM] previous ring seq 344 rep 172.16.0.11 Nov 2 12:47:43 mindy openais[2465]: [TOTEM] aru 3e high delivered 3e received flag 1 Nov 2 12:47:43 mindy openais[2465]: [TOTEM] Did not need to originate any messages in recovery. Nov 2 12:47:43 mindy openais[2465]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:47:43 mindy openais[2465]: [CLM ] New Configuration: Nov 2 12:47:43 mindy kernel: dlm: closing connection to node 1 Nov 2 12:47:43 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:47:43 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:47:43 mindy openais[2465]: [CLM ] Members Left: Nov 2 12:47:43 mindy openais[2465]: [CLM ] Members Joined: Nov 2 12:47:43 mindy openais[2465]: [CLM ] CLM CONFIGURATION CHANGE Nov 2 12:47:43 mindy openais[2465]: [CLM ] New Configuration: Nov 2 12:47:43 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.11) Nov 2 12:47:43 mindy openais[2465]: [CLM ] r(0) ip(172.16.0.12) Nov 2 12:47:43 mindy openais[2465]: [CLM ] Members Left: Nov 2 12:47:43 mindy openais[2465]: [CLM ] Members Joined: Nov 2 12:47:43 mindy openais[2465]: [SYNC ] This node is within the primary component and will provide service. Nov 2 12:47:43 mindy openais[2465]: [TOTEM] entering OPERATIONAL state. Nov 2 12:47:43 mindy openais[2465]: [CLM ] got nodejoin message 172.16.0.11 Nov 2 12:47:43 mindy openais[2465]: [CLM ] got nodejoin message 172.16.0.12 Nov 2 12:47:43 mindy openais[2465]: [CPG ] got joinlist message from node 2 Nov 2 12:47:54 mindy kernel: dlm: connect from non cluster node Nov 2 12:59:48 mindy kernel: dlm_send invoked oom-killer: gfp_mask=0xd0, order=1, oomkilladj=0 Nov 2 12:59:48 mindy kernel: Nov 2 12:59:48 mindy kernel: Call Trace: Nov 2 12:59:48 mindy kernel: [] out_of_memory+0x8e/0x2f5 Nov 2 12:59:48 mindy kernel: [] autoremove_wake_function+0x0/0x2e Nov 2 12:59:48 mindy kernel: [] __alloc_pages+0x245/0x2ce Nov 2 12:59:48 mindy kernel: [] __alloc_pages+0x65/0x2ce Nov 2 12:59:48 mindy kernel: [] cache_grow+0x137/0x395 Nov 2 12:59:48 mindy kernel: [] cache_alloc_refill+0x136/0x186 Nov 2 12:59:48 mindy kernel: [] kmem_cache_alloc+0x6c/0x76 Nov 2 12:59:48 mindy kernel: [] sk_alloc+0x2e/0xf3 Nov 2 12:59:48 mindy kernel: [] inet_create+0x137/0x267 Nov 2 12:59:49 mindy kernel: [] __sock_create+0x170/0x27c Nov 2 12:59:49 mindy kernel: [] :dlm:process_send_sockets+0x0/0x179 Nov 2 12:59:49 mindy kernel: [] :dlm:tcp_connect_to_sock+0x70/0x1de Nov 2 12:59:49 mindy kernel: [] thread_return+0x62/0xfe Nov 2 12:59:49 mindy kernel: [] :dlm:process_send_sockets+0x20/0x179 Nov 2 12:59:49 mindy kernel: [] :dlm:process_send_sockets+0x0/0x179 Nov 2 12:59:49 mindy kernel: [] run_workqueue+0x94/0xe4 Nov 2 12:59:49 mindy kernel: [] worker_thread+0x0/0x122 Nov 2 12:59:49 mindy kernel: [] keventd_create_kthread+0x0/0xc4 Nov 2 12:59:49 mindy kernel: [] worker_thread+0xf0/0x122 Nov 2 12:59:49 mindy kernel: [] default_wake_function+0x0/0xe Nov 2 12:59:49 mindy kernel: [] keventd_create_kthread+0x0/0xc4 Nov 2 12:59:49 mindy kernel: [] keventd_create_kthread+0x0/0xc4 Nov 2 12:59:49 mindy kernel: [] kthread+0xfe/0x132 Nov 2 12:59:49 mindy kernel: [] child_rip+0xa/0x11 Nov 2 12:59:49 mindy kernel: [] keventd_create_kthread+0x0/0xc4 Nov 2 12:59:49 mindy kernel: [] :ext3:ext3_journal_dirty_data+0x0/0x34 Nov 2 12:59:49 mindy kernel: [] kthread+0x0/0x132 Nov 2 12:59:49 mindy kernel: [] child_rip+0x0/0x11 Nov 2 12:59:49 mindy kernel: Both nodes are Qemu-KVM x86_64 guests, each one assigned 1Gb of ram and 2 cpus I can send copy of cluster.conf eventually Thanks in advance for your comments. Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From alain.richard at equation.fr Mon Nov 2 15:44:26 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Mon, 2 Nov 2009 16:44:26 +0100 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 Message-ID: I have had a look at the current vm.sh script and have found out : a) use_virsh = 1 per default b) that if your resource have a path attribute, vm.sh automatically revert to use_virsh=0, even if you hard code use_virsh=1 ! c) there is no option to indicate the xml file that virsh use to create the vm. It always tries "virsh create name" where name is the vm name. The point a) is a little bit silly because if you have a RHEL 5.3 cluster that is using xm configuration files, your vm will no longer launch after upgrade because it tries to do a "virsh create name" instead of "xm create name". It would have been probably cleaner to have "use_virsh = 0" per default to keep compatibility. The point b) will add compatibility to people that use the path attribute in order to store vm conf files in a place shared by all members of the cluster (gfs2 or nfs directory for example). It would have been clearer to document this feature because it is a little bit magical to see a resource with use_virsh=1 use in fact xm and not virsh !!! The point c) is very silly, because it restricts the configuration to be loaded from /etc/xen even for kvm ! Also this directory is not shared on the various members of the cluster and the configuration file must have the same name as the vm name (we prefer to call it name.xml). Also their is no problem to use a "virsh create /path/to/file.xml" under RHEL 5.4 and I have found out that the cluster 3.0 stable branch have a new vm.sh file using an xmlpath attribute to solve this problem. Why this version was not back ported to RHEL 5.4 ? Is there any plan to do it ? Regards, -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From alain.richard at equation.fr Mon Nov 2 16:59:18 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Mon, 2 Nov 2009 17:59:18 +0100 Subject: [Linux-cluster] qdiskd master election and loss of quorum Message-ID: I am currently using a n nodes configuration with a qdiskd process to sustain a n-1 node failure. The simplest case is a two node : ... ... ... I am experiencing some times a loss of quorum on the over node when I shutdown gracefully a node using the following : # service rgmanager stop # service gfs2 stop # service clvmd stop # service qdiskd stop # service cman stop After looking more precisely to the problem, I just discover that the problem is that the node I shutdown is the master qdisk node, so when I shutdown qdiskd and cman on the first node, the second node experience a loss of qdisk vote (because the second node sees that qdisk master is not avail and start the election of the new master) and almost simultaneouly a loss of the first node vote because it has leaved the cluster. The effect is that the second node experience a loss of quorum during about 20 seconds, the time to elect himself as qdisk master. The problem is that rgmanager sees the loss of quorum and shutdowns all the virtual machines that are under its control !!! If I wait 20 seconds between the "service qdiskd stop" and "service cman stop", I don't get the problem because the second node get the time to elect himself master. I was thinking qdiskd is supposed to be a process to maintain the quorum independently of the cman communication. Either I make a mistake or misuse of qdiskd, or there is something to change in the handling of qdiskd votes. One solution may be for a node that was not qdiskd master, and was issuing votes to cman to maintain this vote until a new master election succeeds instead of removing its vote until the master reelection succeeds ? Regards, -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From teigland at redhat.com Mon Nov 2 17:11:27 2009 From: teigland at redhat.com (David Teigland) Date: Mon, 2 Nov 2009 11:11:27 -0600 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <4AEB765B.3010408@isye.gatech.edu> References: <4AEB765B.3010408@isye.gatech.edu> Message-ID: <20091102171127.GE613@redhat.com> On Fri, Oct 30, 2009 at 07:27:23PM -0400, Allen Belletti wrote: > I'll notice the problem when the load average starts rising. It's > always tied to "stuck" processes, and I believe always tied to IMAP > clients (I'm running Dovecot.) It seems like a file belonging to user > "x" (in this case, "jforrest" will become locked in some way, such that > every IMAP process tied that user will get stuck on the same thing. > Over time, as the user keeps trying to read that file, more & more > processes accumulate. They're always in state "D" (uninterruptible > sleep), and always on "dlm_posix_lock" according to WCHAN. The only way > I'm able to get out of this state is to reboot. If I let it persist for > too long, I/O generally stops entirely. Next time, try to collect all the following information as soon as you can after the first process gets stuck: - ps showing pid of stuck/"D" process(es) and WCHAN - which file they are stuck trying to lock (and the inode number of it, you may need to wait until after the reboot to use ls -li on the file to get the inode number) - group_tool dump plocks from all the nodes I'm guessing that dovecot does some "unusual" combinations of locking, closing, renaming, unlinking files. Those combinations are especially prone to races and bugs that cause posix lock state to get off. Dave From teigland at redhat.com Mon Nov 2 17:25:43 2009 From: teigland at redhat.com (David Teigland) Date: Mon, 2 Nov 2009 11:25:43 -0600 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> References: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> Message-ID: <20091102172543.GF613@redhat.com> On Mon, Nov 02, 2009 at 03:09:26PM +0100, Gianluca Cecchi wrote: > Nov 2 12:47:54 mindy kernel: dlm: connect from non cluster node > Nov 2 12:59:48 mindy kernel: dlm_send invoked oom-killer: gfp_mask=0xd0, > order=1, oomkilladj=0 > Nov 2 12:59:48 mindy kernel: > Nov 2 12:59:48 mindy kernel: Call Trace: > Nov 2 12:59:48 mindy kernel: [] out_of_memory+0x8e/0x2f5 > Nov 2 12:59:48 mindy kernel: [] > autoremove_wake_function+0x0/0x2e > Nov 2 12:59:48 mindy kernel: [] > __alloc_pages+0x245/0x2ce > Nov 2 12:59:48 mindy kernel: [] __alloc_pages+0x65/0x2ce > Nov 2 12:59:48 mindy kernel: [] cache_grow+0x137/0x395 > Nov 2 12:59:48 mindy kernel: [] > cache_alloc_refill+0x136/0x186 > Nov 2 12:59:48 mindy kernel: [] > kmem_cache_alloc+0x6c/0x76 > Nov 2 12:59:48 mindy kernel: [] sk_alloc+0x2e/0xf3 > Nov 2 12:59:48 mindy kernel: [] inet_create+0x137/0x267 > Nov 2 12:59:49 mindy kernel: [] > __sock_create+0x170/0x27c > Nov 2 12:59:49 mindy kernel: [] > :dlm:process_send_sockets+0x0/0x179 > Nov 2 12:59:49 mindy kernel: [] > :dlm:tcp_connect_to_sock+0x70/0x1de > Nov 2 12:59:49 mindy kernel: [] thread_return+0x62/0xfe > Nov 2 12:59:49 mindy kernel: [] > :dlm:process_send_sockets+0x20/0x179 > Nov 2 12:59:49 mindy kernel: [] > :dlm:process_send_sockets+0x0/0x179 > Nov 2 12:59:49 mindy kernel: [] run_workqueue+0x94/0xe4 > Nov 2 12:59:49 mindy kernel: [] worker_thread+0x0/0x122 > Nov 2 12:59:49 mindy kernel: [] > keventd_create_kthread+0x0/0xc4 > Nov 2 12:59:49 mindy kernel: [] worker_thread+0xf0/0x122 > Nov 2 12:59:49 mindy kernel: [] > default_wake_function+0x0/0xe > Nov 2 12:59:49 mindy kernel: [] > keventd_create_kthread+0x0/0xc4 > Nov 2 12:59:49 mindy kernel: [] > keventd_create_kthread+0x0/0xc4 > Nov 2 12:59:49 mindy kernel: [] kthread+0xfe/0x132 > Nov 2 12:59:49 mindy kernel: [] child_rip+0xa/0x11 > Nov 2 12:59:49 mindy kernel: [] > keventd_create_kthread+0x0/0xc4 > Nov 2 12:59:49 mindy kernel: [] > :ext3:ext3_journal_dirty_data+0x0/0x34 > Nov 2 12:59:49 mindy kernel: [] kthread+0x0/0x132 > Nov 2 12:59:49 mindy kernel: [] child_rip+0x0/0x11 > Nov 2 12:59:49 mindy kernel: The out-of-memory should be fixed in 5.4: https://bugzilla.redhat.com/show_bug.cgi?id=508829 The fix for dlm_send spinning is not released yet: https://bugzilla.redhat.com/show_bug.cgi?id=521093 Dave From agx at sigxcpu.org Mon Nov 2 19:03:09 2009 From: agx at sigxcpu.org (Guido =?iso-8859-1?Q?G=FCnther?=) Date: Mon, 2 Nov 2009 20:03:09 +0100 Subject: [PATCH]: fix fence_vm run during build [was Re: [Linux-cluster] ccs_config_validate in cluster 3.0.X] In-Reply-To: <4AEBE4AE.1070502@redhat.com> References: <4AE81EAE.3040604@redhat.com> <20091030160119.GA21200@bogon.sigxcpu.org> <4AEBE4AE.1070502@redhat.com> Message-ID: <20091102190309.GA17692@bogon.sigxcpu.org> On Sat, Oct 31, 2009 at 08:18:06AM +0100, Fabio Massimo Di Nitto wrote: > Guido G?nther wrote: > >On Wed, Oct 28, 2009 at 11:36:30AM +0100, Fabio M. Di Nitto wrote: > >>Hi everybody, > >> > >>as briefly mentioned in 3.0.4 release note, a new system to validate the > >>configuration has been enabled in the code. > >> > >>What it does > >>------------ > >> > >>The general idea is to be able to perform as many sanity checks on the > >>configuration as possible. This check allows us to spot the most common > >>mistakes, such as typos or possibly invalid values, in cluster.conf. > >This is great. For what it's worth: I've pushed Cluster 3.0.4 into > >Debian experimental a couple of days ago. > >Cheers, > > -- Guido > > > > Hi Guido, > > thanks for pushing the packages to Debian. > > Please make sure to forward bugs related to this check so we can > address them quickly. > > Lon update the FAQ on our wiki to help debugging issues related to RelaxNG. > > It would be nice if you could do a package check around > (corosync/openais/cluster) and send us any local patch you have. I > have noticed at least corosync has one that is suitable for > upstream. > I didn?t have time to look at cluster. Attached patches fixed the build if you don't have liblogthread already installed. Needed to run fence_xvm. Cheers, -- Guido -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-add-LD_LIBRARY_PATH-so-we-find-liblogthread.patch Type: text/x-diff Size: 737 bytes Desc: not available URL: From lhh at redhat.com Mon Nov 2 19:05:38 2009 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 02 Nov 2009 14:05:38 -0500 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: References: Message-ID: <1257188738.2496.71.camel@localhost.localdomain> On Mon, 2009-11-02 at 16:44 +0100, Alain RICHARD wrote: > a) use_virsh = 1 per default > b) that if your resource have a path attribute, vm.sh automatically > revert to use_virsh=0, even if you hard code use_virsh=1 ! > c) there is no option to indicate the xml file that virsh use to > create the vm. It always tries "virsh create name" where name is the > vm name. > The point a) is a little bit silly because if you have a RHEL 5.3 > cluster that is using xm configuration files, your vm will no longer > launch after upgrade because it tries to do a "virsh create name" > instead of "xm create name". It would have been probably cleaner to > have "use_virsh = 0" per default to keep compatibility. Libvirt's Xen mode loads config files from /etc/xen. When you run 'virsh start foo' it will look for the config in /etc/xen which defines the virtual machine named 'foo'. So, effectively, either what you said above works for me or I don't understand your problem. Both 'xm' and 'virsh' modes work on Xen domains in 5.4 (assuming you don't have a path attribute set): http://pastebin.ca/1653460 Did you mix 5.3 libvirt with 5.4 rgmanager or something? If so, I'd simply keep using the vm.sh from rgmanager in 5.3. > The point b) will add compatibility to people that use the path > attribute in order to store vm conf files in a place shared by all > members of the cluster (gfs2 or nfs directory for example). It would > have been clearer to document this feature because it is a little bit > magical to see a resource with use_virsh=1 use in fact xm and not > virsh !!! You can't use "path" with virsh. You are correct, though - it should produce an error if someone explicitly sets use_virsh="1" with a path= also set. The two options can not be used together. On making "virsh" mode more usable... virsh supports 2 things: - Loading something by -name- from /etc/libvirt/qemu (or /etc/xen), and - Defining transient virtual machines from a -file- vm.sh when using virsh in 5.4 supports the former, but not the latter. Federico Simoncelli wrote a patch to allow the latter for STABLE3: http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=ea90559c936792e22576cac7a0bd0a2a50573426 > The point c) is very silly, because it restricts the configuration to > be loaded from /etc/xen even for kvm ! Correct. Well, /etc/libvirt/qemu for kvm, but... There's no way to define alternate locations for VM config files in rgmanager from RHEL 5.4. > Also their is no problem to use a "virsh create /path/to/file.xml" > under RHEL 5.4 and I have found out that the cluster 3.0 stable branch > have a new vm.sh file using an xmlpath attribute to solve this > problem. Why this version was not back ported to RHEL 5.4 ? Is there > any plan to do it ? No, but it sounds like there should be. -- Lon From lhh at redhat.com Mon Nov 2 19:06:39 2009 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 02 Nov 2009 14:06:39 -0500 Subject: [Linux-cluster] qdiskd master election and loss of quorum In-Reply-To: References: Message-ID: <1257188799.2496.72.camel@localhost.localdomain> On Mon, 2009-11-02 at 17:59 +0100, Alain RICHARD wrote: > I am currently using a n nodes configuration with a qdiskd process to > sustain a n-1 node failure. > > > The simplest case is a two node : > > > > > > Add 'quorum_dev_poll="42000' to the cman tag. -- Lon From allen at isye.gatech.edu Mon Nov 2 19:27:24 2009 From: allen at isye.gatech.edu (Allen Belletti) Date: Mon, 02 Nov 2009 14:27:24 -0500 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <1257162169.6052.746.camel@localhost.localdomain> References: <4AEB765B.3010408@isye.gatech.edu> <1257162169.6052.746.camel@localhost.localdomain> Message-ID: <4AEF329C.4020403@isye.gatech.edu> Hi Steve, On 11/02/2009 06:42 AM, Steven Whitehouse wrote: > Hi, > > On Fri, 2009-10-30 at 19:27 -0400, Allen Belletti wrote: > >> Hi All, >> >> As I've mentioned before, I'm running a two-node clustered mail server >> on GFS2 (with RHEL 5.4) Nearly all of the time, everything works >> great. However, going all the way back to GFS1 on RHEL 5.1 (I think it >> was), I've had occasional locking problems that force a reboot of one or >> both cluster nodes. Lately I've paid closer attention since it's been >> happening more often. >> >> I'll notice the problem when the load average starts rising. It's >> always tied to "stuck" processes, and I believe always tied to IMAP >> clients (I'm running Dovecot.) It seems like a file belonging to user >> "x" (in this case, "jforrest" will become locked in some way, such that >> every IMAP process tied that user will get stuck on the same thing. >> Over time, as the user keeps trying to read that file, more& more >> processes accumulate. They're always in state "D" (uninterruptible >> sleep), and always on "dlm_posix_lock" according to WCHAN. The only way >> I'm able to get out of this state is to reboot. If I let it persist for >> too long, I/O generally stops entirely. >> >> This certainly seems like it ought to have a definite solution, but I've >> no idea what it is. I've tried a variety of things using "find" to >> pinpoint a particular file, but everything belonging to the affected >> user seems just fine. At least, I can read and copy all of the files, >> and do a stat via ls -l. >> >> Is it possible that this is a bug, not within GFS at all, but within >> Dovecot IMAP? >> >> Any thoughts would be appreciated. It's been getting worse lately and >> thus no fun at all. >> >> Cheers, >> Allen >> >> > Do you know if dovecot IMAP uses signals at all? That would be the first > thing that I'd look at. The other thing to check is whether it makes use > of F_GETLK and in particular the l_pid field? strace should be able to > answer both of those questions (except the l_pid field of course, but > the chances are it it calls F_GETLK and then sends a signal, its also > using the l_pid field), > I've checked via both strace and grepping the source, and found no evidence of F_GETLK nor the l_pid field being referenced. Signals don't appear to play a significant role either; I've managed to snag an strace -f -p of a "healthy" imap session (ie, dlm_posix_lock briefly appearing in WCHAN but going away as expected) and I see no signals being used. By the way, I took advantage of a quiet period early Sunday morning and ran fsck.gfs2( version 3.0.4) on the two GFS2 filesystems. Both had a variety of errors although no evidence of major corruption. Since that completed I've seen no additional "stuck" locks but the sample period is far too short to tell. Sometimes things work for weeks without issue. Thanks for your suggestions! Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology From allen at isye.gatech.edu Mon Nov 2 20:02:43 2009 From: allen at isye.gatech.edu (Allen Belletti) Date: Mon, 02 Nov 2009 15:02:43 -0500 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <20091102171127.GE613@redhat.com> References: <4AEB765B.3010408@isye.gatech.edu> <20091102171127.GE613@redhat.com> Message-ID: <4AEF3AE3.3090401@isye.gatech.edu> Hi Dave, On 11/02/2009 12:11 PM, David Teigland wrote: > On Fri, Oct 30, 2009 at 07:27:23PM -0400, Allen Belletti wrote: > >> I'll notice the problem when the load average starts rising. It's >> always tied to "stuck" processes, and I believe always tied to IMAP >> clients (I'm running Dovecot.) It seems like a file belonging to user >> "x" (in this case, "jforrest" will become locked in some way, such that >> every IMAP process tied that user will get stuck on the same thing. >> Over time, as the user keeps trying to read that file, more& more >> processes accumulate. They're always in state "D" (uninterruptible >> sleep), and always on "dlm_posix_lock" according to WCHAN. The only way >> I'm able to get out of this state is to reboot. If I let it persist for >> too long, I/O generally stops entirely. >> > Next time, try to collect all the following information as soon as you can > after the first process gets stuck: > > - ps showing pid of stuck/"D" process(es) and WCHAN > - which file they are stuck trying to lock > (and the inode number of it, you may need to wait until after the > reboot to use ls -li on the file to get the inode number) > - group_tool dump plocks from all the nodes > > I'm guessing that dovecot does some "unusual" combinations of locking, > closing, renaming, unlinking files. Those combinations are especially > prone to races and bugs that cause posix lock state to get off. > I'll collect all of this as soon as I catch the problem in action again. Do you know how I might go about determine which file is involved? I can find the user because it's associated with the particular "imap" process, but haven't been able to figure out what's being locked. Thanks, Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology From gianluca.cecchi at gmail.com Mon Nov 2 21:29:15 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Mon, 2 Nov 2009 22:29:15 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <20091102172543.GF613@redhat.com> References: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> <20091102172543.GF613@redhat.com> Message-ID: <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> On Mon, Nov 2, 2009 at 6:25 PM, David Teigland wrote: > > The out-of-memory should be fixed in 5.4: > > https://bugzilla.redhat.com/show_bug.cgi?id=508829 > > The fix for dlm_send spinning is not released yet: > > https://bugzilla.redhat.com/show_bug.cgi?id=521093 > > Dave > > Thank you so much for the feedback. So I have to expect this freeze and possible downtime...... also if my real nmodes a safer method could be this one below for my two nodes + quorum disk cluster? 1) shutdown and restart in single user mode of the passive node So now the cluster is composed of only one node in 5.3 without loss of service, at the moment 2) start network and update the passive node (as in steps of the first mail) 3) reboot in single user mode of the just updated node, and test correct funcionality (without cluster) 4) shutdown again of the just updated node 5) shutdown of the active node --- NOW we have downtime (planned) 6) startup of the updated node, now in 5.4 (and with 508829 bug corrected) This node should form the cluster with 2 votes, itself and the quorum, correct? 7) IDEA: make a dummy update to the config on this new running node, only incrementing version number by one, so that after, when the other node comes up, it gets the config.... Does it make sense or no need/no problems for this when the second node will join? 8) power on in single user mode of the node still in 5.3 9) start network on it and update system as in steps 2) 10) reboot the just updated node and let it start in single user mode to test its functionality (without cluster enabled) 11) reboot again and let it normally join the cluster Expected result: correct join of the cluster, correct? 12) Test a relocation of the service ----- NOW another little downtime, but to be sure that in case of need we get relocation without problems I'm going to test this tomorrow (here half past ten pm now) after restore of initial situation with both in 5.3, so if there are any comments, they are welcome.. Thanks Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From allen at isye.gatech.edu Mon Nov 2 21:44:15 2009 From: allen at isye.gatech.edu (Allen Belletti) Date: Mon, 02 Nov 2009 16:44:15 -0500 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <1257162169.6052.746.camel@localhost.localdomain> References: <4AEB765B.3010408@isye.gatech.edu> <1257162169.6052.746.camel@localhost.localdomain> Message-ID: <4AEF52AF.40108@isye.gatech.edu> Hi Again, On 11/02/2009 06:42 AM, Steven Whitehouse wrote: > Hi, > > On Fri, 2009-10-30 at 19:27 -0400, Allen Belletti wrote: > >> Hi All, >> >> As I've mentioned before, I'm running a two-node clustered mail server >> on GFS2 (with RHEL 5.4) Nearly all of the time, everything works >> great. However, going all the way back to GFS1 on RHEL 5.1 (I think it >> was), I've had occasional locking problems that force a reboot of one or >> both cluster nodes. Lately I've paid closer attention since it's been >> happening more often. >> >> I'll notice the problem when the load average starts rising. It's >> always tied to "stuck" processes, and I believe always tied to IMAP >> clients (I'm running Dovecot.) It seems like a file belonging to user >> "x" (in this case, "jforrest" will become locked in some way, such that >> every IMAP process tied that user will get stuck on the same thing. >> Over time, as the user keeps trying to read that file, more& more >> processes accumulate. They're always in state "D" (uninterruptible >> sleep), and always on "dlm_posix_lock" according to WCHAN. The only way >> I'm able to get out of this state is to reboot. If I let it persist for >> too long, I/O generally stops entirely. >> >> This certainly seems like it ought to have a definite solution, but I've >> no idea what it is. I've tried a variety of things using "find" to >> pinpoint a particular file, but everything belonging to the affected >> user seems just fine. At least, I can read and copy all of the files, >> and do a stat via ls -l. >> >> Is it possible that this is a bug, not within GFS at all, but within >> Dovecot IMAP? >> >> Any thoughts would be appreciated. It's been getting worse lately and >> thus no fun at all. >> >> Cheers, >> Allen >> >> > Do you know if dovecot IMAP uses signals at all? That would be the first > thing that I'd look at. The other thing to check is whether it makes use > of F_GETLK and in particular the l_pid field? strace should be able to > answer both of those questions (except the l_pid field of course, but > the chances are it it calls F_GETLK and then sends a signal, its also > using the l_pid field), > > Steve. > I've been looking into how Dovecot IMAP works and I see now that no "locking" in the OS sense of the word is involved for Maildir access. Instead, one particular index per mail folder is "locked" by creating a .lock entry, performing the necessary operations, and then deleting the file. In the case of certain users with hundreds of folders and mail clients which scan all of them, this potentially results in hundreds of rapid create/delete operations. The relevant text from the Dovecot documentation is as follows: > Although maildir was designed to be lockless, Dovecot locks the > maildir while > doing modifications to it or while looking for new messages in it. This is > required because otherwise Dovecot might temporarily see mails incorrectly > deleted, which would cause trouble. Basically the problem is that if one > process modifies the maildir (eg. a rename() to change a message's flag), > another process in the middle of listing files at the same time could > skip a > file. The skipping happens because readdir() system call doesn't > guarantee that > all the files are returned if the directory is modified between the > calls to > it. This problem exists with all the commonly used filesystems. > > Because Dovecot uses its own non-standard locking ('dovecot-uidlist.lock' > dotlock file), other MUAs accessing the maildir don't support it. This > means > that if another MUA is updating messages' flags or expunging messages, > Dovecot > might temporarily lose some message. After the next sync when it finds it > again, an error message may be written to log and the message will > receive a > new UID. Does GFS2 have the limitation that's being described for readdir()? I would expect so, but perhaps the work necessary to ensure a consistent view between cluster node has the side effect of correcting this issue as well. In any case, the number of times that my users would actually encounter the issue being protected against might be so rare that I can safely disable the locking mechanism regardless. Any thoughts on this would be appreciated. Would this sequence of operations cause the WCHAN=dlm_posix_lock condition for brief periods of time in normal operation? Wish I could dig through the kernel & gfs2 code to figure this out for myself but it would crush my productivity at work :-) Cheers, Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology From gordan at bobich.net Tue Nov 3 00:26:05 2009 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 03 Nov 2009 00:26:05 +0000 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> References: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> <20091102172543.GF613@redhat.com> <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> Message-ID: <4AEF789D.9090004@bobich.net> On 02/11/2009 21:29, Gianluca Cecchi wrote: > > On Mon, Nov 2, 2009 at 6:25 PM, David Teigland > wrote: > > > The out-of-memory should be fixed in 5.4: > > https://bugzilla.redhat.com/show_bug.cgi?id=508829 > > The fix for dlm_send spinning is not released yet: > > https://bugzilla.redhat.com/show_bug.cgi?id=521093 > > Dave > > > Thank you so much for the feedback. > So I have to expect this freeze and possible downtime...... also if my > real nmodes a safer method could be this one below for my two nodes + > quorum disk cluster? > 1) shutdown and restart in single user mode of the passive node > So now the cluster is composed of only one node in 5.3 without loss of > service, at the moment > 2) start network and update the passive node (as in steps of the first mail) > 3) reboot in single user mode of the just updated node, and test correct > funcionality (without cluster) > 4) shutdown again of the just updated node > 5) shutdown of the active node --- NOW we have downtime (planned) > 6) startup of the updated node, now in 5.4 (and with 508829 bug corrected) > This node should form the cluster with 2 votes, itself and the quorum, > correct? > > 7) IDEA: make a dummy update to the config on this new running node, > only incrementing version number by one, so that after, when the other > node comes up, it gets the config.... > Does it make sense or no need/no problems for this when the second node > will join? > > 8) power on in single user mode of the node still in 5.3 > 9) start network on it and update system as in steps 2) > 10) reboot the just updated node and let it start in single user mode to > test its functionality (without cluster enabled) > 11) reboot again and let it normally join the cluster > > Expected result: correct join of the cluster, correct? > > 12) Test a relocation of the service ----- NOW another little downtime, > but to be sure that in case of need we get relocation without problems > > I'm going to test this tomorrow (here half past ten pm now) after > restore of initial situation with both in 5.3, so if there are any > comments, they are welcome.. FWIW, I just updated one of my DRBD+GFS clusters from 5.3 (and early 5.3 at that) to 5.4 with a rolling re-start, and it "just worked". It's a 2-node cluster with a shared GFS root, and I updated it, rebuilt the initrd, rebooted one node, which came up and rejoined find, then rebooted the other. No service downtime. Gordan From alain.richard at equation.fr Tue Nov 3 08:52:23 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Tue, 3 Nov 2009 09:52:23 +0100 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <1257188738.2496.71.camel@localhost.localdomain> References: <1257188738.2496.71.camel@localhost.localdomain> Message-ID: <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> Le 2 nov. 2009 ? 20:05, Lon Hohberger a ?crit : > On Mon, 2009-11-02 at 16:44 +0100, Alain RICHARD wrote: > >> a) use_virsh = 1 per default >> b) that if your resource have a path attribute, vm.sh automatically >> revert to use_virsh=0, even if you hard code use_virsh=1 ! >> c) there is no option to indicate the xml file that virsh use to >> create the vm. It always tries "virsh create name" where name is the >> vm name. > >> The point a) is a little bit silly because if you have a RHEL 5.3 >> cluster that is using xm configuration files, your vm will no longer >> launch after upgrade because it tries to do a "virsh create name" >> instead of "xm create name". It would have been probably cleaner to >> have "use_virsh = 0" per default to keep compatibility. > > Libvirt's Xen mode loads config files from /etc/xen. When you run > 'virsh start foo' it will look for the config in /etc/xen which > defines > the virtual machine named 'foo'. > > So, effectively, either what you said above works for me or I don't > understand your problem. Both 'xm' and 'virsh' modes work on Xen > domains in 5.4 (assuming you don't have a path attribute set): > > http://pastebin.ca/1653460 > > Did you mix 5.3 libvirt with 5.4 rgmanager or something? If so, I'd > simply keep using the vm.sh from rgmanager in 5.3. > I am sorry about that one : effectively virsh start xxx is able to read xen format config file and convert it on the fly in xml. It is not documentented anywhere, but I just tested that virsh start works. >> The point b) will add compatibility to people that use the path >> attribute in order to store vm conf files in a place shared by all >> members of the cluster (gfs2 or nfs directory for example). It would >> have been clearer to document this feature because it is a little bit >> magical to see a resource with use_virsh=1 use in fact xm and not >> virsh !!! > > You can't use "path" with virsh. You are correct, though - it should > produce an error if someone explicitly sets use_virsh="1" with a path= > also set. The two options can not be used together. > no currently if the path is defined, use_virsh is silently forced to "0" in the script. > On making "virsh" mode more usable... > > virsh supports 2 things: > - Loading something by -name- from /etc/libvirt/qemu (or /etc/xen), > and > - Defining transient virtual machines from a -file- > > vm.sh when using virsh in 5.4 supports the former, but not the latter. > Federico Simoncelli wrote a patch to allow the latter for STABLE3: > > http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=ea90559c936792e22576cac7a0bd0a2a50573426 > > yes, this is this one I hope to see backported under RHEL5.x. >> The point c) is very silly, because it restricts the configuration to >> be loaded from /etc/xen even for kvm ! > > Correct. Well, /etc/libvirt/qemu for kvm, but... There's no way to > define alternate locations for VM config files in rgmanager from RHEL > 5.4. > > >> Also their is no problem to use a "virsh create /path/to/file.xml" >> under RHEL 5.4 and I have found out that the cluster 3.0 stable >> branch >> have a new vm.sh file using an xmlpath attribute to solve this >> problem. Why this version was not back ported to RHEL 5.4 ? Is there >> any plan to do it ? > > No, but it sounds like there should be. > Should I open a case under bugzilla to ask for the backport ? Regards, -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From alain.richard at equation.fr Tue Nov 3 08:59:57 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Tue, 3 Nov 2009 09:59:57 +0100 Subject: [Linux-cluster] qdiskd master election and loss of quorum In-Reply-To: <1257188799.2496.72.camel@localhost.localdomain> References: <1257188799.2496.72.camel@localhost.localdomain> Message-ID: Le 2 nov. 2009 ? 20:06, Lon Hohberger a ?crit : > On Mon, 2009-11-02 at 17:59 +0100, Alain RICHARD wrote: >> I am currently using a n nodes configuration with a qdiskd process to >> sustain a n-1 node failure. >> >> >> The simplest case is a two node : >> >> >> >> >> >> > > Add 'quorum_dev_poll="42000' to the cman tag. > > -- Lon I just found out a note describing this parameter : http://kbase.redhat.com/faq/docs/DOC-2882 but It seams to deals with problems with multipath and cman interraction on one node. My problem seams to be a little bit different : I do stop qdisk and cman on node 1, and this triggers a temporary loss of quorum on the node 2. I will try this parameter to see if I reproduce the problem. Regards, PS: what is the default value for this parameter ? -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniac.nl at gmail.com Tue Nov 3 09:05:37 2009 From: maniac.nl at gmail.com (Mark Janssen) Date: Tue, 3 Nov 2009 10:05:37 +0100 Subject: [Linux-cluster] Inconsistency between 'gfs_tool df' and 'df' Message-ID: <531e3e4c0911030105q3e9b9d02n346f6dea07dd6164@mail.gmail.com> Hello, One of my GFS1 filesystems was filling up, so I resized the logical volume and gfs_grow'd the filesystem. gfs_tool df will correctly show the new maximum and free size, however normal 'df' still reports the original size, and is reporting a very full filesystem. [root at system ~]# gfs_tool df /some/filesystem /some/filesystem: SB lock proto = "lock_dlm" SB lock table = "nfsclust:blablabla" SB ondisk format = 1309 SB multihost format = 1401 Block size = 4096 Journals = 2 Resource Groups = 3003 Mounted lock proto = "lock_dlm" Mounted lock table = "nfsclust:blablabla" Mounted host data = "jid=1:id=1310722:first=0" Journal number = 1 Lock module flags = 0 Local flocks = FALSE Local caching = FALSE Oopses OK = FALSE Type Total Used Free use % ------------------------------------------------------------------------ inodes 1739711 1739711 0 100% metadata 471881 338342 133539 72% data 223141556 193760217 29381339 87% [root at system ~]# df -h /some/filesystem Filesystem Size Used Avail Use% Mounted on /dev/mapper/CluVGgfs-lv--somefilesystem 810G 748G 63G 93% /some/filesystem [root at system ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy % Convert lv-somefilesystem CluVGgfs -wi-ao 860.00G Any idea on how I get '/bin/df' to show the correct 'new' size and utilization. Currently nagios is putting this filesystem in warning due to utilization, though it's not full (according to gfs_tool df). -- Mark Janssen -- maniac(at)maniac.nl -- pgp: 0x357D2178 | ,''`. | Unix / Linux Open-Source and Internet Consultant @ Snow.nl | : :' : | Maniac.nl MarkJanssen.nl NerdNet.nl Unix.nl | `. `' | Skype: markmjanssen ICQ: 129696007 irc: FooBar on undernet | `- | From fdinitto at redhat.com Tue Nov 3 09:28:10 2009 From: fdinitto at redhat.com (Fabio Massimo Di Nitto) Date: Tue, 03 Nov 2009 10:28:10 +0100 Subject: [PATCH]: fix fence_vm run during build [was Re: [Linux-cluster] ccs_config_validate in cluster 3.0.X] In-Reply-To: <20091102190309.GA17692@bogon.sigxcpu.org> References: <4AE81EAE.3040604@redhat.com> <20091030160119.GA21200@bogon.sigxcpu.org> <4AEBE4AE.1070502@redhat.com> <20091102190309.GA17692@bogon.sigxcpu.org> Message-ID: <4AEFF7AA.7060609@redhat.com> Hi Guido, Guido G?nther wrote: > > Attached patches fixed the build if you don't have liblogthread already > installed. Needed to run fence_xvm. > Cheers, > -- Guido > Thanks, your patch is actually cleaner than mine. I?ll apply it shortly. Fabio From fdinitto at redhat.com Tue Nov 3 09:33:33 2009 From: fdinitto at redhat.com (Fabio Massimo Di Nitto) Date: Tue, 03 Nov 2009 10:33:33 +0100 Subject: [PATCH]: fix fence_vm run during build [was Re: [Linux-cluster] ccs_config_validate in cluster 3.0.X] In-Reply-To: <20091102190309.GA17692@bogon.sigxcpu.org> References: <4AE81EAE.3040604@redhat.com> <20091030160119.GA21200@bogon.sigxcpu.org> <4AEBE4AE.1070502@redhat.com> <20091102190309.GA17692@bogon.sigxcpu.org> Message-ID: <4AEFF8ED.4000107@redhat.com> Guido G?nther wrote: > On Sat, Oct 31, 2009 at 08:18:06AM +0100, Fabio Massimo Di Nitto wrote: >> Guido G?nther wrote: >>> On Wed, Oct 28, 2009 at 11:36:30AM +0100, Fabio M. Di Nitto wrote: >>>> Hi everybody, >>>> >>>> as briefly mentioned in 3.0.4 release note, a new system to validate the >>>> configuration has been enabled in the code. >>>> >>>> What it does >>>> ------------ >>>> >>>> The general idea is to be able to perform as many sanity checks on the >>>> configuration as possible. This check allows us to spot the most common >>>> mistakes, such as typos or possibly invalid values, in cluster.conf. >>> This is great. For what it's worth: I've pushed Cluster 3.0.4 into >>> Debian experimental a couple of days ago. >>> Cheers, >>> -- Guido >>> >> Hi Guido, >> >> thanks for pushing the packages to Debian. >> >> Please make sure to forward bugs related to this check so we can >> address them quickly. >> >> Lon update the FAQ on our wiki to help debugging issues related to RelaxNG. >> >> It would be nice if you could do a package check around >> (corosync/openais/cluster) and send us any local patch you have. I >> have noticed at least corosync has one that is suitable for >> upstream. >> I didn?t have time to look at cluster. > > Attached patches fixed the build if you don't have liblogthread already > installed. Needed to run fence_xvm. > Cheers, > -- Guido Applied and pushed. thanks again! Fabio From sreeharsha.totakura at tcs.com Tue Nov 3 10:18:36 2009 From: sreeharsha.totakura at tcs.com (Sreeharsha Totakura) Date: Tue, 3 Nov 2009 15:48:36 +0530 Subject: [Linux-cluster] RHCS Message-ID: Hi, I am using RHCS 5.2 for achieving High Availability in MySQL database service. I have configured two nodes; one as a primary and the other as a backup. Both the servers have GFS on SAN. The cluster suite is configured properly and when a node shuts down, the service is being started in the other node with the configured virtual ip address for MySQL service. However, if we plug out the network link to the primary server the service is not being started in the other node; even after having 'monitor_link="1"' in cluster.conf for that virtual ip address. Our servers are connected by redundant SAN connections so we have 6 network interfaces on both the servers. I feel the nodes are still able to communicate through these network connections which are being used for SAN volumes. Is there anyway in which I can configure cluster manager to bind the virtual ip for MySQL service to an interface and monitor the link on that interface? Regards, Sree Harsha Totakura =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you From lhh at redhat.com Tue Nov 3 13:15:05 2009 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 03 Nov 2009 08:15:05 -0500 Subject: [Linux-cluster] qdiskd master election and loss of quorum In-Reply-To: References: <1257188799.2496.72.camel@localhost.localdomain> Message-ID: <1257254105.2483.4.camel@localhost> On Tue, 2009-11-03 at 09:59 +0100, Alain RICHARD wrote: > > I will try this parameter to see if I reproduce the problem. > > > Regards, > > > PS: what is the default value for this parameter ? Right now, 10000. As of a commit last week or so, it will be equal to the token timeout value: http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=b1ac9397a2e27c2862cf41df7ac523c9982b65cf (This should be in 3.0.5) Though it's a bit odd that stopping node 1 causes a loss of quorum on node2. :( -- Lon From lhh at redhat.com Tue Nov 3 13:20:04 2009 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 03 Nov 2009 08:20:04 -0500 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> Message-ID: <1257254404.2483.9.camel@localhost> On Tue, 2009-11-03 at 09:52 +0100, Alain RICHARD wrote: > > no currently if the path is defined, use_virsh is silently forced to > "0" in the script. Yes. We probably should need to warn users if use_virsh="1" and path="..." is set; this would remove this confusion. https://bugzilla.redhat.com/show_bug.cgi?id=529926 > > > On making "virsh" mode more usable... > > > > virsh supports 2 things: > > - Loading something by -name- from /etc/libvirt/qemu (or /etc/xen), > > and > > - Defining transient virtual machines from a -file- > > > > vm.sh when using virsh in 5.4 supports the former, but not the > > latter. > > Federico Simoncelli wrote a patch to allow the latter for STABLE3: > > > > http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=ea90559c936792e22576cac7a0bd0a2a50573426 > > > > > > > > > yes, this is this one I hope to see backported under RHEL5.x. > Should I open a case under bugzilla to ask for the backport ? Yes. -- Lon From gordan at bobich.net Tue Nov 3 13:31:50 2009 From: gordan at bobich.net (Gordan Bobic) Date: Tue, 03 Nov 2009 13:31:50 +0000 Subject: [Linux-cluster] RHCS In-Reply-To: References: Message-ID: <4AF030C6.5010503@bobich.net> Sreeharsha Totakura wrote: > Hi, > > I am using RHCS 5.2 for achieving High Availability in MySQL database > service. I have configured two nodes; one as a primary and the other as a > backup. Both the servers have GFS on SAN. > > The cluster suite is configured properly and when a node shuts down, the > service is being started in the other node with the configured virtual ip > address for MySQL service. However, if we plug out the network link to the > primary server the service is not being started in the other node; even > after having 'monitor_link="1"' in cluster.conf for that virtual ip > address. > > Our servers are connected by redundant SAN connections so we have 6 network > interfaces on both the servers. I feel the nodes are still able to > communicate through these network connections which are being used for SAN > volumes. Is there anyway in which I can configure cluster manager to bind > the virtual ip for MySQL service to an interface and monitor the link on > that interface? The virtual IP will always be assigned to the interface with the static IP on the same subnet. i.e. if you have an interface eth0 on 10.1.1.1/24, and the floating IP is 10.1.1.2, it'll automatically go to the eth0 interface. Is this not what is happening in your setup? I think link monitoring should be on the static, rather than floating IP resource (that's what I do, and it works for me). Gordan From jeff.sturm at eprize.com Tue Nov 3 15:07:47 2009 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Tue, 3 Nov 2009 10:07:47 -0500 Subject: [Linux-cluster] Inconsistency between 'gfs_tool df' and 'df' In-Reply-To: <531e3e4c0911030105q3e9b9d02n346f6dea07dd6164@mail.gmail.com> References: <531e3e4c0911030105q3e9b9d02n346f6dea07dd6164@mail.gmail.com> Message-ID: <64D0546C5EBBD147B75DE133D798665F03F3EED2@hugo.eprize.local> -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Mark Janssen Sent: Tuesday, November 03, 2009 4:06 AM To: linux-cluster at redhat.com Subject: [Linux-cluster] Inconsistency between 'gfs_tool df' and 'df' > gfs_tool df will correctly show the new maximum and free size, however > normal 'df' still reports the original size, and is reporting a very > full filesystem. Do you use statfs_fast? I've been able to refresh "df" output just by toggling it off, and on again, e.g.: gfs_tool /gfs statfs_fast 0 gfs_tool /gfs statfs_fast 1 -Jeff From lhh at redhat.com Tue Nov 3 17:56:18 2009 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 03 Nov 2009 12:56:18 -0500 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <1257254404.2483.9.camel@localhost> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> <1257254404.2483.9.camel@localhost> Message-ID: <1257270978.2483.17.camel@localhost> On Tue, 2009-11-03 at 08:20 -0500, Lon Hohberger wrote: > > > > yes, this is this one I hope to see backported under RHEL5.x. > > > Should I open a case under bugzilla to ask for the backport ? > > Yes. > Federico kicked be in to finishing this... Want to give it a shot? Effectively, it makes 'path' work with virsh - making virsh mode upgrade compatible with 'xm' mode. This means that we can really (actually) deprecate 'xm' support. Pay no attention to the fact that it uses a global var called 'xmlfile'; it doesn't do any config file parsing, so it still works with Xen and libvirt (xml) style config files: http://git.fedorahosted.org/git/?p=resource-agents.git;a=commit;h=1e42add4f2ec63d4c796d544118dfcccbd8a042b It just splits up a OCF_RESKEY_path on ':' and looks for a file in each directory named NAME or NAME.xml. The requirement is obviously that your config files need to be named after your VM. If this is a problem, you must use the full 'xmlfile' option. -- Lon From teigland at redhat.com Tue Nov 3 19:00:52 2009 From: teigland at redhat.com (David Teigland) Date: Tue, 3 Nov 2009 13:00:52 -0600 Subject: [Linux-cluster] GFS2 processes getting stuck in WCHAN=dlm_posix_lock In-Reply-To: <4AEF3AE3.3090401@isye.gatech.edu> References: <4AEB765B.3010408@isye.gatech.edu> <20091102171127.GE613@redhat.com> <4AEF3AE3.3090401@isye.gatech.edu> Message-ID: <20091103190052.GA22675@redhat.com> On Mon, Nov 02, 2009 at 03:02:43PM -0500, Allen Belletti wrote: > I'll collect all of this as soon as I catch the problem in action > again. Do you know how I might go about determine which file is > involved? I can find the user because it's associated with the > particular "imap" process, but haven't been able to figure out what's > being locked. No I don't, I thought finding the user had identified that, guess not. Dave From tfrumbacher at gmail.com Tue Nov 3 20:03:47 2009 From: tfrumbacher at gmail.com (Aaron Benner) Date: Tue, 3 Nov 2009 13:03:47 -0700 Subject: [Linux-cluster] clusvcadm -U returns "Temporary failure" on vm service Message-ID: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> All, I have a problem that I can't find documentation on and has me baffled. I have a 3 node cluster running xen with multiple domU enabled as cluster services. The individual services are set to have a node affinity using resource groups (see cluster.conf below) and live migration is enabled. I had migrated two domU off of one of the cluster nodes in anticipation of a power-cycle and network reconfig. Before bringing up the node that had been reconfigured I froze (clusvcadm -Z ...) the domU in question so that when the newly reconfigured node came up they would not migrate back to their preferred host, or at least that's what I *THOUGHT* -Z would do. I booted up reconfigured node, and ignoring their frozen state the rgmanager on the rebooting node initiated a migration of the domUs. The migration finished and the virtuals resumed operation on the reconfigured host. The problem is now rgmanager is showing those resrouce groups as having state "migrating" (even though there are no migration processes still active) and clusvcadm -U ... returns the following: "Local machine unfreezing vm:SaturnE...Temporary failure; try again" I get this message on all of the cluster nodes. I'm not sure if this is coming from clusvcadm, vm.sh, or some other piece of the cluster puzzle. Is there any way to get rgmanager to realize that these resource groups are no longer migrating and as such can be unfrozen? Is that even my problem? Can I fix this with anything other than a complete power down of the cluster (disaster)? --AB From peter.tiggerdine at uq.edu.au Wed Nov 4 05:33:19 2009 From: peter.tiggerdine at uq.edu.au (Peter Tiggerdine) Date: Wed, 4 Nov 2009 15:33:19 +1000 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> References: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com><20091102172543.GF613@redhat.com> <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> Message-ID: One problem with the below workflow. 7. Your going to need to copy this over manually otherwise it will fail, I've fallen victim of this before. All cluster nodes need to start on the current revision of the file before you update it. I think this is a chicken and egg problem. One of this things I have configured on my clusters is that all clustered services start on it's own runlevel, in my case I have cluster services running on runlevel 3 but default boot to runlelvel 2. This allows a node to boot up and get network before racing into the cluster (ideal for wanting to find out why it got fenced and solving the problem). Everything else will work as I've just done this myself (except 5 nodes). Your downtime should be quite minimal. Regards, Peter Tiggerdine HPC & eResearch Specialist High Performance Computing Group Information Technology Services University of Queensland Phone: +61 7 3346 6634 Fax: +61 7 3346 6630 Email: peter.tiggerdine at uq.edu.au ________________________________ From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Gianluca Cecchi Sent: Tuesday, 3 November 2009 7:29 AM To: David Teigland Cc: linux-cluster at redhat.com Subject: Re: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 On Mon, Nov 2, 2009 at 6:25 PM, David Teigland wrote: The out-of-memory should be fixed in 5.4: https://bugzilla.redhat.com/show_bug.cgi?id=508829 The fix for dlm_send spinning is not released yet: https://bugzilla.redhat.com/show_bug.cgi?id=521093 Dave Thank you so much for the feedback. So I have to expect this freeze and possible downtime...... also if my real nmodes a safer method could be this one below for my two nodes + quorum disk cluster? 1) shutdown and restart in single user mode of the passive node So now the cluster is composed of only one node in 5.3 without loss of service, at the moment 2) start network and update the passive node (as in steps of the first mail) 3) reboot in single user mode of the just updated node, and test correct funcionality (without cluster) 4) shutdown again of the just updated node 5) shutdown of the active node --- NOW we have downtime (planned) 6) startup of the updated node, now in 5.4 (and with 508829 bug corrected) This node should form the cluster with 2 votes, itself and the quorum, correct? 7) IDEA: make a dummy update to the config on this new running node, only incrementing version number by one, so that after, when the other node comes up, it gets the config.... Does it make sense or no need/no problems for this when the second node will join? 8) power on in single user mode of the node still in 5.3 9) start network on it and update system as in steps 2) 10) reboot the just updated node and let it start in single user mode to test its functionality (without cluster enabled) 11) reboot again and let it normally join the cluster Expected result: correct join of the cluster, correct? 12) Test a relocation of the service ----- NOW another little downtime, but to be sure that in case of need we get relocation without problems I'm going to test this tomorrow (here half past ten pm now) after restore of initial situation with both in 5.3, so if there are any comments, they are welcome.. Thanks Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.wang2 at nsn.com Wed Nov 4 07:40:34 2009 From: colin.wang2 at nsn.com (Wang2, Colin (NSN - CN/Cheng Du)) Date: Wed, 04 Nov 2009 15:40:34 +0800 Subject: [Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster Message-ID: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> Hi Gurus, I am working on setup 2 nodes cluster, and environment is, Hardware, IBM BladeCenter with 2 LS42( AMD Opteron Quad Code 2356 CPU, 16GB Memory). Storage, EMC CX3-20f Storage Switch: Brocade 4GB 20 ports switch in IBM bladecenter. Network Switch: Cisco Switch module in IBM Bladecenter. Software, Redhat EL 5.3 x86_64, 2.6.18-128.el5 Redhat Cluster Suite 5.3. This is 2 nodes cluster, and my problem is that, - When poweroff 1st node with command "halt -fp", 2nd node can fence 1st node and take over services. - When poweroff 2nd node with command "halt -fp", 1st node can't fence 2nd node and can't take over services. fence_tool dump contents, ----for successful test dump read: Success 1257305495 our_nodeid 2 our_name 198.18.9.34 1257305495 listen 4 member 5 groupd 7 1257305511 client 3: join default 1257305511 delay post_join 3s post_fail 0s 1257305511 clean start, skipping initial nodes 1257305511 setid default 65538 1257305511 start default 1 members 1 2 1257305511 do_recovery stop 0 start 1 finish 0 1257305511 first complete list empty warning 1257305511 finish default 1 1257305611 stop default 1257305611 start default 3 members 2 1257305611 do_recovery stop 1 start 3 finish 1 1257305611 add node 1 to list 1 1257305611 node "198.18.9.33" not a cman member, cn 1 1257305611 node "198.18.9.33" has not been fenced 1257305611 fencing node 198.18.9.33 1257305615 finish default 3 1257305658 client 3: dump ----For failed test dump read: Success 1257300282 our_nodeid 1 our_name 198.18.9.33 1257300282 listen 4 member 5 groupd 7 1257300297 client 3: join default 1257300297 delay post_join 3s post_fail 0s 1257300297 clean start, skipping initial nodes 1257300297 setid default 65538 1257300297 start default 1 members 1 2 1257300297 do_recovery stop 0 start 1 finish 0 1257300297 first complete list empty warning 1257300297 finish default 1 1257303721 stop default 1257303721 start default 3 members 1 1257303721 do_recovery stop 1 start 3 finish 1 1257303721 add node 2 to list 1 1257303721 averting fence of node 198.18.9.34 1257303721 finish default 3 1257303759 client 3: dump I think it was caused by "averting fence of node 198.18.9.34", but why it advert fence? Could you help me out? Thanks in advance. This cluster.conf for reference. BRs, Colin From maniac.nl at gmail.com Wed Nov 4 08:45:24 2009 From: maniac.nl at gmail.com (Mark Janssen) Date: Wed, 4 Nov 2009 09:45:24 +0100 Subject: [Linux-cluster] Inconsistency between 'gfs_tool df' and 'df' In-Reply-To: <64D0546C5EBBD147B75DE133D798665F03F3EED2@hugo.eprize.local> References: <531e3e4c0911030105q3e9b9d02n346f6dea07dd6164@mail.gmail.com> <64D0546C5EBBD147B75DE133D798665F03F3EED2@hugo.eprize.local> Message-ID: <531e3e4c0911040045i37d160k94964add9d6ca23e@mail.gmail.com> On Tue, Nov 3, 2009 at 4:07 PM, Jeff Sturm wrote: >> gfs_tool df will correctly show the new maximum and free size, however >> normal 'df' still reports the original size, and is reporting a very >> full filesystem. > > Do you use statfs_fast? ?I've been able to refresh "df" output just by toggling it off, and on again, e.g.: > > gfs_tool /gfs statfs_fast 0 > gfs_tool /gfs statfs_fast 1 Thanks... that did the trick. I'm seeing my extra gigabytes now. -- Mark Janssen -- maniac(at)maniac.nl -- pgp: 0x357D2178 | ,''`. | Unix / Linux Open-Source and Internet Consultant @ Snow.nl | : :' : | Maniac.nl MarkJanssen.nl NerdNet.nl Unix.nl | `. `' | Skype: markmjanssen ICQ: 129696007 irc: FooBar on undernet | `- | From jakov.sosic at srce.hr Wed Nov 4 11:15:19 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Wed, 4 Nov 2009 12:15:19 +0100 Subject: [Linux-cluster] RHCS In-Reply-To: <4AF030C6.5010503@bobich.net> References: <4AF030C6.5010503@bobich.net> Message-ID: <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> On Tue, 03 Nov 2009 13:31:50 +0000 Gordan Bobic wrote: > i.e. if you have an interface eth0 on 10.1.1.1/24, and the floating > IP is 10.1.1.2, it'll automatically go to the eth0 interface. Is this > not what is happening in your setup? I think link monitoring should > be on the static, rather than floating IP resource (that's what I do, > and it works for me). So you add your static addresses to cluster.conf as resources too? -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From jakov.sosic at srce.hr Wed Nov 4 11:30:57 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Wed, 4 Nov 2009 12:30:57 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> References: <561c252c0911020609p151c1691t63d5fc079f5765d2@mail.gmail.com> <20091102172543.GF613@redhat.com> <561c252c0911021329k2c7f94cag79fd457c1d6baf87@mail.gmail.com> Message-ID: <20091104123057.78006dad@pc-jsosic.srce.hr> On Mon, 2 Nov 2009 22:29:15 +0100 Gianluca Cecchi wrote: > I'm going to test this tomorrow (here half past ten pm now) after > restore of initial situation with both in 5.3, so if there are any > comments, they are welcome.. Well I usually do rolling updates, (i relocate the services to other nodes, and update one node, then restart it and see if it joins cluster). I start with test cluster, follow the path with clusters with lower priority and finish with high priority clusters. If I find some problems somewhere down the path, I slow down and debug. So far, no problems at all. (5.1 -> 5.2, 5.2 -> 5.3, 5.3 -> 5.4). Only problem I encountered so far was when CentOS published part of cluster suite from 5.4 into 5.3 repositories... -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From gianluca.cecchi at gmail.com Wed Nov 4 11:57:54 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Wed, 4 Nov 2009 12:57:54 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 Message-ID: <561c252c0911040357xdd605a0k700502996814a48f@mail.gmail.com> On Wed, 4 Nov 2009 15:33:19 +1000 Peter Tiggerdine wrote: > 7. Your going to need to copy this over manually otherwise it > will fail, I've fallen victim of this before. All cluster nodes need to start on > the current revision of the file before you update it. I think this is a chicken > and egg problem. In the past I already encountered this situation. And in all cases, the starting node detects its version as not up2date and gets its new config from the other node. My scenario was: node 1 and node 2 up node 2 shutdown change node1 config (I mean here in term of services, probably not valid if inserting a qdiskd section when not available before, or possibly in other cases) power on node2 node 2 gets the new config and apply it (based on availability and correctness of definitions) So I don't think this is correct..... Any one commenting on this? Do you have the messages of the errors when you get this problem? On Wed, 4 Nov 2009 12:30:57 +0100 Jakov Sosic wrote: > Well I usually do rolling updates, (i relocate the services to other > nodes, and update one node, then restart it and see if it joins > cluster). OK. In fact I'm now working on a test cluster, just to get the correct workflow. But you are saying you did this also for 5.3 -> 5.4, while I experienced the oom problem that David documented too, with the entry in bugzilla...... So you joined a just updated 5.4 node to its previous cluster (composed by all 5.3 nodes) and you didn't get any problem at all? Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From gordan at bobich.net Wed Nov 4 13:28:57 2009 From: gordan at bobich.net (Gordan Bobic) Date: Wed, 04 Nov 2009 13:28:57 +0000 Subject: [Linux-cluster] RHCS In-Reply-To: <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> References: <4AF030C6.5010503@bobich.net> <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> Message-ID: <4AF18199.2050705@bobich.net> Jakov Sosic wrote: > On Tue, 03 Nov 2009 13:31:50 +0000 > Gordan Bobic wrote: > >> i.e. if you have an interface eth0 on 10.1.1.1/24, and the floating >> IP is 10.1.1.2, it'll automatically go to the eth0 interface. Is this >> not what is happening in your setup? I think link monitoring should >> be on the static, rather than floating IP resource (that's what I do, >> and it works for me). > > So you add your static addresses to cluster.conf as resources too? No, the static IPs aren't cluster resources, they should be set the usual way using ifcfg- files or whatever your distro uses. Gordan From mark at nostromo.net Wed Nov 4 16:46:26 2009 From: mark at nostromo.net (Mark Horton) Date: Wed, 4 Nov 2009 14:46:26 -0200 Subject: [Linux-cluster] gfs2 and heartbeat cluster Message-ID: <988e28430911040846i5795d1ctc8507ddf90aff9a3@mail.gmail.com> We have an existing cluster that uses heartbeat. We'd like to add clustered storage and have been looking at gfs2. The issue for us seems to be that we can't directly integrate gfs2 into our current cluster because gfs2 uses a different cluster stack. Is it advisable to try to run both gfs2 along with a typical heartbeat cluster. It seems the problem would be the fencing. Any thoughts or advice welcome. Mark From extmaillist at linuxbox.cz Wed Nov 4 16:50:12 2009 From: extmaillist at linuxbox.cz (Nikola Ciprich) Date: Wed, 4 Nov 2009 17:50:12 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? Message-ID: <20091104165012.GA6157@nik-comp.linuxbox.cz> Hello, I'm trying to integrate clvmd to pacemaker+corosync cluster. Corosync service starts nicely starting pacemaker, but trying to start clvmd with corosync as cluster manager ends up with "Can't initialise cluster interface" error and "Unable to create lockspace for CLVM: Transport endpoint is not connected" appearing in syslog. strace shows that clvmd tries to open /dev/misc/dlm-control, but certainly nothing is connected to it. I wasn't successfull in running cman either (and I'd actually like to avoid that and use only corosync). Has anybody got this combination running? Could somebody advise me? thanks a lot in advance. with best regards nikola ciprich -- ------------------------------------- Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis at linuxbox.cz ------------------------------------- From extmaillist at linuxbox.cz Wed Nov 4 16:54:49 2009 From: extmaillist at linuxbox.cz (Nikola Ciprich) Date: Wed, 4 Nov 2009 17:54:49 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <20091104165012.GA6157@nik-comp.linuxbox.cz> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> Message-ID: <20091104165449.GA6201@nik-comp.linuxbox.cz> ouch, I forgot to mention versions: I'm using corosync-1.1.2, pacemaker-1.0.6, lvm2-cluster-2.02.53 and running 2.6.31.5 kernel n. On Wed, Nov 04, 2009 at 05:50:12PM +0100, Nikola Ciprich wrote: > Hello, > I'm trying to integrate clvmd to pacemaker+corosync cluster. > Corosync service starts nicely starting pacemaker, but trying to > start clvmd with corosync as cluster manager ends up with > "Can't initialise cluster interface" error > and > "Unable to create lockspace for CLVM: Transport endpoint is not connected" > appearing in syslog. > strace shows that clvmd tries to open /dev/misc/dlm-control, > but certainly nothing is connected to it. I wasn't successfull in running > cman either (and I'd actually like to avoid that and use only corosync). > Has anybody got this combination running? Could somebody advise me? > thanks a lot in advance. > with best regards > nikola ciprich > > > -- > ------------------------------------- > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: +420 596 603 142 > fax: +420 596 621 273 > mobil: +420 777 093 799 > > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: servis at linuxbox.cz > ------------------------------------- > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- ------------------------------------- Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis at linuxbox.cz ------------------------------------- From sdake at redhat.com Wed Nov 4 17:00:48 2009 From: sdake at redhat.com (Steven Dake) Date: Wed, 04 Nov 2009 10:00:48 -0700 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <20091104165449.GA6201@nik-comp.linuxbox.cz> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <20091104165449.GA6201@nik-comp.linuxbox.cz> Message-ID: <1257354048.2686.15.camel@localhost.localdomain> lvm2 depends on the cluster infrastrucure (the cluster package). It will alternatively (I believe compile time option) use the distirbuted lock manager in the openais package. Regards -steve On Wed, 2009-11-04 at 17:54 +0100, Nikola Ciprich wrote: > ouch, I forgot to mention versions: > I'm using > corosync-1.1.2, pacemaker-1.0.6, lvm2-cluster-2.02.53 and running > 2.6.31.5 kernel > n. > > > On Wed, Nov 04, 2009 at 05:50:12PM +0100, Nikola Ciprich wrote: > > Hello, > > I'm trying to integrate clvmd to pacemaker+corosync cluster. > > Corosync service starts nicely starting pacemaker, but trying to > > start clvmd with corosync as cluster manager ends up with > > "Can't initialise cluster interface" error > > and > > "Unable to create lockspace for CLVM: Transport endpoint is not connected" > > appearing in syslog. > > strace shows that clvmd tries to open /dev/misc/dlm-control, > > but certainly nothing is connected to it. I wasn't successfull in running > > cman either (and I'd actually like to avoid that and use only corosync). > > Has anybody got this combination running? Could somebody advise me? > > thanks a lot in advance. > > with best regards > > nikola ciprich > > > > > > -- > > ------------------------------------- > > Nikola CIPRICH > > LinuxBox.cz, s.r.o. > > 28. rijna 168, 709 01 Ostrava > > > > tel.: +420 596 603 142 > > fax: +420 596 621 273 > > mobil: +420 777 093 799 > > > > www.linuxbox.cz > > > > mobil servis: +420 737 238 656 > > email servis: servis at linuxbox.cz > > ------------------------------------- > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > From extmaillist at linuxbox.cz Wed Nov 4 19:58:58 2009 From: extmaillist at linuxbox.cz (Nikola Ciprich) Date: Wed, 4 Nov 2009 20:58:58 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <1257354048.2686.15.camel@localhost.localdomain> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <20091104165449.GA6201@nik-comp.linuxbox.cz> <1257354048.2686.15.camel@localhost.localdomain> Message-ID: <20091104195858.GA4671@nik-comp.linuxbox.cz> Hi Steve, and thanks for Your reply! well, while compiling lvm, I can enable cman,corosync or openais locking support. I don't want to use cman, corosync doesn't work for me and to use openais, I'd surely had to have openais working, but it seems to me it also doesn't want to play well together with corosync. So You mean I should get openais working? I'd still prefer corosync, if this is somehow possible, but if it's not, I'll try to find out how to start openais along with corosync. regards nik On Wed, Nov 04, 2009 at 10:00:48AM -0700, Steven Dake wrote: > lvm2 depends on the cluster infrastrucure (the cluster package). It > will alternatively (I believe compile time option) use the distirbuted > lock manager in the openais package. > > Regards > -steve > > On Wed, 2009-11-04 at 17:54 +0100, Nikola Ciprich wrote: > > ouch, I forgot to mention versions: > > I'm using > > corosync-1.1.2, pacemaker-1.0.6, lvm2-cluster-2.02.53 and running > > 2.6.31.5 kernel > > n. > > > > > > On Wed, Nov 04, 2009 at 05:50:12PM +0100, Nikola Ciprich wrote: > > > Hello, > > > I'm trying to integrate clvmd to pacemaker+corosync cluster. > > > Corosync service starts nicely starting pacemaker, but trying to > > > start clvmd with corosync as cluster manager ends up with > > > "Can't initialise cluster interface" error > > > and > > > "Unable to create lockspace for CLVM: Transport endpoint is not connected" > > > appearing in syslog. > > > strace shows that clvmd tries to open /dev/misc/dlm-control, > > > but certainly nothing is connected to it. I wasn't successfull in running > > > cman either (and I'd actually like to avoid that and use only corosync). > > > Has anybody got this combination running? Could somebody advise me? > > > thanks a lot in advance. > > > with best regards > > > nikola ciprich > > > > > > > > > -- > > > ------------------------------------- > > > Nikola CIPRICH > > > LinuxBox.cz, s.r.o. > > > 28. rijna 168, 709 01 Ostrava > > > > > > tel.: +420 596 603 142 > > > fax: +420 596 621 273 > > > mobil: +420 777 093 799 > > > > > > www.linuxbox.cz > > > > > > mobil servis: +420 737 238 656 > > > email servis: servis at linuxbox.cz > > > ------------------------------------- > > > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- ------------------------------------- Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis at linuxbox.cz ------------------------------------- From sdake at redhat.com Wed Nov 4 20:11:53 2009 From: sdake at redhat.com (Steven Dake) Date: Wed, 04 Nov 2009 13:11:53 -0700 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <20091104195858.GA4671@nik-comp.linuxbox.cz> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <20091104165449.GA6201@nik-comp.linuxbox.cz> <1257354048.2686.15.camel@localhost.localdomain> <20091104195858.GA4671@nik-comp.linuxbox.cz> Message-ID: <1257365513.2674.1.camel@localhost.localdomain> On Wed, 2009-11-04 at 20:58 +0100, Nikola Ciprich wrote: > Hi Steve, > and thanks for Your reply! > well, while compiling lvm, I can enable cman,corosync or openais > locking support. I don't want to use cman, corosync doesn't work I'm not sure what corosync locking support is. Chrissie is the expert here. She is in EU timezone so might have to wait until later to get a full response. > for me and to use openais, I'd surely had to have openais working, > but it seems to me it also doesn't want to play well together with > corosync. So You mean I should get openais working? I'd still prefer > corosync, if this is somehow possible, but if it's not, I'll try to > find out how to start openais along with corosync. > regards > nik > the way you start openais 1.0.0 or later is to run /usr/sbin/aisexec (which starts corosync with the openais service engines). Regards -steve > > On Wed, Nov 04, 2009 at 10:00:48AM -0700, Steven Dake wrote: > > lvm2 depends on the cluster infrastrucure (the cluster package). It > > will alternatively (I believe compile time option) use the distirbuted > > lock manager in the openais package. > > > > Regards > > -steve > > > > On Wed, 2009-11-04 at 17:54 +0100, Nikola Ciprich wrote: > > > ouch, I forgot to mention versions: > > > I'm using > > > corosync-1.1.2, pacemaker-1.0.6, lvm2-cluster-2.02.53 and running > > > 2.6.31.5 kernel > > > n. > > > > > > > > > On Wed, Nov 04, 2009 at 05:50:12PM +0100, Nikola Ciprich wrote: > > > > Hello, > > > > I'm trying to integrate clvmd to pacemaker+corosync cluster. > > > > Corosync service starts nicely starting pacemaker, but trying to > > > > start clvmd with corosync as cluster manager ends up with > > > > "Can't initialise cluster interface" error > > > > and > > > > "Unable to create lockspace for CLVM: Transport endpoint is not connected" > > > > appearing in syslog. > > > > strace shows that clvmd tries to open /dev/misc/dlm-control, > > > > but certainly nothing is connected to it. I wasn't successfull in running > > > > cman either (and I'd actually like to avoid that and use only corosync). > > > > Has anybody got this combination running? Could somebody advise me? > > > > thanks a lot in advance. > > > > with best regards > > > > nikola ciprich > > > > > > > > > > > > -- > > > > ------------------------------------- > > > > Nikola CIPRICH > > > > LinuxBox.cz, s.r.o. > > > > 28. rijna 168, 709 01 Ostrava > > > > > > > > tel.: +420 596 603 142 > > > > fax: +420 596 621 273 > > > > mobil: +420 777 093 799 > > > > > > > > www.linuxbox.cz > > > > > > > > mobil servis: +420 737 238 656 > > > > email servis: servis at linuxbox.cz > > > > ------------------------------------- > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster at redhat.com > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > From gianluca.cecchi at gmail.com Thu Nov 5 01:17:52 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Thu, 5 Nov 2009 02:17:52 +0100 Subject: [Linux-cluster] ways used for auto-eviction clarifications Message-ID: <561c252c0911041717x74116648j9acedaf2ccfb4ec5@mail.gmail.com> Hello, can anyone summarize the possible events generating a self-eviction of a node for an rhcs cluster? Are these only executed via halt/reboot commands inside OS or also through connection to the self-fence-device? (For example in a HP iLO based fencing, through connection to its own iLO and running the defined command (off-reboot) ) Thanks in advance, Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccaulfie at redhat.com Thu Nov 5 08:24:27 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Thu, 05 Nov 2009 08:24:27 +0000 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <20091104165012.GA6157@nik-comp.linuxbox.cz> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> Message-ID: <4AF28BBB.2020604@redhat.com> On 04/11/09 16:50, Nikola Ciprich wrote: > Hello, > I'm trying to integrate clvmd to pacemaker+corosync cluster. > Corosync service starts nicely starting pacemaker, but trying to > start clvmd with corosync as cluster manager ends up with > "Can't initialise cluster interface" error > and > "Unable to create lockspace for CLVM: Transport endpoint is not connected" > appearing in syslog. > strace shows that clvmd tries to open /dev/misc/dlm-control, > but certainly nothing is connected to it. I wasn't successfull in running > cman either (and I'd actually like to avoid that and use only corosync). > Has anybody got this combination running? Could somebody advise me? > thanks a lot in advance. Hiya It sounds like you haven't started the DLM. the 'corosync' module for clvmd uses corosync for communications and the kernel DLM for locking. If you don't want to use DLM, then you need to load the openais modules for corosync and choose openais as the module for clvmd. It will then use Lck as the lock manager. Chrissie From extmaillist at linuxbox.cz Thu Nov 5 08:35:33 2009 From: extmaillist at linuxbox.cz (Nikola Ciprich) Date: Thu, 5 Nov 2009 09:35:33 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <4AF28BBB.2020604@redhat.com> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <4AF28BBB.2020604@redhat.com> Message-ID: <20091105083532.GA27302@develbox.linuxbox.cz> Hello Christine, well, I don't have problem with using DLM + corosync, but how do I do that? I haven't found any documentation, and just loading dlm module prior to starting corosync didn't help... Is there some special configuration needed for corosync to start dlm? thanks a lot for Your reply! cheers nik > > Hiya > > It sounds like you haven't started the DLM. the 'corosync' module for > clvmd uses corosync for communications and the kernel DLM for locking. > > If you don't want to use DLM, then you need to load the openais modules > for corosync and choose openais as the module for clvmd. It will then > use Lck as the lock manager. > > Chrissie > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- ------------------------------------- Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis at linuxbox.cz ------------------------------------- From andrew at beekhof.net Thu Nov 5 08:40:54 2009 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 5 Nov 2009 09:40:54 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: <20091105083532.GA27302@develbox.linuxbox.cz> References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <4AF28BBB.2020604@redhat.com> <20091105083532.GA27302@develbox.linuxbox.cz> Message-ID: On Thu, Nov 5, 2009 at 9:35 AM, Nikola Ciprich wrote: > Hello Christine, > well, I don't have problem with using DLM + corosync, but how do I > do that? I haven't found any documentation, and just loading dlm > module prior to starting corosync didn't help... > Is there some special configuration needed for corosync > to start dlm? You need to add it as a resource. You can see a worked example of this in the "Clusters from Scratch - Apache on Fedora11" document on http://www.clusterlabs.org/wiki/Documentation > thanks a lot for Your reply! > cheers > nik > >> >> Hiya >> >> It sounds like you haven't started the DLM. the 'corosync' module for >> clvmd uses corosync for communications and the kernel DLM for locking. >> >> If you don't want to use DLM, then you need to load the openais modules >> for corosync and choose openais as the module for clvmd. It will then >> use Lck as the lock manager. >> >> Chrissie >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > -- > ------------------------------------- > Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28. rijna 168, 709 01 Ostrava > > tel.: ? +420 596 603 142 > fax: ? ?+420 596 621 273 > mobil: ?+420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: servis at linuxbox.cz > ------------------------------------- > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From andrew at beekhof.net Thu Nov 5 08:46:02 2009 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 5 Nov 2009 09:46:02 +0100 Subject: [Linux-cluster] gfs2 and heartbeat cluster In-Reply-To: <988e28430911040846i5795d1ctc8507ddf90aff9a3@mail.gmail.com> References: <988e28430911040846i5795d1ctc8507ddf90aff9a3@mail.gmail.com> Message-ID: Why not run the heartbeat resource manager (pacemaker) on top of corosync? Best of both worlds :-) On Wed, Nov 4, 2009 at 5:46 PM, Mark Horton wrote: > We have an existing cluster that uses heartbeat. ?We'd like to add > clustered storage and have been looking at gfs2. ?The issue for us > seems to be that we can't directly integrate gfs2 into our current > cluster because gfs2 uses a different cluster stack. > > Is it advisable to try to run both gfs2 along with a typical heartbeat > cluster. ?It seems the problem would be the fencing. > > Any thoughts or advice welcome. > > Mark > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From alain.richard at equation.fr Thu Nov 5 09:12:11 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Thu, 5 Nov 2009 10:12:11 +0100 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <1257270978.2483.17.camel@localhost> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> <1257254404.2483.9.camel@localhost> <1257270978.2483.17.camel@localhost> Message-ID: <26C393E2-EBD5-4C14-9A76-BE6E0BF713B3@equation.fr> Le 3 nov. 2009 ? 18:56, Lon Hohberger a ?crit : > Federico kicked be in to finishing this... Want to give it a shot? > Effectively, it makes 'path' work with virsh - making virsh mode > upgrade > compatible with 'xm' mode. This means that we can really (actually) > deprecate 'xm' support. > > Pay no attention to the fact that it uses a global var called > 'xmlfile'; > it doesn't do any config file parsing, so it still works with Xen and > libvirt (xml) style config files: > > http://git.fedorahosted.org/git/?p=resource-agents.git;a=commit;h=1e42add4f2ec63d4c796d544118dfcccbd8a042b > > It just splits up a OCF_RESKEY_path on ':' and looks for a file in > each > directory named NAME or NAME.xml. The requirement is obviously that > your config files need to be named after your VM. If this is a > problem, > you must use the full 'xmlfile' option. > > -- Lon > I am currently testing this version of vm.sh that is handling xmlfile and path differently : - if use_virth=0, use xm as before - if path and xmlfile, ignore path and issue a warning (you should use xmlfile) - if xmlfile only, use it - if path, search for a file name under path or name.xml under path and set xmfile to this file in the case of virsh, creation is handled using 'virsh create xmlfile' if xmlfile is not empty, or 'virsh create name' if there is no xmlfile/ path configured. The effect of this is that the config file must be : - an xml file, with or without .xml extension, if xmlfile or path attribute is set - else, a classic xen config file under /etc/xen In order to stay compatible with current rgmanager configuration, we must ensure that use_virsh is set to 0 for vm that use classical xen conf files and path directive, else the vm fails to lauch because virsh create is not able to handle xen config file and virsh start, that is able to handle xen conf files, is not able to get the file from an other location than /etc/xen. An other point is that if libvirtd is not running, the status returned by this vm.sh for a vm is always "indeterminate", so I have to launch it and to disable the default libvirt network because I really don't need it. The last problem so far is that clusvcadm -M always end-up with an error although the migration is working correctly : #clusvcadm -M vm:ns2 -m titan2 Trying to migrate vm:ns2 to titan2...Failure # clustat Cluster Status for titan-cluster @ Thu Nov 5 10:07:24 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ titan1 1 Online, rgmanager titan2 2 Online, Local, rgmanager qdisk1 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- vm:ns2 titan2 started Looking at rgmanager log, I found this : Nov 5 10:05:19 titan1 clurgmgrd[11231]: Migrating vm:ns2 to titan2 Nov 5 10:05:19 titan1 clurgmgrd: [11231]: Using /data/vmconf/ ns2.xml Nov 5 10:05:19 titan1 clurgmgrd: [11231]: virsh migrate -- live ns2 xen:/// xenmigr://titan2/ Nov 5 10:05:27 titan1 clurgmgrd: [11231]: Migrate ns2 to titan2 failed: Nov 5 10:05:27 titan1 clurgmgrd: [11231]: error: Domain not found: xenUnifiedDomainLookupByName Nov 5 10:05:27 titan1 clurgmgrd[11231]: migrate on vm "ns2" returned 7 (unspecified) Nov 5 10:05:27 titan1 clurgmgrd[11231]: Migration of vm:ns2 to titan2 failed; return code 7 Nov 5 10:05:19 titan2 clurgmgrd[7484]: FW: Forwarding migrate request to 1 Nov 5 10:05:24 titan2 clurgmgrd: [7484]: Using /data/vmconf/ ns2.xml Nov 5 10:05:25 titan2 clurgmgrd[7484]: vm:ns2 is now running locally Nov 5 10:05:34 titan2 clurgmgrd: [7484]: Using /data/vmconf/ ns2.xml This error is generated by libvirtd : # virsh migrate --live ns2 xen:/// xenmigr://titan1/ error: Domain not found: xenUnifiedDomainLookupByName # echo $? 1 Is it normal ? My initial goal to use virsh instead of xm was for better error handling during migrations... Regards, -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From extmaillist at linuxbox.cz Thu Nov 5 09:17:44 2009 From: extmaillist at linuxbox.cz (Nikola Ciprich) Date: Thu, 5 Nov 2009 10:17:44 +0100 Subject: [Linux-cluster] pacemaker + corosync + clvmd? In-Reply-To: References: <20091104165012.GA6157@nik-comp.linuxbox.cz> <4AF28BBB.2020604@redhat.com> <20091105083532.GA27302@develbox.linuxbox.cz> Message-ID: <20091105091744.GB27302@develbox.linuxbox.cz> Thanks Andrew, that did the job! I haven't noticed there's controld as resource in pacemaker. now it works, so I can start testing if it's more reliable then old clvmd I've been using till now. Thanks to all of You for help once more have a nice day! nik On Thu, Nov 05, 2009 at 09:40:54AM +0100, Andrew Beekhof wrote: > On Thu, Nov 5, 2009 at 9:35 AM, Nikola Ciprich wrote: > > Hello Christine, > > well, I don't have problem with using DLM + corosync, but how do I > > do that? I haven't found any documentation, and just loading dlm > > module prior to starting corosync didn't help... > > Is there some special configuration needed for corosync > > to start dlm? > > You need to add it as a resource. > You can see a worked example of this in the "Clusters from Scratch - > Apache on Fedora11" document on > http://www.clusterlabs.org/wiki/Documentation > > > > > thanks a lot for Your reply! > > cheers > > nik > > > >> > >> Hiya > >> > >> It sounds like you haven't started the DLM. the 'corosync' module for > >> clvmd uses corosync for communications and the kernel DLM for locking. > >> > >> If you don't want to use DLM, then you need to load the openais modules > >> for corosync and choose openais as the module for clvmd. It will then > >> use Lck as the lock manager. > >> > >> Chrissie > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > > > > -- > > ------------------------------------- > > Nikola CIPRICH > > LinuxBox.cz, s.r.o. > > 28. rijna 168, 709 01 Ostrava > > > > tel.: ? +420 596 603 142 > > fax: ? ?+420 596 621 273 > > mobil: ?+420 777 093 799 > > www.linuxbox.cz > > > > mobil servis: +420 737 238 656 > > email servis: servis at linuxbox.cz > > ------------------------------------- > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- ------------------------------------- Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis at linuxbox.cz ------------------------------------- From kamat12 at gmail.com Thu Nov 5 09:24:08 2009 From: kamat12 at gmail.com (Prashant Kamat) Date: Thu, 5 Nov 2009 14:54:08 +0530 Subject: [Linux-cluster] Linux Cluster Message-ID: <60fbb01c0911050124y19da539jfa2e351bfa6bd51a@mail.gmail.com> How to configure Linux cluster on VMWare Workstation. Rgds Prashant Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gianluca.cecchi at gmail.com Thu Nov 5 09:38:34 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Thu, 5 Nov 2009 10:38:34 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911040357xdd605a0k700502996814a48f@mail.gmail.com> References: <561c252c0911040357xdd605a0k700502996814a48f@mail.gmail.com> Message-ID: <561c252c0911050138x438036cbm4c058c3c162297f7@mail.gmail.com> On Wed, Nov 4, 2009 at 12:57 PM, Gianluca Cecchi wrote: > On Wed, 4 Nov 2009 15:33:19 +1000 Peter Tiggerdine wrote: > > 7. Your going to need to copy this over manually otherwise it > > will fail, I've fallen victim of this before. All cluster nodes need to start on > > > the current revision of the file before you update it. I think this is a chicken > > and egg problem. > > In the past I already encountered this situation. And in all cases, the starting node detects its version as not up2date and gets its new config from the other node. > > My scenario was: > node 1 and node 2 up > node 2 shutdown > change node1 config (I mean here in term of services, probably not valid if inserting a qdiskd section when not available before, or possibly in other cases) > > power on node2 > node 2 gets the new config and apply it (based on availability and correctness of definitions) > > So I don't think this is correct..... > Any one commenting on this? > Do you have the messages of the errors when you get this problem? > > On Wed, 4 Nov 2009 12:30:57 +0100 Jakov Sosic wrote: > > Well I usually do rolling updates, (i relocate the services to other > > nodes, and update one node, then restart it and see if it joins > > cluster). > > OK. In fact I'm now working on a test cluster, just to get the correct > workflow. > But you are saying you did this also for 5.3 -> 5.4, while I experienced > the oom problem that David documented too, with the entry in bugzilla...... > So you joined a just updated 5.4 node to its previous cluster (composed by > all 5.3 nodes) and you didn't get any problem at all? > > Gianluca > OK. All went well in my virtual environment. More, in step 7, I created a new ip service and updated my config into the first updated node, enabling it while the second node, still in 5.3, was down: This below the diff with the pre-5.4 < --- > 38,41d37 < < < < 46d41 < 62,64d56 < < < When the second node in step 11) joins the cluster, it indeed gets the updated config and all goes well. I also successfully relocated this new server from former node to the other one. No oom with this approach as written by David. thanks two other things: 1) I see these messages about quorum inside the first node, that didn't came during the previous days in 5.3 env Nov 5 08:00:14 mork clurgmgrd: [2692]: Getting status Nov 5 08:27:08 mork qdiskd[2206]: qdiskd: read (system call) has hung for 40 seconds Nov 5 08:27:08 mork qdiskd[2206]: In 40 more seconds, we will be evicted Nov 5 09:00:15 mork clurgmgrd: [2692]: Getting status Nov 5 09:00:15 mork clurgmgrd: [2692]: Getting status Nov 5 09:48:23 mork qdiskd[2206]: qdiskd: read (system call) has hung for 40 seconds Nov 5 09:48:23 mork qdiskd[2206]: In 40 more seconds, we will be evicted Nov 5 10:00:15 mork clurgmgrd: [2692]: Getting status Nov 5 10:00:15 mork clurgmgrd: [2692]: Getting status Any timings changed between releases? My relevant lines about timings in cluster.conf were in 5.3 and remained so in 5.4: (tko very big in heuristic because I was testing best and safer way to do on-the-fly changes to heuristic, due to network maintenance activity causing gw disappear for some time, not predictable by the net-guys...) I don't know if this message is deriving from a problem with latencies in my virtual env or not.... On the host side I don't see any message with dmesg command or in /var/log/messages..... 2) saw that a new kernel just released...... ;-( Hints about possible interferences with cluster infra? Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From jakov.sosic at srce.hr Thu Nov 5 10:18:03 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Thu, 5 Nov 2009 11:18:03 +0100 Subject: [Linux-cluster] RHCS In-Reply-To: <4AF18199.2050705@bobich.net> References: <4AF030C6.5010503@bobich.net> <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> <4AF18199.2050705@bobich.net> Message-ID: <20091105111803.6d05acab@pc-jsosic.srce.hr> On Wed, 04 Nov 2009 13:28:57 +0000 Gordan Bobic wrote: > > So you add your static addresses to cluster.conf as resources too? > > No, the static IPs aren't cluster resources, they should be set the > usual way using ifcfg- files or whatever your distro uses. So how do you monitor link then? :) -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From jakov.sosic at srce.hr Thu Nov 5 10:21:36 2009 From: jakov.sosic at srce.hr (Jakov Sosic) Date: Thu, 5 Nov 2009 11:21:36 +0100 Subject: [Linux-cluster] RHCS In-Reply-To: <4AF18199.2050705@bobich.net> References: <4AF030C6.5010503@bobich.net> <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> <4AF18199.2050705@bobich.net> Message-ID: <20091105112136.11d78a22@pc-jsosic.srce.hr> On Wed, 04 Nov 2009 13:28:57 +0000 Gordan Bobic wrote: > No, the static IPs aren't cluster resources, they should be set the > usual way using ifcfg- files or whatever your distro uses. I know, but I did not understand the following: > I think link monitoring should be on the static, rather than floating > IP resource (that's what I do, and it works for me). Could you explain it a bit? -- | Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D | ================================================================= | start fighting cancer -> http://www.worldcommunitygrid.org/ | From gianluca.cecchi at gmail.com Thu Nov 5 10:32:22 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Thu, 5 Nov 2009 11:32:22 +0100 Subject: [Linux-cluster] share experience migrating cluster suite from centos 5.3 to centos 5.4 In-Reply-To: <561c252c0911050138x438036cbm4c058c3c162297f7@mail.gmail.com> References: <561c252c0911040357xdd605a0k700502996814a48f@mail.gmail.com> <561c252c0911050138x438036cbm4c058c3c162297f7@mail.gmail.com> Message-ID: <561c252c0911050232i5093cad7o82936d19aecc8986@mail.gmail.com> On Thu, Nov 5, 2009 at 10:38 AM, Gianluca Cecchi wrote: > [snip] > two other things: > 1) I see these messages about quorum inside the first node, that didn't > came during the previous days in 5.3 env > Nov 5 08:00:14 mork clurgmgrd: [2692]: Getting status > Nov 5 08:27:08 mork qdiskd[2206]: qdiskd: read (system call) has > hung for 40 seconds > Nov 5 08:27:08 mork qdiskd[2206]: In 40 more seconds, we will be > evicted > Nov 5 09:00:15 mork clurgmgrd: [2692]: Getting status > Nov 5 09:00:15 mork clurgmgrd: [2692]: Getting status > Nov 5 09:48:23 mork qdiskd[2206]: qdiskd: read (system call) has > hung for 40 seconds > Nov 5 09:48:23 mork qdiskd[2206]: In 40 more seconds, we will be > evicted > Nov 5 10:00:15 mork clurgmgrd: [2692]: Getting status > Nov 5 10:00:15 mork clurgmgrd: [2692]: Getting status > > Any timings changed between releases? > My relevant lines about timings in cluster.conf were in 5.3 and remained so > in 5.4: > > > > > post_join_delay="20"/> > > log_facility="local4" log_level="7" tko="16" votes="1"> > > > > (tko very big in heuristic because I was testing best and safer way to do > on-the-fly changes to heuristic, due to network maintenance activity causing > gw disappear for some time, not predictable by the net-guys...) > > I don't know if this message is deriving from a problem with latencies in > my virtual env or not.... > On the host side I don't see any message with dmesg command or in > /var/log/messages..... > > 2) saw that a new kernel just released...... ;-( > Hints about possible interferences with cluster infra? > > Gianluca > > > Probably 1) is due to this bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=500450 that found its solution released in RHSA-2009-1341 advisory with cman-2.0.115-1.el5.x86_64.rpm. And coming from 2.0.98 this is reasonable. In my case tko=16 and interval=5, so that max time tolerance is about 80 seconds that is the 40+40 seconds I see inside the messages.... -------------- next part -------------- An HTML attachment was scrubbed... URL: From gordan at bobich.net Thu Nov 5 13:04:25 2009 From: gordan at bobich.net (Gordan Bobic) Date: Thu, 05 Nov 2009 13:04:25 +0000 Subject: [Linux-cluster] RHCS In-Reply-To: <20091105111803.6d05acab@pc-jsosic.srce.hr> References: <4AF030C6.5010503@bobich.net> <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> <4AF18199.2050705@bobich.net> <20091105111803.6d05acab@pc-jsosic.srce.hr> Message-ID: <4AF2CD59.6040501@bobich.net> Jakov Sosic wrote: > On Wed, 04 Nov 2009 13:28:57 +0000 > Gordan Bobic wrote: > >>> So you add your static addresses to cluster.conf as resources too? >> No, the static IPs aren't cluster resources, they should be set the >> usual way using ifcfg- files or whatever your distro uses. > > So how do you monitor link then? :) You monitor it on one of the virtual IP resources that will be attached onto the same physical interface (i.e. on the same subnet). Gordan From gordan at bobich.net Thu Nov 5 13:05:13 2009 From: gordan at bobich.net (Gordan Bobic) Date: Thu, 05 Nov 2009 13:05:13 +0000 Subject: [Linux-cluster] RHCS In-Reply-To: <20091105112136.11d78a22@pc-jsosic.srce.hr> References: <4AF030C6.5010503@bobich.net> <20091104121519.2c9fb5b2@pc-jsosic.srce.hr> <4AF18199.2050705@bobich.net> <20091105112136.11d78a22@pc-jsosic.srce.hr> Message-ID: <4AF2CD89.80700@bobich.net> Jakov Sosic wrote: > On Wed, 04 Nov 2009 13:28:57 +0000 > Gordan Bobic wrote: > >> No, the static IPs aren't cluster resources, they should be set the >> usual way using ifcfg- files or whatever your distro uses. > > I know, but I did not understand the following: > >> I think link monitoring should be on the static, rather than floating >> IP resource (that's what I do, and it works for me). > > Could you explain it a bit? That was just me being wrong. Disregard. :) Gordan From fdinitto at redhat.com Thu Nov 5 14:15:17 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Thu, 05 Nov 2009 15:15:17 +0100 Subject: [Linux-cluster] Cluster Configuration Docs (STABLE3) Message-ID: <4AF2DDF5.901@redhat.com> Hi all, as previously mentioned in other emails to the list: http://fabbione.fedorapeople.org/CCS.pdf contains an updated schema on how the new configuration system works in details and compares to the old version in STABLE2/RHEL5. Please don?t hesitate to ask any questions. Fabio From linux at alteeve.com Thu Nov 5 14:17:44 2009 From: linux at alteeve.com (Madison Kelly) Date: Thu, 05 Nov 2009 09:17:44 -0500 Subject: [Linux-cluster] Linux Cluster In-Reply-To: <60fbb01c0911050124y19da539jfa2e351bfa6bd51a@mail.gmail.com> References: <60fbb01c0911050124y19da539jfa2e351bfa6bd51a@mail.gmail.com> Message-ID: <4AF2DE88.8050107@alteeve.com> Prashant Kamat wrote: > How to configure Linux cluster on VMWare Workstation. > > > Rgds > Prashant Hi Prashant, Clustering is a fairly complex topic, and your question is quite vague. If I can suggest two things; First; Have you decided what OS the cluster will be built on? What tools do you want to use? What problems are you trying to solve with a cluster? How familiar with various clustering technologies are you? Second; It's very difficult for someone to give you a "cluster recipe". You will have the best results asking specific questions to specific problems as you run into them. We've all struggled with clustering and we've all asked a lot of questions along the way. Be sure when asking questions that you provide as much detail describing what you've tried to do already to deal with the given, specific problem you are having. Best of luck. :) Madi From nehemiasjahcob at gmail.com Thu Nov 5 14:19:42 2009 From: nehemiasjahcob at gmail.com (Nehemias Jahcob) Date: Thu, 5 Nov 2009 11:19:42 -0300 Subject: [Linux-cluster] Cluster Configuration Docs (STABLE3) In-Reply-To: <4AF2DDF5.901@redhat.com> References: <4AF2DDF5.901@redhat.com> Message-ID: <5f61ab380911050619g51607c75uaacbfac0593cfbe9@mail.gmail.com> Fabio "Forbidden You don't have permission to access /CCS.pdf on this server." Thanks.. NJ 2009/11/5 Fabio M. Di Nitto > Hi all, > > as previously mentioned in other emails to the list: > > http://fabbione.fedorapeople.org/CCS.pdf > > contains an updated schema on how the new configuration system works in > details and compares to the old version in STABLE2/RHEL5. > > Please don?t hesitate to ask any questions. > > Fabio > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gianluca.cecchi at gmail.com Thu Nov 5 14:28:07 2009 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Thu, 5 Nov 2009 15:28:07 +0100 Subject: [Linux-cluster] qdiskd master election and loss of quorum Message-ID: <561c252c0911050628x6189153eob12e270da1a694a1@mail.gmail.com> On Tue, 03 Nov 2009 08:15:05 -0500 Lon Hohberger wrote: > Though it's a bit odd that stopping node 1 causes a loss of quorum on > node2. :( I'm experimenting the same behaviour with a cluster composed by two nodes in CentOS 5.4 openais-0.80.6-8.el5_4.1 cman-2.0.115-1.el5_4.3 rgmanager-2.0.52-1.el5.centos.2 Here the lines in cluster.conf and below the simulated scenario [root at mork ~]# egrep "totem|quorum" /etc/cluster/cluster.conf the white paper referred by Alain, apart from related to multipath as he already wrote, says only that quorum_dev_poll must be lesser than totem token.... and the quorum_dev_poll should be configured to be greater than the value of multipath failover (but here we don't have multipath...) - mork is the second node and has no services active and its quorum is not master at this moment: logs on mork [root at mork ~]# tail -f /var/log/messages Nov 5 12:35:41 mork ricci: startup succeeded Nov 5 12:35:42 mork clurgmgrd: [2633]: node2 owns vg_cl1/lv_cl1 unable to stop Nov 5 12:35:42 mork clurgmgrd[2633]: stop on lvm "CL1" returned 1 (generic error) Nov 5 12:35:42 mork clurgmgrd: [2633]: node2 owns vg_cl2/lv_cl2 unable to stop Nov 5 12:35:42 mork clurgmgrd[2633]: stop on lvm "CL2" returned 1 (generic error) Nov 5 12:36:02 mork qdiskd[2214]: Node 2 is the master Nov 5 12:36:52 mork qdiskd[2214]: Initial score 1/1 Nov 5 12:36:52 mork qdiskd[2214]: Initialization complete Nov 5 12:36:52 mork openais[2185]: [CMAN ] quorum device registered Nov 5 12:36:52 mork qdiskd[2214]: Score sufficient for master operation (1/1; required=1); upgrading - shutdown of the other rnode (mindy) that has in charge three services (note that mindy shutdowns cleanly) logs on mork Nov 5 12:52:53 mork clurgmgrd[2633]: Member 2 shutting down Nov 5 12:52:57 mork qdiskd[2214]: Node 2 shutdown Nov 5 12:52:58 mork clurgmgrd[2633]: Starting stopped service service:MM1SRV Nov 5 12:52:58 mork clurgmgrd[2633]: Starting stopped service service:MM2SRV Nov 5 12:52:58 mork clurgmgrd[2633]: Starting stopped service service:MM3SRV Nov 5 12:52:58 mork clurgmgrd: [2633]: Activating vg_cl1/lv_cl1 Nov 5 12:52:58 mork clurgmgrd: [2633]: Making resilient : lvchange -ay vg_cl1/lv_cl1 Nov 5 12:52:59 mork clurgmgrd: [2633]: Activating vg_cl2/lv_cl2 Nov 5 12:52:59 mork clurgmgrd: [2633]: Resilient command: lvchange -ay vg_cl1/lv_cl1 --config devices{filter=["a|/dev/hda2|","a|/dev/hdb1|","a|/dev/sdb1|","a|/dev/sdc1|","r|.*|"]} Nov 5 12:52:59 mork clurgmgrd: [2633]: Making resilient : lvchange -ay vg_cl2/lv_cl2 Nov 5 12:52:59 mork clurgmgrd: [2633]: Resilient command: lvchange -ay vg_cl2/lv_cl2 --config devices{filter=["a|/dev/hda2|","a|/dev/hdb1|","a|/dev/sdb1|","a|/dev/sdc1|","r|.*|"]} Nov 5 12:52:59 mork kernel: kjournald starting. Commit interval 5 seconds Nov 5 12:52:59 mork kernel: EXT3 FS on dm-3, internal journal Nov 5 12:52:59 mork kernel: EXT3-fs: mounted filesystem with ordered data mode. Nov 5 12:52:59 mork kernel: kjournald starting. Commit interval 5 seconds Nov 5 12:52:59 mork kernel: EXT3 FS on dm-4, internal journal Nov 5 12:52:59 mork kernel: EXT3-fs: mounted filesystem with ordered data mode. Nov 5 12:53:15 mork clurgmgrd[2633]: #75: Failed changing service status Nov 5 12:53:30 mork clurgmgrd[2633]: #75: Failed changing service status Nov 5 12:53:30 mork clurgmgrd[2633]: Stopping service service:MM3SRV Nov 5 12:53:32 mork qdiskd[2214]: Assuming master role Nov 5 12:53:45 mork clurgmgrd[2633]: #52: Failed changing RG status Nov 5 12:53:45 mork clurgmgrd[2633]: #13: Service service:MM3SRV failed to stop cleanly - clustat run several times on mork during this phase (note the timeout messages) [root at mork ~]# clustat Timed out waiting for a response from Resource Group Manager Cluster Status for clumm @ Thu Nov 5 12:54:08 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk [root at mork ~]# clustat Service states unavailable: Temporary failure; try again Cluster Status for clumm @ Thu Nov 5 12:54:14 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk [root at mork ~]# clustat Service states unavailable: Temporary failure; try again Cluster Status for clumm @ Thu Nov 5 12:54:15 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk [root at mork ~]# clustat Timed out waiting for a response from Resource Group Manager Cluster Status for clumm @ Thu Nov 5 12:54:46 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk - service manager is running [root at mork ~]# service rgmanager status clurgmgrd (pid 2632) is running... - cman_tool command outputs [root at mork ~]# cman_tool services type level name id state fence 0 default 00010001 none [1] dlm 1 rgmanager 00020001 none [1] [root at mork ~]# cman_tool nodes Node Sts Inc Joined Name 0 M 0 2009-11-05 12:36:52 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 1 M 52 2009-11-05 12:35:30 node1 2 X 56 node2 [root at mork ~]# cman_tool status Version: 6.2.0 Config Version: 7 Cluster Name: clumm Cluster Id: 3243 Cluster Member: Yes Cluster Generation: 56 Membership state: Cluster-Member Nodes: 2 Expected votes: 3 Quorum device votes: 1 Total votes: 2 Quorum: 2 Active subsystems: 9 Flags: Dirty Ports Bound: 0 177 Node name: node1 Node ID: 1 Multicast addresses: 239.192.12.183 Node addresses: 172.16.0.11 - now clustat gives output but the services remain in starting and never go to "started" [root at mork ~]# clustat Cluster Status for clumm @ Thu Nov 5 12:55:16 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local, rgmanager node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:MM1SRV node1 starting service:MM2SRV node1 starting service:MM3SRV node1 starting - latest entries in messages [root at mork ~]# tail -f /var/log/messages Nov 5 12:53:45 mork clurgmgrd[2633]: #13: Service service:MM3SRV failed to stop cleanly Nov 5 12:54:00 mork clurgmgrd[2633]: #75: Failed changing service status Nov 5 12:54:15 mork clurgmgrd[2633]: #57: Failed changing RG status Nov 5 12:54:15 mork clurgmgrd[2633]: Stopping service service:MM1SRV Nov 5 12:54:30 mork clurgmgrd[2633]: Stopping service service:MM2SRV Nov 5 12:54:30 mork clurgmgrd[2633]: #52: Failed changing RG status Nov 5 12:54:30 mork clurgmgrd[2633]: #13: Service service:MM1SRV failed to stop cleanly Nov 5 12:54:45 mork clurgmgrd[2633]: #52: Failed changing RG status Nov 5 12:54:45 mork clurgmgrd[2633]: #13: Service service:MM2SRV failed to stop cleanly Nov 5 12:55:00 mork clurgmgrd[2633]: #57: Failed changing RG status - new entries in messages [root at mork ~]# tail -f /var/log/messages Nov 5 12:54:30 mork clurgmgrd[2633]: #52: Failed changing RG status Nov 5 12:54:30 mork clurgmgrd[2633]: #13: Service service:MM1SRV failed to stop cleanly Nov 5 12:54:45 mork clurgmgrd[2633]: #52: Failed changing RG status Nov 5 12:54:45 mork clurgmgrd[2633]: #13: Service service:MM2SRV failed to stop cleanly Nov 5 12:55:00 mork clurgmgrd[2633]: #57: Failed changing RG status Nov 5 12:55:15 mork clurgmgrd[2633]: #57: Failed changing RG status Nov 5 12:55:41 mork openais[2185]: [TOTEM] The token was lost in the OPERATIONAL state. Nov 5 12:55:41 mork openais[2185]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). Nov 5 12:55:41 mork openais[2185]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Nov 5 12:55:41 mork openais[2185]: [TOTEM] entering GATHER state from 2. Nov 5 12:55:46 mork openais[2185]: [TOTEM] entering GATHER state from 0. Nov 5 12:55:46 mork openais[2185]: [TOTEM] Creating commit token because I am the rep. Nov 5 12:55:46 mork openais[2185]: [TOTEM] Saving state aru 64 high seq received 64 Nov 5 12:55:46 mork openais[2185]: [TOTEM] Storing new sequence id for ring 3c Nov 5 12:55:46 mork openais[2185]: [TOTEM] entering COMMIT state. Nov 5 12:55:46 mork openais[2185]: [TOTEM] entering RECOVERY state. Nov 5 12:55:46 mork openais[2185]: [TOTEM] position [0] member 172.16.0.11: Nov 5 12:55:46 mork openais[2185]: [TOTEM] previous ring seq 56 rep 172.16.0.11 Nov 5 12:55:46 mork openais[2185]: [TOTEM] aru 64 high delivered 64 received flag 1 Nov 5 12:55:46 mork openais[2185]: [TOTEM] Did not need to originate any messages in recovery. Nov 5 12:55:46 mork openais[2185]: [TOTEM] Sending initial ORF token Nov 5 12:55:46 mork openais[2185]: [CLM ] CLM CONFIGURATION CHANGE Nov 5 12:55:46 mork openais[2185]: [CLM ] New Configuration: Nov 5 12:55:46 mork kernel: dlm: closing connection to node 2 Nov 5 12:55:46 mork openais[2185]: [CLM ] r(0) ip(172.16.0.11) Nov 5 12:55:46 mork openais[2185]: [CLM ] Members Left: Nov 5 12:55:46 mork openais[2185]: [CLM ] r(0) ip(172.16.0.12) Nov 5 12:55:46 mork openais[2185]: [CLM ] Members Joined: Nov 5 12:55:46 mork openais[2185]: [CLM ] CLM CONFIGURATION CHANGE Nov 5 12:55:46 mork openais[2185]: [CLM ] New Configuration: Nov 5 12:55:46 mork openais[2185]: [CLM ] r(0) ip(172.16.0.11) Nov 5 12:55:46 mork openais[2185]: [CLM ] Members Left: Nov 5 12:55:46 mork openais[2185]: [CLM ] Members Joined: Nov 5 12:55:46 mork openais[2185]: [SYNC ] This node is within the primary component and will provide service. Nov 5 12:55:46 mork openais[2185]: [TOTEM] entering OPERATIONAL state. Nov 5 12:55:46 mork openais[2185]: [CLM ] got nodejoin message 172.16.0.11 Nov 5 12:55:46 mork openais[2185]: [CPG ] got joinlist message from node 1 - services remain in "starting" [root at mork ~]# clustat Cluster Status for clumm @ Thu Nov 5 12:58:47 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local, rgmanager node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:MM1SRV node1 starting service:MM2SRV node1 starting service:MM3SRV node1 starting - services MM1SRV and MM2SRV are ip+fs (/cl1 and /cl2 respectively): they are active so it seems all was done good but without passing to started form starting.... Also MM3SRV that is an ip only service has been started [root at mork ~]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 5808616 4045884 1462908 74% / /dev/hda1 101086 38786 57081 41% /boot tmpfs 447656 0 447656 0% /dev/shm /dev/mapper/vg_cl1-lv_cl1 4124352 1258064 2656780 33% /cl1 /dev/mapper/vg_cl2-lv_cl2 4124352 1563032 2351812 40% /cl2 [root at mork ~]# ip addr list 1: lo: mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 54:52:00:6a:cb:ba brd ff:ff:ff:ff:ff:ff inet 192.168.122.101/24 brd 192.168.122.255 scope global eth0 inet 192.168.122.113/24 scope global secondary eth0 <--- MM3SRV ip inet 192.168.122.111/24 scope global secondary eth0 <--- MM1SRV ip inet 192.168.122.112/24 scope global secondary eth0 <--- MM2SRV ip inet6 fe80::5652:ff:fe6a:cbba/64 scope link valid_lft forever preferred_lft forever 3: eth1: mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 54:52:00:00:0c:c5 brd ff:ff:ff:ff:ff:ff inet 172.16.0.11/12 brd 172.31.255.255 scope global eth1 inet6 fe80::5652:ff:fe00:cc5/64 scope link valid_lft forever preferred_lft forever 4: sit0: mtu 1480 qdisc noop link/sit 0.0.0.0 brd 0.0.0.0 [root at mork ~]# - I wait a couple of hours [root at mork ~]# clustat Cluster Status for clumm @ Thu Nov 5 15:22:23 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local, rgmanager node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:MM1SRV node1 starting service:MM2SRV node1 starting service:MM3SRV node1 starting - resource groups are unlocked: [root at mork ~]# clusvcadm -S Resource groups unlocked - [root at mork ~]# clusvcadm -e MM3SRV Local machine trying to enable service:MM3SRV...Service is already running Note that the other node is still powered off - So to solve the situation I have to do a disable/enable sequence, having downtime (ip alias removed and file systems unmounted in my case): [root at mork ~]# clusvcadm -d MM3SRV Local machine disabling service:MM3SRV...Success [root at mork ~]# clustat Cluster Status for clumm @ Thu Nov 5 15:25:49 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, Local, rgmanager node2 2 Offline /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_scsi0-hd0 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:MM1SRV node1 starting service:MM2SRV node1 starting service:MM3SRV (node1) disabled [root at mork ~]# clusvcadm -e MM3SRV Local machine trying to enable service:MM3SRV...Success service:MM3SRV is now running on node1 [root at mork ~]# clusvcadm -d MM1SRV Local machine disabling service:MM1SRV...Success [root at mork ~]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 5808616 4047656 1461136 74% / /dev/hda1 101086 38786 57081 41% /boot tmpfs 447656 0 447656 0% /dev/shm /dev/mapper/vg_cl2-lv_cl2 4124352 1563032 2351812 40% /cl2 [root at mork ~]# clusvcadm -e MM1SRV Local machine trying to enable service:MM1SRV...Success service:MM1SRV is now running on node1 [root at mork ~]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 5808616 4047664 1461128 74% / /dev/hda1 101086 38786 57081 41% /boot tmpfs 447656 0 447656 0% /dev/shm /dev/mapper/vg_cl2-lv_cl2 4124352 1563032 2351812 40% /cl2 /dev/mapper/vg_cl1-lv_cl1 4124352 1258064 2656780 33% /cl1 Gianluca -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdinitto at redhat.com Thu Nov 5 14:59:52 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Thu, 05 Nov 2009 15:59:52 +0100 Subject: [Linux-cluster] Cluster Configuration Docs (STABLE3) In-Reply-To: <5f61ab380911050619g51607c75uaacbfac0593cfbe9@mail.gmail.com> References: <4AF2DDF5.901@redhat.com> <5f61ab380911050619g51607c75uaacbfac0593cfbe9@mail.gmail.com> Message-ID: <4AF2E868.5080503@redhat.com> Nehemias Jahcob wrote: > > Fabio > > > "Forbidden > > You don't have permission to access /CCS.pdf on this server." > > Thanks.. > > NJ > Should be Ok now. Fabio From allen at isye.gatech.edu Thu Nov 5 19:36:27 2009 From: allen at isye.gatech.edu (Allen Belletti) Date: Thu, 05 Nov 2009 14:36:27 -0500 Subject: [Linux-cluster] GFS2 interesting death with error Message-ID: <4AF3293B.8050300@isye.gatech.edu> Saw an interesting and different GFS2 death this morning that I wanted to pass along in case anyone has insights. We have not seen any of the "hanging in dlm_posix_lock" since fsck'ing early Sunday morning. In any case I'm pretty confident that's being triggered by the creation & deletion of ".lock" files within Dovecot. This was something completely different and it left some potentially useful debug info in the logs. Things were running fine when the machine "post2" abruptly died. The following was found to have been enscribed upon its stone logs: Nov 5 10:56:28 post2 kernel: original: gfs2_rindex_hold+0x32/0x153 [gfs2] Nov 5 10:56:28 post2 kernel: pid : 27197 Nov 5 10:56:28 post2 kernel: lock type: 2 req lock state : 3 Nov 5 10:56:28 post2 kernel: new: gfs2_rindex_hold+0x32/0x153 [gfs2] Nov 5 10:56:28 post2 kernel: pid: 27197 Nov 5 10:56:28 post2 kernel: lock type: 2 req lock state : 3 Nov 5 10:56:28 post2 kernel: G: s:SH n:2/2053b f:s t:SH d:EX/0 l:0 a:0 r:4 Nov 5 10:56:28 post2 kernel: H: s:SH f:H e:0 p:27197 [procmail] gfs2_rindex_hold+0x32/0x153 [gfs2] Nov 5 10:56:28 post2 kernel: I: n:23/132411 t:8 f:0x00000010 Nov 5 10:56:28 post2 kernel: ----------- [cut here ] --------- [please bite here ] --------- Nov 5 10:56:32 post2 kernel: Kernel BUG at ...ir/build/BUILD/gfs2-kmod-1.92/_kmod_build_/glock.c:950 The fact that it died in procmail indicates that the failure occurred while writing mail to someone's Inbox. The system wasn't heavily loaded at the time -- the load averages were a little bit below 1.0 at the time of the crash. Also interesting is what happened next. The load average on post1 (the only other node) shot up over 100, as numerous processes were blocked. It spent several minutes with an administrative process using 100% of a CPU -- I believe it was dlm_recoverd though I'm not 100% certain. Then, just as the load average had come back down to 15-20 and functionality was returning, it abruptly hung. At this point I reset both cluster nodes and all was well. Anyway, if you've seen anything like this or have a clue as to the cause, I'd love to hear it. Looks like more lock-related glitchiness in our relatively lock intensive environment. Thanks, Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology From lhh at redhat.com Thu Nov 5 20:07:38 2009 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 05 Nov 2009 15:07:38 -0500 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <26C393E2-EBD5-4C14-9A76-BE6E0BF713B3@equation.fr> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> <1257254404.2483.9.camel@localhost> <1257270978.2483.17.camel@localhost> <26C393E2-EBD5-4C14-9A76-BE6E0BF713B3@equation.fr> Message-ID: <1257451658.3509.57.camel@localhost.localdomain> On Thu, 2009-11-05 at 10:12 +0100, Alain RICHARD wrote: > I am currently testing this version of vm.sh that is handling xmlfile > and path differently : Right. > - if use_virth=0, use xm as before > - if path and xmlfile, ignore path and issue a warning (you should use > xmlfile) > - if xmlfile only, use it > - if path, search for a file name under path or name.xml under path > and set xmfile to this file 1. If you set 'xmlfile' and 'path', the RA should produce a warning. 2. If you set 'path', it does a search when using 'virsh' for the right file and sets 'xmlfile' to it (since xmlfile handling was already written). > in the case of virsh, creation is handled using 'virsh create xmlfile' > if xmlfile is not empty, or 'virsh create name' if there is no > xmlfile/path configured. The 'xmlfile' patch always has worked this way; the 'path' patch just searches the path attribute for config files. > The effect of this is that the config file must be : > > > - an xml file, with or without .xml extension, if xmlfile or path > attribute is set > - else, a classic xen config file under /etc/xen > In order to stay compatible with current rgmanager configuration, we > must ensure that use_virsh is set to 0 for vm that use classical xen > conf files and path directive, else the vm fails to lauch because > virsh create is not able to handle xen config file and virsh start, > that is able to handle xen conf files, is not able to get the file > from an other location than /etc/xen. Right, good point. We can run the conf file through xmllint, and if it fails, assume it's 'xen', and revert to 'xm' during the search process. Libvirt description files are all XML. Furthermore, if you set 'use_virsh' to '1' explicitly, the resource agent needs to return an error if the config file does not pass this XML test. I do not want the agent to parse individual files in order to allow a user to have say /mnt/tmp/foo.xml with "bar" for the virtual machine. > An other point is that if libvirtd is not running, the status returned > by this vm.sh for a vm is always "indeterminate", so I have to launch > it and to disable the default libvirt network because I really don't > need it. Ok. > The last problem so far is that clusvcadm -M always end-up with an > error although the migration is working correctly : What version of libvirt do you have? -- Lon From andrew at ntsg.umt.edu Thu Nov 5 22:30:45 2009 From: andrew at ntsg.umt.edu (Andrew A. Neuschwander) Date: Thu, 05 Nov 2009 15:30:45 -0700 Subject: [Linux-cluster] GFS Journal Size Message-ID: <4AF35215.6060407@ntsg.umt.edu> Is there a good way to determine what size journals are needed on a gfs? Is there anyway to tell if a journal gets full? I have a vary large file system (20TB) shared by three hosts with 48GB RAM each. With the default journal size, applications would become unresponsive under heavy sequential writes. I increased the journal size to 1GB (from the default 128M) and this alleviated the problem. Searching hasn't turned up any discussions on journal size. Thanks, -Andrew -- Andrew A. Neuschwander, RHCE Manager, Systems Engineer Science Compute Services College of Forestry and Conservation The University of Montana http://www.ntsg.umt.edu andrew at ntsg.umt.edu - 406.243.6310 From tfrumbacher at gmail.com Thu Nov 5 23:11:32 2009 From: tfrumbacher at gmail.com (Aaron Benner) Date: Thu, 5 Nov 2009 16:11:32 -0700 Subject: [Linux-cluster] Re: clusvcadm -U returns "Temporary failure" on vm service In-Reply-To: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> References: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> Message-ID: <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> I have dug a little deeper on this: in rg_state.c the function _svc_freeze( ... ) the switch checking for the service status only lists as valid RG_STATE_ {STOPPED,STARTED,DISABLED} Based on the output of clustat my services are in RG_STATE_MIGRATE Which means execution fails over to the default case which unlocks the group and returns RG_EAGAIN which is generating the "Temporary failure; try again" note below. What this means is that it is possible, given the scenario outlined below to have a service in the "migrating" state with the frozen flag set. Once this state is entered the rg can no longer be unfrozen because the unfreeze code expects it to eventually undergo a state change at which point you can unfreeze it. Problem is now that it's frozen it can't be stopped, disabled, etc. and so I can't force a state change. I saw reference to a patch to prevent migration of frozen groups, but either I'm not using that release of the code or it doesn't apply to the situation I outlined below. --AB On Nov 3, 2009, at 1:03 PM, Aaron Benner wrote: > All, > > I have a problem that I can't find documentation on and has me > baffled. > > I have a 3 node cluster running xen with multiple domU enabled as > cluster services. The individual services are set to have a node > affinity using resource groups (see cluster.conf below) and live > migration is enabled. > > I had migrated two domU off of one of the cluster nodes in > anticipation of a power-cycle and network reconfig. Before bringing > up the node that had been reconfigured I froze (clusvcadm -Z ...) > the domU in question so that when the newly reconfigured node came > up they would not migrate back to their preferred host, or at least > that's what I *THOUGHT* -Z would do. > > I booted up reconfigured node, and ignoring their frozen state the > rgmanager on the rebooting node initiated a migration of the domUs. > The migration finished and the virtuals resumed operation on the > reconfigured host. The problem is now rgmanager is showing those > resrouce groups as having state "migrating" (even though there are > no migration processes still active) and clusvcadm -U ... returns > the following: > > "Local machine unfreezing vm:SaturnE...Temporary failure; try again" > > I get this message on all of the cluster nodes. I'm not sure if > this is coming from clusvcadm, vm.sh, or some other piece of the > cluster puzzle. Is there any way to get rgmanager to realize that > these resource groups are no longer migrating and as such can be > unfrozen? Is that even my problem? Can I fix this with anything > other than a complete power down of the cluster (disaster)? > > --AB > > > post_join_delay="180"/> > > nodeid="1" votes="1"> > > > port="6"/> > > > > nodeid="2" votes="1"> > > > port="13"/> > > > > nodeid="3" votes="1"> > > > port="12"/> > > > > > > > > > > > > nofailback="0" ordered="0" restricted="0"> > > > nofailback="0" ordered="0" restricted="0"> > > > nofailback="0" ordered="0" restricted="0"> > > > > > exclusive="0" max_restarts="0" migrate="live" name="SaturnX" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnC" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnE" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnF" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnD" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnA" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="Orion1" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="Orion2" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="Orion3" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="SaturnB" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > exclusive="0" max_restarts="0" migrate="live" name="Pluto" path="/ > etc/xen" recovery="restart" restart_expire_time="0"/> > > > > From lhh at redhat.com Thu Nov 5 23:42:12 2009 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 05 Nov 2009 18:42:12 -0500 Subject: [Linux-cluster] clusvcadm -U returns "Temporary failure" on vm service In-Reply-To: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> References: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> Message-ID: <1257464532.3509.84.camel@localhost.localdomain> On Tue, 2009-11-03 at 13:03 -0700, Aaron Benner wrote: > All, > > I have a problem that I can't find documentation on and has me baffled. > > I have a 3 node cluster running xen with multiple domU enabled as > cluster services. The individual services are set to have a node > affinity using resource groups (see cluster.conf below) and live > migration is enabled. > > I had migrated two domU off of one of the cluster nodes in > anticipation of a power-cycle and network reconfig. Before bringing > up the node that had been reconfigured I froze (clusvcadm -Z ...) the > domU in question so that when the newly reconfigured node came up they > would not migrate back to their preferred host, or at least that's > what I *THOUGHT* -Z would do. http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=6b751d048dd2068c5c62019b675d0d3699409a63 -- Lon From lhh at redhat.com Thu Nov 5 23:43:20 2009 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 05 Nov 2009 18:43:20 -0500 Subject: [Linux-cluster] Re: clusvcadm -U returns "Temporary failure" on vm service In-Reply-To: <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> References: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> Message-ID: <1257464600.3509.85.camel@localhost.localdomain> On Thu, 2009-11-05 at 16:11 -0700, Aaron Benner wrote: > I saw reference to a patch to prevent migration of frozen groups, but > either I'm not using that release of the code or it doesn't apply to > the situation I outlined below. The patch completely prevents migration of frozen VMs. -- Lon From alain.richard at equation.fr Fri Nov 6 07:58:58 2009 From: alain.richard at equation.fr (Alain RICHARD) Date: Fri, 6 Nov 2009 08:58:58 +0100 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <1257451658.3509.57.camel@localhost.localdomain> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> <1257254404.2483.9.camel@localhost> <1257270978.2483.17.camel@localhost> <26C393E2-EBD5-4C14-9A76-BE6E0BF713B3@equation.fr> <1257451658.3509.57.camel@localhost.localdomain> Message-ID: <60B22AD9-38DC-4CCB-9250-C0D3027A136D@equation.fr> Le 5 nov. 2009 ? 21:07, Lon Hohberger a ?crit : >> The last problem so far is that clusvcadm -M always end-up with an >> error although the migration is working correctly : > > What version of libvirt do you have? I am using the current Centos 5.4 packages : libvirt-0.6.3-20.1.el5_4 xen-3.0.3-94.el5_4.2 cman-2.0.115-1.el5_4.3 rgmanager-2.0.52-1.el5.centos.2 Regards, -- Alain RICHARD EQUATION SA Tel : +33 477 79 48 00 Fax : +33 477 79 48 01 E-Liance, Op?rateur des entreprises et collectivit?s, Liaisons Fibre optique, SDSL et ADSL -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Fri Nov 6 09:45:58 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 06 Nov 2009 09:45:58 +0000 Subject: [Linux-cluster] GFS Journal Size In-Reply-To: <4AF35215.6060407@ntsg.umt.edu> References: <4AF35215.6060407@ntsg.umt.edu> Message-ID: <1257500758.2722.14.camel@localhost.localdomain> Hi, On Thu, 2009-11-05 at 15:30 -0700, Andrew A. Neuschwander wrote: > Is there a good way to determine what size journals are needed on a gfs? Is there anyway to tell if > a journal gets full? I have a vary large file system (20TB) shared by three hosts with 48GB RAM > each. With the default journal size, applications would become unresponsive under heavy sequential > writes. I increased the journal size to 1GB (from the default 128M) and this alleviated the problem. > > Searching hasn't turned up any discussions on journal size. > > Thanks, > -Andrew It depends a lot upon the workload, and also upon the hardware, so its tricky to give any hard and fast answers. We don't currently have any easy way to tell if the journal is getting full, although with the tracepoints built into upstream/fedora kernels it should be possible to get this information indirectly. Unless you have journaled data mode turned on, then only metadata will be journaled, so that it is the amount of metadata being modified that determines how quickly the journal fills up. Streaming writes will create a fair amount of metadata (assuming the files are not preallocated) in the form of indirect blocks. The journaled blocks are pinned in memory until they are written to the journal. This means that with a larger journal, you can potentially take up a lot of memory which would otherwise be used for the running of applications and/or caching data. As a result its not a good idea to have a journal that is too large a percentage of physical memory. There are actually two limits to consider wrt to journal size. The first is the number of blocks which can be put in the journal before the journal is flushed, and the second (probably what you are coming up against) is the requirement that all the journaled blocks must be written back "in place" before a segment of the journal can be freed. It is also possible to adjust the first of these items with the sysfs incore_log_blocks setting. I should warn you though that this particular setting is rather a crude way to make adjustments and at some future point we intend to replace that with a better method. Does that answer your question? Steve. From swhiteho at redhat.com Fri Nov 6 09:52:10 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 06 Nov 2009 09:52:10 +0000 Subject: [Linux-cluster] GFS2 interesting death with error In-Reply-To: <4AF3293B.8050300@isye.gatech.edu> References: <4AF3293B.8050300@isye.gatech.edu> Message-ID: <1257501130.2722.20.camel@localhost.localdomain> Hi, On Thu, 2009-11-05 at 14:36 -0500, Allen Belletti wrote: > Saw an interesting and different GFS2 death this morning that I wanted > to pass along in case anyone has insights. We have not seen any of the > "hanging in dlm_posix_lock" since fsck'ing early Sunday morning. In any > case I'm pretty confident that's being triggered by the creation & > deletion of ".lock" files within Dovecot. This was something completely > different and it left some potentially useful debug info in the logs. > > Things were running fine when the machine "post2" abruptly died. The > following was found to have been enscribed upon its stone logs: > > Nov 5 10:56:28 post2 kernel: original: gfs2_rindex_hold+0x32/0x153 [gfs2] > Nov 5 10:56:28 post2 kernel: pid : 27197 > Nov 5 10:56:28 post2 kernel: lock type: 2 req lock state : 3 > Nov 5 10:56:28 post2 kernel: new: gfs2_rindex_hold+0x32/0x153 [gfs2] > Nov 5 10:56:28 post2 kernel: pid: 27197 > Nov 5 10:56:28 post2 kernel: lock type: 2 req lock state : 3 > Nov 5 10:56:28 post2 kernel: G: s:SH n:2/2053b f:s t:SH d:EX/0 l:0 > a:0 r:4 > Nov 5 10:56:28 post2 kernel: H: s:SH f:H e:0 p:27197 [procmail] > gfs2_rindex_hold+0x32/0x153 [gfs2] > Nov 5 10:56:28 post2 kernel: I: n:23/132411 t:8 f:0x00000010 > Nov 5 10:56:28 post2 kernel: ----------- [cut here ] --------- [please > bite here ] --------- > Nov 5 10:56:32 post2 kernel: Kernel BUG at > ...ir/build/BUILD/gfs2-kmod-1.92/_kmod_build_/glock.c:950 > There should have been a stack trace following this message which is critical to tracking down the bug. What kernel version is this? Did you run gfs2_grow at any time? Steve. From swhiteho at redhat.com Fri Nov 6 15:15:45 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 06 Nov 2009 15:15:45 +0000 Subject: [Linux-cluster] GFS2 interesting death with error In-Reply-To: <4AF3293B.8050300@isye.gatech.edu> References: <4AF3293B.8050300@isye.gatech.edu> Message-ID: <1257520545.6052.789.camel@localhost.localdomain> Hi, On Thu, 2009-11-05 at 14:36 -0500, Allen Belletti wrote: > Saw an interesting and different GFS2 death this morning that I wanted > to pass along in case anyone has insights. We have not seen any of the > "hanging in dlm_posix_lock" since fsck'ing early Sunday morning. In any > case I'm pretty confident that's being triggered by the creation & > deletion of ".lock" files within Dovecot. This was something completely > different and it left some potentially useful debug info in the logs. > I've made an educated guess as to what this might be. The attached patch should fix it, if my hunch is correct. If you have the back trace I mentioned in my previous email, we can confirm that this really is the cause, Steve. >From 89fc5489d25fc0a34a367b119448a037ed162c00 Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Fri, 6 Nov 2009 11:10:51 +0000 Subject: [PATCH 27/27] GFS2: Locking order fix in gfs2_check_blk_state In some cases we already have the rindex lock when we enter this function. Signed-off-by: Steven Whitehouse --- fs/gfs2/rgrp.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index 8f1cfb0..0608f49 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -1710,11 +1710,16 @@ int gfs2_check_blk_type(struct gfs2_sbd *sdp, u64 no_addr, unsigned int type) { struct gfs2_rgrpd *rgd; struct gfs2_holder ri_gh, rgd_gh; + struct gfs2_inode *ip = GFS2_I(sdp->sd_rindex); + int ri_locked = 0; int error; - error = gfs2_rindex_hold(sdp, &ri_gh); - if (error) - goto fail; + if (!gfs2_glock_is_locked_by_me(ip->i_gl)) { + error = gfs2_rindex_hold(sdp, &ri_gh); + if (error) + goto fail; + ri_locked = 1; + } error = -EINVAL; rgd = gfs2_blk2rgrpd(sdp, no_addr); @@ -1730,7 +1735,8 @@ int gfs2_check_blk_type(struct gfs2_sbd *sdp, u64 no_addr, unsigned int type) gfs2_glock_dq_uninit(&rgd_gh); fail_rindex: - gfs2_glock_dq_uninit(&ri_gh); + if (ri_locked) + gfs2_glock_dq_uninit(&ri_gh); fail: return error; } -- 1.6.2.5 From andrew at ntsg.umt.edu Fri Nov 6 15:37:09 2009 From: andrew at ntsg.umt.edu (Andrew A. Neuschwander) Date: Fri, 06 Nov 2009 08:37:09 -0700 Subject: [Linux-cluster] GFS Journal Size In-Reply-To: <1257500758.2722.14.camel@localhost.localdomain> References: <4AF35215.6060407@ntsg.umt.edu> <1257500758.2722.14.camel@localhost.localdomain> Message-ID: <4AF442A5.1070902@ntsg.umt.edu> Steven Whitehouse wrote: > Hi, > > On Thu, 2009-11-05 at 15:30 -0700, Andrew A. Neuschwander wrote: >> Is there a good way to determine what size journals are needed on a gfs? Is there anyway to tell if >> a journal gets full? I have a vary large file system (20TB) shared by three hosts with 48GB RAM >> each. With the default journal size, applications would become unresponsive under heavy sequential >> writes. I increased the journal size to 1GB (from the default 128M) and this alleviated the problem. >> >> Searching hasn't turned up any discussions on journal size. >> >> Thanks, >> -Andrew > > It depends a lot upon the workload, and also upon the hardware, so its > tricky to give any hard and fast answers. We don't currently have any > easy way to tell if the journal is getting full, although with the > tracepoints built into upstream/fedora kernels it should be possible to > get this information indirectly. > > Unless you have journaled data mode turned on, then only metadata will > be journaled, so that it is the amount of metadata being modified that > determines how quickly the journal fills up. Streaming writes will > create a fair amount of metadata (assuming the files are not > preallocated) in the form of indirect blocks. > > The journaled blocks are pinned in memory until they are written to the > journal. This means that with a larger journal, you can potentially take > up a lot of memory which would otherwise be used for the running of > applications and/or caching data. As a result its not a good idea to > have a journal that is too large a percentage of physical memory. > > There are actually two limits to consider wrt to journal size. The first > is the number of blocks which can be put in the journal before the > journal is flushed, and the second (probably what you are coming up > against) is the requirement that all the journaled blocks must be > written back "in place" before a segment of the journal can be freed. It > is also possible to adjust the first of these items with the sysfs > incore_log_blocks setting. I should warn you though that this particular > setting is rather a crude way to make adjustments and at some future > point we intend to replace that with a better method. > > Does that answer your question? > > Steve. > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > Steve, This is very good and helpful information. Thanks. -Andrew -- From claudiorenatosantiago at gmail.com Sat Nov 7 02:11:39 2009 From: claudiorenatosantiago at gmail.com (=?ISO-8859-1?Q?Cl=E1udio_Santiago?=) Date: Sat, 7 Nov 2009 00:11:39 -0200 Subject: [Linux-cluster] Error mounting GFS2 metafs: Block device required Message-ID: I have encountered problem to execute some operations on gfs2 filesystem. # mount | grep gfs2 /dev/mapper/vg-test on /mnt/test type gfs2 (rw,relatime,hostdata=jid=0,quota=on) # gfs2_tool journals /mnt/test Error mounting GFS2 metafs: Block device required # gfs2_quota list -f /mnt/test Error mounting GFS2 metafs: Block device required I'm using cluster-3.0.4 (build from source) with kernel 2.6.31.5. Anybody have idea about what can be wrong? Thanks Claudio Santiago -------------- next part -------------- An HTML attachment was scrubbed... URL: From linux at alteeve.com Sat Nov 7 05:09:18 2009 From: linux at alteeve.com (Madison Kelly) Date: Sat, 07 Nov 2009 00:09:18 -0500 Subject: [Linux-cluster] All VMs are "blocked", terrible performance Message-ID: <4AF500FE.7010405@alteeve.com> Hi all, I've built up a handful of VMs on my 2-node cluster and all are showing as being in a blocked state. The performance is terrible, too. All VMs are currently on one node (another problem, maybe related? is keeping me from migrating any). My Setup: (each node) 2x Quad Core AMD Opteron 2347 HE 32GB RAM (16GB/CPU) Nodes have a DRBD partition running cluster-aware LVM for all domU VMs. Each VM has it's own logical volume. The DRBD has a dedicate gigabit link and DRBD is using 'Protocol C', as required. LVM is set to use 'locking_type=3'. Here's what I see: # xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 32544 8 r----- 22400.8 auth01 10 1023 1 -b---- 3659.7 dev01 22 8191 1 -b---- 830.2 fw01 11 1023 1 -b---- 1046.9 res01 23 2047 1 -b---- 812.1 sql01 24 16383 1 -b---- 817.0 web01 20 2047 1 -b---- 1156.3 web02 21 1023 1 -b---- 931.1 When I ran that, all VMs were running yum update (all but two were fresh installs). Any idea what's causing this and/or why my performance is so bad? Each VM is taking minutes to install each updated RPM. In case it's related, when I tried to do a live migration of a VM from one node to the other, I got an error saying that the VM's partition couldn't be seen on the other node. However, '/proc/drbd' shows both nodes are sync'ed and in Primary/Primary mode. Also, both nodes have identical output from 'lvdisplay' (all LVs are 'active') and all LVs were created during the provision on the first node. Kinda stuck here, so any input will be greatly appreciated! Let me know if I can post anything else useful. Madi From kkovachev at varna.net Sun Nov 8 12:39:05 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Sun, 8 Nov 2009 14:39:05 +0200 Subject: [Linux-cluster] Error mounting GFS2 metafs: Block device required In-Reply-To: References: Message-ID: <20091108123425.M95693@varna.net> On Sat, 7 Nov 2009 00:11:39 -0200, Cl?udio Santiago wrote > I have encountered problem to execute some operations on gfs2 filesystem. > > # mount | grep gfs2 > /dev/mapper/vg-test on /mnt/test type gfs2 (rw,relatime,hostdata=jid=0,quota=on) > > # gfs2_tool journals /mnt/test > Error mounting GFS2 metafs: Block device required > > # gfs2_quota list -f /mnt/test > Error mounting GFS2 metafs: Block device required > > I'm using cluster-3.0.4 (build from source) with kernel 2.6.31.5. > > Anybody have idea about what can be wrong? > I have the same problem with the tools from 3.0.4 and kernel 2.6.31.4, but the ones from cluster-2.03.10 are working fine ... weird > Thanks > > Claudio Santiago From carlopmart at gmail.com Mon Nov 9 10:11:18 2009 From: carlopmart at gmail.com (carlopmart) Date: Mon, 09 Nov 2009 11:11:18 +0100 Subject: [Linux-cluster] Where is fence scsi test script? Message-ID: <4AF7EAC6.8020005@gmail.com> Hi all, I am trying to build a cluster using three rhel5.4 nodes using fence_scsi as a fence agent and a solaris 10 iscsi target to serve disks as a shared storage to this cluster. I am reading scsi fence wiki doc (http://sources.redhat.com/cluster/wiki/SCSI_FencingConfig) and I have some doubts. a) Where is fence_scsi_test script to test SPC-3 compilance against solaris iscsi target?? b) Under rhel5.4, is possible to use two cluster node to use fence_scsi or not? c) using three nodes, what happens with fence_scsi if one node fails for along 24 hours, for example?? Thanks. -- CL Martinez carlopmart {at} gmail {d0t} com From pasik at iki.fi Mon Nov 9 10:41:38 2009 From: pasik at iki.fi (Pasi =?iso-8859-1?Q?K=E4rkk=E4inen?=) Date: Mon, 9 Nov 2009 12:41:38 +0200 Subject: [Linux-cluster] All VMs are "blocked", terrible performance In-Reply-To: <4AF500FE.7010405@alteeve.com> References: <4AF500FE.7010405@alteeve.com> Message-ID: <20091109104138.GE16033@reaktio.net> On Sat, Nov 07, 2009 at 12:09:18AM -0500, Madison Kelly wrote: > Hi all, > > I've built up a handful of VMs on my 2-node cluster and all are > showing as being in a blocked state. The performance is terrible, too. > All VMs are currently on one node (another problem, maybe related? is > keeping me from migrating any). > > My Setup: > (each node) > 2x Quad Core AMD Opteron 2347 HE > 32GB RAM (16GB/CPU) > > Nodes have a DRBD partition running cluster-aware LVM for all domU VMs. > Each VM has it's own logical volume. The DRBD has a dedicate gigabit > link and DRBD is using 'Protocol C', as required. LVM is set to use > 'locking_type=3'. > > Here's what I see: > > # xm list > Name ID Mem(MiB) VCPUs State Time(s) > Domain-0 0 32544 8 r----- 22400.8 > auth01 10 1023 1 -b---- 3659.7 > dev01 22 8191 1 -b---- 830.2 > fw01 11 1023 1 -b---- 1046.9 > res01 23 2047 1 -b---- 812.1 > sql01 24 16383 1 -b---- 817.0 > web01 20 2047 1 -b---- 1156.3 > web02 21 1023 1 -b---- 931.1 > > When I ran that, all VMs were running yum update (all but two were > fresh installs). > > Any idea what's causing this and/or why my performance is so bad? Each > VM is taking minutes to install each updated RPM. > > In case it's related, when I tried to do a live migration of a VM from > one node to the other, I got an error saying that the VM's partition > couldn't be seen on the other node. However, '/proc/drbd' shows both > nodes are sync'ed and in Primary/Primary mode. Also, both nodes have > identical output from 'lvdisplay' (all LVs are 'active') and all LVs > were created during the provision on the first node. > > Kinda stuck here, so any input will be greatly appreciated! Let me know > if I can post anything else useful. > Have you configured domain weights? With a busy host/dom0 it's requirement to make sure dom0 will always get enough cpu time to be able to process the important stuff (IO etc). - Give dom0 more weight than domUs - And you could also dedicate a single core only for dom0. (in grub.conf add for xen.gz: dom0_vcpus=1 dom0_vcpus_pin) and after that make sure domU vcpus are pinned to other pcpus than 0. cpus=1-x parameter in /etc/xen/ cfgfile. You can monitor with "xm vcpu-list". -- Pasi From jruemker at redhat.com Mon Nov 9 20:41:27 2009 From: jruemker at redhat.com (John Ruemker) Date: Mon, 09 Nov 2009 15:41:27 -0500 Subject: [Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster In-Reply-To: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> References: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> Message-ID: <4AF87E77.2060302@redhat.com> On 11/04/2009 02:40 AM, Wang2, Colin (NSN - CN/Cheng Du) wrote: > 1257303721 averting fence of node 198.18.9.34 [...] > [...] > This is caused by the usage of IP addresses for node names. https://bugzilla.redhat.com/show_bug.cgi?id=504158 http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=d3557114c74a96710e0612fb1aed77513835ee90 For now you can work around this by using host names (that differ up until the first '.') for nodenames. -John From erickson.jon at gmail.com Tue Nov 10 14:14:00 2009 From: erickson.jon at gmail.com (Jon Erickson) Date: Tue, 10 Nov 2009 09:14:00 -0500 Subject: [Linux-cluster] What's the proper way to shut down my cluster? Message-ID: <6a90e4da0911100614r6ee0febeh1fa2f49e25d8115c@mail.gmail.com> So I was wondering what the proper way to shut down a cluster is? The CMAN faq says to use 'cman_tool leave remove', but the Redhat docs say to run 'service cman stop'. Thanks. Sources: http://sources.redhat.com/cluster/wiki/FAQ/CMAN http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/pdf/Cluster_Administration/Cluster_Administration.pdf -- Jon From crosa at redhat.com Tue Nov 10 14:28:50 2009 From: crosa at redhat.com (Cleber Rodrigues) Date: Tue, 10 Nov 2009 12:28:50 -0200 Subject: [Linux-cluster] What's the proper way to shut down my cluster? In-Reply-To: <6a90e4da0911100614r6ee0febeh1fa2f49e25d8115c@mail.gmail.com> References: <6a90e4da0911100614r6ee0febeh1fa2f49e25d8115c@mail.gmail.com> Message-ID: <1257863330.3780.22.camel@localhost> If this is RHEL5, then "#service cman stop" will attempt to shutdown all associated daemons (fenced, for instance). Maybe it's just me, but I find it quite unreliable... IMHO, the best options are: 1) Use the "leave cluster" functionality from luci 2) Do it manually, something like: - umount gfs - # service clvmd stop - # service rgmanager stop - # fence_tool leave ... - # cman_tool leave remove ... CR. On Tue, 2009-11-10 at 09:14 -0500, Jon Erickson wrote: > So I was wondering what the proper way to shut down a cluster is? The > CMAN faq says to use 'cman_tool leave remove', but the Redhat docs say > to run 'service cman stop'. > > Thanks. > > Sources: > http://sources.redhat.com/cluster/wiki/FAQ/CMAN > http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/pdf/Cluster_Administration/Cluster_Administration.pdf > -- Cleber Rodrigues Solutions Architect - Red Hat, Inc. Mobile: +55 61 9185.3454 From lhh at redhat.com Wed Nov 11 14:17:29 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Wed, 11 Nov 2009 09:17:29 -0500 Subject: [Linux-cluster] rgmanager vm.sh using virsh under RHEL5.4 In-Reply-To: <60B22AD9-38DC-4CCB-9250-C0D3027A136D@equation.fr> References: <1257188738.2496.71.camel@localhost.localdomain> <91816B41-6F16-429D-B125-BB826F9441F0@equation.fr> <1257254404.2483.9.camel@localhost> <1257270978.2483.17.camel@localhost> <26C393E2-EBD5-4C14-9A76-BE6E0BF713B3@equation.fr> <1257451658.3509.57.camel@localhost.localdomain> <60B22AD9-38DC-4CCB-9250-C0D3027A136D@equation.fr> Message-ID: <1257949049.2593.2.camel@localhost> On Fri, 2009-11-06 at 08:58 +0100, Alain RICHARD wrote: > > Le 5 nov. 2009 ? 21:07, Lon Hohberger a ?crit : > > > > The last problem so far is that clusvcadm -M always end-up with an > > > error although the migration is working correctly : > > > > What version of libvirt do you have? > > > I am using the current Centos 5.4 packages : > > > libvirt-0.6.3-20.1.el5_4 > xen-3.0.3-94.el5_4.2 > cman-2.0.115-1.el5_4.3 > rgmanager-2.0.52-1.el5.centos.2 http://git.fedorahosted.org/git/?p=resource-agents.git;a=commit;h=41f585cf2da887a77f0a811ae6a3594976358d6d ... should address what I mentioned in the previous mail. Basically the goal here is: - use virsh if at all possible - if use_virsh is set by a user, do not override it - set use_virsh according to hypervisor / config file format -- Lon From lhh at redhat.com Wed Nov 11 16:37:59 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Wed, 11 Nov 2009 11:37:59 -0500 Subject: [Linux-cluster] ways used for auto-eviction clarifications In-Reply-To: <561c252c0911041717x74116648j9acedaf2ccfb4ec5@mail.gmail.com> References: <561c252c0911041717x74116648j9acedaf2ccfb4ec5@mail.gmail.com> Message-ID: <1257957479.2593.17.camel@localhost> On Thu, 2009-11-05 at 02:17 +0100, Gianluca Cecchi wrote: > Hello, > can anyone summarize the possible events generating a self-eviction of > a node for an rhcs cluster? > Are these only executed via halt/reboot commands inside OS or also > through connection to the self-fence-device? linux-cluster does not generally have a notion of "self-fencing". Unless there's a confirmed "dead" by another cluster member, the node is considered alive. self_fence in the 'fs.sh' script is an exception, but not the way you might think... The node calls 'reboot -fn' if the umount command fails. However, this does /not/ obviate the requirement that another node successfully fence the newly-rebooted node prior to allowing recovery. In effect, the node is rebooted twice: once by itself, and once by another cluster member via iLO or whatever other power device you are using. -- Lon From lhh at redhat.com Wed Nov 11 16:39:47 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Wed, 11 Nov 2009 11:39:47 -0500 Subject: [Linux-cluster] What's the proper way to shut down my cluster? In-Reply-To: <1257863330.3780.22.camel@localhost> References: <6a90e4da0911100614r6ee0febeh1fa2f49e25d8115c@mail.gmail.com> <1257863330.3780.22.camel@localhost> Message-ID: <1257957587.2593.18.camel@localhost> On Tue, 2009-11-10 at 12:28 -0200, Cleber Rodrigues wrote: > If this is RHEL5, then "#service cman stop" will attempt to shutdown all > associated daemons (fenced, for instance). Maybe it's just me, but I > find it quite unreliable... > > IMHO, the best options are: > > 1) Use the "leave cluster" functionality from luci > 2) Do it manually, something like: > - umount gfs > - # service clvmd stop > - # service rgmanager stop > - # fence_tool leave ... > - # cman_tool leave remove ... Except... Stop rgmanager first. :) -- Lon From lhh at redhat.com Wed Nov 11 16:49:29 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Wed, 11 Nov 2009 11:49:29 -0500 Subject: [Linux-cluster] qdiskd master election and loss of quorum In-Reply-To: <561c252c0911050628x6189153eob12e270da1a694a1@mail.gmail.com> References: <561c252c0911050628x6189153eob12e270da1a694a1@mail.gmail.com> Message-ID: <1257958169.2593.26.camel@localhost> On Thu, 2009-11-05 at 15:28 +0100, Gianluca Cecchi wrote: > Nov 5 12:52:53 mork clurgmgrd[2633]: Member 2 shutting down > Nov 5 12:52:57 mork qdiskd[2214]: Node 2 shutdown > Nov 5 12:55:41 mork openais[2185]: [TOTEM] The token was lost in the > OPERATIONAL state. That's very interesting. It looks like the what happened to cause the state change failures was the huge lag time between when rgmanager sent its "good bye kiss" and the time openais noticed the node was offline. The timeout was large enough that rgmanager gave up. This isn't actually the quorum disk master election problem at all... It's also very strange. - rgmanager should have known this was unnecessary. The other node said it was going away. - cman probably should have caused a transition sooner, I think (??) -- Lon From lhh at redhat.com Wed Nov 11 17:06:31 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Wed, 11 Nov 2009 12:06:31 -0500 Subject: [Linux-cluster] qdiskd master election and loss of quorum In-Reply-To: <1257958169.2593.26.camel@localhost> References: <561c252c0911050628x6189153eob12e270da1a694a1@mail.gmail.com> <1257958169.2593.26.camel@localhost> Message-ID: <1257959191.2593.40.camel@localhost> On Wed, 2009-11-11 at 11:49 -0500, Lon H. Hohberger wrote: > On Thu, 2009-11-05 at 15:28 +0100, Gianluca Cecchi wrote: > > > Nov 5 12:52:53 mork clurgmgrd[2633]: Member 2 shutting down > > Nov 5 12:52:57 mork qdiskd[2214]: Node 2 shutdown > > > Nov 5 12:55:41 mork openais[2185]: [TOTEM] The token was lost in the > > OPERATIONAL state. > > That's very interesting. It looks like the what happened to cause the > state change failures was the huge lag time between when rgmanager sent > its "good bye kiss" and the time openais noticed the node was offline. > The timeout was large enough that rgmanager gave up. > > This isn't actually the quorum disk master election problem at all... > It's also very strange. > > - rgmanager should have known this was unnecessary. The other node said > it was going away. > - cman probably should have caused a transition sooner, I think (??) So... rgmanager treats a node which sends the 'EXITING' message as offline. It makes no sense why it would do this and subsequently fail to update the cluster state. case RG_EXITING: if (!member_online(msg_hdr->gh_arg1)) break; logt_print(LOG_NOTICE, "Member %d shutting down\n", msg_hdr->gh_arg1); member_set_state(msg_hdr->gh_arg1, 0); node_event_q(0, msg_hdr->gh_arg1, 0, 1); break; You said in your previous mail that mindy shut down cleanly -- so I'm really stumped... -- Lon From brem.belguebli at gmail.com Wed Nov 11 17:23:12 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Wed, 11 Nov 2009 18:23:12 +0100 Subject: [Linux-cluster] ways used for auto-eviction clarifications In-Reply-To: <1257957479.2593.17.camel@localhost> References: <561c252c0911041717x74116648j9acedaf2ccfb4ec5@mail.gmail.com> <1257957479.2593.17.camel@localhost> Message-ID: <29ae894c0911110923m48b81b9dl5cd87dd47e9e5fc@mail.gmail.com> With multisites clusters, STONITH as only fencing method won't work in case of network partition (intersite network global failure). Tools to build multisites clusters already exist (lvm mirror, drbd) or are going to be available (dm-replicator) but still the cluster stack forbids the setup due to the only STONITH support. Brem 2009/11/11 Lon H. Hohberger : > On Thu, 2009-11-05 at 02:17 +0100, Gianluca Cecchi wrote: >> Hello, >> can anyone summarize the possible events generating a self-eviction of >> a node for an rhcs cluster? >> Are these only executed via halt/reboot commands inside OS or also >> through connection to the self-fence-device? > > linux-cluster does not generally have a notion of "self-fencing". > > Unless there's a confirmed "dead" by another cluster member, the node is > considered alive. > > self_fence in the 'fs.sh' script is an exception, but not the way you > might think... > > The node calls 'reboot -fn' if the umount command fails. > > However, this does /not/ obviate the requirement that another node > successfully fence the newly-rebooted node prior to allowing recovery. > > In effect, the node is rebooted twice: once by itself, and once by > another cluster member via iLO or whatever other power device you are > using. > > -- Lon > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From uk-linux-cluster at dataway.ch Wed Nov 11 18:23:06 2009 From: uk-linux-cluster at dataway.ch (Anthony Uk) Date: Wed, 11 Nov 2009 19:23:06 +0100 Subject: [Linux-cluster] GFS2: quota file size not a multiple of struct gfs2_quota Message-ID: <4AFB010A.2040404@dataway.ch> Hello I am quite unable to get quotas on my GFS2 filesystem to work properly. I have tried to do gfs2_quota reset followed by gfs2_quota init, but the latter always comes back with: warning: quota file size not a multiple of struct gfs2_quota Warning: This filesystem doesn't seem to have the new quota list format or the quota list is corrupt. list, check and init operation performance will suffer due to this. It is recommended that you run the 'gfs2_quota reset' operation to reset the quota file. All current quota information will be lost and you will have to reassign all quota limits and warnings I see from the archives that a gentleman named Scooter had the same issue a while back, but have not found any mention of a fix. This takes place both with quota=on and quota=off and without any processes accessing the file system, and only one node having mounted the file system. I mounted the gfs2meta system and the quota file (after gfs2_quota init) is always 17668312 bytes in size. If I just do reset but no init, and then set gfs2_quota limit, things seem to work properly to start with (albeit without the current contents being reflected in the numbers) but sooner or later some ridiculous numbers appear, such as a user having 70 exabytes (I don't have the exact output any more). I might add that gfs2_fsck gives me lots of warnings of the type: Unlinked block found at block 952956 (0xe8a7c), left unchanged. I don't know whether that has anything to do with it. This is under Centos 5.4 x64, kernel 2.6.18-164.6.1.el5 and gfs2-utils.x86_64 0.1.62-1.el5 If there's anything I can do to help get this fixed do let me know. Kind regards Anthony Uk www.dataway.ch From adas at redhat.com Wed Nov 11 18:42:19 2009 From: adas at redhat.com (Abhijith Das) Date: Wed, 11 Nov 2009 13:42:19 -0500 (EST) Subject: [Linux-cluster] GFS2: quota file size not a multiple of struct gfs2_quota In-Reply-To: <4AFB010A.2040404@dataway.ch> Message-ID: <1007066200.1972201257964939567.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Hi Anthony, I just filed a RH bugzilla for this problem. https://bugzilla.redhat.com/show_bug.cgi?id=536902 Could you please also post your compressed quota file to this bugzilla using the steps highlighted in the bug description? Unfortunately, I haven't looked at this problem yet. I hope to pretty soon, though. Thanks! --Abhi ----- "Anthony Uk" wrote: > From: "Anthony Uk" > To: linux-cluster at redhat.com > Sent: Wednesday, November 11, 2009 12:23:06 PM GMT -06:00 US/Canada Central > Subject: [Linux-cluster] GFS2: quota file size not a multiple of struct gfs2_quota > > Hello > > I am quite unable to get quotas on my GFS2 filesystem to work > properly. > I have tried to do gfs2_quota reset followed by gfs2_quota init, but > the > latter always comes back with: > > warning: quota file size not a multiple of struct gfs2_quota > > Warning: This filesystem doesn't seem to have the new quota list > format > or the quota list is corrupt. list, check and init operation > performance > will suffer due to this. It is recommended that you run the > 'gfs2_quota > reset' operation to reset the quota file. All current quota > information > will be lost and you will have to reassign all quota limits and > warnings > > I see from the archives that a gentleman named Scooter had the same > issue a while back, but have not found any mention of a fix. > > This takes place both with quota=on and quota=off and without any > processes accessing the file system, and only one node having mounted > > the file system. I mounted the gfs2meta system and the quota file > (after > gfs2_quota init) is always 17668312 bytes in size. > > If I just do reset but no init, and then set gfs2_quota limit, things > > seem to work properly to start with (albeit without the current > contents > being reflected in the numbers) but sooner or later some ridiculous > numbers appear, such as a user having 70 exabytes (I don't have the > exact output any more). > > I might add that gfs2_fsck gives me lots of warnings of the type: > > Unlinked block found at block 952956 (0xe8a7c), left unchanged. > > I don't know whether that has anything to do with it. > > This is under Centos 5.4 x64, kernel 2.6.18-164.6.1.el5 and > gfs2-utils.x86_64 0.1.62-1.el5 > > If there's anything I can do to help get this fixed do let me know. > > Kind regards > > Anthony Uk > www.dataway.ch > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From tfrumbacher at gmail.com Wed Nov 11 23:33:58 2009 From: tfrumbacher at gmail.com (Aaron Benner) Date: Wed, 11 Nov 2009 16:33:58 -0700 Subject: [Linux-cluster] Re: clusvcadm -U returns "Temporary failure" on vm service In-Reply-To: <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> References: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> Message-ID: <12A83053-CF1B-4C80-B702-D69DF752F610@gmail.com> Final follow up here for the archives. I managed to clear the services in the locked state by: 1) remove the virtual machine service definition from the cluster configuration. 2) stop the virtual machine using 'xm shutdown domU'. 3) re-creae the virtual machine service definition which will appear in the stopped state frozen. 4) unfreeze the virtual machine service. 5) start the virtual machine service. This allowed me to clear the freeze without restarting the cluster entirely which was what I needed to accomplish. --AB On Nov 5, 2009, at 4:11 PM, Aaron Benner wrote: > I have dug a little deeper on this: > > in rg_state.c the function _svc_freeze( ... ) the switch checking for the service status only lists as valid RG_STATE_{STOPPED,STARTED,DISABLED} > > Based on the output of clustat my services are in RG_STATE_MIGRATE > > Which means execution fails over to the default case which unlocks the group and returns RG_EAGAIN which is generating the "Temporary failure; try again" note below. > > What this means is that it is possible, given the scenario outlined below to have a service in the "migrating" state with the frozen flag set. Once this state is entered the rg can no longer be unfrozen because the unfreeze code expects it to eventually undergo a state change at which point you can unfreeze it. Problem is now that it's frozen it can't be stopped, disabled, etc. and so I can't force a state change. > > I saw reference to a patch to prevent migration of frozen groups, but either I'm not using that release of the code or it doesn't apply to the situation I outlined below. > > --AB > > On Nov 3, 2009, at 1:03 PM, Aaron Benner wrote: > >> All, >> >> I have a problem that I can't find documentation on and has me baffled. >> >> I have a 3 node cluster running xen with multiple domU enabled as cluster services. The individual services are set to have a node affinity using resource groups (see cluster.conf below) and live migration is enabled. >> >> I had migrated two domU off of one of the cluster nodes in anticipation of a power-cycle and network reconfig. Before bringing up the node that had been reconfigured I froze (clusvcadm -Z ...) the domU in question so that when the newly reconfigured node came up they would not migrate back to their preferred host, or at least that's what I *THOUGHT* -Z would do. >> >> I booted up reconfigured node, and ignoring their frozen state the rgmanager on the rebooting node initiated a migration of the domUs. The migration finished and the virtuals resumed operation on the reconfigured host. The problem is now rgmanager is showing those resrouce groups as having state "migrating" (even though there are no migration processes still active) and clusvcadm -U ... returns the following: >> >> "Local machine unfreezing vm:SaturnE...Temporary failure; try again" >> >> I get this message on all of the cluster nodes. I'm not sure if this is coming from clusvcadm, vm.sh, or some other piece of the cluster puzzle. Is there any way to get rgmanager to realize that these resource groups are no longer migrating and as such can be unfrozen? Is that even my problem? Can I fix this with anything other than a complete power down of the cluster (disaster)? >> >> --AB >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > From colin.wang2 at nsn.com Thu Nov 12 08:19:30 2009 From: colin.wang2 at nsn.com (Wang2, Colin (NSN - CN/Cheng Du)) Date: Thu, 12 Nov 2009 16:19:30 +0800 Subject: [Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster In-Reply-To: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> References: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> Message-ID: <1258013970.14942.2.camel@chn-cdrd-dhcp003182.china.nsn-net.net> Hi All, FYI. The issue was resolved with help from Redhat support. If use ip but name of cluster node, it will not fence node that has lowest id. Resolution, Put one mapping in /etc/hosts. Use name but ip for cluster node. BRs, Colin -----Original Message----- From: ext Wang2, Colin (NSN - CN/Cheng Du) Reply-To: linux clustering To: linux-cluster at redhat.com Subject: [Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster Date: Wed, 04 Nov 2009 15:40:34 +0800 Hi Gurus, I am working on setup 2 nodes cluster, and environment is, Hardware, IBM BladeCenter with 2 LS42( AMD Opteron Quad Code 2356 CPU, 16GB Memory). Storage, EMC CX3-20f Storage Switch: Brocade 4GB 20 ports switch in IBM bladecenter. Network Switch: Cisco Switch module in IBM Bladecenter. Software, Redhat EL 5.3 x86_64, 2.6.18-128.el5 Redhat Cluster Suite 5.3. This is 2 nodes cluster, and my problem is that, - When poweroff 1st node with command "halt -fp", 2nd node can fence 1st node and take over services. - When poweroff 2nd node with command "halt -fp", 1st node can't fence 2nd node and can't take over services. fence_tool dump contents, ----for successful test dump read: Success 1257305495 our_nodeid 2 our_name 198.18.9.34 1257305495 listen 4 member 5 groupd 7 1257305511 client 3: join default 1257305511 delay post_join 3s post_fail 0s 1257305511 clean start, skipping initial nodes 1257305511 setid default 65538 1257305511 start default 1 members 1 2 1257305511 do_recovery stop 0 start 1 finish 0 1257305511 first complete list empty warning 1257305511 finish default 1 1257305611 stop default 1257305611 start default 3 members 2 1257305611 do_recovery stop 1 start 3 finish 1 1257305611 add node 1 to list 1 1257305611 node "198.18.9.33" not a cman member, cn 1 1257305611 node "198.18.9.33" has not been fenced 1257305611 fencing node 198.18.9.33 1257305615 finish default 3 1257305658 client 3: dump ----For failed test dump read: Success 1257300282 our_nodeid 1 our_name 198.18.9.33 1257300282 listen 4 member 5 groupd 7 1257300297 client 3: join default 1257300297 delay post_join 3s post_fail 0s 1257300297 clean start, skipping initial nodes 1257300297 setid default 65538 1257300297 start default 1 members 1 2 1257300297 do_recovery stop 0 start 1 finish 0 1257300297 first complete list empty warning 1257300297 finish default 1 1257303721 stop default 1257303721 start default 3 members 1 1257303721 do_recovery stop 1 start 3 finish 1 1257303721 add node 2 to list 1 1257303721 averting fence of node 198.18.9.34 1257303721 finish default 3 1257303759 client 3: dump I think it was caused by "averting fence of node 198.18.9.34", but why it advert fence? Could you help me out? Thanks in advance. This cluster.conf for reference. BRs, Colin -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From omerfsen at gmail.com Thu Nov 12 15:38:24 2009 From: omerfsen at gmail.com (Omer Faruk Sen) Date: Thu, 12 Nov 2009 17:38:24 +0200 Subject: [Linux-cluster] HP DL100 and RHCS Message-ID: <75a268720911120738i76de19cfka74854c9f9492df4@mail.gmail.com> Hi, I want to make a simple test setup for to play with RHCS. Does RHCS that comes with RHEL 5.4 support LO100 that comes with HP DL100 models(for fencing)? I haven't used this hardware but I am making an investigation about it. By the way what do you use to test RHCS Does Vmware ESX 3.5 supported on RHCS (5.4) as a fencing device? From teigland at redhat.com Thu Nov 12 17:25:17 2009 From: teigland at redhat.com (David Teigland) Date: Thu, 12 Nov 2009 11:25:17 -0600 Subject: [Linux-cluster] RHCS not fence 2nd node in 2 nodes cluster In-Reply-To: <1258013970.14942.2.camel@chn-cdrd-dhcp003182.china.nsn-net.net> References: <1257320434.8353.28.camel@chn-cdrd-dhcp003182.china.nsn-net.net> <1258013970.14942.2.camel@chn-cdrd-dhcp003182.china.nsn-net.net> Message-ID: <20091112172516.GE20714@redhat.com> On Thu, Nov 12, 2009 at 04:19:30PM +0800, Wang2, Colin (NSN - CN/Cheng Du) wrote: > Hi All, > > FYI. > The issue was resolved with help from Redhat support. > If use ip but name of cluster node, it will not fence node that has > lowest id. > > Resolution, > Put one mapping in /etc/hosts. > Use name but ip for cluster node. Appears to be this, https://bugzilla.redhat.com/show_bug.cgi?id=504158 Dave From rjerrido at outsidaz.org Thu Nov 12 16:26:31 2009 From: rjerrido at outsidaz.org (Richard Jerrido) Date: Thu, 12 Nov 2009 11:26:31 -0500 Subject: [Linux-cluster] HP DL100 and RHCS In-Reply-To: <75a268720911120738i76de19cfka74854c9f9492df4@mail.gmail.com> References: <75a268720911120738i76de19cfka74854c9f9492df4@mail.gmail.com> Message-ID: One way of testing RHCS would be to use a physical host running a xen kernel hosting some number of DomU's. You would then use the fence_xvm fencing agent to allow the physical machine (Dom0) to fence the DomU's. I use this method on a RHEL5.4 whitebox system and it works fine. On Thu, Nov 12, 2009 at 10:38 AM, Omer Faruk Sen wrote: > Hi, > > I want to make a simple test setup for to play with RHCS. Does RHCS > that comes with RHEL 5.4 support LO100 that comes with HP DL100 > models(for fencing)? I haven't used this hardware but I am making an > investigation about it. > > By the way what do you use to test RHCS Does Vmware ESX 3.5 supported > on RHCS (5.4) ?as a fencing device? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From lhh at redhat.com Thu Nov 12 17:44:28 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Thu, 12 Nov 2009 12:44:28 -0500 Subject: [Linux-cluster] Re: clusvcadm -U returns "Temporary failure" on vm service In-Reply-To: <12A83053-CF1B-4C80-B702-D69DF752F610@gmail.com> References: <2AB261F0-F6D4-46DA-A7F0-771FB5197DD1@gmail.com> <536265D4-6354-45C3-9270-73015E5F7BBA@gmail.com> <12A83053-CF1B-4C80-B702-D69DF752F610@gmail.com> Message-ID: <1258047868.2601.3.camel@localhost> On Wed, 2009-11-11 at 16:33 -0700, Aaron Benner wrote: > Final follow up here for the archives. I managed to clear the services in the locked state by: > > 1) remove the virtual machine service definition from the cluster configuration. > 2) stop the virtual machine using 'xm shutdown domU'. > 3) re-creae the virtual machine service definition which will appear in the stopped state frozen. > 4) unfreeze the virtual machine service. > 5) start the virtual machine service. > > This allowed me to clear the freeze without restarting the cluster entirely which was what I needed to accomplish. This problem won't exhibit itself in 3.0.5. The 'migrating' state is no longer needed. Either migration will work or it won't; no more half-migrations or wedged 'migrating' states. There are still a couple of bits to clean up around error checking when migrations fail; whether this makes 3.0.5 I can't say for sure. -- Lon From suuuper at messinalug.org Thu Nov 12 18:21:50 2009 From: suuuper at messinalug.org (Giovanni Mancuso) Date: Thu, 12 Nov 2009 19:21:50 +0100 Subject: [Linux-cluster] Time to check services in RHCS Message-ID: <4AFC523E.4040901@messinalug.org> Hi guys, i don't find where i can set the time to check if services is up and running in RedHat Cluster Suite. Can you help me? P.S. I use luci/ricci to configure the cluster. From alan.zg at gmail.com Thu Nov 12 21:30:02 2009 From: alan.zg at gmail.com (Alan A) Date: Thu, 12 Nov 2009 15:30:02 -0600 Subject: [Linux-cluster] GFS Errors - cant mount gfs shares Message-ID: This is the error I get trying to start gfs service. What does this mean? Nov 12 15:28:20 fenmrdev04 ntpd[3340]: kernel time sync enabled 0001 Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 a11 Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 surv34 Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 account61 Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 acct63 Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 gfs_web Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 cati_gfs Nov 12 15:28:27 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open error 12 gfs_cmdr -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.zg at gmail.com Thu Nov 12 22:22:17 2009 From: alan.zg at gmail.com (Alan A) Date: Thu, 12 Nov 2009 16:22:17 -0600 Subject: [Linux-cluster] Re: GFS Errors - cant mount gfs shares In-Reply-To: References: Message-ID: Here are the packages that caused the lockup: [root at fenmrdev02 ~]# rpm -qa | grep sg3 sg3_utils-libs-1.25-4.el5 sg3_utils-1.25-4.el5 sg3_utils-devel-1.25-4.el5 The big question is WHY? On Thu, Nov 12, 2009 at 3:30 PM, Alan A wrote: > This is the error I get trying to start gfs service. What does this mean? > > > Nov 12 15:28:20 fenmrdev04 ntpd[3340]: kernel time sync enabled 0001 > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 a11 > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 surv34 > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 account61 > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 acct63 > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 gfs_web > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 cati_gfs > Nov 12 15:28:27 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > error 12 gfs_cmdr > > -- > Alan A. > -- Alan A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlopmart at gmail.com Thu Nov 12 22:40:46 2009 From: carlopmart at gmail.com (carlopmart) Date: Thu, 12 Nov 2009 23:40:46 +0100 Subject: [Linux-cluster] What about mdadm resource script on rgmanager? Message-ID: <4AFC8EEE.7090802@gmail.com> Is it approved?? http://archives.free.net.ph/message/20090817.102217.ff430e36.en.html -- CL Martinez carlopmart {at} gmail {d0t} com From teigland at redhat.com Thu Nov 12 23:49:46 2009 From: teigland at redhat.com (David Teigland) Date: Thu, 12 Nov 2009 17:49:46 -0600 Subject: [Linux-cluster] Re: GFS Errors - cant mount gfs shares In-Reply-To: References: Message-ID: <20091112234945.GA28893@redhat.com> On Thu, Nov 12, 2009 at 04:22:17PM -0600, Alan A wrote: > Here are the packages that caused the lockup: > > [root at fenmrdev02 ~]# rpm -qa | grep sg3 > sg3_utils-libs-1.25-4.el5 > sg3_utils-1.25-4.el5 > sg3_utils-devel-1.25-4.el5 These packages are unrelated to the gfs_controld errors. > > Nov 12 15:28:20 fenmrdev04 ntpd[3340]: kernel time sync enabled 0001 > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 a11 > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 surv34 > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 account61 > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 acct63 > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 gfs_web > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 cati_gfs > > Nov 12 15:28:27 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt open > > error 12 gfs_cmdr These may or may not create problems. To figure out why they happened we'd need to see "group_tool dump gfs" from each of the nodes. Dave From brem.belguebli at gmail.com Thu Nov 12 23:37:45 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Fri, 13 Nov 2009 00:37:45 +0100 Subject: [Linux-cluster] What about mdadm resource script on rgmanager? In-Reply-To: <4AFC8EEE.7090802@gmail.com> References: <4AFC8EEE.7090802@gmail.com> Message-ID: <29ae894c0911121537l32a4ff8ex21cf78d0a78e3903@mail.gmail.com> Not "yet" I don't even know if they had a glance at it 2009/11/12 carlopmart : > Is it approved?? > http://archives.free.net.ph/message/20090817.102217.ff430e36.en.html > -- > CL Martinez > carlopmart {at} gmail {d0t} com > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From teigland at redhat.com Fri Nov 13 16:26:57 2009 From: teigland at redhat.com (David Teigland) Date: Fri, 13 Nov 2009 10:26:57 -0600 Subject: [Linux-cluster] Re: GFS Errors - cant mount gfs shares In-Reply-To: References: <20091112234945.GA28893@redhat.com> Message-ID: <20091113162657.GA16831@redhat.com> On Fri, Nov 13, 2009 at 09:13:17AM -0600, Alan A wrote: > On Thu, Nov 12, 2009 at 5:49 PM, David Teigland wrote: > > > On Thu, Nov 12, 2009 at 04:22:17PM -0600, Alan A wrote: > > > Here are the packages that caused the lockup: > > > > > > [root at fenmrdev02 ~]# rpm -qa | grep sg3 > > > sg3_utils-libs-1.25-4.el5 > > > sg3_utils-1.25-4.el5 > > > sg3_utils-devel-1.25-4.el5 > > > > These packages are unrelated to the gfs_controld errors. > > > > > > Nov 12 15:28:20 fenmrdev04 ntpd[3340]: kernel time sync enabled 0001 > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 a11 > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 surv34 > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 account61 > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 acct63 > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 gfs_web > > > > Nov 12 15:28:26 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 cati_gfs > > > > Nov 12 15:28:27 fenmrdev04 gfs_controld[2935]: retrieve_plocks: ckpt > > open > > > > error 12 gfs_cmdr > > > > These may or may not create problems. To figure out why they happened > > we'd need to see "group_tool dump gfs" from each of the nodes. > > > > Dave > > > > > Here is what I started with and where I am today. > > I had only one node out of three being able to mount GFS (clust has node > 2-3-4). The other nodes would tell me that /dev/mapper/gfsshare was not a > block device (node 2 and 4). I worked to see what changed and I found out > that November 5th update installed sg3_utils on two of the nodes that had > problem mounting GFS. I also found (I am not sure how this happened) that > one of the node 4 had service scsi_reserve running. As soon as I removed it, > a simple reboot allowed me to mount GFS on node4, but node 2 sill had the > same problem same errors. I tried looking if there is SCSI key reservation > active on one of the volumes, but no luck, no key was returned on any of the > GFS volumes. > > Today, something different..... > I am not sure what is going on but I can't mount GFS on all three nodes. I > was able to mount it on node2, but then I restarted node3 and everything > went to hell again. > > Here is the output from gfs_tool dump at the time when GFS was mounted: The retrieve_plocks errors are a harmless side effect of the failing mount syscalls, which are returning ENODEV. Are you using fence_scsi? I'm guessing not since you didn't have sg3_utils until now. As bizarre as it may sound, it seems that init.d/scsi_reserve may be applying scsi reservations on your devices, which you don't want of course, and which would explain the mount errors. I don't know how or why scsi_reserve is running, but you need to disable it (again assuming you're not using fence_scsi for your cluster.) Dave From lhh at redhat.com Fri Nov 13 18:15:35 2009 From: lhh at redhat.com (Lon H. Hohberger) Date: Fri, 13 Nov 2009 13:15:35 -0500 Subject: [Linux-cluster] What about mdadm resource script on rgmanager? In-Reply-To: <29ae894c0911121537l32a4ff8ex21cf78d0a78e3903@mail.gmail.com> References: <4AFC8EEE.7090802@gmail.com> <29ae894c0911121537l32a4ff8ex21cf78d0a78e3903@mail.gmail.com> Message-ID: <1258136135.2615.16.camel@localhost> On Fri, 2009-11-13 at 00:37 +0100, brem belguebli wrote: > Not "yet" > > I don't even know if they had a glance at it I looked at it ... but it got pushed out of my head later. I'm sorry. :( It should be possible to add, but I think 3.0.6 is as early as we'll see it. If I read it right, it functions a lot the LVM agent - assemble on one host at a time. This should be fine. Some minor changes will need to be made to fix tmpfile annoyances ( $(mktemp -f /tmp/foo.XXXXX) instead of /tmp/foo.$$ ) and we need to tag one of the attributes as 'primary' (probably raidconf) in the metadata, but otherwise it looks good. Marek should also look at it as well if he hasn't yet. -- Lon From gordan at bobich.net Sat Nov 14 02:11:42 2009 From: gordan at bobich.net (Gordan Bobic) Date: Sat, 14 Nov 2009 02:11:42 +0000 Subject: [Linux-cluster] openais[5817]: [TOTEM] The token was lost in the OPERATIONAL state. In-Reply-To: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11095@PWR-XCH-03.pwrutc.com> References: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11095@PWR-XCH-03.pwrutc.com> Message-ID: <4AFE11DE.70002@bobich.net> Swift, Jon S PWR wrote: > All, > I have a 2 node test cluster made up of Dell 1850's with only > virtual IP's as services supporting NFS on 3 GFS2 file systems using > RHEL5U4 64 bit. Both nodes of the cluster export/share all 3 file > systems all the time. When I create a NFS load that reduces the CPU > %idle to less than 75% (as shown by top or vmstat) I have problems with > my cluster crashing. Have you tried the same setup with GFS1 instead of GFS2? Gordan From brem.belguebli at gmail.com Sat Nov 14 02:42:32 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Sat, 14 Nov 2009 03:42:32 +0100 Subject: [Linux-cluster] What about mdadm resource script on rgmanager? In-Reply-To: <1258136135.2615.16.camel@localhost> References: <4AFC8EEE.7090802@gmail.com> <29ae894c0911121537l32a4ff8ex21cf78d0a78e3903@mail.gmail.com> <1258136135.2615.16.camel@localhost> Message-ID: <29ae894c0911131842g593c9b4bqe90a6aa7f88792e@mail.gmail.com> Hi, Good news to hear that you've looked at it. I have posted a corrected version a little while after the original post at https://www.redhat.com/archives/linux-cluster/2009-August/msg00179.html which contains indeed the primary set on raidconf. Brem 2009/11/13 Lon H. Hohberger : > On Fri, 2009-11-13 at 00:37 +0100, brem belguebli wrote: >> Not "yet" >> >> I don't even know if they had a glance at it > > I looked at it ... but it got pushed out of my head later. ?I'm > sorry. :( > > It should be possible to add, but I think 3.0.6 is as early as we'll see > it. > > If I read it right, it functions a lot the LVM agent - assemble on one > host at a time. ?This should be fine. > > Some minor changes will need to be made to fix tmpfile annoyances > ( $(mktemp -f /tmp/foo.XXXXX) instead of /tmp/foo.$$ ) and we need to > tag one of the attributes as 'primary' (probably raidconf) in the > metadata, but otherwise it looks good. > > Marek should also look at it as well if he hasn't yet. > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From Jon.Swift at pwr.utc.com Sat Nov 14 04:19:27 2009 From: Jon.Swift at pwr.utc.com (Swift, Jon S PWR) Date: Fri, 13 Nov 2009 20:19:27 -0800 Subject: [Linux-cluster] openais[5817]: [TOTEM] The token was lost inthe OPERATIONAL state. In-Reply-To: <4AFE11DE.70002@bobich.net> References: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11095@PWR-XCH-03.pwrutc.com> <4AFE11DE.70002@bobich.net> Message-ID: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11099@PWR-XCH-03.pwrutc.com> Yes, Same problem. I do not think this is a GFS/GFS2 problem. I think it is an openais problem. Openais is what is making the decision to move the services and to fence the remote system. Any ideas? Jon -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Gordan Bobic Sent: Friday, November 13, 2009 6:12 PM To: linux clustering Subject: Re: [Linux-cluster] openais[5817]: [TOTEM] The token was lost inthe OPERATIONAL state. Swift, Jon S PWR wrote: > All, > I have a 2 node test cluster made up of Dell 1850's with only > virtual IP's as services supporting NFS on 3 GFS2 file systems using > RHEL5U4 64 bit. Both nodes of the cluster export/share all 3 file > systems all the time. When I create a NFS load that reduces the CPU > %idle to less than 75% (as shown by top or vmstat) I have problems > with my cluster crashing. Have you tried the same setup with GFS1 instead of GFS2? Gordan -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From kkovachev at varna.net Sat Nov 14 11:24:45 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Sat, 14 Nov 2009 13:24:45 +0200 Subject: [Linux-cluster] Cluster death Message-ID: <20091114105155.M7450@varna.net> Hello, yesterday the cluster died after a short network outage i guess and Node2 (which is the only one with single NIC without bonding) have been fenced, but services not relocated and after rebooted was unable to mount the GFS2 shares and i had to reboot the entire cluster. It's a 4 node cluster with Node1 and Node2 booting from the network (from Stor1 and Stor2). From the logs i can see that both Stor1 and Stor2 have the same trace (below in csv format from the rsyslog database as nodes don't have local storage) which is probably why Node2 was unable to rejoin the cluster even it was fenced. "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.547726] ------------[ cut here ]------------" "2009-11-13 16:41:12","Stor1","kernel","critical","kernel:"," [773501.559005] kernel BUG at fs/inode.c:1323!" "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.559251] invalid opcode: 0000 [#1] SMP " "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.559516] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.559946] CPU 6 " "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.560173] Modules linked in: gfs2 dlm configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iscsi_trgt drbd e1000e [last unloaded: configfs]" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.560690] Pid: 7742, comm: dlm_send Not tainted 2.6.31.4 #13 X8STi" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.560910] RIP: 0010:[] [] iput+0x1b/0x65" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.561391] RSP: 0018:ffff8801a85fdc60 EFLAGS: 00010246" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.561625] RAX: 0000000000000000 RBX: ffff8801af7cb488 RCX: ffff8801af7cb440" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.562053] RDX: ffff8801b7e25480 RSI: ffffea0005e633c0 RDI: ffff8801af7cb488" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.562545] RBP: ffff8801a85fdc70 R08: ffff8801a85fdbe0 R09: 8000000000000000" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.562976] R10: 0000000000000000 R11: ffff8801a85fdbc0 R12: ffff8801af7cb440" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.563480] R13: ffff8801add24010 R14: ffff8801add24030 R15: ffff8801a8600000" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.563920] FS: 0000000000000000(0000) GS:ffff8800280e8000(0000) knlGS:0000000000000000" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.564401] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.564652] CR2: 00007f258933d9c8 CR3: 00000001bcd03000 CR4: 00000000000006e0" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.565082] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.565568] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.565992] Process dlm_send (pid: 7742, threadinfo ffff8801a85fc000, task ffff8801a8600000)" "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.566487] Stack:" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.566698] 0000000000000000 0000000000000000 ffff8801a85fdc90 ffffffff815dedc0" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.566932] <0> 0000000000000000 ffff8801add24000 ffff8801a85fddd0 ffffffffa00e2f0f" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.567418] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000" "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.568081] Call Trace:" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.568347] [] sock_release+0x5c/0x6c" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.568609] [] tcp_connect_to_sock+0x1f4/0x247 [dlm]" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.568834] [] ? __wake_up+0x43/0x50" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.569063] [] process_send_sockets+0x31/0x1a5 [dlm]" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.569334] [] ? tcp_sendpage+0x0/0x46e" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.569592] [] worker_thread+0x17d/0x222" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.569821] [] ? process_send_sockets+0x0/0x1a5 [dlm]" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.570044] [] ? autoremove_wake_function+0x0/0x38" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.570305] [] ? worker_thread+0x0/0x222" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.570551] [] kthread+0x8f/0x97" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.570769] [] child_rip+0xa/0x20" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.570989] [] ? resched_task+0x6a/0x6e" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.571236] [] ? kthread+0x0/0x97" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.571490] [] ? child_rip+0x0/0x20" "2009-11-13 16:41:12","Stor1","kernel","emerg","kernel:"," [773501.571721] Code: e8 28 96 00 00 eb df 48 83 c4 20 5b 41 5c c9 c3 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 50 48 83 bf 10 02 00 00 40 75 04 <0f> 0b eb fe 48 8d 7f 48 48 c7 c6 10 b9 d5 81 e8 42 0b 24 00 85 " "2009-11-13 16:41:12","Stor1","kernel","alert","kernel:"," [773501.572592] RIP [] iput+0x1b/0x65" "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.572824] RSP " "2009-11-13 16:41:12","Stor1","kernel","warning","kernel:"," [773501.573739] ---[ end trace 209b2e1c9e1ac145 ]---" "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.412503] ------------[ cut here ]------------" "2009-11-13 16:41:12","Stor2","kernel","critical","kernel:"," [775403.412887] kernel BUG at fs/inode.c:1323!" "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.413262] invalid opcode: 0000 [#1] SMP " "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.413387] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] CPU 3 " "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] Modules linked in: gfs2 dlm configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iscsi_trgt drbd igb [last unloaded: configfs]" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] Pid: 5478, comm: dlm_send Not tainted 2.6.31.4 #3 Unknow" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] RIP: 0010:[] [] iput+0x1b/0x65" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] RSP: 0018:ffff88012a49dc60 EFLAGS: 00010246" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] RAX: 0000000000000000 RBX: ffff88013210bcc8 RCX: ffff88013210bc80" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] RDX: ffff88013a5be800 RSI: ffffea00042f39c0 RDI: ffff88013210bcc8" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] RBP: ffff88012a49dc70 R08: ffff88012a49dbe0 R09: 8000000000000000" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] R10: ffff88012c1dbca8 R11: ffff88012a49dbc0 R12: ffff88013210bc80" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] R13: ffff88012c1671c0 R14: ffff88012c1671e0 R15: ffff88012a490f20" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] FS: 00007fb62c5086f0(0000) GS:ffff88002807c000(0000) knlGS:0000000000000000" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] CR2: 00007fb62c522e49 CR3: 0000000129caf000 CR4: 00000000000006e0" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000" "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.413387] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] Process dlm_send (pid: 5478, threadinfo ffff88012a49c000, task ffff88012a490f20)" "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.413387] Stack:" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] 0000000000000000 0000000000000000 ffff88012a49dc90 ffffffff815dedc0" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] <0> 0000000000000000 ffff88012c1671b0 ffff88012a49ddd0 ffffffffa00d9f0f" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] <0> ffff88012a490f58 ffff8800280917e0 0000000000000001 ffffc900116de088" "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.413387] Call Trace:" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] sock_release+0x5c/0x6c" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] tcp_connect_to_sock+0x1f4/0x247 [dlm]" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? __wake_up+0x43/0x50" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] process_send_sockets+0x31/0x1a5 [dlm]" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? tcp_sendpage+0x0/0x46e" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] worker_thread+0x17d/0x222" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? process_send_sockets+0x0/0x1a5 [dlm]" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? autoremove_wake_function+0x0/0x38" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? worker_thread+0x0/0x222" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] kthread+0x8f/0x97" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] child_rip+0xa/0x20" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? kthread+0x0/0x97" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] [] ? child_rip+0x0/0x20" "2009-11-13 16:41:12","Stor2","kernel","emerg","kernel:"," [775403.413387] Code: e8 28 96 00 00 eb df 48 83 c4 20 5b 41 5c c9 c3 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 50 48 83 bf 10 02 00 00 40 75 04 <0f> 0b eb fe 48 8d 7f 48 48 c7 c6 10 b9 d5 81 e8 42 0b 24 00 85 " "2009-11-13 16:41:12","Stor2","kernel","alert","kernel:"," [775403.413387] RIP [] iput+0x1b/0x65" "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.413387] RSP " "2009-11-13 16:41:12","Stor2","kernel","warning","kernel:"," [775403.453850] ---[ end trace 1a43987c53646c14 ]---" "2009-11-13 16:41:12","Stor1","local4","notice","corosync[7441]:"," [MAIN ] Completed service synchronization, ready to provide service." "2009-11-13 16:41:12","Stor2","local4","notice","corosync[5211]:"," [MAIN ] Completed service synchronization, ready to provide service." "2009-11-13 16:41:12","Node1","local4","notice","corosync[4710]:"," [MAIN ] Completed service synchronization, ready to provide service." "2009-11-13 16:41:12","Stor2","local4","info","fenced[5286]:"," fencing deferred to Node1" "2009-11-13 16:41:12","Stor1","local4","info","fenced[7529]:"," fencing deferred to Node1" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.685992] GFS2: fsid=Cluster1:Hosting.0: jid=2: Trying to acquire journal lock..." "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.553450] GFS2: fsid=Cluster1:Hosting.1: jid=2: Trying to acquire journal lock..." "2009-11-13 16:41:12","Node1","local4","info","fenced[4831]:"," fencing node Node2" "2009-11-13 16:41:12","Stor1","kernel","info","kernel:"," [773501.689702] GFS2: fsid=Cluster1:Services.2: jid=1: Trying to acquire journal lock..." "2009-11-13 16:41:12","Stor2","kernel","info","kernel:"," [775403.557265] GFS2: fsid=Cluster1:Services.3: jid=1: Trying to acquire journal lock..." "2009-11-13 16:41:12","Node1","kernel","info","kernel:"," [775171.695293] GFS2: fsid=Cluster1:Services.0: jid=1: Trying to acquire journal lock..." "2009-11-13 16:41:12","Node1","kernel","info","kernel:"," [775171.697581] GFS2: fsid=Cluster1:Mails.0: jid=1: Trying to acquire journal lock..." "2009-11-13 16:41:26","Node1","local4","error","fenced[4831]:"," fence Node2 success" "2009-11-13 16:41:32","Stor1","auth","info","sshd[7120]:"," Timeout, client not responding." I have just increased post_fail_dalay from 0 to 5 with the hope to avoid future problems, but maybe i should also include configfs in the kernel instead of a module to avoid the crash? From mbmartinbadie at gmail.com Sun Nov 15 08:59:41 2009 From: mbmartinbadie at gmail.com (Martin Badie) Date: Sun, 15 Nov 2009 10:59:41 +0200 Subject: [Linux-cluster] clustat failed state Message-ID: Hi, I have tried to setup a simple http cluster setup but when I reboot a node manually cluster service state goes to failed and manual intervention required. Can someone enligthen me what is wrong with the following config. If I fence one one service are taken over by the other node but not with reboot Here is my config : -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmicmirregs at gmail.com Sun Nov 15 10:59:15 2009 From: rmicmirregs at gmail.com (Rafael =?ISO-8859-1?Q?Mic=F3?= Miranda) Date: Sun, 15 Nov 2009 11:59:15 +0100 Subject: [Linux-cluster] What about mdadm resource script on rgmanager? In-Reply-To: <1258136135.2615.16.camel@localhost> References: <4AFC8EEE.7090802@gmail.com> <29ae894c0911121537l32a4ff8ex21cf78d0a78e3903@mail.gmail.com> <1258136135.2615.16.camel@localhost> Message-ID: <1258282755.6570.3.camel@mecatol> Hi Lon, El vie, 13-11-2009 a las 13:15 -0500, Lon H. Hohberger escribi?: > On Fri, 2009-11-13 at 00:37 +0100, brem belguebli wrote: > > Not "yet" > > > > I don't even know if they had a glance at it > > I looked at it ... but it got pushed out of my head later. I'm > sorry. :( > > It should be possible to add, but I think 3.0.6 is as early as we'll see > it. > > If I read it right, it functions a lot the LVM agent - assemble on one > host at a time. This should be fine. > > Some minor changes will need to be made to fix tmpfile annoyances > ( $(mktemp -f /tmp/foo.XXXXX) instead of /tmp/foo.$$ ) and we need to > tag one of the attributes as 'primary' (probably raidconf) in the > metadata, but otherwise it looks good. > > Marek should also look at it as well if he hasn't yet. > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster I must ask: I provided a LVM resource script some months ago, will it be studied? I provided a new version at: https://www.redhat.com/archives/cluster-devel/2009-October/msg00012.html I'm pretty interested on it, I would make any necessary changes in it if needed. Thanks in advance, Rafael -- Rafael Mic? Miranda From brem.belguebli at gmail.com Sun Nov 15 11:17:37 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Sun, 15 Nov 2009 12:17:37 +0100 Subject: [Linux-cluster] Fwd: CLVM exclusive mode In-Reply-To: <1258282924.6570.6.camel@mecatol> References: <4A7AA7DE02000027000536E3@lucius.provo.novell.com> <29ae894c0910011557r3a4a716dudd0f8787ba81803@mail.gmail.com> <29ae894c0910020022h3fb8e2er2e3569441ea6f30a@mail.gmail.com> <1254513882.6508.10.camel@mecatol> <29ae894c0910021408o366ce744m120b41f65b7e579f@mail.gmail.com> <1254518717.6501.10.camel@mecatol> <29ae894c0910021506l95d31c3x742e76eaf3a0f50d@mail.gmail.com> <1254647645.6501.0.camel@mecatol> <29ae894c0910040528t2098b281n5f848cfe83cf8b43@mail.gmail.com> <1258282924.6570.6.camel@mecatol> Message-ID: <29ae894c0911150317k48c6b732r665ee9ce42552f5e@mail.gmail.com> Hi Rafael, Yep, the fix will be put in LVM2-2.02-54. Concerning the md resource script, I got enjoyed to learn that Lon considered it. I hope your LVM exclusive one will be too, as I do rely on it. Brem 2009/11/15 Rafael Mic? Miranda : > Hi Brem, > > I just saw your email and answered Lon about taking a look at my LVM > resource script. > > Also, i just checked that the LVM exclusive flag seems to be fixed: > > https://bugzilla.redhat.com/show_bug.cgi?id=517900 > > So maybe all this stuff gets working some day. > > Cheers, > > Rafael > > El dom, 04-10-2009 a las 14:28 +0200, brem belguebli escribi?: >> Hi Rafael, >> >> Nada, but I won't put no pressure about that as things are being >> evolving with LVM mirror.... >> >> Brem >> >> >> >> 2009/10/4 Rafael Mic? Miranda : >> > Hi Brem, >> > >> > >> >> >> >> For your resource, you should send a direct email to Lon Hohberger and >> >> Fabio di Nitto ask them >> >> what they think of it and if they consider it to be incorporated with >> >> rgmanager. >> >> >> > >> > Did you have any success or reply or something about your mdadm resource >> > script? >> > >> > Cheers, >> > >> > Rafael >> > >> > -- >> > Rafael Mic? Miranda >> > >> > > -- > Rafael Mic? Miranda > > From carlopmart at gmail.com Mon Nov 16 08:16:43 2009 From: carlopmart at gmail.com (carlopmart) Date: Mon, 16 Nov 2009 09:16:43 +0100 Subject: [Linux-cluster] Minimal partition size for GFS2 filesystem Message-ID: <4B010A6B.8060500@gmail.com> Hi all, Which is the minimal partition size that needs GFS2?? Thanks. -- CL Martinez carlopmart {at} gmail {d0t} com From teigland at redhat.com Mon Nov 16 16:11:32 2009 From: teigland at redhat.com (David Teigland) Date: Mon, 16 Nov 2009 10:11:32 -0600 Subject: [Linux-cluster] openais[5817]: [TOTEM] The token was lost in the OPERATIONAL state. In-Reply-To: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11095@PWR-XCH-03.pwrutc.com> References: <5A3CA8FF800F1E418CC86A0FCC71A36B06D11095@PWR-XCH-03.pwrutc.com> Message-ID: <20091116161132.GA14875@redhat.com> On Fri, Nov 13, 2009 at 03:38:04PM -0800, Swift, Jon S PWR wrote: > All, > I have a 2 node test cluster made up of Dell 1850's with only > virtual IP's as services supporting NFS on 3 GFS2 file systems using > RHEL5U4 64 bit. Both nodes of the cluster export/share all 3 file > systems all the time. When I create a NFS load that reduces the CPU > %idle to less than 75% (as shown by top or vmstat) I have problems with > my cluster crashing. I'm using iozone to generate the load from separate > NFS clients. nfsv3 or v4? v4 might do better. > The higher the load on the cluster the more often this > happens. Under a very heavy load it will fail within 5 minutes. But with > a light load, CPU %idle above 75% I see no problems. One system logs > messages like the following, the other one crashes. Most of this CPU > load is I/O wait time. The private network connecting my 2 node cluster > together is currently a cat5 cross over cable. I tried a 10/100/1000 hub > as well, but with it in I was logging collisions. The private network is > using IP's 192.168.15.1 (hostname ic-cnfs01) and 192.168.15.2 (hostname > ic-cnfs02). The storage is an EMC CX3-40, with PowerPath supporting the > logical volumes the GFS2 file systems are built on. > > How do I prevent this condition from happening? Thanks in > advance. > > > Nov 13 11:39:14 cnfs01 openais[5817]: [TOTEM] The token was lost in the > OPERATIONAL state. This is the standard message you get when the token doesn't arrive within the timeout period. You could try increasing the token timeout to say 30 seconds, (default is 10 seconds) > The cluster.conf file is below > > > > The openais.conf file is below openais.conf is ignored when using cman. Dave From carlopmart at gmail.com Tue Nov 17 11:32:24 2009 From: carlopmart at gmail.com (carlopmart) Date: Tue, 17 Nov 2009 12:32:24 +0100 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <4B010A6B.8060500@gmail.com> References: <4B010A6B.8060500@gmail.com> Message-ID: <4B0289C8.50409@gmail.com> carlopmart wrote: > Hi all, > > Which is the minimal partition size that needs GFS2?? > > Thanks. > Please, any hints? -- CL Martinez carlopmart {at} gmail {d0t} com From swhiteho at redhat.com Tue Nov 17 11:34:29 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 17 Nov 2009 11:34:29 +0000 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <4B0289C8.50409@gmail.com> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> Message-ID: <1258457669.6052.896.camel@localhost.localdomain> Hi, On Tue, 2009-11-17 at 12:32 +0100, carlopmart wrote: > carlopmart wrote: > > Hi all, > > > > Which is the minimal partition size that needs GFS2?? > > > > Thanks. > > > > Please, any hints? > I'm not sure I understand the question. Are you asking what is the minimum size of a GFS2 filesystem? That depends a lot on the journal sizes, but it wouldn't make sense to make it too small, Steve. From carlopmart at gmail.com Tue Nov 17 11:40:04 2009 From: carlopmart at gmail.com (carlopmart) Date: Tue, 17 Nov 2009 12:40:04 +0100 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <1258457669.6052.896.camel@localhost.localdomain> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> Message-ID: <4B028B94.2020500@gmail.com> Steven Whitehouse wrote: > Hi, > > On Tue, 2009-11-17 at 12:32 +0100, carlopmart wrote: >> carlopmart wrote: >>> Hi all, >>> >>> Which is the minimal partition size that needs GFS2?? >>> >>> Thanks. >>> >> Please, any hints? >> > > I'm not sure I understand the question. Are you asking what is the > minimum size of a GFS2 filesystem? That depends a lot on the journal > sizes, but it wouldn't make sense to make it too small, > > Steve. > > I need to create a GFS2 filesystem only for two nodes to store some configuration text files ... -- CL Martinez carlopmart {at} gmail {d0t} com From kkovachev at varna.net Tue Nov 17 11:42:11 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 17 Nov 2009 13:42:11 +0200 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <4B028B94.2020500@gmail.com> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> <4B028B94.2020500@gmail.com> Message-ID: <20091117114143.M46261@varna.net> On Tue, 17 Nov 2009 12:40:04 +0100, carlopmart wrote > Steven Whitehouse wrote: > > Hi, > > > > On Tue, 2009-11-17 at 12:32 +0100, carlopmart wrote: > >> carlopmart wrote: > >>> Hi all, > >>> > >>> Which is the minimal partition size that needs GFS2?? > >>> > >>> Thanks. > >>> > >> Please, any hints? > >> > > > > I'm not sure I understand the question. Are you asking what is the > > minimum size of a GFS2 filesystem? That depends a lot on the journal > > sizes, but it wouldn't make sense to make it too small, > > > > Steve. > > > > > > I need to create a GFS2 filesystem only for two nodes to store some configuration > text files ... Why not just rsync some folder? > -- > CL Martinez > carlopmart {at} gmail {d0t} com > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From carlopmart at gmail.com Tue Nov 17 11:53:05 2009 From: carlopmart at gmail.com (carlopmart) Date: Tue, 17 Nov 2009 12:53:05 +0100 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <20091117114143.M46261@varna.net> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> <4B028B94.2020500@gmail.com> <20091117114143.M46261@varna.net> Message-ID: <4B028EA1.9040807@gmail.com> Kaloyan Kovachev wrote: > On Tue, 17 Nov 2009 12:40:04 +0100, carlopmart wrote >> Steven Whitehouse wrote: >>> Hi, >>> >>> On Tue, 2009-11-17 at 12:32 +0100, carlopmart wrote: >>>> carlopmart wrote: >>>>> Hi all, >>>>> >>>>> Which is the minimal partition size that needs GFS2?? >>>>> >>>>> Thanks. >>>>> >>>> Please, any hints? >>>> >>> I'm not sure I understand the question. Are you asking what is the >>> minimum size of a GFS2 filesystem? That depends a lot on the journal >>> sizes, but it wouldn't make sense to make it too small, >>> >>> Steve. >>> >>> >> I need to create a GFS2 filesystem only for two nodes to store some > configuration >> text files ... > > Why not just rsync some folder? > Maybe an option, but how can I sync configuration files in real time with rsync?? I don't see very clear ... -- CL Martinez carlopmart {at} gmail {d0t} com From brem.belguebli at gmail.com Tue Nov 17 11:53:34 2009 From: brem.belguebli at gmail.com (brem belguebli) Date: Tue, 17 Nov 2009 12:53:34 +0100 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <20091117114143.M46261@varna.net> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> <4B028B94.2020500@gmail.com> <20091117114143.M46261@varna.net> Message-ID: <29ae894c0911170353q1f23bbd9y5ce89ae30941aeb1@mail.gmail.com> I think the constraint is just like for regular filesystems. 1 GB should be right, shouldn't it ? 2009/11/17, Kaloyan Kovachev : > On Tue, 17 Nov 2009 12:40:04 +0100, carlopmart wrote > > Steven Whitehouse wrote: > > > Hi, > > > > > > On Tue, 2009-11-17 at 12:32 +0100, carlopmart wrote: > > >> carlopmart wrote: > > >>> Hi all, > > >>> > > >>> Which is the minimal partition size that needs GFS2?? > > >>> > > >>> Thanks. > > >>> > > >> Please, any hints? > > >> > > > > > > I'm not sure I understand the question. Are you asking what is the > > > minimum size of a GFS2 filesystem? That depends a lot on the journal > > > sizes, but it wouldn't make sense to make it too small, > > > > > > Steve. > > > > > > > > > > I need to create a GFS2 filesystem only for two nodes to store some > configuration > > text files ... > > Why not just rsync some folder? > > > -- > > CL Martinez > > carlopmart {at} gmail {d0t} com > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From swhiteho at redhat.com Tue Nov 17 12:07:00 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Tue, 17 Nov 2009 12:07:00 +0000 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <29ae894c0911170353q1f23bbd9y5ce89ae30941aeb1@mail.gmail.com> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> <4B028B94.2020500@gmail.com> <20091117114143.M46261@varna.net> <29ae894c0911170353q1f23bbd9y5ce89ae30941aeb1@mail.gmail.com> Message-ID: <1258459620.6052.898.camel@localhost.localdomain> Hi, On Tue, 2009-11-17 at 12:53 +0100, brem belguebli wrote: > I think the constraint is just like for regular filesystems. > > 1 GB should be right, shouldn't it ? > Well there are journals of 128M each (default) so for two nodes, thats 256M, so with one or two smaller structures (they take only a few blocks each) 1G would seem a sensible minimum in this case, Steve. From kkovachev at varna.net Tue Nov 17 12:48:52 2009 From: kkovachev at varna.net (Kaloyan Kovachev) Date: Tue, 17 Nov 2009 14:48:52 +0200 Subject: [Linux-cluster] Re: Minimal partition size for GFS2 filesystem In-Reply-To: <1258459620.6052.898.camel@localhost.localdomain> References: <4B010A6B.8060500@gmail.com> <4B0289C8.50409@gmail.com> <1258457669.6052.896.camel@localhost.localdomain> <4B028B94.2020500@gmail.com> <20091117114143.M46261@varna.net> <29ae894c0911170353q1f23bbd9y5ce89ae30941aeb1@mail.gmail.com> <1258459620.6052.898.camel@localhost.localdomain> Message-ID: <20091117124052.M18458@varna.net> On Tue, 17 Nov 2009 12:07:00 +0000, Steven Whitehouse wrote > Hi, > > On Tue, 2009-11-17 at 12:53 +0100, brem belguebli wrote: > > I think the constraint is just like for regular filesystems. > > > > 1 GB should be right, shouldn't it ? > > > Well there are journals of 128M each (default) so for two nodes, thats > 256M, so with one or two smaller structures (they take only a few blocks > each) 1G would seem a sensible minimum in this case, > Yes 128M is by default but the minimum is 8M, so 8x2 jurnals = 16M + some space for resource groups = 19M fs is possible with ~1M for data ... just confirmed: dd if=/dev/zero of=/tmp/test_gfs bs=1M count=19 losetup /dev/loop0 /tmp/test_gfs mkfs.gfs2 /dev/loop0 -j 2 -J 8 -p lock_dlm -t a:b mount -o lockproto=lock_nolock /dev/loop0 /mnt/tmp/ df -h result /dev/loop0 19M 19M 840K 96% /mnt/tmp > Steve. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From ccaulfie at redhat.com Tue Nov 17 13:40:14 2009 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 17 Nov 2009 13:40:14 +0000 Subject: [Linux-cluster] Corosync & SELinux in Fedora 12 Message-ID: <4B02A7BE.3040907@redhat.com> Hi all, Fedora12 has a full policy in it for Red Hat Cluster Suite and corosync, so it should be quite possible to run clustering with SELinux in enforcing mode now. It has been fairly well tested but there still could be some areas left that need attention, please report a problem in the Red Hat bugzilla if you see any unwanted AVCs. There is currently one known problem (and that's why I'm ccing the openais list too), and that is if you run corosync without cman you could get some AVCs. This problem is fixed in the selinux policy revision -43, but -41 is in Fedora GA so it might be a little while before it reaches the archives. In the meantime the problem is easily fixed with a single command: # chcon -t initrc_exec_t /etc/init.d/corosync Chrissie From kpodesta at redbrick.dcu.ie Tue Nov 17 15:01:40 2009 From: kpodesta at redbrick.dcu.ie (Karl Podesta) Date: Tue, 17 Nov 2009 15:01:40 +0000 Subject: [Linux-cluster] Quorum Disk on 2 nodes out of 4? Message-ID: <20091117150140.GA4174@minerva.redbrick.dcu.ie> Hi there, Is it possible to have a quorum disk, applicable only to 2 nodes out of a 4 node cluster? (i.e. with the other 2 nodes not connected to the shared quorum disk storage, or not affected by failover or service operation on the 2 nodes that are sharing a disk?) I have encountered the following scenario with a 4-node cluster: (using RHEL 4.8) ============= Nodes 1 & 2: Acting as database nodes, sharing a database service between them (there is a shared disk with an ext3 partition that is only mounted on one node at a time) Node 3: Standalone network service, with an IP address. No shared storage. Network service uses the database on nodes 1 & 2. Node 4: Standalone network service, with an IP address. No shared storage. Network service uses the database on nodes 1 & 2. ============= Failover domains are configured appropriately (one for Nodes 1 & 2, one for Node 3, one for Node 4). The owner of the cluster would like to introduce a shared quorum disk to the two DB nodes, to ensure that the DB service basically fails over between the DB nodes, in the case where shared disk access is lost (i.e. card or cable failure) on the active DB node. This architecture should really be two clusters, right? One separate cluster for DB, and another cluster (even if) for the network apps? I gather it has been configured this way from a desire to logically keep all of these services together in the one "system". Apologies if a similar question has been asked in the past, any inputs, thoughts, or pointers welcome. Karl -- Karl Podesta From fdinitto at redhat.com Wed Nov 18 05:32:25 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Wed, 18 Nov 2009 06:32:25 +0100 Subject: [Linux-cluster] Quorum Disk on 2 nodes out of 4? In-Reply-To: <20091117150140.GA4174@minerva.redbrick.dcu.ie> References: <20091117150140.GA4174@minerva.redbrick.dcu.ie> Message-ID: <4B0386E9.5070903@redhat.com> Karl Podesta wrote: > Hi there, > > Is it possible to have a quorum disk, applicable only to 2 nodes > out of a 4 node cluster? No. The prerequisite for qdisk to work is for all nodes in a cluster to have it running at the same time. > This architecture should really be two clusters, right? One separate > cluster for DB, and another cluster (even if) for the network apps? Yes and no. By splitting the 4 nodes cluster in 2x2nodes clusters will only help partially. The cluster with qdisk will be ok because qdisk will act as tie breaker for fencing/quorum, but the other 2 nodes cluster (for network services) will require other kind of attentions to avoid fencing races. > I gather it has been configured this way from a desire to logically > keep all of these services together in the one "system". > > Apologies if a similar question has been asked in the past, any inputs, > thoughts, or pointers welcome. Ideally you would find a way to plug the storage into the 2 nodes that do not have it now, and then run qdisk on top. At that point you can also benefit from "global" failover of the applications across all the nodes. Fabio From kpodesta at redbrick.dcu.ie Wed Nov 18 11:08:42 2009 From: kpodesta at redbrick.dcu.ie (Karl Podesta) Date: Wed, 18 Nov 2009 11:08:42 +0000 Subject: [Linux-cluster] Quorum Disk on 2 nodes out of 4? In-Reply-To: <4B0386E9.5070903@redhat.com> References: <20091117150140.GA4174@minerva.redbrick.dcu.ie> <4B0386E9.5070903@redhat.com> Message-ID: <20091118110841.GA7617@minerva.redbrick.dcu.ie> On Wed, Nov 18, 2009 at 06:32:25AM +0100, Fabio M. Di Nitto wrote: > > Apologies if a similar question has been asked in the past, any inputs, > > thoughts, or pointers welcome. > > Ideally you would find a way to plug the storage into the 2 nodes that > do not have it now, and then run qdisk on top. > > At that point you can also benefit from "global" failover of the > applications across all the nodes. > > Fabio Thanks for the reply and pointers, indeed the 4 nodes attached to storage with qdisk sounds best... I believe in the particular scenario above, 2 of the nodes don't have any HBA cards / attachment to storage. Maybe an IP tiebreaker would have to be introduced if storage connections could not be obtained and the cluster was to split into two. I wonder how common that type of quorum disk setup would be these days, I gather most would use GFS in this scenario with 4 nodes, eliminating the need for any specific failover of an ext3 disk mount etc., and merely failing over the services accross all cluster nodes instead. Karl -- Karl Podesta From allen at isye.gatech.edu Wed Nov 18 16:06:38 2009 From: allen at isye.gatech.edu (Allen Belletti) Date: Wed, 18 Nov 2009 11:06:38 -0500 Subject: [Linux-cluster] GFS2 panic on current release Message-ID: <4B041B8E.6000004@isye.gatech.edu> Hi All, A few weeks ago I discovered that I'd had an obsolete gfs2 kernel module loaded and removed it, thus bringing it up to the revision included in the current kernel. Was hoping that all was well, but then yesterday morning one of the nodes panicked as follows: original: gfs2_rename+0x19d/0x63b [gfs2] pid : 12810 lock type: 3 req lock state : 1 new: gfs2_rlist_alloc+0x5c/0x6a [gfs2] pid: 12810 lock type: 3 req lock state : 1 G: s:EX n:3/33d0327 f:y t:EX d:EX/0 l:0 a:5 r:4 H: s:EX f:H e:0 p:12810 [imap] gfs2_rename+0x19d/0x63b [gfs2] R: n:54330151 f:05 b:274/274 i:1121 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at fs/gfs2/glock.c:1074 invalid opcode: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/irq CPU 1 Modules linked in: nfs fscache nfs_acl lock_dlm gfs2 dlm configfs lockd sunrpc ipv6 xfrm_nalgo crypto_api ipt_LOG xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables 8021q dm_multipath scsi_dh video backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_amd756 k8temp ide_cd i2c_core hwmon sg amd_rng cdrom k8_edac pcspkr tg3 floppy edac_mc e1000 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 12810, comm: imap Not tainted 2.6.18-164.6.1.el5 #1 RIP: 0010:[] [] :gfs2:gfs2_glock_nq+0x231/0x273 RSP: 0018:ffff8101ba8d9868 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff8101ba8d9cb0 RCX: 0000000000000461 RDX: ffff8101ffe27a98 RSI: ffffffff80309c28 RDI: ffffffff80309c20 RBP: ffff8101860b1340 R08: ffffffff80309c28 R09: 000000000000003f R10: ffff8101ba8d9368 R11: 0000000000000000 R12: ffff8100e87ea590 R13: ffff8100e87ea590 R14: ffff8100ed24e000 R15: 0000000000000000 FS: 00002b18a78ac530(0000) GS:ffff810103901940(0000) knlGS:00000000acbfbb90 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b70cf5cf000 CR3: 00000001b4d4a000 CR4: 00000000000006e0 Process imap (pid: 12810, threadinfo ffff8101ba8d8000, task ffff8101ffe277e0) Stack: ffff8101860b1340 0000000000000001 ffff8100b3e1b000 ffff8100b3e1a0e8 0000000000000000 ffffffff8862a74e 0000000000000038 ffff810184e88368 0000000000000001 ffffffff800caa0b 0000000000000005 ffff810184e88368 Call Trace: [] :gfs2:gfs2_glock_nq_m+0x2d/0xf4 [] __kzalloc+0x9/0x21 [] :gfs2:do_strip+0x175/0x349 [] :gfs2:recursive_scan+0xf2/0x175 [] :gfs2:trunc_dealloc+0x99/0xe7 [] :gfs2:do_strip+0x0/0x349 [] sched_exit+0xb4/0xb5 [] :gfs2:gfs2_delete_inode+0xdd/0x191 [] :gfs2:gfs2_delete_inode+0x46/0x191 [] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a [] :gfs2:gfs2_delete_inode+0x0/0x191 [] generic_delete_inode+0xc6/0x143 [] :gfs2:gfs2_inplace_reserve_i+0x63b/0x691 [] :gfs2:gfs2_dirent_find_space+0x0/0x41 [] :gfs2:gfs2_dirent_search+0x147/0x16e [] :gfs2:gfs2_rename+0x3be/0x63b [] :gfs2:gfs2_rename+0xff/0x63b [] :gfs2:gfs2_rename+0x145/0x63b [] :gfs2:gfs2_rename+0x16a/0x63b [] :gfs2:gfs2_rename+0x19d/0x63b [] :gfs2:gfs2_holder_uninit+0xd/0x1f [] :gfs2:gfs2_permission+0xaf/0xd4 [] :gfs2:gfs2_drevalidate+0x158/0x214 [] permission+0x81/0xc8 [] vfs_rename+0x2f4/0x471 [] sys_renameat+0x180/0x1eb [] audit_syscall_entry+0x180/0x1b3 [] tracesys+0xd5/0xe0 Code: 0f 0b 68 f8 27 64 88 c2 32 04 be 01 00 00 00 4c 89 ef e8 df RIP [] :gfs2:gfs2_glock_nq+0x231/0x273 RSP <0>Kernel panic - not syncing: Fatal exception Killed by signal 15. It seems possible that there would be some filesystem damage from running the old code and I'm going to fsck this weekend, but wanted to post this in case it revealed an obvious problem to anyone. The "invalid opcode: 0000" makes me think we ended up executing code that was actually data, but beyond that I'm clueless. Thanks, Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology From swhiteho at redhat.com Wed Nov 18 16:29:02 2009 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 18 Nov 2009 16:29:02 +0000 Subject: [Linux-cluster] GFS2 panic on current release In-Reply-To: <4B041B8E.6000004@isye.gatech.edu> References: <4B041B8E.6000004@isye.gatech.edu> Message-ID: <1258561742.2699.11.camel@localhost.localdomain> Hi, Can you open a bz for that one? Its not related to any issues on disk and it looks like a new bug to me. It is a fairly rare event though I think... You need a rename which is going to unlink a target inode, combined with the requirement to allocate a block in order to satisfy the space requirement for adding the new entry to the directory, combined with there being an "unlinked but not deallocated" inode sitting in the same resource group as was chosen for the allocation. If you unmount on all nodes, run fsck.gfs2 (the very latest one with the #500483 fix - thats very important) on the filesystem, then that will greatly reduce the chances of you hitting this again in the near future. The message refers to a bug trap for recursive locking which was added some time back to catch any cases like this before any corruption occurs, Steve. On Wed, 2009-11-18 at 11:06 -0500, Allen Belletti wrote: > Hi All, > > A few weeks ago I discovered that I'd had an obsolete gfs2 kernel module > loaded and removed it, thus bringing it up to the revision included in > the current kernel. Was hoping that all was well, but then yesterday > morning one of the nodes panicked as follows: > > original: gfs2_rename+0x19d/0x63b [gfs2] > pid : 12810 > lock type: 3 req lock state : 1 > new: gfs2_rlist_alloc+0x5c/0x6a [gfs2] > pid: 12810 > lock type: 3 req lock state : 1 > G: s:EX n:3/33d0327 f:y t:EX d:EX/0 l:0 a:5 r:4 > H: s:EX f:H e:0 p:12810 [imap] gfs2_rename+0x19d/0x63b [gfs2] > R: n:54330151 f:05 b:274/274 i:1121 > ----------- [cut here ] --------- [please bite here ] --------- > Kernel BUG at fs/gfs2/glock.c:1074 > invalid opcode: 0000 [1] SMP > last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/irq > CPU 1 > Modules linked in: nfs fscache nfs_acl lock_dlm gfs2 dlm configfs lockd > sunrpc ipv6 xfrm_nalgo crypto_api ipt_LOG xt_state ip_conntrack > nfnetlink xt_tcpudp iptable_filter ip_tables x_tables 8021q dm_multipath > scsi_dh video backlight sbs i2c_ec button battery asus_acpi > acpi_memhotplug ac parport_pc lp parport i2c_amd756 k8temp ide_cd > i2c_core hwmon sg amd_rng cdrom k8_edac pcspkr tg3 floppy edac_mc e1000 > dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero > dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih > mptbase scsi_transport_spi sd_mod scsi_mod raid1 ext3 jbd uhci_hcd > ohci_hcd ehci_hcd > Pid: 12810, comm: imap Not tainted 2.6.18-164.6.1.el5 #1 > RIP: 0010:[] [] > :gfs2:gfs2_glock_nq+0x231/0x273 > RSP: 0018:ffff8101ba8d9868 EFLAGS: 00010292 > RAX: 0000000000000000 RBX: ffff8101ba8d9cb0 RCX: 0000000000000461 > RDX: ffff8101ffe27a98 RSI: ffffffff80309c28 RDI: ffffffff80309c20 > RBP: ffff8101860b1340 R08: ffffffff80309c28 R09: 000000000000003f > R10: ffff8101ba8d9368 R11: 0000000000000000 R12: ffff8100e87ea590 > R13: ffff8100e87ea590 R14: ffff8100ed24e000 R15: 0000000000000000 > FS: 00002b18a78ac530(0000) GS:ffff810103901940(0000) knlGS:00000000acbfbb90 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002b70cf5cf000 CR3: 00000001b4d4a000 CR4: 00000000000006e0 > Process imap (pid: 12810, threadinfo ffff8101ba8d8000, task > ffff8101ffe277e0) > Stack: ffff8101860b1340 0000000000000001 ffff8100b3e1b000 ffff8100b3e1a0e8 > 0000000000000000 ffffffff8862a74e 0000000000000038 ffff810184e88368 > 0000000000000001 ffffffff800caa0b 0000000000000005 ffff810184e88368 > Call Trace: > [] :gfs2:gfs2_glock_nq_m+0x2d/0xf4 > [] __kzalloc+0x9/0x21 > [] :gfs2:do_strip+0x175/0x349 > [] :gfs2:recursive_scan+0xf2/0x175 > [] :gfs2:trunc_dealloc+0x99/0xe7 > [] :gfs2:do_strip+0x0/0x349 > [] sched_exit+0xb4/0xb5 > [] :gfs2:gfs2_delete_inode+0xdd/0x191 > [] :gfs2:gfs2_delete_inode+0x46/0x191 > [] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a > [] :gfs2:gfs2_delete_inode+0x0/0x191 > [] generic_delete_inode+0xc6/0x143 > [] :gfs2:gfs2_inplace_reserve_i+0x63b/0x691 > [] :gfs2:gfs2_dirent_find_space+0x0/0x41 > [] :gfs2:gfs2_dirent_search+0x147/0x16e > [] :gfs2:gfs2_rename+0x3be/0x63b > [] :gfs2:gfs2_rename+0xff/0x63b > [] :gfs2:gfs2_rename+0x145/0x63b > [] :gfs2:gfs2_rename+0x16a/0x63b > [] :gfs2:gfs2_rename+0x19d/0x63b > [] :gfs2:gfs2_holder_uninit+0xd/0x1f > [] :gfs2:gfs2_permission+0xaf/0xd4 > [] :gfs2:gfs2_drevalidate+0x158/0x214 > [] permission+0x81/0xc8 > [] vfs_rename+0x2f4/0x471 > [] sys_renameat+0x180/0x1eb > [] audit_syscall_entry+0x180/0x1b3 > [] tracesys+0xd5/0xe0 > > > Code: 0f 0b 68 f8 27 64 88 c2 32 04 be 01 00 00 00 4c 89 ef e8 df > RIP [] :gfs2:gfs2_glock_nq+0x231/0x273 > RSP > <0>Kernel panic - not syncing: Fatal exception > Killed by signal 15. > > It seems possible that there would be some filesystem damage from > running the old code and I'm going to fsck this weekend, but wanted to > post this in case it revealed an obvious problem to anyone. The > "invalid opcode: 0000" makes me think we ended up executing code that > was actually data, but beyond that I'm clueless. > > Thanks, > Allen > From xishipan at gmail.com Thu Nov 19 02:08:35 2009 From: xishipan at gmail.com (Xishi PAN) Date: Thu, 19 Nov 2009 10:08:35 +0800 Subject: [Linux-cluster] Question about Resource Start and Stop Ordering for MySQL Message-ID: Hi there, By checking through the start/stop ordering of service resource agent, may I ask why "mysql" is not in the list? Because in my configuration, "mysql" agent is used to start up the MySQL daemon and subsequent components' initialization depend on a functioning database, I patched /usr/share/cluster/service.sh as below to ensure the proper start/stop ordering. Could you comment? Thanks a lot. -- Stay Fabulous, Garfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From lhh at redhat.com Thu Nov 19 19:16:01 2009 From: lhh at redhat.com (Lon Hohberger) Date: Thu, 19 Nov 2009 14:16:01 -0500 Subject: [Linux-cluster] Quorum Disk on 2 nodes out of 4? In-Reply-To: <20091118110841.GA7617@minerva.redbrick.dcu.ie> References: <20091117150140.GA4174@minerva.redbrick.dcu.ie> <4B0386E9.5070903@redhat.com> <20091118110841.GA7617@minerva.redbrick.dcu.ie> Message-ID: <1258658161.6132.1367.camel@localhost.localdomain> On Wed, 2009-11-18 at 11:08 +0000, Karl Podesta wrote: > On Wed, Nov 18, 2009 at 06:32:25AM +0100, Fabio M. Di Nitto wrote: > > > Apologies if a similar question has been asked in the past, any inputs, > > > thoughts, or pointers welcome. > > > > Ideally you would find a way to plug the storage into the 2 nodes that > > do not have it now, and then run qdisk on top. > > > > At that point you can also benefit from "global" failover of the > > applications across all the nodes. > > > > Fabio > > Thanks for the reply and pointers, indeed the 4 nodes attached to storage > with qdisk sounds best... I believe in the particular scenario above, > 2 of the nodes don't have any HBA cards / attachment to storage. Maybe > an IP tiebreaker would have to be introduced if storage connections could > not be obtained and the cluster was to split into two. > > I wonder how common that type of quorum disk setup would be these days, > I gather most would use GFS in this scenario with 4 nodes, eliminating > the need for any specific failover of an ext3 disk mount etc., and merely > failing over the services accross all cluster nodes instead. We don't have an IP tiebreaker in the traditional sense. I wrote a demo IP tiebreaker which works for 2 node clusters, but it does not work in 4 node clusters since there is no coordination about whether other nodes in a partition can "see" the tiebreaker in the demo application. You can use a tweaked version of Carl's weighted voting scheme to be able to sustain 2 node failures 1/2 the time in a 4 node cluster: node# 1 2 3 4 votes 1 3 5 4 Votes = 13 Quorum = 7 Any 1 node can fail: Nodes 1 2 3 = 9 votes Nodes 2 3 4 = 12 votes Nodes 1 3 4 = 10 votes Nodes 1 2 4 = 8 votes Half of the time, 2 nodes can fail (ex: if you were worried about a random partition between 2 racks): Nodes 2 3 = 8 votes Nodes 3 4 = 9 votes Nodes 2 4 = 7 votes Obviously in the other half of the possible failure permutations, 2 nodes failing would mean loss of quorum: Nodes 1 2 = 4 votes -> NO QUORUM Nodes 1 3 = 6 votes -> NO QUORUM Nodes 1 4 = 5 votes -> NO QUORUM If you do this, put your critical applications on nodes 1 and 2. In the event of a failure, nodes 3 and 4 can pick up the load without losing quorum. Well, in theory ;) -- Lon From agx at sigxcpu.org Thu Nov 19 12:58:01 2009 From: agx at sigxcpu.org (Guido =?iso-8859-1?Q?G=FCnther?=) Date: Thu, 19 Nov 2009 13:58:01 +0100 Subject: [Linux-cluster] [PATCH] fix schema for fence_virsh Message-ID: <20091119125757.GA29869@bogon.sigxcpu.org> Hi, fence_virsh needs a port element but it's not allowed per schema so validation fails. Attached patch adds the port element to the schame for 3.0.4. Chees, -- Guido -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-Add-missing-port-element-to-fence_virsh.patch Type: text/x-diff Size: 691 bytes Desc: not available URL: From agx at sigxcpu.org Thu Nov 19 16:25:08 2009 From: agx at sigxcpu.org (Guido =?iso-8859-1?Q?G=FCnther?=) Date: Thu, 19 Nov 2009 17:25:08 +0100 Subject: [Linux-cluster] Re: [PATCH] fix schema for fence_virsh In-Reply-To: <20091119125757.GA29869@bogon.sigxcpu.org> References: <20091119125757.GA29869@bogon.sigxcpu.org> Message-ID: <20091119162508.GA25564@bogon.sigxcpu.org> On Thu, Nov 19, 2009 at 01:58:00PM +0100, Guido G?nther wrote: > Hi, > fence_virsh needs a port element but it's not allowed per schema so > validation fails. Attached patch adds the port element to the schame for > 3.0.4. Forget about that one. I just noticed that there's a second definition containing the port element. Sorry for the noise. Cheers, -- Guido From fdinitto at redhat.com Fri Nov 20 08:22:56 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 20 Nov 2009 09:22:56 +0100 Subject: [Linux-cluster] [PATCH] fix schema for fence_virsh In-Reply-To: <20091119125757.GA29869@bogon.sigxcpu.org> References: <20091119125757.GA29869@bogon.sigxcpu.org> Message-ID: <4B0651E0.2070507@redhat.com> Guido G?nther wrote: > Hi, > fence_virsh needs a port element but it's not allowed per schema so > validation fails. Attached patch adds the port element to the schame for > 3.0.4. > Chees, > -- Guido Hi Guido, thanks for the patch, as you spotted yourself is not required. Changes to the RelaxNG schema are still "a bit" delicate. The fence-agents and resource-agents bits are automatically generated by using the metadata output of the *-agents themselves. So if you find a bug in that area, the correct fix is probably hidden down in the agent itself. For distribution purposes, I strongly discourage local patching of the schema. Just report the issue to us ASAP. Because the schema is new, we are very responsive to fix bugs related to it and we can provide the correct fix straight into git (that you can cherry pick safely since it will be part of the next release). There are also other bits that need to be updated when the RelaxNG schema change, such as the LDAP schema file. A bad, un-coordinate change, could result in a wrong generation of the LDAP schema file, breaking compatibility between distributions and eventually upgrades within the same distribution. So, bottom line, just be careful when/if you touch it. Cheers Fabio From fdinitto at redhat.com Fri Nov 20 09:57:01 2009 From: fdinitto at redhat.com (Fabio M. Di Nitto) Date: Fri, 20 Nov 2009 10:57:01 +0100 Subject: [Linux-cluster] Cluster 3.0.5 stable release Message-ID: <4B0667ED.3060600@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The cluster team and its community are proud to announce the 3.0.5 stable release from the STABLE3 branch. This release contains a few major bug fixes. We strongly recommend people to update your clusters. In order to build the 3.0.5 release you will need: - - corosync 1.1.2 - - openais 1.1.0 - - linux kernel 2.6.31 The new source tarball can be downloaded here: ftp://sources.redhat.com/pub/cluster/releases/cluster-3.0.5.tar.gz https://fedorahosted.org/releases/c/l/cluster/cluster-3.0.5.tar.gz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. Happy clustering, Fabio Under the hood (from 3.0.4): Abhijith Das (1): gfs2_convert: Fix rgrp conversion to allow re-converts Benjamin Marzinski (1): libgfs2: mount device for metafs Christine Caulfield (4): cman: Improve error message when the hostname resolves to 127.0.0.1 config: enable the CMAN_PIPE in ccs_config_validate cman: clarify and tidy cman_tool help text cman: improve error message if ccs_sync fails. David Teigland (2): dlm_controld: detect lowcomms protocol fenced: add debug message Fabio M. Di Nitto (8): oracledb ras: stop using obsoleted initlog build: fix fence_xvm invokation at man page build time build: use xvm build fix from Debian cman init: move unfencing operation down the line build: fix dlm_controld build config validation: export env vars correctly qdisk: fix possible has_holder value leak cman init: fix unfencing return code Ferenc Antal (1): resource-agents: Make ip.sh deal with ip address collision Jan Friesse (2): Fence agents: Fix traceback when using any SNMP agent fence: Fix fence_ipmilan read from unitialized memory Lon Hohberger (26): rgmanager: Fix bad assertion rgmanager: fix bug in virsh_migrate rgmanager: Initial commit of central proc + migration support rgmanager: Use RG_START_RECOVER after relo failure resource-agents: Fix smb.sh return code rgmanager: Fix error recovery with central_processing resource-agents: Fix error messages in apache.sh qdiskd: Make qdiskd stop crying wolf resource-agents: Report bad config from vm.sh resource-agents: More misc. vm.sh warnings resource-agents: Fix vxfs support resource-agents: Fix samba netbios name resource-agents: Add missing primary attribute to SAPDatabase rgmanager: Fix migrate-to-offline node cman: Make master-wins mode work config: Add master_wins and io_timeout to schemas resource-agents: Decrease message level for debug info fence: add fence agent -> rng generator fence-agents: Fix xvm metadata handling fence-agents: Tweak translator output config: Update cluster schema with new fence agent info fence-agents: fix fence_xvm metadata again config: Unbreak config schema due to bad xvm metadata config: Make rng2ldif handle 'ref' properly config: Schema updates config: Fix fencing attribute requirements Marek 'marx' Grac (5): fencing: New option --retry-on fencing: fence_bladecenter needs longer timeout fence: Broken device detection for DRAC3 ERA/O fencing: Invalid initialization of default value for retry-on option fencing: Unable to power on machine after applying patch Shane Bradley (1): resource-agents: Add missing primary attribute to SAPInstance Toure Dunnon (1): rgmanager: Fix clusvcadm error reporting cman/cman_tool/join.c | 2 +- cman/cman_tool/main.c | 12 +- cman/daemon/cman-preconfig.c | 2 +- cman/init.d/cman.in | 34 +- cman/man/cman_tool.8 | 3 + cman/man/qdisk.5 | 18 + cman/qdisk/disk.h | 3 +- cman/qdisk/iostate.c | 3 +- cman/qdisk/main.c | 19 +- cman/qdisk/scandisk.c | 10 +- config/plugins/ldap/99cluster.ldif | 184 ++- config/plugins/ldap/ldap-base.csv | 26 +- config/tools/ldap/rng2ldif/tree.c | 70 + config/tools/xml/ccs_config_validate.in | 10 +- config/tools/xml/cluster.rng.in | 1741 ++++++++++++++--------- fence/agents/bladecenter/fence_bladecenter.py | 2 +- fence/agents/drac/fence_drac.pl | 2 +- fence/agents/ilo/fence_ilo.py | 1 + fence/agents/ipmilan/ipmilan.c | 1 + fence/agents/lib/fence2rng.xsl | 20 + fence/agents/lib/fencing.py.py | 49 +- fence/agents/lib/fencing_snmp.py.py | 2 +- fence/agents/xvm/Makefile | 2 +- fence/agents/xvm/options.c | 11 +- fence/fenced/cpg.c | 1 + gfs2/convert/gfs2_convert.c | 3 + gfs2/libgfs2/misc.c | 2 +- group/dlm_controld/Makefile | 2 +- group/dlm_controld/action.c | 64 +- group/dlm_controld/config.c | 7 +- group/dlm_controld/dlm_daemon.h | 6 + group/dlm_controld/main.c | 16 +- rgmanager/src/daemons/restree.c | 5 +- rgmanager/src/daemons/rg_state.c | 27 +- rgmanager/src/daemons/service_op.c | 85 ++- rgmanager/src/daemons/slang_event.c | 59 + rgmanager/src/resources/SAPDatabase | 2 +- rgmanager/src/resources/SAPInstance | 2 +- rgmanager/src/resources/apache.sh | 6 +- rgmanager/src/resources/default_event_script.sl | 8 +- rgmanager/src/resources/fs.sh.in | 1 + rgmanager/src/resources/ip.sh | 14 + rgmanager/src/resources/oracledb.sh.in | 10 +- rgmanager/src/resources/samba.sh | 2 +- rgmanager/src/resources/smb.sh | 3 + rgmanager/src/resources/vm.sh | 38 +- rgmanager/src/utils/clusvcadm.c | 5 + 47 files changed, 1839 insertions(+), 756 deletions(-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAksGZ+sACgkQhCzbekR3nhhCPACgn2v8v834hAvVJk/CAl2l+K1Z 2D4An1K/9OUi1VfaSxfuOy1BOAZlcqnT =BXDE -----END PGP SIGNATURE----- From agx at sigxcpu.org Tue Nov 24 16:54:40 2009 From: agx at sigxcpu.org (Guido =?iso-8859-1?Q?G=FCnther?=) Date: Tue, 24 Nov 2009 17:54:40 +0100 Subject: [Linux-cluster] [PATCH] fix schema for fence_virsh In-Reply-To: <4B0651E0.2070507@redhat.com> References: <20091119125757.GA29869@bogon.sigxcpu.org> <4B0651E0.2070507@redhat.com> Message-ID: <20091124165440.GA11948@bogon.sigxcpu.org> On Fri, Nov 20, 2009 at 09:22:56AM +0100, Fabio M. Di Nitto wrote: > There are also other bits that need to be updated when the RelaxNG > schema change, such as the LDAP schema file. A bad, un-coordinate > change, could result in a wrong generation of the LDAP schema file, > breaking compatibility between distributions and eventually upgrades > within the same distribution. > > So, bottom line, just be careful when/if you touch it. Good to know. Thanks! -- Guido From rvandolson at esri.com Wed Nov 25 17:30:17 2009 From: rvandolson at esri.com (Ray Van Dolson) Date: Wed, 25 Nov 2009 09:30:17 -0800 Subject: [Linux-cluster] LVM Cluster errors on bootup -- ok? Message-ID: <20091125173016.GA9417@esri.com> Noticed that on bootup of a cluster node usings cLVM and GFS2 I see the following: Scanning logical volumes connect() failed on local socket: Connection refused WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Reading all physical volumes. This may take a while... Found volume group "VolGroup00" using metadata type lvm2 Activating logical volumes connect() failed on local socket: Connection refused WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. 2 logical volume(s) in volume group "VolGroup00" now active VolGroup00 is a vg on the local disk and is not shared. Later, cluster services start (including clvmd) and the clustered volumes and associated filesystems come up fine. I'm assuming the errors above are because clvmd isn't running. Can they be safely ignored? Any way to configure so my clustered volumes aren't scanned until after clvmd starts? I know I can edit filter settings in lvm.conf, but don't see any way to specify that some block devices should be skipped until clvmd is running. Thanks, Ray From rvandolson at esri.com Wed Nov 25 17:34:31 2009 From: rvandolson at esri.com (Ray Van Dolson) Date: Wed, 25 Nov 2009 09:34:31 -0800 Subject: [Linux-cluster] LVM Cluster errors on bootup -- ok? In-Reply-To: <20091125173016.GA9417@esri.com> References: <20091125173016.GA9417@esri.com> Message-ID: <20091125173431.GA9532@esri.com> On Wed, Nov 25, 2009 at 09:30:17AM -0800, Ray Van Dolson wrote: > Noticed that on bootup of a cluster node usings cLVM and GFS2 I see the > following: > > Scanning logical volumes > connect() failed on local socket: Connection refused > WARNING: Falling back to local file-based locking. > Volume Groups with the clustered attribute will be inaccessible. > Reading all physical volumes. This may take a while... > Found volume group "VolGroup00" using metadata type lvm2 > Activating logical volumes > connect() failed on local socket: Connection refused > WARNING: Falling back to local file-based locking. > Volume Groups with the clustered attribute will be inaccessible. > 2 logical volume(s) in volume group "VolGroup00" now active > > VolGroup00 is a vg on the local disk and is not shared. > > Later, cluster services start (including clvmd) and the clustered > volumes and associated filesystems come up fine. > > I'm assuming the errors above are because clvmd isn't running. Can > they be safely ignored? Any way to configure so my clustered volumes > aren't scanned until after clvmd starts? I know I can edit filter > settings in lvm.conf, but don't see any way to specify that some block > devices should be skipped until clvmd is running. > Hmm, thinking out loud here. I am using multipath, and I'm seriously doubting multipath is running when the errors above are spit out. My filter is as follows for lvm.conf: filter = [ "a|/dev/mapper/.*|", "a|/dev/hd[a-z].*|", "r|/dev/sd[a-z].*|" ] Which should be filtering out my FC attached devices in lieu of stuff under /dev/mapper. So I guess maybe the fact that I have locking_type 3 set, my local hard drive (non-clustered) is getting scanned and clustered locking is attempted with it which fails. So in the end it's probably OK that it falls back to local file locking and I should just ignore these "errors". Ray From rmicmirregs at gmail.com Wed Nov 25 19:41:09 2009 From: rmicmirregs at gmail.com (Rafael =?ISO-8859-1?Q?Mic=F3?= Miranda) Date: Wed, 25 Nov 2009 20:41:09 +0100 Subject: [Linux-cluster] Automated tool for cluster testing? Message-ID: <1259178069.6928.10.camel@mecatol> Hi all Is there any automated tool to test the functionality and availability of a cluster configuration? Which test would it made? I was thinking on this subjects: - Resource start, monitor and stop operations - Resource migration - Communications/network failures - Node failures - Service daemons failures - Fencing device failures - Qdisk failure - etc. If the answer is "no", which is the battery of test you would apply over your cluster before considering it "stable and ready" for production purposes? Thanks in advance, Rafael -- Rafael Mic? Miranda From jeff.sturm at eprize.com Thu Nov 26 00:44:28 2009 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Wed, 25 Nov 2009 19:44:28 -0500 Subject: [Linux-cluster] fenced spinning? Message-ID: <64D0546C5EBBD147B75DE133D798665F03F3F0FD@hugo.eprize.local> CentOS 5.2, 26-node cluster. Today I restarted one node. It left the cluster, rebooted and joined the cluster without incident. Everything is fine but... fenced has the CPU pegged. No useful log messages. strace says it is spinning on poll/recvfrom: poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1) = 2 recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1) = 2 recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) Anything else useful I can do to diagnose? What are the chances I can recover this node nicely without making things worse? Any help/ideas appreciated, Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From skjbalaji at gmail.com Thu Nov 26 04:53:49 2009 From: skjbalaji at gmail.com (Balaji S) Date: Thu, 26 Nov 2009 10:23:49 +0530 Subject: [Linux-cluster] (no subject) Message-ID: -- Thanks, Balaji S -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at beekhof.net Thu Nov 26 10:35:01 2009 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 26 Nov 2009 11:35:01 +0100 Subject: [Linux-cluster] Automated tool for cluster testing? In-Reply-To: <1259178069.6928.10.camel@mecatol> References: <1259178069.6928.10.camel@mecatol> Message-ID: On Wed, Nov 25, 2009 at 8:41 PM, Rafael Mic? Miranda wrote: > Hi all > > Is there any automated tool to test the functionality and availability > of a cluster configuration? Which test would it made? > > I was thinking on this subjects: > - Resource start, monitor and stop operations > - Resource migration > - Communications/network failures > - Node failures > - Service daemons failures > - Fencing device failures > - Qdisk failure > - etc. > > If the answer is "no", which is the battery of test you would apply over > your cluster before considering it "stable and ready" for production > purposes? Pacemaker comes with one that does almost all this, you could probably write a rgmanager module for it. http://hg.clusterlabs.org/pacemaker/stable-1.0/file/tip/cts Let me know if you'd like more info. From pk at nodex.ru Thu Nov 26 13:36:10 2009 From: pk at nodex.ru (Pavel Kuzin) Date: Thu, 26 Nov 2009 16:36:10 +0300 Subject: [Linux-cluster] Can`t mount gfs Message-ID: <4B0E844A.3070303@nodex.ru> I`m tring to upgrade my old 1.04 installation to 2.03.11. I`m installed new versions of software. I`m using customised kernel. And cluster is came up. Creating test filesystem is ok. But when i trying to mount new filesystem error is occured. Can anybody tell what i`m doing wrong? Thank you! mount -t gfs -v /dev/sdb1 /mnt node3:/usr/src# mount -t gfs -v /dev/sdb1 /mnt /sbin/mount.gfs: mount /dev/sdb1 /mnt /sbin/mount.gfs: parse_opts: opts = "rw" /sbin/mount.gfs: clear flag 1 for "rw", flags = 0 /sbin/mount.gfs: parse_opts: flags = 0 /sbin/mount.gfs: parse_opts: extra = "" /sbin/mount.gfs: parse_opts: hostdata = "" /sbin/mount.gfs: parse_opts: lockproto = "" /sbin/mount.gfs: parse_opts: locktable = "" /sbin/mount.gfs: message to gfs_controld: asking to join mountgroup: /sbin/mount.gfs: write "join /mnt gfs lock_dlm TEST:NEW rw /dev/sdb1" /sbin/mount.gfs: message from gfs_controld: response to join request: /sbin/mount.gfs: lock_dlm_join: read "0" /sbin/mount.gfs: message from gfs_controld: mount options: /sbin/mount.gfs: lock_dlm_join: read "hostdata=jid=0:id=1441795:first=1" /sbin/mount.gfs: lock_dlm_join: hostdata: "hostdata=jid=0:id=1441795:first=1" /sbin/mount.gfs: lock_dlm_join: extra_plus: "hostdata=jid=0:id=1441795:first=1" /sbin/mount.gfs: mount(2) failed error -1 errno 19 /sbin/mount.gfs: lock_dlm_mount_result: write "mount_result /mnt gfs -1" /sbin/mount.gfs: message to gfs_controld: asking to leave mountgroup: /sbin/mount.gfs: lock_dlm_leave: write "leave /mnt gfs 19" /sbin/mount.gfs: message from gfs_controld: response to leave request: /sbin/mount.gfs: lock_dlm_leave: read "0" /sbin/mount.gfs: error mounting /dev/sdb1 on /mnt: No such device # uname -a Linux node3.cl.nodex.ru 2.6.30.9 #1 SMP Thu Nov 26 12:18:02 MSK 2009 i686 GNU/Linux node3:/usr/src# cman_tool status Version: 6.2.0 Config Version: 1 Cluster Name: TEST Cluster Id: 1198 Cluster Member: Yes Cluster Generation: 28 Membership state: Cluster-Member Nodes: 1 Expected votes: 1 Total votes: 1 Node votes: 1 Quorum: 1 Active subsystems: 8 Flags: 2node Dirty Ports Bound: 0 11 Node name: node3 Node ID: 3 Multicast addresses: 239.0.210.1 Node addresses: 10.210.10.12 node3:/usr/src# node3:/usr/src# mkfs -t gfs -p lock_dlm -t TEST:NEW -j 1 /dev/sdb1 This will destroy any data on /dev/sdb1. It appears to contain a LVM2_member raid. Are you sure you want to proceed? [y/n] y Device: /dev/sdb1 Blocksize: 4096 Filesystem Size: 24382128 Journals: 1 Resource Groups: 374 Locking Protocol: lock_dlm Lock Table: TEST:NEW Syncing... All Done node3:/usr/src# cat /etc/cluster/cluster.conf -- Pavel D. Kuzin pk at nodex.ru Nodex LTD. Saint-Petersburg, Russia From rmicmirregs at gmail.com Thu Nov 26 15:30:40 2009 From: rmicmirregs at gmail.com (Rafael =?ISO-8859-1?Q?Mic=F3?= Miranda) Date: Thu, 26 Nov 2009 16:30:40 +0100 Subject: [Linux-cluster] Automated tool for cluster testing? In-Reply-To: References: <1259178069.6928.10.camel@mecatol> Message-ID: <1259249440.7602.3.camel@mecatol> Hi Andrew, El jue, 26-11-2009 a las 11:35 +0100, Andrew Beekhof escribi?: > On Wed, Nov 25, 2009 at 8:41 PM, Rafael Mic? Miranda > wrote: > > Hi all > > > > Is there any automated tool to test the functionality and availability > > of a cluster configuration? Which test would it made? > > > > I was thinking on this subjects: > > - Resource start, monitor and stop operations > > - Resource migration > > - Communications/network failures > > - Node failures > > - Service daemons failures > > - Fencing device failures > > - Qdisk failure > > - etc. > > > > If the answer is "no", which is the battery of test you would apply over > > your cluster before considering it "stable and ready" for production > > purposes? > > Pacemaker comes with one that does almost all this, you could probably > write a rgmanager module for it. > http://hg.clusterlabs.org/pacemaker/stable-1.0/file/tip/cts > > Let me know if you'd like more info. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster I used CTS once in one of my old Linux-HA clusters, but writing a rgmanager module sounds difficult. In the other hand, if you can provide me a simple list of the different tests CTS applies it will help a lot just to check if my list of checks is as complete as it can. Thanks, Rafael -- Rafael Mic? Miranda From andrew at beekhof.net Thu Nov 26 18:35:57 2009 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 26 Nov 2009 19:35:57 +0100 Subject: [Linux-cluster] Automated tool for cluster testing? In-Reply-To: <1259249440.7602.3.camel@mecatol> References: <1259178069.6928.10.camel@mecatol> <1259249440.7602.3.camel@mecatol> Message-ID: On Thu, Nov 26, 2009 at 4:30 PM, Rafael Mic? Miranda wrote: > Hi Andrew, > > El jue, 26-11-2009 a las 11:35 +0100, Andrew Beekhof escribi?: >> On Wed, Nov 25, 2009 at 8:41 PM, Rafael Mic? Miranda >> wrote: >> > Hi all >> > >> > Is there any automated tool to test the functionality and availability >> > of a cluster configuration? Which test would it made? >> > >> > I was thinking on this subjects: >> > - Resource start, monitor and stop operations >> > - Resource migration >> > - Communications/network failures >> > - Node failures >> > - Service daemons failures >> > - Fencing device failures >> > - Qdisk failure >> > - etc. >> > >> > If the answer is "no", which is the battery of test you would apply over >> > your cluster before considering it "stable and ready" for production >> > purposes? >> >> Pacemaker comes with one that does almost all this, you could probably >> write a rgmanager module for it. >> ? ?http://hg.clusterlabs.org/pacemaker/stable-1.0/file/tip/cts >> >> Let me know if you'd like more info. >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > I used CTS once in one of my old Linux-HA clusters, but writing a > rgmanager module sounds difficult. Yep, non-trivial. But possible. > In the other hand, if you can provide me a simple list of the different > tests CTS applies it will help a lot just to check if my list of checks > is as complete as it can. Look for the class keyword in CTStests.py From jeff.sturm at eprize.com Fri Nov 27 16:46:32 2009 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Fri, 27 Nov 2009 11:46:32 -0500 Subject: [Linux-cluster] fenced spinning? In-Reply-To: <64D0546C5EBBD147B75DE133D798665F03F3F0FD@hugo.eprize.local> References: <64D0546C5EBBD147B75DE133D798665F03F3F0FD@hugo.eprize.local> Message-ID: <64D0546C5EBBD147B75DE133D798665F03F3F0FF@hugo.eprize.local> Found the bug report for this: https://bugzilla.redhat.com/show_bug.cgi?id=444529 It has been fixed, but not in my version. I need to determine whether I can simply fence the affected nodes without compromising the cluster (since the fence daemon itself is affected). Since our production cluster is currently stable, I'll probably try this on a test cluster. Later we'll attempt a rolling upgrade of the cluster to get the bug fix. From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jeff Sturm Sent: Wednesday, November 25, 2009 7:44 PM To: linux-cluster at redhat.com Subject: [Linux-cluster] fenced spinning? CentOS 5.2, 26-node cluster. Today I restarted one node. It left the cluster, rebooted and joined the cluster without incident. Everything is fine but... fenced has the CPU pegged. No useful log messages. strace says it is spinning on poll/recvfrom: poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1) = 2 recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLNVAL}], 4, -1) = 2 recvfrom(5, 0x7fffb074ab40, 20, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) Anything else useful I can do to diagnose? What are the chances I can recover this node nicely without making things worse? Any help/ideas appreciated, Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.gustafsson1 at se.ibm.com Fri Nov 27 21:11:57 2009 From: peter.gustafsson1 at se.ibm.com (Peter Gustafsson1) Date: Fri, 27 Nov 2009 22:11:57 +0100 Subject: [Linux-cluster] AUTO: Peter Gustafsson1 is prepared for DELETION (FREEZE) (returning 2009-08-24) Message-ID: I am out of the office until 2009-08-24. The mail box of Peter Gustafsson1 is being deactivated and will be deleted at 27.12.2009. To look up e-mail addresses of IBM employees for which you know the name, use this site : http://www.ibm.com/contact/employees/be/en/ To look up general telephone numbers and e-mail addresses to contact IBM, open this site and select your country : http://www.ibm.com/planetwide/ The Lotus Notes Administration Team Note: This is an automated response to your message "RE: [Linux-cluster] fenced spinning?" sent on 27/11/09 17:46:32. This is the only notification you will receive while this person is away. From redhat-linux-cluster at feraudet.com Sat Nov 28 16:05:09 2009 From: redhat-linux-cluster at feraudet.com (Cyril FERAUDET) Date: Sat, 28 Nov 2009 17:05:09 +0100 Subject: [Linux-cluster] Cman with two node on different subnet Message-ID: Hello, Is possible to have two node on two different subnet, without using multicast ? I have two dedicated servers directly on the internet on two different subnet without private network. In fact, both server are on the same switch but in different vlan, so I've no doubt about cluster performance. Thanks in advance, Cyril Feraudet From sdake at redhat.com Sat Nov 28 19:51:48 2009 From: sdake at redhat.com (Steven Dake) Date: Sat, 28 Nov 2009 12:51:48 -0700 Subject: [Linux-cluster] Cman with two node on different subnet In-Reply-To: References: Message-ID: <1259437908.4674.1.camel@localhost.localdomain> On Sat, 2009-11-28 at 17:05 +0100, Cyril FERAUDET wrote: > Hello, > > Is possible to have two node on two different subnet, without using multicast ? > > I have two dedicated servers directly on the internet on two different subnet without private network. > > In fact, both server are on the same switch but in different vlan, so I've no doubt about cluster performance. > > Thanks in advance, > > Cyril Feraudet > > -- Multicast or broadcast capabilities in your network are required. It is possible to operate between subnets however, if you setup a VLAN. The configuration for setting up VLANs depends on your switch vendor's config. Regards -steve > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From redhat-linux-cluster at feraudet.com Sat Nov 28 21:55:32 2009 From: redhat-linux-cluster at feraudet.com (Cyril FERAUDET) Date: Sat, 28 Nov 2009 22:55:32 +0100 Subject: [Linux-cluster] Cman with two node on different subnet In-Reply-To: <1259437908.4674.1.camel@localhost.localdomain> References: <1259437908.4674.1.camel@localhost.localdomain> Message-ID: <457BA656-A0A2-4DC4-B5F3-00C40B99F4D4@feraudet.com> Thank you for your message. I've no way to add vlan or configure broadcast cause it's two personal server hosted by one of many webhosting company. Do you know what happens if I use unicast IP from A, B or C class instead of multicast IP from D class ? Regards, Cyril On 28 nov. 2009, at 20:51, Steven Dake wrote: > On Sat, 2009-11-28 at 17:05 +0100, Cyril FERAUDET wrote: >> Hello, >> >> Is possible to have two node on two different subnet, without using multicast ? >> >> I have two dedicated servers directly on the internet on two different subnet without private network. >> >> In fact, both server are on the same switch but in different vlan, so I've no doubt about cluster performance. >> >> Thanks in advance, >> >> Cyril Feraudet >> >> -- > > Multicast or broadcast capabilities in your network are required. > > It is possible to operate between subnets however, if you setup a VLAN. > The configuration for setting up VLANs depends on your switch vendor's > config. > > Regards > -steve > > >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From sdake at redhat.com Sat Nov 28 23:58:53 2009 From: sdake at redhat.com (Steven Dake) Date: Sat, 28 Nov 2009 16:58:53 -0700 Subject: [Linux-cluster] Cman with two node on different subnet In-Reply-To: <457BA656-A0A2-4DC4-B5F3-00C40B99F4D4@feraudet.com> References: <1259437908.4674.1.camel@localhost.localdomain> <457BA656-A0A2-4DC4-B5F3-00C40B99F4D4@feraudet.com> Message-ID: <1259452733.2780.0.camel@localhost.localdomain> On Sat, 2009-11-28 at 22:55 +0100, Cyril FERAUDET wrote: > Thank you for your message. > > I've no way to add vlan or configure broadcast cause it's two personal server hosted by one of many webhosting company. > > Do you know what happens if I use unicast IP from A, B or C class instead of multicast IP from D class ? > There is no way to configure a unicast mode of operation. regards -steve > Regards, > > Cyril > > On 28 nov. 2009, at 20:51, Steven Dake wrote: > > > On Sat, 2009-11-28 at 17:05 +0100, Cyril FERAUDET wrote: > >> Hello, > >> > >> Is possible to have two node on two different subnet, without using multicast ? > >> > >> I have two dedicated servers directly on the internet on two different subnet without private network. > >> > >> In fact, both server are on the same switch but in different vlan, so I've no doubt about cluster performance. > >> > >> Thanks in advance, > >> > >> Cyril Feraudet > >> > >> -- > > > > Multicast or broadcast capabilities in your network are required. > > > > It is possible to operate between subnets however, if you setup a VLAN. > > The configuration for setting up VLANs depends on your switch vendor's > > config. > > > > Regards > > -steve > > > > > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > From jeff.sturm at eprize.com Sun Nov 29 16:12:39 2009 From: jeff.sturm at eprize.com (Jeff Sturm) Date: Sun, 29 Nov 2009 11:12:39 -0500 Subject: [Linux-cluster] Cman with two node on different subnet In-Reply-To: <457BA656-A0A2-4DC4-B5F3-00C40B99F4D4@feraudet.com> References: <1259437908.4674.1.camel@localhost.localdomain> <457BA656-A0A2-4DC4-B5F3-00C40B99F4D4@feraudet.com> Message-ID: <64D0546C5EBBD147B75DE133D798665F03F3F10A@hugo.eprize.local> -----Original Message----- > From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Cyril FERAUDET > Sent: Saturday, November 28, 2009 4:56 PM > To: sdake at redhat.com; linux clustering > Subject: Re: [Linux-cluster] Cman with two node on different subnet > I've no way to add vlan or configure broadcast cause it's two personal server hosted by one of many webhosting company. For testing purposes, perhaps you could establish a VPN connection between the two hosts (e.g. with openvpn or openswan) and configure a private subnet over the VPN link. That way all multicast traffic would be tunneled over the VPN link. Beware that if the two hosts are not in close proximity and do not have a reliable, low-latency network connection, the cluster could perform poorly and/or behave erratically. In my experience, such clusters are only as reliable as the network connecting them. (That's to say we learned a lot about networks before we got ours to perform reliably well.) Jeff From k.c.f.maguire at gmail.com Mon Nov 30 14:44:30 2009 From: k.c.f.maguire at gmail.com (Kevin Maguire) Date: Mon, 30 Nov 2009 15:44:30 +0100 Subject: [Linux-cluster] temp network failure kills GFS cluster Message-ID: Hi I am using Scientific Linux 4.8 (32 bits), with these version of the various components: kernel-smp-2.6.9-89.0.3.EL GFS-6.1.19-1.el4 GFS-kernel-smp-2.6.9-85.2.1 ccs-1.0.12-1 cman-1.0.27-1.el4 cman-kernel-smp-2.6.9-56.7.4 dlm-kernel-smp-2.6.9-58.6.1 fence-1.32.67-1.el4 I have an 18 node cluster which is used for IO intensive computation. The IO intensive part is done on a FC-connected RAID connected to all nodes. We are using GFS for the clustered IO intensive filesystem, and no other resources at all. Each node has 2 network connections, a private cluster only connection and a public connection. All cluster communications, like dlm traffic, will go via the private network. We have the following problem: When, for whatever reason, a node loses its private network connection, several other nodes in the cluster quickly crash. This takes the cluster out of its quorate state, locking the filesystem, and basically means a bit of a job to clean up and users work is lost. Fencing (via fence_sanbox2 agent) is working OK, but this does not help if the whole cluster dies. The syslog output for an example node that dies is shown below. IMO it's a bug that the cluster software/kernel crashes, but aside from that, are there are GFS or CMAN or kernel timeouts or tuneables that i can change to allow smoother operations? For example this morning's outage was caused merely by me trying to change a defective cable, the "private" network would/should have been unavailable for no more than 30 seconds or so. A search of bugzilla.redhat.com found a few similar bugs, but most seemed fixed ages ago. Thanks in advance for any advice or suggestions, Kevin Nov 30 07:03:01 HPC_01 kernel: CMAN: node HPC_14 has been removed from the cluster : Missed too many heartbeats Nov 30 07:03:01 HPC_01 kernel: CMAN: Started transition, generation 44 Nov 30 07:03:02 HPC_01 kernel: CMAN: Finished transition, generation 44 Nov 30 07:03:02 HPC_01 fenced[5393]: fencing deferred to HPC_07 Nov 30 07:03:05 HPC_01 kernel: GFS: fsid=HPC_-cluster:lv_fastfs.0: jid=13: Trying to acquire journal lock... Nov 30 07:03:05 HPC_01 kernel: GFS: fsid=HPC_-cluster:lv_fastfs.0: jid=13: Busy Nov 30 07:03:48 HPC_01 kernel: CMAN: node HPC_10 has been removed from the cluster : No response to messages Nov 30 07:03:54 HPC_01 kernel: CMAN: node HPC_06 has been removed from the cluster : No response to messages Nov 30 07:04:01 HPC_01 kernel: CMAN: node HPC_17 has been removed from the cluster : No response to messages Nov 30 07:04:08 HPC_01 kernel: CMAN: node HPC_18 has been removed from the cluster : No response to messages Nov 30 07:04:15 HPC_01 kernel: CMAN: node HPC_02 has been removed from the cluster : No response to messages Nov 30 07:04:22 HPC_01 kernel: CMAN: node HPC_05 has been removed from the cluster : No response to messages Nov 30 07:04:29 HPC_01 kernel: CMAN: node HPC_11 has been removed from the cluster : No response to messages Nov 30 07:04:36 HPC_01 kernel: CMAN: node HPC_09 has been removed from the cluster : No response to messages Nov 30 07:04:43 HPC_01 kernel: CMAN: node HPC_12 has been removed from the cluster : No response to messages Nov 30 07:04:50 HPC_01 kernel: CMAN: node HPC_03 has been removed from the cluster : No response to messages Nov 30 07:04:57 HPC_01 kernel: CMAN: node HPC_01 has been removed from the cluster : No response to messages Nov 30 07:04:57 HPC_01 kernel: CMAN: killed by NODEDOWN message Nov 30 07:04:57 HPC_01 kernel: CMAN: we are leaving the cluster. No response to messages Nov 30 07:04:57 HPC_01 kernel: WARNING: dlm_emergency_shutdown Nov 30 07:04:58 HPC_01 kernel: WARNING: dlm_emergency_shutdown finished 2 Nov 30 07:04:58 HPC_01 kernel: SM: 00000003 sm_stop: SG still joined Nov 30 07:04:58 HPC_01 kernel: SM: 01000005 sm_stop: SG still joined Nov 30 07:04:58 HPC_01 kernel: SM: 02000009 sm_stop: SG still joined Nov 30 07:04:58 HPC_01 ccsd[5212]: Cluster manager shutdown. Attemping to reconnect... Nov 30 07:05:09 HPC_01 kernel: dlm: dlm_lock: no lockspace Nov 30 07:05:09 HPC_01 kernel: d 0 requests Nov 30 07:05:09 HPC_01 kernel: clvmd purge locks of departed nodes Nov 30 07:05:09 HPC_01 kernel: clvmd purged 1 locks Nov 30 07:05:09 HPC_01 kernel: clvmd update remastered resources Nov 30 07:05:09 HPC_01 kernel: clvmd updated 0 resources Nov 30 07:05:09 HPC_01 kernel: clvmd rebuild locks Nov 30 07:05:09 HPC_01 kernel: clvmd rebuilt 0 locks Nov 30 07:05:09 HPC_01 kernel: clvmd recover event 86 done Nov 30 07:05:09 HPC_01 kernel: clvmd move flags 0,0,1 ids 83,86,86 Nov 30 07:05:09 HPC_01 kernel: clvmd process held requests Nov 30 07:05:09 HPC_01 kernel: clvmd processed 0 requests Nov 30 07:05:09 HPC_01 kernel: clvmd resend marked requests Nov 30 07:05:09 HPC_01 kernel: clvmd resent 0 requests Nov 30 07:05:09 HPC_01 kernel: clvmd recover event 86 finished Nov 30 07:05:09 HPC_01 kernel: lv_fastfs mark waiting requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs marked 0 requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs purge locks of departed nodes Nov 30 07:05:09 HPC_01 kernel: lv_fastfs purged 31363 locks Nov 30 07:05:09 HPC_01 kernel: lv_fastfs update remastered resources Nov 30 07:05:09 HPC_01 kernel: lv_fastfs updated 1 resources Nov 30 07:05:09 HPC_01 kernel: lv_fastfs rebuild locks Nov 30 07:05:09 HPC_01 kernel: lv_fastfs rebuilt 1 locks Nov 30 07:05:09 HPC_01 kernel: lv_fastfs recover event 86 done Nov 30 07:05:09 HPC_01 kernel: lv_fastfs move flags 0,0,1 ids 84,86,86 Nov 30 07:05:09 HPC_01 kernel: lv_fastfs process held requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs processed 0 requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs resend marked requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs resent 0 requests Nov 30 07:05:09 HPC_01 kernel: lv_fastfs recover event 86 finished Nov 30 07:05:09 HPC_01 kernel: lv_fastfs (6290) req reply einval 4640242 fr 18 r 18 Nov 30 07:05:09 HPC_01 kernel: lv_fastfs send einval to 8 Nov 30 07:05:09 HPC_01 kernel: lv_fastfs send einval to 6 Nov 30 07:05:09 HPC_01 kernel: overy_done jid 24 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 25 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 26 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 27 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 28 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 29 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 30 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 recovery_done jid 31 msg 309 b Nov 30 07:05:09 HPC_01 kernel: 6267 others_may_mount b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 38 last_start 40 last_finish 38 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 2 type 2 event 40 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 40 done 1 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 40 last_start 42 last_finish 40 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 3 type 2 event 42 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 42 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 42 last_start 44 last_finish 42 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 4 type 2 event 44 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 44 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 44 last_start 46 last_finish 44 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 5 type 2 event 46 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 46 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 46 last_start 48 last_finish 46 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 6 type 2 event 48 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 48 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 48 last_start 50 last_finish 48 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 7 type 2 event 50 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 50 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start last_stop 50 last_start 52 last_finish 50 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start count 8 type 2 event 52 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start 52 done 1 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 52 last_start 54 last_finish 52 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 9 type 2 event 54 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 54 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 54 last_start 56 last_finish 54 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 10 type 2 event 56 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 56 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 56 last_start 58 last_finish 56 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 11 type 2 event 58 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 58 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 58 last_start 60 last_finish 58 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 12 type 2 event 60 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 60 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 60 last_start 62 last_finish 60 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 13 type 2 event 62 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 62 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 62 last_start 64 last_finish 62 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 14 type 2 event 64 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 64 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 64 last_start 66 last_finish 64 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 15 type 2 event 66 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 66 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 66 last_start 68 last_finish 66 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 16 type 2 event 68 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 68 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 68 last_start 70 last_finish 68 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 17 type 2 event 70 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 70 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 70 last_start 72 last_finish 70 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 18 type 2 event 72 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 72 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 rereq 5,3f3a6c7c id 13be024a 5,0 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start last_stop 72 last_start 73 last_finish 72 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start count 17 type 1 event 73 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start cb jid 12 id 18 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start 73 done 0 Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done jid 12 msg 308 91b Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done nodeid 18 flg 1b Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done start_done 73 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 73 last_start 77 last_finish 73 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 18 type 2 event 77 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6283 rereq 2,19 id 132c02a2 5,0 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 77 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start last_stop 77 last_start 78 last_finish 77 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start count 17 type 3 event 78 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6283 pr_start 78 done 1 Nov 30 07:05:09 HPC_01 kernel: 6283 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6283 rereq 5,38445187 id d0100322 3,0 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 78 last_start 85 last_finish 78 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 18 type 2 event 85 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6283 rereq 2,19 id d0550264 5,0 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 85 done 1 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start last_stop 85 last_start 86 last_finish 85 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start count 17 type 1 event 86 flags a1b Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start cb jid 13 id 5 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_start 86 done 0 Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done jid 13 msg 308 91b Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done nodeid 5 flg 1b Nov 30 07:05:09 HPC_01 kernel: 6288 recovery_done start_done 86 Nov 30 07:05:09 HPC_01 kernel: 6284 pr_finish flags 81b Nov 30 07:05:09 HPC_01 kernel: Nov 30 07:05:09 HPC_01 kernel: lock_dlm: Assertion failed on line 440 of file /mnt/src/4/BUILD/gfs-kernel-2.6.9-85/smp/src/dlm/lock.c Nov 30 07:05:09 HPC_01 kernel: lock_dlm: assertion: "!error" Nov 30 07:05:09 HPC_01 kernel: lock_dlm: time = 1199646181 Nov 30 07:05:09 HPC_01 kernel: lv_fastfs: num=2,1a err=-22 cur=-1 req=3 lkf=10000 Nov 30 07:05:09 HPC_01 kernel: Nov 30 07:05:09 HPC_01 kernel: ------------[ cut here ]------------ Nov 30 07:05:09 HPC_01 kernel: kernel BUG at /mnt/src/4/BUILD/gfs-kernel-2.6.9-85/smp/src/dlm/lock.c:440! Nov 30 07:05:09 HPC_01 kernel: invalid operand: 0000 [#1] Nov 30 07:05:09 HPC_01 kernel: SMP Nov 30 07:05:09 HPC_01 kernel: Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) sg lquota(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dlm(U) cman(U) nfsd exportfs md5 ipv6 parport_pc lp parport nfs lockd nfs_acl sunrpc dm_mirror dm_mod button battery ac ohci_hcd hw_random k8_edac edac_mc e1000 floppy ext3 jbd qla2300 qla2xxx scsi_transport_fc sata_sil libata sd_mod scsi_mod Nov 30 07:05:09 HPC_01 kernel: CPU: 1 Nov 30 07:05:09 HPC_01 kernel: EIP: 0060:[] Not tainted VLI Nov 30 07:05:09 HPC_01 kernel: EFLAGS: 00010246 (2.6.9-89.0.3.ELsmp) Nov 30 07:05:09 HPC_01 kernel: EIP is at do_dlm_lock+0x134/0x14e [lock_dlm] Nov 30 07:05:09 HPC_01 kernel: eax: 00000001 ebx: ffffffea ecx: c88e5da8 edx: f8ea3409 Nov 30 07:05:09 HPC_01 kernel: esi: f8e9e83f edi: f7dbda00 ebp: f6c7b280 esp: c88e5da4 Nov 30 07:05:09 HPC_01 kernel: ds: 007b es: 007b ss: 0068 Nov 30 07:05:09 HPC_01 kernel: Process rm (pid: 15538, threadinfo=c88e5000 task=f65780b0) Nov 30 07:05:09 HPC_01 kernel: Stack: f8ea3409 20202020 32202020 20202020 20202020 20202020 61312020 ffff0018 Nov 30 07:05:09 HPC_01 kernel: ffffffff f6c7b280 00000003 00000000 f6c7b280 f8e9e8cf 00000003 f8ea6dc0 Nov 30 07:05:09 HPC_01 kernel: f8e77000 f8fead96 00000008 00000001 f655ba78 f655ba5c f8e77000 f8fe09d2 Nov 30 07:05:09 HPC_01 kernel: Call Trace: Nov 30 07:05:09 HPC_01 kernel: [] lm_dlm_lock+0x49/0x52 [lock_dlm] Nov 30 07:05:09 HPC_01 kernel: [] gfs_lm_lock+0x35/0x4d [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_glock_xmote_th+0x130/0x172 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] rq_promote+0xc8/0x147 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] run_queue+0x91/0xc1 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_glock_nq+0xcf/0x116 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_glock_nq_init+0x13/0x26 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_permission+0x0/0x61 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_permission+0x3a/0x61 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] gfs_permission+0x0/0x61 [gfs] Nov 30 07:05:09 HPC_01 kernel: [] permission+0x4a/0x6e Nov 30 07:05:09 HPC_01 kernel: [] __link_path_walk+0x14a/0xc25 Nov 30 07:05:09 HPC_01 kernel: [] do_page_fault+0x1ae/0x5c6 Nov 30 07:05:09 HPC_01 kernel: [] link_path_walk+0x36/0xa1 Nov 30 07:05:09 HPC_01 kernel: [] path_lookup+0x14b/0x17f Nov 30 07:05:09 HPC_01 kernel: [] sys_unlink+0x2c/0x132 Nov 30 07:05:09 HPC_01 kernel: [] unix_ioctl+0xd1/0xda Nov 30 07:05:09 HPC_01 kernel: [] sys_ioctl+0x227/0x269 Nov 30 07:05:09 HPC_01 kernel: [] sys_ioctl+0x25d/0x269 Nov 30 07:05:09 HPC_01 kernel: [] syscall_call+0x7/0xb Nov 30 07:05:09 HPC_01 kernel: Code: 26 50 0f bf 45 24 50 53 ff 75 08 ff 75 04 ff 75 0c ff 77 18 68 8f 35 ea f8 e8 03 4d 28 c7 83 c4 38 68 09 34 ea f8 e8 f6 4c 28 c7 <0f> 0b b8 01 56 33 ea f8 68 0b 34 ea f8 e8 91 44 28 c7 83 c4 20 Nov 30 07:05:09 HPC_01 kernel: <0>Fatal exception: panic in 5 seconds -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonardodg2084 at gmail.com Mon Nov 30 14:54:30 2009 From: leonardodg2084 at gmail.com (=?ISO-8859-1?Q?Leonardo_D=27Angelo_Gon=E7alves?=) Date: Mon, 30 Nov 2009 12:54:30 -0200 Subject: [Linux-cluster] GFS - Small files - Performance Message-ID: <3170ac020911300654g33fbd14fpa6361b358ba7cbb2@mail.gmail.com> Hi I have a GFS cluster on RHEL4.8 which one filesystem (10G) with various directories and sub-directories and small files about 5Kb. When I run the command "du-sh" in the directory it generates about 1500 IOPS on the disks, for GFS it takes time about 5 minutes and 2 second for ext3 filesyem. Could someone help me with this problem. follows below the output of gfs_tool Why for GFS it takes 5 minutes and ext3 2 seconds ? Is there any relation ? ilimit1 = 100 ilimit1_tries = 3 ilimit1_min = 1 ilimit2 = 500 ilimit2_tries = 10 ilimit2_min = 3 demote_secs = 300 incore_log_blocks = 1024 jindex_refresh_secs = 60 depend_secs = 60 scand_secs = 5 recoverd_secs = 60 logd_secs = 1 quotad_secs = 5 inoded_secs = 15 glock_purge = 0 quota_simul_sync = 64 quota_warn_period = 10 atime_quantum = 3600 quota_quantum = 60 quota_scale = 1.0000 (1, 1) quota_enforce = 1 quota_account = 1 new_files_jdata = 0 new_files_directio = 0 max_atomic_write = 4194304 max_readahead = 262144 lockdump_size = 131072 stall_secs = 600 complain_secs = 10 reclaim_limit = 5000 entries_per_readdir = 32 prefetch_secs = 10 statfs_slots = 64 max_mhc = 10000 greedy_default = 100 greedy_quantum = 25 greedy_max = 250 rgrp_try_threshold = 100 statfs_fast = 0 seq_readahead = 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpeterso at redhat.com Mon Nov 30 15:00:55 2009 From: rpeterso at redhat.com (Bob Peterson) Date: Mon, 30 Nov 2009 10:00:55 -0500 (EST) Subject: [Linux-cluster] Can`t mount gfs In-Reply-To: <4B0E844A.3070303@nodex.ru> Message-ID: <669981099.721971259593255944.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> ----- "Pavel Kuzin" wrote: | I`m tring to upgrade my old 1.04 installation to 2.03.11. | I`m installed new versions of software. | I`m using customised kernel. | And cluster is came up. | Creating test filesystem is ok. | But when i trying to mount new filesystem error is occured. | | Can anybody tell what i`m doing wrong? | | Thank you! | /sbin/mount.gfs: mount(2) failed error -1 errno 19 Hi Pavel, Make sure the gfs kernel module is running. In other words, do: lsmod | grep gfs If it's not running, that's your problem. Make sure it is compiled properly for your custom kernel. Maybe you just need to do a depmod -a or something. Regards, Bob Peterson Red Hat File Systems From leonardodg2084 at gmail.com Mon Nov 30 17:03:56 2009 From: leonardodg2084 at gmail.com (=?ISO-8859-1?Q?Leonardo_D=27Angelo_Gon=E7alves?=) Date: Mon, 30 Nov 2009 15:03:56 -0200 Subject: [Linux-cluster] GFS - Preformance small files Message-ID: <3170ac020911300903v7b091d21t44ccc1dd85f61d69@mail.gmail.com> Hi I have a GFS cluster on RHEL4.8 which one filesystem (10G) with various directories and sub-directories and small files about 5Kb. When I run the command "du-sh" in the directory it generates about 1500 IOPS on the disks, for GFS it takes time about 5 minutes and 2 second for ext3 filesyem. Could someone help me with this problem. follows below the output of gfs_tool Why for GFS it takes 5 minutes and ext3 2 seconds ? Is there any relation ? ilimit1 = 100 ilimit1_tries = 3 ilimit1_min = 1 ilimit2 = 500 ilimit2_tries = 10 ilimit2_min = 3 demote_secs = 300 incore_log_blocks = 1024 jindex_refresh_secs = 60 depend_secs = 60 scand_secs = 5 recoverd_secs = 60 logd_secs = 1 quotad_secs = 5 inoded_secs = 15 glock_purge = 0 quota_simul_sync = 64 quota_warn_period = 10 atime_quantum = 3600 quota_quantum = 60 quota_scale = 1.0000 (1, 1) quota_enforce = 1 quota_account = 1 new_files_jdata = 0 new_files_directio = 0 max_atomic_write = 4194304 max_readahead = 262144 lockdump_size = 131072 stall_secs = 600 complain_secs = 10 reclaim_limit = 5000 entries_per_readdir = 32 prefetch_secs = 10 statfs_slots = 64 max_mhc = 10000 greedy_default = 100 greedy_quantum = 25 greedy_max = 250 rgrp_try_threshold = 100 statfs_fast = 0 seq_readahead = 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at ntsg.umt.edu Mon Nov 30 17:34:17 2009 From: andrew at ntsg.umt.edu (Andrew A. Neuschwander) Date: Mon, 30 Nov 2009 10:34:17 -0700 Subject: [Linux-cluster] GFS - Preformance small files In-Reply-To: <3170ac020911300903v7b091d21t44ccc1dd85f61d69@mail.gmail.com> References: <3170ac020911300903v7b091d21t44ccc1dd85f61d69@mail.gmail.com> Message-ID: <4B140219.8010604@ntsg.umt.edu> Leonardo D'Angelo Gon?alves wrote: > Hi > > I have a GFS cluster on RHEL4.8 which one filesystem (10G) with various > directories and sub-directories and small files about 5Kb. When I run the > command "du-sh" in the directory it generates about 1500 IOPS on the disks, > for GFS it takes time about 5 minutes and 2 second for ext3 filesyem. Could > someone help me with this problem. follows below the output of gfs_tool > Why for GFS it takes 5 minutes and ext3 2 seconds ? Is there any relation ? > ilimit1 = 100 > ilimit1_tries = 3 > ilimit1_min = 1 > ilimit2 = 500 > ilimit2_tries = 10 > ilimit2_min = 3 > demote_secs = 300 > incore_log_blocks = 1024 > jindex_refresh_secs = 60 > depend_secs = 60 > scand_secs = 5 > recoverd_secs = 60 > logd_secs = 1 > quotad_secs = 5 > inoded_secs = 15 > glock_purge = 0 > quota_simul_sync = 64 > quota_warn_period = 10 > atime_quantum = 3600 > quota_quantum = 60 > quota_scale = 1.0000 (1, 1) > quota_enforce = 1 > quota_account = 1 > new_files_jdata = 0 > new_files_directio = 0 > max_atomic_write = 4194304 > max_readahead = 262144 > lockdump_size = 131072 > stall_secs = 600 > complain_secs = 10 > reclaim_limit = 5000 > entries_per_readdir = 32 > prefetch_secs = 10 > statfs_slots = 64 > max_mhc = 10000 > greedy_default = 100 > greedy_quantum = 25 > greedy_max = 250 > rgrp_try_threshold = 100 > statfs_fast = 0 > seq_readahead = 0 > > > > ------------------------------------------------------------------------ Leonardo I'm not sure if 4.8 supports it, but in 5.4, the plock_rate_limit option in cluster.conf has a terrible default which can causes this type of slow down. I have these statements in my block in my cluster.conf: Searching the web, there are a lot of recommendations to set plock_ownership to 1, but its benefit is more workload dependent than plock_rate_limit. -Andrew From sunrocs at gmail.com Mon Nov 9 09:31:37 2009 From: sunrocs at gmail.com (kunpeng sun) Date: Mon, 09 Nov 2009 09:31:37 -0000 Subject: [Linux-cluster] (no subject) Message-ID: <857a526a0911090131y422dcea3u545da1579914b71f@mail.gmail.com> hi ,everybody: I have some trouble in using two nodes cluster. I want to share the same lun of a EMC storage on two servers, the packages that I installed as following: cd /mnt/iso/Server rpm -Uvh perl-Net-Telnet-3.03-5.noarch.rpm rpm -ivh openais-0.80.3-15.el5.i386.rpm rpm -ivh perl-XML-LibXML-Common-0.13-8.2.2.i386.rpm rpm -ivh perl-XML-NamespaceSupport-1.09-1.2.1.noarch.rpm rpm -ivh perl-XML-SAX-0.14-5.noarch.rpm rpm -ivh perl-XML-LibXML-1.58-5.i386.rpm ) rpm -ivh cman-2.0.84-2.el5.i386.rpm rpm -ivh gfs2-utils-0.1.44-1.el5.i386.rpm cd /mnt/iso/ClusterStorage/ rpm -ivh gfs-utils-0.1.17-1.el5.i386.rpm rpm -ivh kmod-gfs-xen-0.1.23-5.el5.i686.rpm rpm -ivh lvm2-cluster-2.02.32-4.el5.i386.rpm cd /mnt/iso/Cluster rpm -ivh system-config-cluster-1.0.52-1.1.noarch.rpm rpm -ivh rgmanager-2.0.38-2.el5.i386.rpm The content of my /etc/hosts is : [root at media1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 media1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.10.250 media2 192.168.10.253 media1 The content of my /etc/cluster/cluster.conf : But when I start cman on server media1,it cannot start and some errors reported: [root at media1 ~]# /etc/init.d/cman start Starting cluster: Loading modules... done Mounting configfs... done Starting ccsd... done Starting cman... failed cman not started: two_node set but there are more than 2 nodes /usr/sbin/cman_tool: aisexec daemon didn't start [??] is there anything wrong with my config file? could give some suggestions to me ? I just want to use the same lun on two servers -------------- next part -------------- An HTML attachment was scrubbed... URL: