From lionel.hay at agriculture.gouv.fr Mon Oct 1 07:06:02 2007 From: lionel.hay at agriculture.gouv.fr (lionel.hay) Date: Mon, 01 Oct 2007 09:06:02 +0200 Subject: [Linux-cluster] DRBD cs:DiskLessClient inconsistant Message-ID: <47009C5A.6030706@agriculture.gouv.fr> Hi What does that mean cs:DiskLessClient ld:Inconsistent ? Here are my status SERVER 1: --------- drbd driver loaded OK; device status: version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at svr-dep64, 2006-11-23 16:28:05 0: cs:Unconfigured 1: cs:Unconfigured 2: cs:DiskLessClient st:Primary/Secondary ld:Inconsistent ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 3: cs:Connected st:Primary/Secondary ld:Consistent ns:80203 nr:0 dw:45187 dr:54382 al:0 bm:52 lo:0 pe:0 ua:0 ap:0 4: cs:Unconfigured 5: cs:Connected st:Primary/Secondary ld:Consistent ns:15865321 nr:0 dw:15865313 dr:1285608 al:372493 bm:2 lo:0 pe:0 ua:0 ap:0 6: cs:Connected st:Primary/Secondary ld:Consistent ns:2890920 nr:0 dw:224620 dr:8650473 al:938 bm:1088 lo:0 pe:0 ua:0 ap:0 7: cs:Unconfigured SERVER 2: --------- drbd driver loaded OK; device status: version: 0.7.11 (api:77/proto:74) SVN Revision: 1807 build by root at svr-secours, 2006-11-17 15:16:30 0: cs:Unconfigured 1: cs:Unconfigured 2: cs:ServerForDLess st:Secondary/Primary ld:Consistent ns:0 nr:0 dw:35800 dr:0 al:0 bm:60 lo:0 pe:0 ua:0 ap:0 3: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:80162 dw:80162 dr:0 al:0 bm:52 lo:0 pe:0 ua:0 ap:0 4: cs:Unconfigured 5: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:15865224 dw:15865224 dr:0 al:0 bm:2 lo:0 pe:0 ua:0 ap:0 6: cs:Connected st:Secondary/Primary ld:Consistent ns:0 nr:2890240 dw:2890240 dr:0 al:0 bm:1088 lo:0 pe:0 ua:0 ap:0 7: cs:Unconfigured Thanks for your response Lionel HAY DDAF 64 0559021254 FRANCE From lhh at redhat.com Mon Oct 1 14:09:11 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:09:11 -0400 Subject: [Linux-cluster] CS4 U5 / recommended quorumd values for a two nodes cluster In-Reply-To: <20070925124750.GA28897@jasmine.xos.nl> References: <46F8B1C5.7010007@bull.net> <1190709435.23919.11.camel@marc> <20070925124750.GA28897@jasmine.xos.nl> Message-ID: <1191247751.4477.1.camel@ayanami.boston.devel.redhat.com> On Tue, 2007-09-25 at 14:47 +0200, Jos Vos wrote: > On Tue, Sep 25, 2007 at 10:37:15AM +0200, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote: > > > Your problem lies here: expected_votes="3" > > [...] > > > You should calculate your votes like this: > > ( votes % 2 ) + 1 > > > > So you should use this: expected_votes="2" > > Oh... but the FAQ (# 18) explicitly says "nodes + 1" and gives "3" > as the example for a two-node cluster. And I tried it (accidently) > with expected_votes="2" and the result was that the nodes started > fencing each other in an endless loop. This was solve by setting > expected_votes="3". This is on RHEL5. b.t.w., not RHEL4. > two_node=0 expected_votes=3 qdisk=1 vote nodes=1 vote each -- Lon From lhh at redhat.com Mon Oct 1 14:11:30 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:11:30 -0400 Subject: [Linux-cluster] help:can't execute Add a failover Domain In-Reply-To: <390793994.12818@ustc.edu.cn> References: <390793994.12818@ustc.edu.cn> Message-ID: <1191247890.4477.3.camel@ayanami.boston.devel.redhat.com> On Wed, 2007-09-26 at 16:06 +0800, lining at mail.ustc.edu.cn wrote: > I have a cluster with two nodes ,it has started. > On the conga platform ,when I choose add a failover domain , > it returned as followed: > > Network station Error > this network station occured an error when handle your request. > the error is : > error type: > AttributeError > error value: > getFdomNodes That's a bug. http://bugzilla.redhat.com -- Lon From lhh at redhat.com Mon Oct 1 14:12:47 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:12:47 -0400 Subject: [Linux-cluster] Re: CS4 U5 / recommended quorumd values for a two nodes (contd.) In-Reply-To: <46FA5AA8.4010205@bull.net> References: <46FA5AA8.4010205@bull.net> Message-ID: <1191247967.4477.5.camel@ayanami.boston.devel.redhat.com> On Wed, 2007-09-26 at 15:12 +0200, Alain Moulle wrote: > until the cman of second node is started and then : > Sep 26 15:07:01 s_sys at bali0 ccsd[12224]: Cluster is quorate. Allowing connections. > > I have read and read again the FAQ page, especially the # you mention, but > don't understand why it does not work for me ... > Except if my quorum disk is not working ? > But command mkqisk returns : > #mkqdisk -L > mkqdisk v0.5.1 > /dev/sdk: > Magic: eb7a62c2 > Label: CS4QUORUMDISK > Created: Tue Sep 18 16:33:40 2007 > Host: node > but is it sufficient to know if Quorum disk is working correctly ? cat /tmp/qdisk_status On RHEL4.x, you need to chkconfig --add qdiskd On 4.4, qdiskd won't correctly wait for CMAN; this is fixed in the 4.5 version. -- Lon From lhh at redhat.com Mon Oct 1 14:13:06 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:13:06 -0400 Subject: [Linux-cluster] rgmanager and qdisk in RHEL5 "behavior problems" In-Reply-To: References: Message-ID: <1191247986.4477.7.camel@ayanami.boston.devel.redhat.com> On Wed, 2007-09-26 at 18:50 +0200, thorsten.henrici at gfd.de wrote: > > Hello List, > has this fix > > http://www.redhat.com/archives/cluster-devel/2007-April/msg00064.html > > Rgmanager thinks qdisk is a node (with node ID 0), so it tries to send > VF information to node 0 - which doesn't exist, causing rgmanger to > not > work when qdisk is running :( This is a bug in 5.0; it's fixed in 5.1 beta. -- Lon From lhh at redhat.com Mon Oct 1 14:14:49 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:14:49 -0400 Subject: [Linux-cluster] service can not be relocated In-Reply-To: <9fa3c2e50709270031n187b3403nf0666fdfa9bed4e9@mail.gmail.com> References: <9fa3c2e50709270031n187b3403nf0666fdfa9bed4e9@mail.gmail.com> Message-ID: <1191248089.4477.9.camel@ayanami.boston.devel.redhat.com> On Thu, 2007-09-27 at 15:31 +0800, Changer Van wrote: > Hi all, > > Httpd service can not be relocated when I performed the command as > follows: > > # clusvcadm -r httpd > Trying to relocate service:httpd...Failure > service:httpd is now running on node02 Hi, what release are you using? -- Lon From lhh at redhat.com Mon Oct 1 14:18:55 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:18:55 -0400 Subject: [Linux-cluster] Different views In-Reply-To: <46FCCE79.3070804@cesca.es> References: <46FCCE79.3070804@cesca.es> Message-ID: <1191248335.4477.13.camel@ayanami.boston.devel.redhat.com> On Fri, 2007-09-28 at 11:50 +0200, Jordi Prats wrote: > Hi, > I'm getting a strange error: On one node I cannot see the other one, but > on the other I can see both online. Any one can help me with this? > > I'm getting a lot of problems setting up this version of RH cluster (the > one with openais). > > Thanks, > > Here I paste some status messages: Is fencing waiting for completion for inf19 on one of the other nodes? When a cluster forms a quorum, the nodes which are not part of the quorate partition need to be fenced. So, it's likely that the one of the nodes is trying to fence inf19. -- Lon From lhh at redhat.com Mon Oct 1 14:20:51 2007 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Oct 2007 10:20:51 -0400 Subject: [Linux-cluster] service can not be relocated In-Reply-To: <1191248089.4477.9.camel@ayanami.boston.devel.redhat.com> References: <9fa3c2e50709270031n187b3403nf0666fdfa9bed4e9@mail.gmail.com> <1191248089.4477.9.camel@ayanami.boston.devel.redhat.com> Message-ID: <1191248451.4477.15.camel@ayanami.boston.devel.redhat.com> On Mon, 2007-10-01 at 10:14 -0400, Lon Hohberger wrote: > On Thu, 2007-09-27 at 15:31 +0800, Changer Van wrote: > > Hi all, > > > > Httpd service can not be relocated when I performed the command as > > follows: > > > > # clusvcadm -r httpd > > Trying to relocate service:httpd...Failure > > service:httpd is now running on node02 > > Hi, what release are you using? Right, and are there any logs on node01 indicating why it might not be started? -- Lon From jobot at wmdata.com Mon Oct 1 14:33:49 2007 From: jobot at wmdata.com (=?iso-8859-1?Q?Borgstr=F6m_Jonas?=) Date: Mon, 1 Oct 2007 16:33:49 +0200 Subject: [Linux-cluster] Possible cman init script race condition In-Reply-To: <20070928170309.GD7239@redhat.com> References: <20070928142730.GA7239@redhat.com> <20070928145818.GB7239@redhat.com> <20070928164547.GC7239@redhat.com> <20070928170309.GD7239@redhat.com> Message-ID: -----Original Message----- From: David Teigland [mailto:teigland at redhat.com] Sent: den 28 september 2007 19:03 To: Borgstr?m Jonas Cc: linux clustering Subject: Re: [Linux-cluster] Possible cman init script race condition > On Fri, Sep 28, 2007 at 11:45:47AM -0500, David Teigland wrote: > > On Fri, Sep 28, 2007 at 09:58:18AM -0500, David Teigland wrote: > > > On Fri, Sep 28, 2007 at 04:48:18PM +0200, Borgstr?m Jonas wrote: > > > > I must have misunderstood you or something, but didn't I already include > > > > that info in the message I sent a few days ago? > > > > > > > > http://permalink.gmane.org/gmane.linux.redhat.cluster/9999 > > > > > > > > (The archive inlines the "group_tool dump" output making it a bit hard > > > > to read, but hopefully your email client shows them as attachments). > > > > > > I missed that, I'll take a look, thanks. > > > > You've hit a known bug that's been fixed: > > https://bugzilla.redhat.com/show_bug.cgi?id=251966 > > > > We may have to move up the release of that fix since people are seeing the > > problem. Be careful when reading that bz because there's a lot of > > incorrect diagnosis that was recorded before we figured out what the real > > bug was. Here's the problem, it's very complex: > > > > 1. when the nodes start up, they each form a 1-node openais cluster > > independent of the other > > > > [This shouldn't really happen, but in reality we can't prevent it > > 100% of the time. We try to make it rare, and then deal with it > > sensibly on the rare occasion when it does happen. You've hit > > the "rare" occasion -- if you're actually seeing this regularly > > then we probably need to fix or adjust something at the openais > > level to make it less common.] > > I'd try to use some sleeps here, before running fence_tool join on either > node, as a work-around. We're trying to get both nodes merged together > before they do anything else. Strangely enough adding a "sleep 30" line directly below the "echo "Starting cluster: "" line seems to make this problem go away every time. Note that this is before any daemon is started. It works, but I'm not sure why. > > Also, how often are you seeing the nodes not merge together right away? > If it's frequent, then we need to fix that. This happens every time on this hardware (2 Dell 1955 blades). I never got fenced to work correctly until I figured out that I need to add a sleep 30 to the cman init script. So I'm obviously very interested in seeing this fixed in a 5.0 errata or in 5.1 at the very latest. I can't really wait until 5.2 is out... And as I mentioned before, the really scary part is that I am able to mount gfs filesystems during this kind of cluster split. And if I one node is shot, the other node replays the gfs journal and makes the filesystem writable again without first fencing the shot/missing node. Here some "group_tool -v" output with a mounted filesystem: [root at prod-db2 pgsql]# group_tool -v type level name id state node id local_done fence 0 default 00010002 JOIN_START_WAIT 1 100020001 1 [1 2] dlm 1 clvmd 00020001 JOIN_START_WAIT 1 100020001 1 [1 2] dlm 1 pg_fs 00060001 JOIN_START_WAIT 1 100020001 1 [1 2] gfs 2 pg_fs 00050001 JOIN_START_WAIT 1 100020001 1 [1 2] Regards, Jonas From teigland at redhat.com Mon Oct 1 16:21:46 2007 From: teigland at redhat.com (David Teigland) Date: Mon, 1 Oct 2007 11:21:46 -0500 Subject: [Linux-cluster] Possible cman init script race condition In-Reply-To: References: <20070928142730.GA7239@redhat.com> <20070928145818.GB7239@redhat.com> <20070928164547.GC7239@redhat.com> <20070928170309.GD7239@redhat.com> Message-ID: <20071001162145.GC3937@redhat.com> > Strangely enough adding a "sleep 30" line directly below the "echo > "Starting cluster: "" line seems to make this problem go away every > time. Note that this is before any daemon is started. It works, but I'm > not sure why. Have you tried numbers less than 30? I forget if I've asked yet, but do you have the xend init script disabled? > > Also, how often are you seeing the nodes not merge together right > > away? If it's frequent, then we need to fix that. > > This happens every time on this hardware (2 Dell 1955 blades). I never > got fenced to work correctly until I figured out that I need to add a > sleep 30 to the cman init script. So I'm obviously very interested in > seeing this fixed in a 5.0 errata or in 5.1 at the very latest. I can't > really wait until 5.2 is out... Remember, there are two problems we're talking about here. The first is why openais doesn't merge together for many seconds when both nodes start up in parallel. This should be a rare occurance. The fact that you're seeing it every time implies there's an openais problem, or there could be a problem related to the networking between your nodes. We don't have any idea at this point. Maybe Steve Dake could help you more with this. Your sleep 30 workaround is a clue -- it forces openais to start 30 seconds apart on the two nodes. The second problem is how we deal with the eventual merging of the two clusters. After we fix the first problem, you will probably never see this second problem again. > And as I mentioned before, the really scary part is that I am able to > mount gfs filesystems during this kind of cluster split. And if I one > node is shot, the other node replays the gfs journal and makes the > filesystem writable again without first fencing the shot/missing node. I would need to see the logs from the exact scenario you're talking about here to determine if this is a new problem or an effect of the other one. Dave From linux-cluster at veggiechinese.net Mon Oct 1 19:04:02 2007 From: linux-cluster at veggiechinese.net (William Yardley) Date: Mon, 1 Oct 2007 12:04:02 -0700 Subject: [Linux-cluster] simplest usage case for shared storage pool Message-ID: <20071001190402.GA9288@mitch.veggiechinese.net> I have a Dell MD-3000 hooked up (via SAS) to 2 (will be 4 in actual usage) Dell 2900 series servers. I would like to know the _bare_minimum_ set of stuff I need to configure to share a filesystem between the two devices, keeping in mind the following criteria: * The devices will mount the filesystem ro - there will be only a single node mounting the filesystesm rw * The root filesystem or other system files will not be mounting the array - just a filesystem with shared data. * The applications that will be accessing the data will be Apache (read-only), as well as (on the rw host) rsync or some other standard utility to keep things up to date. So given that, is fencing each host and having a full cluster setup really necessary? Given that gfs (and the application(s) accessing the image do POSIX compliant file locking, isn't there some simpler way to accomplish this? Assuming it is necessary, I should be able to use IPMI on the individual hosts to fence the devices, correct? Obviously NFS would be a bit simpler, but even with a NetApp or other filer appliance, and the latest versions of NFS, I'm a bit concerned that the performance wouldn't be as good. w From stefano.bossi at mediaset.it Mon Oct 1 21:06:02 2007 From: stefano.bossi at mediaset.it (Stefano Bossi) Date: Mon, 01 Oct 2007 23:06:02 +0200 Subject: [Linux-cluster] dlm_controld problem starting Message-ID: <4701613A.5040301@mediaset.it> Hi guys, I'm trying to compile a GFS2 system from source but I found some trouble. I'm using 2.6.23-rc7 #1 SMP PREEMPT Mon Oct 1 14:14:36 CEST 2007 x86_64 AMD Opteron(tm) Processor 285 AuthenticAMD GNU/Linux on the node where I found the trouble. This is a 4 nodes cluster: [root at SAN-node1 mnt]# cman_tool nodes Node Sts Inc Joined Name 1 M 416952 2007-10-01 19:31:46 SAN-node1 2 M 417044 2007-10-01 19:31:46 SAN-node2 3 M 417056 2007-10-01 22:37:34 SAN-node3 4 M 417044 2007-10-01 19:31:46 SAN-node4 [root at SAN-node1 mnt]# cman_tool status Version: 6.0.1 Config Version: 10 Cluster Name: alpha_cluster Cluster Id: 50356 Cluster Member: Yes Cluster Generation: 417056 Membership state: Cluster-Member Nodes: 4 Expected votes: 4 Total votes: 4 Quorum: 3 Active subsystems: 8 Flags: Ports Bound: 0 11 177 Node name: SAN-node1 Node ID: 1 Multicast addresses: 239.0.1.100 Node addresses: 10.102.41.74 as you can see the cluster is quorate and the other three node are correcly working (they are Fedora core 7 and no compilation from scratch !) The node I'm trying to rebuild is the SAN-node3 and the error is: Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No /sys/kernel/config/dlm, is the dlm loaded? Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No /sys/kernel/config/dlm, is the dlm loaded? Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No /sys/kernel/config/dlm, is the dlm loaded? Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No /sys/kernel/config/dlm, is the dlm loaded? of course I recompiled the kernel too. Where I can point my attention to find out the problem? Has some more experienced then me some useful suggestion? Thanks, Stefano Le informazioni trasmesse sono destinate esclusivamente alla persona o alla societ? in indirizzo e sono da intendersi confidenziali e riservate. Ogni trasmissione, inoltro, diffusione o altro uso di queste informazioni a persone o societ? differenti dal destinatario ? proibita. Se ricevete questa comunicazione per errore, contattate il mittente e cancellate le informazioni da ogni computer. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. From teigland at redhat.com Mon Oct 1 21:01:45 2007 From: teigland at redhat.com (David Teigland) Date: Mon, 1 Oct 2007 16:01:45 -0500 Subject: [Linux-cluster] dlm_controld problem starting In-Reply-To: <4701613A.5040301@mediaset.it> References: <4701613A.5040301@mediaset.it> Message-ID: <20071001210145.GE3937@redhat.com> On Mon, Oct 01, 2007 at 11:06:02PM +0200, Stefano Bossi wrote: > Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No > /sys/kernel/config/dlm, is the dlm loaded? > Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No > /sys/kernel/config/dlm, is the dlm loaded? > Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No > /sys/kernel/config/dlm, is the dlm loaded? > Oct 1 22:38:00 SAN-node3 dlm_controld[12928]: No > /sys/kernel/config/dlm, is the dlm loaded? Either you've not loaded the dlm kernel module, or you've not mounted configfs, mount -t configfs none /sys/kernel/config. Dave From maciej.bogucki at artegence.com Tue Oct 2 09:46:36 2007 From: maciej.bogucki at artegence.com (Maciej Bogucki) Date: Tue, 02 Oct 2007 11:46:36 +0200 Subject: [Linux-cluster] DRBD cs:DiskLessClient inconsistant In-Reply-To: <47009C5A.6030706@agriculture.gouv.fr> References: <47009C5A.6030706@agriculture.gouv.fr> Message-ID: <4702137C.2080208@artegence.com> lionel.hay napisa?(a): > Hi > > What does that mean cs:DiskLessClient ld:Inconsistent ? Hello, Check Your log(and dmesg). Probably your storage failed on IO request and DRBD deatched one. Best Regarads Maciej Bogucki From jobot at wmdata.com Tue Oct 2 15:51:40 2007 From: jobot at wmdata.com (=?iso-8859-1?Q?Borgstr=F6m_Jonas?=) Date: Tue, 2 Oct 2007 17:51:40 +0200 Subject: [Linux-cluster] Possible cman init script race condition In-Reply-To: <20071001162145.GC3937@redhat.com> References: <20070928142730.GA7239@redhat.com> <20070928145818.GB7239@redhat.com> <20070928164547.GC7239@redhat.com> <20070928170309.GD7239@redhat.com> <20071001162145.GC3937@redhat.com> Message-ID: -----Original Message----- From: David Teigland [mailto:teigland at redhat.com] Sent: den 1 oktober 2007 18:22 To: Borgstr?m Jonas Cc: linux clustering Subject: Re: [Linux-cluster] Possible cman init script race condition > > > Strangely enough adding a "sleep 30" line directly below the "echo > > "Starting cluster: "" line seems to make this problem go away every > > time. Note that this is before any daemon is started. It works, but I'm > > not sure why. > > Have you tried numbers less than 30? I forget if I've asked yet, but do > you have the xend init script disabled? I did try "sleep 15" but that was not enough. Maybe the HBA/lun initialization that's taking too long or something. And no, xen is not installed on these servers. > > > > > Also, how often are you seeing the nodes not merge together right > > > away? If it's frequent, then we need to fix that. > > > > This happens every time on this hardware (2 Dell 1955 blades). I never > > got fenced to work correctly until I figured out that I need to add a > > sleep 30 to the cman init script. So I'm obviously very interested in > > seeing this fixed in a 5.0 errata or in 5.1 at the very latest. I can't > > really wait until 5.2 is out... > > Remember, there are two problems we're talking about here. The first is > why openais doesn't merge together for many seconds when both nodes start > up in parallel. This should be a rare occurance. The fact that you're > seeing it every time implies there's an openais problem, or there could be > a problem related to the networking between your nodes. We don't have any > idea at this point. Maybe Steve Dake could help you more with this. Your > sleep 30 workaround is a clue -- it forces openais to start 30 seconds > apart on the two nodes. No, I think the cman daemons are started at pretty much the same time on both nodes. At least if I reboot both machines at the same time. "sleep 30" gives the kernel and the programs started before "cman" an extra 30 seconds to do their stuff before the bulk of the cman init script is executed. Another workaround is to run "chkconfig cman off" and start it from /etc/rc.d/rc.local. That also works, and does not require and "sleep". This probably works since rc.local is the very last thing executed by the boot-up process and that is probably at least 30 seconds later. > > The second problem is how we deal with the eventual merging of the two > clusters. After we fix the first problem, you will probably never see > this second problem again. > > > > And as I mentioned before, the really scary part is that I am able to > > mount gfs filesystems during this kind of cluster split. And if I one > > node is shot, the other node replays the gfs journal and makes the > > filesystem writable again without first fencing the shot/missing node. > > I would need to see the logs from the exact scenario you're talking about > here to determine if this is a new problem or an effect of the other one. Ok, here's some log outpt: Scenario: A gfs filesystem is mounted on two nodes in a "split cluster" cluster.conf: http://jonas.borgstrom.se/gfs/cluster.conf Node: prod-db1: group_tool -v: http://jonas.borgstrom.se/gfs/prod_db1_group_tool_v.txt group_tool dump: http://jonas.borgstrom.se/gfs/prod_db1_group_tool_dump.txt Node: prod-db2: group_tool -v: http://jonas.borgstrom.se/gfs/prod_db2_group_tool_v.txt group_tool dump: http://jonas.borgstrom.se/gfs/prod_db2_group_tool_dump.txt Node prod-db1 is now shot and prod-db2 happily replays the gfs journal without first fencing the failed node: Node: prod-db2: group_tool -v: http://jonas.borgstrom.se/gfs/prod_db2_group_tool_v_after_prod_db1_is_shot.txt group_tool dump: http://jonas.borgstrom.se/gfs/prod_db2_group_tool_dump_after_prod_db1_is_shot.txt /var/log/messages: http://jonas.borgstrom.se/gfs/prod_db2_messages_after_prod_db1_is_shot.txt So gfs is till mounted and writable on prod-db2 even though prod-db1 was never fenced. Expected behavior: prod-db1 should be fenced before the gfs journal is replayed. (Which happens if I add "sleep 30" to /etc/rc.d/init.d/cman). Regards, Jonas From raycharles_man at yahoo.com Tue Oct 2 16:08:25 2007 From: raycharles_man at yahoo.com (Ray Charles) Date: Tue, 2 Oct 2007 09:08:25 -0700 (PDT) Subject: [Linux-cluster] partition tables not sync'd Message-ID: <694881.1462.qm@web32112.mail.mud.yahoo.com> Hi, I have a one year old cluster(2 nodes) in production that is GFS 6.1 attached to an iscsi san. Its currently in an active passive arrangement for tomcat/apache. While the active node was busy servicing web visitors, I used the other node to add a lun from the san; then a physical volume and then a logical volume to the existing vol.group; then I put the file system on the new partition. I also have it mounted on the non-active node. The active node will not recognize the new partition schema. I've rebooted it but it still doesn't want to see the new partition the other node has created. What should I do? -tia ____________________________________________________________________________________ Don't let your dream ride pass you by. Make it a reality with Yahoo! Autos. http://autos.yahoo.com/index.html From rhurst at bidmc.harvard.edu Tue Oct 2 16:13:43 2007 From: rhurst at bidmc.harvard.edu (Robert Hurst) Date: Tue, 02 Oct 2007 12:13:43 -0400 Subject: [Linux-cluster] partition tables not sync'd In-Reply-To: <694881.1462.qm@web32112.mail.mud.yahoo.com> References: <694881.1462.qm@web32112.mail.mud.yahoo.com> Message-ID: <1191341623.19225.22.camel@xw9300.bidmc.harvard.edu> Silly question: you also masked that lun on the active node, right? You could also remove the /etc/lvm/.cache and lvmdiskscan / vgscan --mknodes On Tue, 2007-10-02 at 09:08 -0700, Ray Charles wrote: > Hi, > > I have a one year old cluster(2 nodes) in production > that is GFS 6.1 attached to an iscsi san. Its > currently in an active passive arrangement for > tomcat/apache. > > While the active node was busy servicing web visitors, > I used the other node to add a lun from the san; then > a physical volume and then a logical volume to the > existing vol.group; then I put the file system on the > new partition. I also have it mounted on the > non-active node. > > The active node will not recognize the new partition > schema. I've rebooted it but it still doesn't want to > see the new partition the other node has created. > > What should I do? > > > -tia > > > > > > ____________________________________________________________________________________ > Don't let your dream ride pass you by. Make it a reality with Yahoo! Autos. > http://autos.yahoo.com/index.html > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From raycharles_man at yahoo.com Tue Oct 2 16:26:20 2007 From: raycharles_man at yahoo.com (Ray Charles) Date: Tue, 2 Oct 2007 09:26:20 -0700 (PDT) Subject: [Linux-cluster] partition tables not sync'd In-Reply-To: <1191341623.19225.22.camel@xw9300.bidmc.harvard.edu> Message-ID: <126009.89759.qm@web32101.mail.mud.yahoo.com> Well, in my original attempt i did a rescan-scsi_bus, vgscan and lvscan but the new partition didn't show up. I think you're on to something that i also want to try and that's the /etc/lvm/.cache. Any precautions? -tia --- Robert Hurst wrote: > Silly question: you also masked that lun on the > active node, right? > You could also remove the /etc/lvm/.cache and > lvmdiskscan / vgscan > --mknodes > > On Tue, 2007-10-02 at 09:08 -0700, Ray Charles > wrote: > > > Hi, > > > > I have a one year old cluster(2 nodes) in > production > > that is GFS 6.1 attached to an iscsi san. Its > > currently in an active passive arrangement for > > tomcat/apache. > > > > While the active node was busy servicing web > visitors, > > I used the other node to add a lun from the san; > then > > a physical volume and then a logical volume to the > > existing vol.group; then I put the file system on > the > > new partition. I also have it mounted on the > > non-active node. > > > > The active node will not recognize the new > partition > > schema. I've rebooted it but it still doesn't want > to > > see the new partition the other node has created. > > > > What should I do? > > > > > > -tia > > > > > > > > > > > > > ____________________________________________________________________________________ > > Don't let your dream ride pass you by. Make it a > reality with Yahoo! Autos. > > http://autos.yahoo.com/index.html > > > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster ____________________________________________________________________________________ Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. http://farechase.yahoo.com/ From teigland at redhat.com Tue Oct 2 16:25:14 2007 From: teigland at redhat.com (David Teigland) Date: Tue, 2 Oct 2007 11:25:14 -0500 Subject: [Linux-cluster] Possible cman init script race condition In-Reply-To: References: <20070928142730.GA7239@redhat.com> <20070928145818.GB7239@redhat.com> <20070928164547.GC7239@redhat.com> <20070928170309.GD7239@redhat.com> <20071001162145.GC3937@redhat.com> Message-ID: <20071002162514.GA30975@redhat.com> On Tue, Oct 02, 2007 at 05:51:40PM +0200, Borgstr?m Jonas wrote: > No, I think the cman daemons are started at pretty much the same time on > both nodes. At least if I reboot both machines at the same time. "sleep > 30" gives the kernel and the programs started before "cman" an extra 30 > seconds to do their stuff before the bulk of the cman init script is > executed. > > Another workaround is to run "chkconfig cman off" and start it from > /etc/rc.d/rc.local. That also works, and does not require and "sleep". > This probably works since rc.local is the very last thing executed by > the boot-up process and that is probably at least 30 seconds later. I've finally chatted with Steve Dake about this, and he's quite certain that this is a result of openais bugs in the RHEL5.0 release -- fixed in the upcoming 5.1. It might be easiest to use your workarounds until 5.1. > Ok, here's some log outpt: > > Scenario: A gfs filesystem is mounted on two nodes in a "split cluster" Thanks a lot, I'll take a look. Dave From rhurst at bidmc.harvard.edu Tue Oct 2 17:02:40 2007 From: rhurst at bidmc.harvard.edu (Robert Hurst) Date: Tue, 02 Oct 2007 13:02:40 -0400 Subject: [Linux-cluster] partition tables not sync'd In-Reply-To: <126009.89759.qm@web32101.mail.mud.yahoo.com> References: <126009.89759.qm@web32101.mail.mud.yahoo.com> Message-ID: <1191344560.19225.37.camel@xw9300.bidmc.harvard.edu> I erase that cache before invoking EMC PowerPath, because I have had issues with changes not making it to that multipathing I/O solution. I have experienced no issues doing it adhoc, either. rescan-scsi bus doesn't necessarily work with the fiber channel cards... I have to send an equivalent scsi request command directly to the fc driver (in our case, we are using Emulex lpfc). An example script we use: #!/bin/bash # # rescan Emulex Fiber Channel card for new SCSI devices # ACTION=`basename $0` powermt display dev=all | grep emcpower > /tmp/$ACTION.old echo 0 > /sys/class/fc_host/host0/issue_lip echo "- - -" > /sys/class/scsi_host/host0/scan echo 1 > /sys/class/fc_host/host1/issue_lip echo "- - -" > /sys/class/scsi_host/host1/scan powermt config powermt display dev=all | grep emcpower > /tmp/$ACTION.new echo "Differences before & after" echo "==========================" diff /tmp/$ACTION.old /tmp/$ACTION.new On Tue, 2007-10-02 at 09:26 -0700, Ray Charles wrote: > Well, in my original attempt i did a rescan-scsi_bus, > vgscan and lvscan but the new partition didn't show > up. > > I think you're on to something that i also want to try > and that's the /etc/lvm/.cache. > > Any precautions? > > -tia -------------- next part -------------- An HTML attachment was scrubbed... URL: From raycharles_man at yahoo.com Tue Oct 2 17:47:41 2007 From: raycharles_man at yahoo.com (Ray Charles) Date: Tue, 2 Oct 2007 10:47:41 -0700 (PDT) Subject: [Linux-cluster] partition tables not sync'd In-Reply-To: <1191344560.19225.37.camel@xw9300.bidmc.harvard.edu> Message-ID: <807026.16460.qm@web32108.mail.mud.yahoo.com> mea culpa- Problem caused by an ID 10T controlling the box. The ID 10T didn't provision the active server to be a host for the new lun. Sorry for the bother. The ID 10T has been given a slap and a shake. -Thanks everyone --- Robert Hurst wrote: > I erase that cache before invoking EMC PowerPath, > because I have had > issues with changes not making it to that > multipathing I/O solution. I > have experienced no issues doing it adhoc, either. > > rescan-scsi bus doesn't necessarily work with the > fiber channel cards... > I have to send an equivalent scsi request command > directly to the fc > driver (in our case, we are using Emulex lpfc). An > example script we > use: > > #!/bin/bash > # > # rescan Emulex Fiber Channel card for new SCSI > devices > # > > ACTION=`basename $0` > > powermt display dev=all | grep emcpower > > /tmp/$ACTION.old > > echo 0 > /sys/class/fc_host/host0/issue_lip > echo "- - -" > /sys/class/scsi_host/host0/scan > > echo 1 > /sys/class/fc_host/host1/issue_lip > echo "- - -" > /sys/class/scsi_host/host1/scan > > powermt config > > powermt display dev=all | grep emcpower > > /tmp/$ACTION.new > > echo "Differences before & after" > echo "==========================" > diff /tmp/$ACTION.old /tmp/$ACTION.new > > > On Tue, 2007-10-02 at 09:26 -0700, Ray Charles > wrote: > > > Well, in my original attempt i did a > rescan-scsi_bus, > > vgscan and lvscan but the new partition didn't > show > > up. > > > > I think you're on to something that i also want to > try > > and that's the /etc/lvm/.cache. > > > > Any precautions? > > > > -tia > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster ____________________________________________________________________________________ Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 From teigland at redhat.com Tue Oct 2 19:03:05 2007 From: teigland at redhat.com (David Teigland) Date: Tue, 2 Oct 2007 14:03:05 -0500 Subject: [Linux-cluster] Possible cman init script race condition In-Reply-To: References: <20070928142730.GA7239@redhat.com> <20070928145818.GB7239@redhat.com> <20070928164547.GC7239@redhat.com> <20070928170309.GD7239@redhat.com> <20071001162145.GC3937@redhat.com> Message-ID: <20071002190305.GA4378@redhat.com> On Tue, Oct 02, 2007 at 05:51:40PM +0200, Borgstr?m Jonas wrote: > > > And as I mentioned before, the really scary part is that I am able to > > > mount gfs filesystems during this kind of cluster split. And if I one > > > node is shot, the other node replays the gfs journal and makes the > > > filesystem writable again without first fencing the shot/missing node. > > > > I would need to see the logs from the exact scenario you're talking about > > here to determine if this is a new problem or an effect of the other one. > > Ok, here's some log outpt: > > Scenario: A gfs filesystem is mounted on two nodes in a "split cluster" ... > So gfs is till mounted and writable on prod-db2 even though prod-db1 was > never fenced. Yes, you're correct. I've looked at the logs, and it's a side effect of the other bug where cman should disallow the merger of the two clusters. So, in summary, you've identified three different problems, each one is an effect of the one before it: 1. unidentified openais bug(s) in RHEL5.0 cause the two nodes to initially form independent clusters -- fixed in 5.1 2. bz 251966 is triggered by (1) -- fixed in 5.2 (maybe earlier) 3. groupd/fenced don't fence the failed node; this is triggered by (2). once (2) is fixed this won't happen Dave From changerv at gmail.com Wed Oct 3 03:33:05 2007 From: changerv at gmail.com (Changer Van) Date: Wed, 3 Oct 2007 11:33:05 +0800 Subject: [Linux-cluster] service can not be relocated Message-ID: <9fa3c2e50710022033u466ea0dawfea6bdcf8b278920@mail.gmail.com> ------------------------------ Message: 8 Date: Mon, 01 Oct 2007 10:20:51 -0400 From: Lon Hohberger Subject: Re: [Linux-cluster] service can not be relocated To: linux clustering Message-ID: <1191248451.4477.15.camel at ayanami.boston.devel.redhat.com> Content-Type: text/plain On Mon, 2007-10-01 at 10:14 -0400, Lon Hohberger wrote: > On Thu, 2007-09-27 at 15:31 +0800, Changer Van wrote: > > Hi all, > > > > Httpd service can not be relocated when I performed the command as > > follows: > > > > # clusvcadm -r httpd > > Trying to relocate service:httpd...Failure > > service:httpd is now running on node02 > > Hi, what release are you using? RHEL 5 (2.6.18-8el5) > Right, and are there any logs on node01 indicating why it might not be > started? No, there aren't. But service httpd was relocated to node01 while cluster member was specified like 'clusvcadm -r httpd -m node01'. Now the service was on node01. I did a test as follows: I unplugged network cable of node01 for a while then plugged in again. Service cman was terminated on node02 suddenly, and it could not stop on node02. logs on node02: node02 openais[2813]: [CLM ] CLM CONFIGURATION CHANGE node02 openais[2813]: [CLM ] New Configuration: node02 openais[2813]: [CLM ] r(0) ip(192.168.0.221) node02 openais[2813]: [CLM ] Members Left: node02 openais[2813]: [CLM ] Members Joined: node02 openais[2813]: [SYNC ] This node is within the primary component and will provide service. node02 openais[2813]: [CLM ] CLM CONFIGURATION CHANGE node02 openais[2813]: [CLM ] New Configuration: node02 openais[2813]: [CLM ] r(0) ip(192.168.0.219) node02 openais[2813]: [CLM ] r(0) ip(192.168.0.221) node02 openais[2813]: [CLM ] Members Left: node02 openais[2813]: [CLM ] Members Joined: node02 openais[2813]: [CLM ] r(0) ip(192.168.0.219) node02 openais[2813]: [SYNC ] This node is within the primary component and will provide service. node02 openais[2813]: [TOTEM] entering OPERATIONAL state. node02 openais[2813]: [MAIN ] Killing node node01 because it has rejoined the cluster without cman_tool join node02 openais[2813]: [CMAN ] cman killed by node 2 for reason 3 node02 dlm_controld[2843]: groupd is down, exiting node02 kernel: dlm: closing connection to node 1 node02 gfs_controld[2849]: groupd_dispatch error -1 errno 11 node02 gfs_controld[2849]: groupd connection died node02 gfs_controld[2849]: cluster is down, exiting node02 ccsd[2807]: Unable to connect to cluster infrastructure after 30 seconds. node02 ccsd[2807]: Unable to connect to cluster infrastructure after 60 seconds. node02 ccsd[2807]: Unable to connect to cluster infrastructure after 90 seconds. Any help would be greatly appreciated. -- Regards, Changer -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernard.chew at muvee.com Wed Oct 3 07:14:25 2007 From: bernard.chew at muvee.com (Bernard Chew) Date: Wed, 3 Oct 2007 15:14:25 +0800 Subject: [Linux-cluster] fence_xvmd in RHEL5 with a virtual domU cluster Message-ID: <229C73600EB0E54DA818AB599482BCE901C0AB46@shadowfax.sg.muvee.net> Hi, I follow the steps below to use fence_xvmd in RHEL5; (1) configure dom0 like a 1-node cluster (2) Add "" to cluster.conf in dom0 as a child of the "" tag. (3) dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4096 count=1 (4) scp /etc/cluster/fence_xvm.key root virtual_node_1:/etc/cluster (5) scp /etc/cluster/fence_xvm.key root virtual_node_2:/etc/cluster (6) Start cman on dom0 - this should start fence_xvmd for you However, I encounter the following errors while running "fence_xvmd -fddddddddd" on the dom0 node and "fence_xvm -H -o null" on one of the guests; Hash mismatch: PKT = 94598bef00f4bc3198032800b714bca59581404f8ca2e9c8ea8bb1119840e83c00000000 00000000000000000000000000000000000000000000000000000000 EXP = 85168eb74638ff044c31a4749dba9cc0b9c66e319398dcc8cd97ee4cf1e3936800000000 00000000000000000000000000000000000000000000000000000000 Key mismatch; dropping packet Any idea why? Regards, Bernard Chew From ben.yarwood at juno.co.uk Wed Oct 3 11:53:28 2007 From: ben.yarwood at juno.co.uk (Ben Yarwood) Date: Wed, 3 Oct 2007 12:53:28 +0100 Subject: [Linux-cluster] fencing using rps-10 Message-ID: <00ea01c805b4$039333d0$0ab99b70$@yarwood@juno.co.uk> I think the documentation for using the rps10 fence device is incorrect (http://sources.redhat.com/cluster/doc/cluster_schema_rhel5.html), or else there is a bug in the fence agent in rhel5: The doc does not mention that you must specify an "option" attribute or else the agent returns an error. Eg. will work but without the "option" attribute you get the error: failed: operation must be 'on', 'off', or 'reboot' Thanks Ben From Jeremyc at tasconline.com Wed Oct 3 14:36:39 2007 From: Jeremyc at tasconline.com (Jeremy Carroll) Date: Wed, 3 Oct 2007 09:36:39 -0500 Subject: [Linux-cluster] VMWare Fencing / RHCS 4 In-Reply-To: <00ea01c805b4$039333d0$0ab99b70$@yarwood@juno.co.uk> References: <00ea01c805b4$039333d0$0ab99b70$@yarwood@juno.co.uk> Message-ID: Does anybody here know of a fencing module that would work with VMWare ESX Server 3? We utilize VMWare for our cluster infrastructure and would like to put fencing in place to power down virtual machines. Thanks! From kanderso at redhat.com Wed Oct 3 14:45:06 2007 From: kanderso at redhat.com (Kevin Anderson) Date: Wed, 03 Oct 2007 09:45:06 -0500 Subject: [Linux-cluster] VMWare Fencing / RHCS 4 In-Reply-To: References: <00ea01c805b4$039333d0$0ab99b70$@yarwood@juno.co.uk> Message-ID: <1191422706.2718.24.camel@dhcp80-204.msp.redhat.com> On Wed, 2007-10-03 at 09:36 -0500, Jeremy Carroll wrote: > Does anybody here know of a fencing module that would work with VMWare > ESX Server 3? We utilize VMWare for our cluster infrastructure and would > like to put fencing in place to power down virtual machines. > Our desire is to use the fence_xvm/fence_xvmd agent for all virtual machine management in the clusters. The problem with fencing virtual machines is knowing on which physical machine the virtual instance is executing. With the ability to failover/restart/migrate virtual instances, fence_xvmd maintains that status and tracks the movement. This issue is that fence_xvmd uses libvirt interfaces to do this for xen and other virtual engines. However, libvirt does not have APIs to control VMWare instances due to VMWare not providing/documenting their control points. Given the lack of documentation, it will be problematic to integrate that capability into the open source products. So, put pressure on VMWare from a customer standpoint to open up their interfaces. Thanks Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbrassow at redhat.com Wed Oct 3 14:57:17 2007 From: jbrassow at redhat.com (Jonathan Brassow) Date: Wed, 3 Oct 2007 09:57:17 -0500 Subject: [Linux-cluster] Re: Some ideas on changes to the lvm.sh agent (or new agents). In-Reply-To: <1190996054.5802.30.camel@localhost> References: <1190996054.5802.30.camel@localhost> Message-ID: <3984516A-D158-487B-BC6D-AAD5850746CD@redhat.com> Great stuff! Much of what you are describing I've thought about in the past, but just haven't had the cycles to work on. You can see in the script itself, the comments at the top mention the desire to operate on the VG level. You can also see a couple vg_* functions that simply return error right now, but were intended to be filled in. Comments in-line. On Sep 28, 2007, at 11:14 AM, Simone Gotti wrote: > Hi, > > Trying to use a non cluster vg in redhat cluster I noticed that > lvm.sh, > to avoid metadata corruption, is forcing the need of only one lv > per vg. > > I was thinking that other clusters don't have this limitation as they > let you just use a vg only on one node at a time (and also on one > service group at a time). > > To test if this was possible with lvm2 I made little changes to lvm.sh > (just variables renames, use of vgchange instead of lvchange for tag > adding) and using the same changes needed to /etc/lvm/lvm.conf > (volume_list = [ "rootvgname", "@my_hostname" ]) looks like this idea > was working. > > I can activate the vg and all of its volume only on the node with > the vg > tagged with its hostname and the start on the other nodes is refused. > > Now, will this idea be accepted? If so these are a list of possible > needed changes and other ideas: > > *) Make also unique="1" or > better primary="1" and remove the parameter "name" as only one service > can use a vg. Sounds reasonable. Be careful when using those parameters though, they often result in cryptic error messages that are tough to follow. I do checks in lvm.sh where possible to be able to give the user more information on what went wrong. > > *) What vg_status should do? > a) Monitor all the LVs > or > b) Check only the VG and use ANOTHER resource agent for every lv > used by > the cluster? So I can create/remove/modify lvs on that vg that aren't > under rgmanager control without any error reported by the status > functions of the lvm.sh agent. > Also other clusters distinguish between vg and lv and they have 2 > different agents for them. This is were things get difficult. It would be ok to modify lvs on that vg as long as it's on the same machine that has ownership. Tags should prevent otherwise, so should be ok. User would have to be careful (or barriers would have to prevent) users from assigning different LVs in the same VG to different services. Otherwise, if a service fails (application level) and must be moved to a different machine, we would have to find a way to move all services associated with the VG to the next machine. I think there are ways to mandate this (that service A stick with service B), but we would have to have a way to enforce it. > Creating two new agents will also leave the actual lvm.sh without > changes and keep backward compatibility for who is already using it. > > Something like this (lets call lvm_vg and lvm_lv respectively the > agents > for the vg and the lv): > > > > >