From neale at sinenomine.net Wed Aug 20 04:45:12 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Wed, 20 Aug 2014 04:45:12 +0000 Subject: [Linux-cluster] clvmd not terminating Message-ID: <6A0A9B9D-E287-468C-B784-3F0A8DAB4E23@sinenomine.net> We have a sporadic situation where we are attempting to shutdown/restart both nodes of a two node cluster. One shutdowns completely but one sometimes hangs with: [root at aude2mq036nabzi ~]# service cman stop Stopping cluster: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd fence_tool: cannot leave due to active systems [FAILED] When the other node is brought back up it has problems with clvmd: ># pvscan connect() failed on local socket: Connection refused Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Sometimes it works fine but very occasionally we get the above situation. I've encountered the fence message before, usually when the fence devices were incorrectly configured but it would always fail because of this. Before I get too far into investigation mode I wondered if the above symptoms ring any bells for anyone. Neale From ricks at alldigital.com Wed Aug 20 05:04:56 2014 From: ricks at alldigital.com (ricks) Date: Tue, 19 Aug 2014 22:04:56 -0700 Subject: [Linux-cluster] clvmd not terminating Message-ID: Just issued. Should take 10-20 minutes to go through.? Sent from my Verizon Wireless 4G LTE smartphone
-------- Original message --------
From: Neale Ferguson
Date:08/19/2014 9:45 PM (GMT-07:00)
To: linux clustering
Subject: [Linux-cluster] clvmd not terminating
We have a sporadic situation where we are attempting to shutdown/restart both nodes of a two node cluster. One shutdowns completely but one sometimes hangs with: [root at aude2mq036nabzi ~]# service cman stop Stopping cluster: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd fence_tool: cannot leave due to active systems [FAILED] When the other node is brought back up it has problems with clvmd: ># pvscan connect() failed on local socket: Connection refused Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Sometimes it works fine but very occasionally we get the above situation. I've encountered the fence message before, usually when the fence devices were incorrectly configured but it would always fail because of this. Before I get too far into investigation mode I wondered if the above symptoms ring any bells for anyone. Neale -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From ricks at alldigital.com Wed Aug 20 05:07:16 2014 From: ricks at alldigital.com (ricks) Date: Tue, 19 Aug 2014 22:07:16 -0700 Subject: [Linux-cluster] clvmd not terminating Message-ID: Please ignore my last post. Ruddy phone slid a new message in. Sent from my Verizon Wireless 4G LTE smartphone
-------- Original message --------
From: Neale Ferguson
Date:08/19/2014 9:45 PM (GMT-07:00)
To: linux clustering
Subject: [Linux-cluster] clvmd not terminating
We have a sporadic situation where we are attempting to shutdown/restart both nodes of a two node cluster. One shutdowns completely but one sometimes hangs with: [root at aude2mq036nabzi ~]# service cman stop Stopping cluster: Leaving fence domain... found dlm lockspace /sys/kernel/dlm/clvmd fence_tool: cannot leave due to active systems [FAILED] When the other node is brought back up it has problems with clvmd: ># pvscan connect() failed on local socket: Connection refused Internal cluster locking initialisation failed. WARNING: Falling back to local file-based locking. Volume Groups with the clustered attribute will be inaccessible. Sometimes it works fine but very occasionally we get the above situation. I've encountered the fence message before, usually when the fence devices were incorrectly configured but it would always fail because of this. Before I get too far into investigation mode I wondered if the above symptoms ring any bells for anyone. Neale -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- An HTML attachment was scrubbed... URL: From wferi at niif.hu Fri Aug 22 00:37:44 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Fri, 22 Aug 2014 02:37:44 +0200 Subject: [Linux-cluster] on exiting maintenance mode Message-ID: <8761hlickn.fsf@lant.ki.iif.hu> Hi, While my Pacemaker cluster was in maintenance mode, resources were moved (by hand) between the nodes as I rebooted each node in turn. In the end the crm status output became perfectly empty, as the reboot of a given node removed from the output the resources which were located on the rebooted node at the time of entering maintenance mode. I expected full resource discovery on exiting maintenance mode, but it probably did not happen, as the cluster started up resources already running on other nodes, which is generally forbidden. Given that all resources were running (though possibly migrated during the maintenance), what would have been the correct way of bringing the cluster out of maintenance mode? This should have required no resource actions at all. Would cleanup of all resources have helped? Or is there a better way? -- Thanks, Feri. From vasil.val at gmail.com Tue Aug 26 06:56:29 2014 From: vasil.val at gmail.com (Vasil Valchev) Date: Tue, 26 Aug 2014 09:56:29 +0300 Subject: [Linux-cluster] totem token & post_fail_delay question Message-ID: Hello, I have a cluster that sometimes has intermittent network issues on the heartbeat network. Unfortunately improving the network is not an option, so I am looking for a way to tolerate longer interruptions. Previously it seemed to me the post_fail_delay option is suitable, but after some research it might not be what I am looking for. If I am correct, when a member leaves (due to token timeout) the cluster will wait the post_fail_delay before fencing. If the member rejoins before that, it will still be fenced, because it has previous state? >From a recent fencing on this cluster there is a strange message: Aug 24 06:20:45 node2 openais[29048]: [MAIN ] Not killing node node1cl despite it rejoining the cluster with existing state, it has a lower node ID What does this mean? And lastly is increasing the totem token timeout the way to go? Thanks, Vasil Valchev -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at beekhof.net Tue Aug 26 07:40:50 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Tue, 26 Aug 2014 17:40:50 +1000 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <8761hlickn.fsf@lant.ki.iif.hu> References: <8761hlickn.fsf@lant.ki.iif.hu> Message-ID: <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> On 22 Aug 2014, at 10:37 am, Ferenc Wagner wrote: > Hi, > > While my Pacemaker cluster was in maintenance mode, resources were moved > (by hand) between the nodes as I rebooted each node in turn. In the end > the crm status output became perfectly empty, as the reboot of a given > node removed from the output the resources which were located on the > rebooted node at the time of entering maintenance mode. I expected full > resource discovery on exiting maintenance mode, Version and logs? The discovery usually happens at the point the cluster is started on a node. Maintenance mode just prevents the cluster from doing anything about it. > but it probably did not > happen, as the cluster started up resources already running on other > nodes, which is generally forbidden. Given that all resources were > running (though possibly migrated during the maintenance), what would > have been the correct way of bringing the cluster out of maintenance > mode? This should have required no resource actions at all. Would > cleanup of all resources have helped? Or is there a better way? > -- > Thanks, > Feri. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From emi2fast at gmail.com Tue Aug 26 08:11:39 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Tue, 26 Aug 2014 10:11:39 +0200 Subject: [Linux-cluster] totem token & post_fail_delay question In-Reply-To: References: Message-ID: from man fenced Post-fail delay is the number of seconds the daemon will wait before fencing any victims after a domain member fails. It's used for delay the fence action. 2014-08-26 8:56 GMT+02:00 Vasil Valchev : > Hello, > > I have a cluster that sometimes has intermittent network issues on the > heartbeat network. > Unfortunately improving the network is not an option, so I am looking for a > way to tolerate longer interruptions. > > Previously it seemed to me the post_fail_delay option is suitable, but after > some research it might not be what I am looking for. > > If I am correct, when a member leaves (due to token timeout) the cluster > will wait the post_fail_delay before fencing. If the member rejoins before > that, it will still be fenced, because it has previous state? > From a recent fencing on this cluster there is a strange message: > > Aug 24 06:20:45 node2 openais[29048]: [MAIN ] Not killing node node1cl > despite it rejoining the cluster with existing state, it has a lower node ID > > What does this mean? > > And lastly is increasing the totem token timeout the way to go? > > > Thanks, > Vasil Valchev > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera From ccaulfie at redhat.com Tue Aug 26 08:23:14 2014 From: ccaulfie at redhat.com (Christine Caulfield) Date: Tue, 26 Aug 2014 09:23:14 +0100 Subject: [Linux-cluster] totem token & post_fail_delay question In-Reply-To: References: Message-ID: <53FC43F2.2010003@redhat.com> On 26/08/14 07:56, Vasil Valchev wrote: > Hello, > > I have a cluster that sometimes has intermittent network issues on the > heartbeat network. > Unfortunately improving the network is not an option, so I am looking > for a way to tolerate longer interruptions. > > Previously it seemed to me the post_fail_delay option is suitable, but > after some research it might not be what I am looking for. > > If I am correct, when a member leaves (due to token timeout) the cluster > will wait the post_fail_delay before fencing. If the member rejoins > before that, it will still be fenced, because it has previous state? > From a recent fencing on this cluster there is a strange message: > > Aug 24 06:20:45 node2 openais[29048]: [MAIN ] Not killing node node1cl > despite it rejoining the cluster with existing state, it has a lower node ID > > What does this mean? > It's an attempt by cman to sort out which node to kill in the situation where a node rejoins too quickly. If both nodes try to send a 'kill' message then then both nodes would leave the cluster leaving you with no active nodes. So cman (and fencing) prioritise the node with the lowest nodeID in an attempt at a tie-break. you should see a corresponding message on the other node: "Killing node %s because it has rejoined the cluster with existing state and has higher node ID" > And lastly is increasing the totem token timeout the way to go? > if there is no option for improving the network situation then, yes, increasing token timeout is probably your best option. Chrissie From emi2fast at gmail.com Tue Aug 26 10:08:48 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Tue, 26 Aug 2014 12:08:48 +0200 Subject: [Linux-cluster] totem token & post_fail_delay question In-Reply-To: <53FC43F2.2010003@redhat.com> References: <53FC43F2.2010003@redhat.com> Message-ID: i think, you are talking about: Post-join delay is the number of seconds the daemon will wait before fencing any victims after a node joins the domain. 2014-08-26 10:23 GMT+02:00 Christine Caulfield : > On 26/08/14 07:56, Vasil Valchev wrote: >> >> Hello, >> >> I have a cluster that sometimes has intermittent network issues on the >> heartbeat network. >> Unfortunately improving the network is not an option, so I am looking >> for a way to tolerate longer interruptions. >> >> Previously it seemed to me the post_fail_delay option is suitable, but >> after some research it might not be what I am looking for. >> >> If I am correct, when a member leaves (due to token timeout) the cluster >> will wait the post_fail_delay before fencing. If the member rejoins >> before that, it will still be fenced, because it has previous state? >> From a recent fencing on this cluster there is a strange message: >> >> Aug 24 06:20:45 node2 openais[29048]: [MAIN ] Not killing node node1cl >> despite it rejoining the cluster with existing state, it has a lower node >> ID >> >> What does this mean? >> > > It's an attempt by cman to sort out which node to kill in the situation > where a node rejoins too quickly. If both nodes try to send a 'kill' message > then then both nodes would leave the cluster leaving you with no active > nodes. So cman (and fencing) prioritise the node with the lowest nodeID in > an attempt at a tie-break. you should see a corresponding message on the > other node: > "Killing node %s because it has rejoined the cluster with existing state and > has higher node ID" > > > >> And lastly is increasing the totem token timeout the way to go? >> > > if there is no option for improving the network situation then, yes, > increasing token timeout is probably your best option. > > Chrissie > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera From wferi at niif.hu Tue Aug 26 17:40:19 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Tue, 26 Aug 2014 19:40:19 +0200 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> (Andrew Beekhof's message of "Tue, 26 Aug 2014 17:40:50 +1000") References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> Message-ID: <87iolf9mkc.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 22 Aug 2014, at 10:37 am, Ferenc Wagner wrote: > >> While my Pacemaker cluster was in maintenance mode, resources were moved >> (by hand) between the nodes as I rebooted each node in turn. In the end >> the crm status output became perfectly empty, as the reboot of a given >> node removed from the output the resources which were located on the >> rebooted node at the time of entering maintenance mode. I expected full >> resource discovery on exiting maintenance mode, > > Version and logs? (The more interesting part comes later, please skip to the theoretical part if you're short on time. :) I left those out, as I don't expect the actual behavior to be a bug. But I experienced this with Pacemaker version 1.1.7. I know it's old and it suffers from crmd segfault on entering maintenance mode (cf. http://thread.gmane.org/gmane.linux.highavailability.user/39121), but works well generally so I did not get to upgrade it yet. Now that I mentioned the crmd segfault: I noted that it died on the DC when I entered maintenance mode: crmd: [7452]: info: te_rsc_command: Initiating action 64: cancel vm-tmvp_monitor_60000 on n01 (local) crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. crmd: [7452]: ERROR: get_lrm_resource: Could not add resource vm-tmvp to LRM crmd: [7452]: ERROR: do_lrm_invoke: Invalid resource definition crmd: [7452]: WARN: do_lrm_invoke: bad input crmd: [7452]: WARN: do_lrm_invoke: bad input crmd: [7452]: WARN: do_lrm_invoke: bad input crmd: [7452]: WARN: do_lrm_invoke: bad input crmd: [7452]: info: te_rsc_command: Initiating action 86: cancel vm-wfweb_monitor_60000 on n01 (local) crmd: [7452]: ERROR: lrm_add_rsc(870): failed to send a addrsc message to lrmd via ch_cmd channel. crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. corosync[6966]: [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x1dc6ea0, async-conn=0x1dc6ea0) left pacemakerd: [7443]: WARN: Managed crmd process 7452 killed by signal 11 [SIGSEGV - Segmentation violation]. pacemakerd: [7443]: notice: pcmk_child_exit: Child process crmd terminated with signal 11 (pid=7452, rc=0) However, it got restarted seamlessly, without the node being fenced, so I did not even notice this until now. Should this have resulted in the node being fenced? But back to the issue at hand. The Pacemaker shutdown seemed normal, apart from the bunch of messages like: crmd: [13794]: ERROR: verify_stopped: Resource vm-web5 was active at shutdown. You may ignore this error if it is unmanaged. appearing twice and warnings like: cib: [7447]: WARN: send_ipc_message: IPC Channel to 13794 is not connected cib: [7447]: WARN: send_via_callback_channel: Delivery of reply to client 13794/bf6f43a2-70db-40ac-a902-eabc3c12e20d failed cib: [7447]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed corosync[6966]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) On reboot, corosync complained until the some Pacemaker components started: corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Pacemaker then probed the resources on the local node (all was inactive): lrmd: [8946]: info: rsc:stonith-n01 probe[5] (pid 9081) lrmd: [8946]: info: rsc:dlm:0 probe[6] (pid 9082) [...] lrmd: [8946]: info: operation monitor[112] on vm-fir for client 8949: pid 12015 exited with return code 7 crmd: [8949]: info: process_lrm_event: LRM operation vm-fir_monitor_0 (call=112, rc=7, cib-update=130, confirmed=true) not running attrd: [8947]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) attrd: [8947]: notice: attrd_perform_update: Sent update 4: probe_complete=true Then I cleaned up some resources running on other nodes, which resulted in those showing up in the crm status output providing log lines like eg.: crmd: [8949]: WARN: status_from_rc: Action 4 (vm-web5_monitor_0) on n02 failed (target: 7 vs. rc: 0): Error Finally, I exited maintenance mode, and Pacemaker started every resource I did not clean up beforehand, concurrently with their already running instances: pengine: [8948]: notice: LogActions: Start vm-web9#011(n03) I can provide more logs if this behavior is indeed unexpected, but it looks more like I miss the exact concept of maintenance mode. > The discovery usually happens at the point the cluster is started on a node. A local discovery did happen, but it could not find anything, as the cluster was started by the init scripts, well before any resource could have been moved to the freshly rebooted node (manually, to free the next node for rebooting). > Maintenance mode just prevents the cluster from doing anything about it. Fine. So I should have restarted Pacemaker on each node before leaving maintenance mode, right? Or is there a better way? (Unfortunately, I could not manage the rolling reboot through Pacemaker, as some DLM/cLVM freeze made the cluster inoperable in its normal way.) >> but it probably did not happen, as the cluster started up resources >> already running on other nodes, which is generally forbidden. Given >> that all resources were running (though possibly migrated during the >> maintenance), what would have been the correct way of bringing the >> cluster out of maintenance mode? This should have required no >> resource actions at all. Would cleanup of all resources have helped? >> Or is there a better way? You say in the above thread that resource definitions can be changed: http://thread.gmane.org/gmane.linux.highavailability.user/39121/focus=39437 Let me quote from there (starting with the words of Ulrich Windl): >>>> I think it's a common misconception that you can modify cluster >>>> resources while in maintenance mode: >>> >>> No, you _should_ be able to. If that's not the case, its a bug. >> >> So the end of maintenance mode starts with a "re-probe"? > > No, but it doesn't need to. > The policy engine already knows if the resource definitions changed > and the recurring monitor ops will find out if any are not running. My experiences show that you may not *move around* resources while in maintenance mode. That would indeed require a cluster-wide re-probe, which does not seem to happen (unless forced some way). Probably there was some misunderstanding in the above discussion, I guess Ulrich meant moving resources when he wrote "modifying cluster resources". Does this make sense? -- Thanks, Feri. From wferi at niif.hu Tue Aug 26 20:42:07 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Tue, 26 Aug 2014 22:42:07 +0200 Subject: [Linux-cluster] locating a starting resource Message-ID: <871ts39e5c.fsf@lant.ki.iif.hu> Hi, crm_resource --locate finds the hosting node of a running (successfully started) resource just fine. Is there a way to similarly find out the location of a resource *being* started, ie. whose resource agent is already running the start action, but that action is not finished yet? -- Thanks, Feri. From andrew at beekhof.net Tue Aug 26 23:46:18 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Wed, 27 Aug 2014 09:46:18 +1000 Subject: [Linux-cluster] locating a starting resource In-Reply-To: <871ts39e5c.fsf@lant.ki.iif.hu> References: <871ts39e5c.fsf@lant.ki.iif.hu> Message-ID: On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: > Hi, > > crm_resource --locate finds the hosting node of a running (successfully > started) resource just fine. Is there a way to similarly find out the > location of a resource *being* started, ie. whose resource agent is > already running the start action, but that action is not finished yet? You need to set record-pending=true in the op_defaults section. For some reason this is not yet documented :-/ With this in place, crm_resource will find the correct location > -- > Thanks, > Feri. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From andrew at beekhof.net Wed Aug 27 04:54:33 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Wed, 27 Aug 2014 14:54:33 +1000 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <87iolf9mkc.fsf@lant.ki.iif.hu> References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> <87iolf9mkc.fsf@lant.ki.iif.hu> Message-ID: On 27 Aug 2014, at 3:40 am, Ferenc Wagner wrote: > Andrew Beekhof writes: > >> On 22 Aug 2014, at 10:37 am, Ferenc Wagner wrote: >> >>> While my Pacemaker cluster was in maintenance mode, resources were moved >>> (by hand) between the nodes as I rebooted each node in turn. In the end >>> the crm status output became perfectly empty, as the reboot of a given >>> node removed from the output the resources which were located on the >>> rebooted node at the time of entering maintenance mode. I expected full >>> resource discovery on exiting maintenance mode, >> >> Version and logs? > > (The more interesting part comes later, please skip to the theoretical > part if you're short on time. :) > > I left those out, as I don't expect the actual behavior to be a bug. > But I experienced this with Pacemaker version 1.1.7. I know it's old No kidding :) > and it suffers from crmd segfault on entering maintenance mode (cf. > http://thread.gmane.org/gmane.linux.highavailability.user/39121), but > works well generally so I did not get to upgrade it yet. Now that I > mentioned the crmd segfault: I noted that it died on the DC when I > entered maintenance mode: > > crmd: [7452]: info: te_rsc_command: Initiating action 64: cancel vm-tmvp_monitor_60000 on n01 (local) > crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. That looks like the lrmd died. > crmd: [7452]: ERROR: get_lrm_resource: Could not add resource vm-tmvp to LRM > crmd: [7452]: ERROR: do_lrm_invoke: Invalid resource definition > crmd: [7452]: WARN: do_lrm_invoke: bad input > crmd: [7452]: WARN: do_lrm_invoke: bad input > crmd: [7452]: WARN: do_lrm_invoke: bad input > crmd: [7452]: WARN: do_lrm_invoke: bad input > crmd: [7452]: info: te_rsc_command: Initiating action 86: cancel vm-wfweb_monitor_60000 on n01 (local) > crmd: [7452]: ERROR: lrm_add_rsc(870): failed to send a addrsc message to lrmd via ch_cmd channel. > crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. > corosync[6966]: [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x1dc6ea0, async-conn=0x1dc6ea0) left > pacemakerd: [7443]: WARN: Managed crmd process 7452 killed by signal 11 [SIGSEGV - Segmentation violation]. Which created a condition in the crmd that it couldn't handle so it crashed too. > pacemakerd: [7443]: notice: pcmk_child_exit: Child process crmd terminated with signal 11 (pid=7452, rc=0) > > However, it got restarted seamlessly, without the node being fenced, so > I did not even notice this until now. Should this have resulted in the > node being fenced? Depends how fast the node can respawn. > > But back to the issue at hand. The Pacemaker shutdown seemed normal, > apart from the bunch of messages like: > > crmd: [13794]: ERROR: verify_stopped: Resource vm-web5 was active at shutdown. You may ignore this error if it is unmanaged. In maintenance mode, everything is unmanaged. So that would be expected. > > appearing twice and warnings like: > > cib: [7447]: WARN: send_ipc_message: IPC Channel to 13794 is not connected > cib: [7447]: WARN: send_via_callback_channel: Delivery of reply to client 13794/bf6f43a2-70db-40ac-a902-eabc3c12e20d failed > cib: [7447]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed > corosync[6966]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) > > On reboot, corosync complained until the some Pacemaker components > started: > > corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) > corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) > > Pacemaker then probed the resources on the local node (all was inactive): > > lrmd: [8946]: info: rsc:stonith-n01 probe[5] (pid 9081) > lrmd: [8946]: info: rsc:dlm:0 probe[6] (pid 9082) > [...] > lrmd: [8946]: info: operation monitor[112] on vm-fir for client 8949: pid 12015 exited with return code 7 > crmd: [8949]: info: process_lrm_event: LRM operation vm-fir_monitor_0 (call=112, rc=7, cib-update=130, confirmed=true) not running > attrd: [8947]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) > attrd: [8947]: notice: attrd_perform_update: Sent update 4: probe_complete=true > > Then I cleaned up some resources running on other nodes, which resulted > in those showing up in the crm status output providing log lines like eg.: > > crmd: [8949]: WARN: status_from_rc: Action 4 (vm-web5_monitor_0) on n02 failed (target: 7 vs. rc: 0): Error > > Finally, I exited maintenance mode, and Pacemaker started every resource > I did not clean up beforehand, concurrently with their already running > instances: > > pengine: [8948]: notice: LogActions: Start vm-web9#011(n03) > > I can provide more logs if this behavior is indeed unexpected, but it > looks more like I miss the exact concept of maintenance mode. > >> The discovery usually happens at the point the cluster is started on a node. > > A local discovery did happen, but it could not find anything, as the > cluster was started by the init scripts, well before any resource could > have been moved to the freshly rebooted node (manually, to free the next > node for rebooting). Thats your problem then, you've started resources outside of the control of the cluster. Two options... recurring monitor actions with role=Stopped would have caught this or you can run crm_resource --cleanup after you've moved resources around. > >> Maintenance mode just prevents the cluster from doing anything about it. > > Fine. So I should have restarted Pacemaker on each node before leaving > maintenance mode, right? Or is there a better way? See above > (Unfortunately, I > could not manage the rolling reboot through Pacemaker, as some DLM/cLVM > freeze made the cluster inoperable in its normal way.) > >>> but it probably did not happen, as the cluster started up resources >>> already running on other nodes, which is generally forbidden. Given >>> that all resources were running (though possibly migrated during the >>> maintenance), what would have been the correct way of bringing the >>> cluster out of maintenance mode? This should have required no >>> resource actions at all. Would cleanup of all resources have helped? >>> Or is there a better way? > > You say in the above thread that resource definitions can be changed: > http://thread.gmane.org/gmane.linux.highavailability.user/39121/focus=39437 > Let me quote from there (starting with the words of Ulrich Windl): > >>>>> I think it's a common misconception that you can modify cluster >>>>> resources while in maintenance mode: >>>> >>>> No, you _should_ be able to. If that's not the case, its a bug. >>> >>> So the end of maintenance mode starts with a "re-probe"? >> >> No, but it doesn't need to. >> The policy engine already knows if the resource definitions changed >> and the recurring monitor ops will find out if any are not running. > > My experiences show that you may not *move around* resources while in > maintenance mode. Correct > That would indeed require a cluster-wide re-probe, > which does not seem to happen (unless forced some way). Probably there > was some misunderstanding in the above discussion, I guess Ulrich meant > moving resources when he wrote "modifying cluster resources". Does this > make sense? No, I've reasonably sure he meant changing their definitions in the cib. Or at least thats what I thought he meant at the time. > -- > Thanks, > Feri. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From wferi at niif.hu Wed Aug 27 17:09:45 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Wed, 27 Aug 2014 19:09:45 +0200 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: (Andrew Beekhof's message of "Wed, 27 Aug 2014 14:54:33 +1000") References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> <87iolf9mkc.fsf@lant.ki.iif.hu> Message-ID: <87iold7tba.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 27 Aug 2014, at 3:40 am, Ferenc Wagner wrote: > >> Andrew Beekhof writes: >> >>> On 22 Aug 2014, at 10:37 am, Ferenc Wagner wrote: >>> >>>> While my Pacemaker cluster was in maintenance mode, resources were moved >>>> (by hand) between the nodes as I rebooted each node in turn. In the end >>>> the crm status output became perfectly empty, as the reboot of a given >>>> node removed from the output the resources which were located on the >>>> rebooted node at the time of entering maintenance mode. I expected full >>>> resource discovery on exiting maintenance mode, >> >> I experienced this with Pacemaker version 1.1.7. I know it's old >> and it suffers from crmd segfault on entering maintenance mode (cf. >> http://thread.gmane.org/gmane.linux.highavailability.user/39121), but >> works well generally so I did not get to upgrade it yet. Now that I >> mentioned the crmd segfault: I noted that it died on the DC when I >> entered maintenance mode: >> >> crmd: [7452]: info: te_rsc_command: Initiating action 64: cancel vm-tmvp_monitor_60000 on n01 (local) >> crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. > > That looks like the lrmd died. It did not die, at least not fully. After entering maintenance mode crmd asked lrmd to cancel the recurring monitor ops for all resources: 08:40:18 crmd: [7452]: info: do_te_invoke: Processing graph 20578 (ref=pe_calc-dc-1408516818-30681) derived from /var/lib/pengine/pe-input-848.bz2 08:40:18 crmd: [7452]: info: te_rsc_command: Initiating action 17: cancel dlm:0_monitor_120000 on n04 08:40:18 crmd: [7452]: info: te_rsc_command: Initiating action 84: cancel dlm:0_cancel_120000 on n01 (local) 08:40:18 lrmd: [7449]: info: cancel_op: operation monitor[194] on dlm:0 for client 7452, its parameters: [...] cancelled 08:40:18 crmd: [7452]: info: te_rsc_command: Initiating action 50: cancel dlm:2_monitor_120000 on n02 The stream of monitor op cancellation messages ended with: 08:40:18 crmd: [7452]: info: te_rsc_command: Initiating action 71: cancel vm-mdssq_monitor_60000 on n01 (local) 08:40:18 lrmd: [7449]: info: cancel_op: operation monitor[329] on vm-mdssq for client 7452, its parameters: [...] cancelled 08:40:18 crmd: [7452]: info: process_lrm_event: LRM operation vm-mdssq_monitor_60000 (call=329, status=1, cib-update=0, confirmed=true) Cancelled 08:40:18 crmd: [7452]: notice: run_graph: ==== Transition 20578 (Complete=87, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-input-848.bz2): Complete 08:40:18 crmd: [7452]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] 08:40:18 pengine: [7451]: notice: process_pe_message: Transition 20578: PEngine Input stored in: /var/lib/pengine/pe-input-848.bz2 08:41:28 crmd: [7452]: WARN: action_timer_callback: Timer popped (timeout=10000, abort_level=0, complete=true) 08:41:28 crmd: [7452]: WARN: action_timer_callback: Ignoring timeout while not in transition [these two lines repeated several times] 08:41:28 crmd: [7452]: WARN: action_timer_callback: Timer popped (timeout=10000, abort_level=0, complete=true) 08:41:28 crmd: [7452]: WARN: action_timer_callback: Ignoring timeout while not in transition 08:41:38 crmd: [7452]: WARN: action_timer_callback: Timer popped (timeout=20000, abort_level=0, complete=true) 08:41:38 crmd: [7452]: WARN: action_timer_callback: Ignoring timeout while not in transition 08:48:05 cib: [7447]: info: cib_stats: Processed 159 operations (23207.00us average, 0% utilization) in the last 10min 08:55:18 crmd: [7452]: info: crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped (900000ms) 08:55:18 crmd: [7452]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] 08:55:18 crmd: [7452]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED 08:55:19 pengine: [7451]: notice: stage6: Delaying fencing operations until there are resources to manage 08:55:19 crmd: [7452]: notice: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] 08:55:19 crmd: [7452]: info: do_te_invoke: Processing graph 20579 (ref=pe_calc-dc-1408517718-30802) derived from /var/lib/pengine/pe-input-849.bz2 08:55:19 crmd: [7452]: info: te_rsc_command: Initiating action 17: cancel dlm:0_monitor_120000 on n04 08:55:19 crmd: [7452]: info: te_rsc_command: Initiating action 84: cancel dlm:0_cancel_120000 on n01 (local) 08:55:19 crmd: [7452]: info: cancel_op: No pending op found for dlm:0:194 08:55:19 lrmd: [7449]: info: on_msg_cancel_op: no operation with id 194 Interestingly, monitor[194], lastly mentioned by lrmd, was the very first cancelled operation. 08:55:19 crmd: [7452]: info: te_rsc_command: Initiating action 50: cancel dlm:2_monitor_120000 on n02 08:55:19 crmd: [7452]: info: te_rsc_command: Initiating action 83: cancel vm-cedar_monitor_60000 on n01 (local) 08:55:19 crmd: [7452]: ERROR: lrm_get_rsc(673): failed to receive a reply message of getrsc. 08:55:19 crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. 08:55:19 crmd: [7452]: ERROR: lrm_add_rsc(870): failed to send a addrsc message to lrmd via ch_cmd channel. 08:55:19 crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel. 08:55:19 crmd: [7452]: ERROR: get_lrm_resource: Could not add resource vm-cedar to LRM 08:55:19 crmd: [7452]: ERROR: do_lrm_invoke: Invalid resource definition 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: ERROR: log_data_element: Output truncated: available=727, needed=1374 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input 08:55:19 crmd: [7452]: WARN: do_lrm_invoke: bad input Blocks of messages like the above repeat a couple of times for other resources, then crmd kicks the bucket and gets restarted: 08:55:19 corosync[6966]: [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x1dc6ea0, async-conn=0x1dc6ea0) left 08:55:19 pacemakerd: [7443]: WARN: Managed crmd process 7452 killed by signal 11 [SIGSEGV - Segmentation violation]. 08:55:19 pacemakerd: [7443]: notice: pcmk_child_exit: Child process crmd terminated with signal 11 (pid=7452, rc=0) 08:55:19 pacemakerd: [7443]: notice: pcmk_child_exit: Respawning failed child process: crmd 08:55:19 pacemakerd: [7443]: info: start_child: Forked child 13794 for process crmd 08:55:19 corosync[6966]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) 08:55:19 crmd: [13794]: info: Invoked: /usr/lib/pacemaker/crmd Anyway, no further logs from lrmd after this point until hours later I rebooted the machine: 14:37:06 pacemakerd: [7443]: notice: stop_child: Stopping lrmd: Sent -15 to process 7449 14:37:06 lrmd: [7449]: info: lrmd is shutting down 14:37:06 pacemakerd: [7443]: info: pcmk_child_exit: Child process lrmd exited (pid=7449, rc=0) So lrmd was alive all the time. > Which created a condition in the crmd that it couldn't handle so it > crashed too. Maybe their connection got severed somehow. >> However, it got restarted seamlessly, without the node being fenced, so >> I did not even notice this until now. Should this have resulted in the >> node being fenced? > > Depends how fast the node can respawn. You mean how fast crmd can respawn? How much time does it have to respawn to avoid being fenced? >> crmd: [13794]: ERROR: verify_stopped: Resource vm-web5 was active at shutdown. You may ignore this error if it is unmanaged. > > In maintenance mode, everything is unmanaged. So that would be expected. Is maintenance mode the same as unmanaging all resources? I think the latter does not cancel the monitor operations here... >>> The discovery usually happens at the point the cluster is started on >>> a node. >> >> A local discovery did happen, but it could not find anything, as the >> cluster was started by the init scripts, well before any resource could >> have been moved to the freshly rebooted node (manually, to free the next >> node for rebooting). > > Thats your problem then, you've started resources outside of the > control of the cluster. Some of them, yes, and moved the rest between the nodes. All this circumventing the cluster. > Two options... recurring monitor actions with role=Stopped would have > caught this Even in maintenance mode? Wouldn't they have been cancelled just like the ordinary recurring monitor actions? I guess adding them would run a recurring monitor operation for every resource on every node, only with different expectations, right? > or you can run crm_resource --cleanup after you've moved resources around. I actually ran some crm resource cleanups for a couple of resources, and those really were not started on exiting maintenance mode. >>> Maintenance mode just prevents the cluster from doing anything about it. >> >> Fine. So I should have restarted Pacemaker on each node before leaving >> maintenance mode, right? Or is there a better way? > > See above So crm_resource -r whatever -C is the way, for each resource separately. Is there no way to do this for all resources at once? >> You say in the above thread that resource definitions can be changed: >> http://thread.gmane.org/gmane.linux.highavailability.user/39121/focus=39437 >> Let me quote from there (starting with the words of Ulrich Windl): >> >>>>>> I think it's a common misconception that you can modify cluster >>>>>> resources while in maintenance mode: >>>>> >>>>> No, you _should_ be able to. If that's not the case, its a bug. >>>> >>>> So the end of maintenance mode starts with a "re-probe"? >>> >>> No, but it doesn't need to. >>> The policy engine already knows if the resource definitions changed >>> and the recurring monitor ops will find out if any are not running. >> >> My experiences show that you may not *move around* resources while in >> maintenance mode. > > Correct > >> That would indeed require a cluster-wide re-probe, which does not >> seem to happen (unless forced some way). Probably there was some >> misunderstanding in the above discussion, I guess Ulrich meant moving >> resources when he wrote "modifying cluster resources". Does this >> make sense? > > No, I've reasonably sure he meant changing their definitions in the cib. > Or at least thats what I thought he meant at the time. Nobody could blame you for that, because that's what it means. But then he inquired about a "re-probe", which fits more the problem of changing the status of resources, not their definition. Actually, I was so firmly stuck in this mind set, that first I wanted to ask you to reconsider, your response felt so much out of place. That's all about history for now... After all this, I suggest to clarify this issue in the fine manual. I've read it a couple of times, and still got the wrong impression. -- Regards, Feri. From wferi at niif.hu Wed Aug 27 18:56:32 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Wed, 27 Aug 2014 20:56:32 +0200 Subject: [Linux-cluster] locating a starting resource In-Reply-To: (Andrew Beekhof's message of "Wed, 27 Aug 2014 09:46:18 +1000") References: <871ts39e5c.fsf@lant.ki.iif.hu> Message-ID: <877g1t7odb.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: > >> crm_resource --locate finds the hosting node of a running (successfully >> started) resource just fine. Is there a way to similarly find out the >> location of a resource *being* started, ie. whose resource agent is >> already running the start action, but that action is not finished yet? > > You need to set record-pending=true in the op_defaults section. > For some reason this is not yet documented :-/ > > With this in place, crm_resource will find the correct location I set it in a single start operation, and it works as advertised, thanks! At first I was suprised to see "Started" in the crm status output while the resource was only starting, but the added order constraint worked as expected, ie. the dependent resource started only after the start action finished successfully. This begs a bonus question: how do I tell apart starting resources with record-pendig=true and started resources? crm_resource --locate does not help either. -- Thanks, Feri. From andrew at beekhof.net Wed Aug 27 22:57:26 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 28 Aug 2014 08:57:26 +1000 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <87iold7tba.fsf@lant.ki.iif.hu> References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> <87iolf9mkc.fsf@lant.ki.iif.hu> <87iold7tba.fsf@lant.ki.iif.hu> Message-ID: <749665C4-F970-4C43-9228-BCFD2EE1B442@beekhof.net> On 28 Aug 2014, at 3:09 am, Ferenc Wagner wrote: > Andrew Beekhof writes: > >> On 27 Aug 2014, at 3:40 am, Ferenc Wagner wrote: > >>> However, it got restarted seamlessly, without the node being fenced, so >>> I did not even notice this until now. Should this have resulted in the >>> node being fenced? >> >> Depends how fast the node can respawn. > > You mean how fast crmd can respawn? How much time does it have to > respawn to avoid being fenced? Until a new node can be elected DC, invoke the policy engine and start fencing. > >>> crmd: [13794]: ERROR: verify_stopped: Resource vm-web5 was active at shutdown. You may ignore this error if it is unmanaged. >> >> In maintenance mode, everything is unmanaged. So that would be expected. > > Is maintenance mode the same as unmanaging all resources? I think the > latter does not cancel the monitor operations here... Right. One cancels monitor operations too. > >>>> The discovery usually happens at the point the cluster is started on >>>> a node. >>> >>> A local discovery did happen, but it could not find anything, as the >>> cluster was started by the init scripts, well before any resource could >>> have been moved to the freshly rebooted node (manually, to free the next >>> node for rebooting). >> >> Thats your problem then, you've started resources outside of the >> control of the cluster. > > Some of them, yes, and moved the rest between the nodes. All this > circumventing the cluster. > >> Two options... recurring monitor actions with role=Stopped would have >> caught this > > Even in maintenance mode? Wouldn't they have been cancelled just like > the ordinary recurring monitor actions? Good point. Perhaps they wouldn't. > > I guess adding them would run a recurring monitor operation for every > resource on every node, only with different expectations, right? > >> or you can run crm_resource --cleanup after you've moved resources around. > > I actually ran some crm resource cleanups for a couple of resources, and > those really were not started on exiting maintenance mode. > >>>> Maintenance mode just prevents the cluster from doing anything about it. >>> >>> Fine. So I should have restarted Pacemaker on each node before leaving >>> maintenance mode, right? Or is there a better way? >> >> See above > > So crm_resource -r whatever -C is the way, for each resource separately. > Is there no way to do this for all resources at once? I think you can just drop the -r > >>> You say in the above thread that resource definitions can be changed: >>> http://thread.gmane.org/gmane.linux.highavailability.user/39121/focus=39437 >>> Let me quote from there (starting with the words of Ulrich Windl): >>> >>>>>>> I think it's a common misconception that you can modify cluster >>>>>>> resources while in maintenance mode: >>>>>> >>>>>> No, you _should_ be able to. If that's not the case, its a bug. >>>>> >>>>> So the end of maintenance mode starts with a "re-probe"? >>>> >>>> No, but it doesn't need to. >>>> The policy engine already knows if the resource definitions changed >>>> and the recurring monitor ops will find out if any are not running. >>> >>> My experiences show that you may not *move around* resources while in >>> maintenance mode. >> >> Correct >> >>> That would indeed require a cluster-wide re-probe, which does not >>> seem to happen (unless forced some way). Probably there was some >>> misunderstanding in the above discussion, I guess Ulrich meant moving >>> resources when he wrote "modifying cluster resources". Does this >>> make sense? >> >> No, I've reasonably sure he meant changing their definitions in the cib. >> Or at least thats what I thought he meant at the time. > > Nobody could blame you for that, because that's what it means. But then > he inquired about a "re-probe", which fits more the problem of changing > the status of resources, not their definition. Actually, I was so > firmly stuck in this mind set, that first I wanted to ask you to > reconsider, your response felt so much out of place. That's all about > history for now... > > After all this, I suggest to clarify this issue in the fine manual. > I've read it a couple of times, and still got the wrong impression. Which specific section do you suggest? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From andrew at beekhof.net Wed Aug 27 22:58:23 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Thu, 28 Aug 2014 08:58:23 +1000 Subject: [Linux-cluster] locating a starting resource In-Reply-To: <877g1t7odb.fsf@lant.ki.iif.hu> References: <871ts39e5c.fsf@lant.ki.iif.hu> <877g1t7odb.fsf@lant.ki.iif.hu> Message-ID: <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> On 28 Aug 2014, at 4:56 am, Ferenc Wagner wrote: > Andrew Beekhof writes: > >> On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: >> >>> crm_resource --locate finds the hosting node of a running (successfully >>> started) resource just fine. Is there a way to similarly find out the >>> location of a resource *being* started, ie. whose resource agent is >>> already running the start action, but that action is not finished yet? >> >> You need to set record-pending=true in the op_defaults section. >> For some reason this is not yet documented :-/ >> >> With this in place, crm_resource will find the correct location > > I set it in a single start operation, and it works as advertised, > thanks! At first I was suprised to see "Started" in the crm status > output while the resource was only starting, but the added order > constraint worked as expected, ie. the dependent resource started only > after the start action finished successfully. This begs a bonus > question: how do I tell apart starting resources with record-pendig=true > and started resources? I'm reasonably sure we don't expose that via crm_resource. Seems like a reasonable thing to do though. crm_mon /might/ show pending though. > crm_resource --locate does not help either. > -- > Thanks, > Feri. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From neale at sinenomine.net Thu Aug 28 19:11:24 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Thu, 28 Aug 2014 19:11:24 +0000 Subject: [Linux-cluster] Delaying fencing during shutdown Message-ID: <0E22A6F6-977A-4E58-A0ED-9D596D6B1A20@sinenomine.net> Hi, In a two node cluster I shutdown one of the nodes and the other node notices the shutdown but on rare occasions that node will then fence the node that is shutting down. I assume Is this a situation where setting post_fail_delay would be useful or setting the totem timeout to something higher than its default. Neale From wferi at niif.hu Thu Aug 28 23:00:24 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Fri, 29 Aug 2014 01:00:24 +0200 Subject: [Linux-cluster] locating a starting resource In-Reply-To: <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> (Andrew Beekhof's message of "Thu, 28 Aug 2014 08:58:23 +1000") References: <871ts39e5c.fsf@lant.ki.iif.hu> <877g1t7odb.fsf@lant.ki.iif.hu> <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> Message-ID: <87sikgb4on.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 28 Aug 2014, at 4:56 am, Ferenc Wagner wrote: > >> Andrew Beekhof writes: >> >>> On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: >>> >>>> crm_resource --locate finds the hosting node of a running (successfully >>>> started) resource just fine. Is there a way to similarly find out the >>>> location of a resource *being* started, ie. whose resource agent is >>>> already running the start action, but that action is not finished yet? >>> >>> You need to set record-pending=true in the op_defaults section. >>> For some reason this is not yet documented :-/ >>> >>> With this in place, crm_resource will find the correct location >> >> I set it in a single start operation, and it works as advertised, >> thanks! At first I was suprised to see "Started" in the crm status >> output while the resource was only starting, but the added order >> constraint worked as expected, ie. the dependent resource started only >> after the start action finished successfully. This begs a bonus >> question: how do I tell apart starting resources with record-pendig=true >> and started resources? > > I'm reasonably sure we don't expose that via crm_resource. > Seems like a reasonable thing to do though. > > crm_mon /might/ show pending though. Version 1.1.7 does not. Looks like call-id="-1" signs the pending operations of an , so pulling this info out of the CIB is not too complicated. -- Regards, Feri. From wferi at niif.hu Fri Aug 29 00:54:57 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Fri, 29 Aug 2014 02:54:57 +0200 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <749665C4-F970-4C43-9228-BCFD2EE1B442@beekhof.net> (Andrew Beekhof's message of "Thu, 28 Aug 2014 08:57:26 +1000") References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> <87iolf9mkc.fsf@lant.ki.iif.hu> <87iold7tba.fsf@lant.ki.iif.hu> <749665C4-F970-4C43-9228-BCFD2EE1B442@beekhof.net> Message-ID: <87oav4azdq.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 28 Aug 2014, at 3:09 am, Ferenc Wagner wrote: > >> So crm_resource -r whatever -C is the way, for each resource separately. >> Is there no way to do this for all resources at once? > > I think you can just drop the -r Unfortunately, that does not work under version 1.1.7: $ sudo crm_resource -C Error performing operation: The object/attribute does not exist >> Andrew Beekhof writes: >> >>> On 27 Aug 2014, at 3:40 am, Ferenc Wagner wrote: >>> >>>> My experiences show that you may not *move around* resources while in >>>> maintenance mode. >>> >>> Correct >>> >>>> That would indeed require a cluster-wide re-probe, which does not >>>> seem to happen (unless forced some way). >> >> After all this, I suggest to clarify this issue in the fine manual. >> I've read it a couple of times, and still got the wrong impression. > > Which specific section do you suggest? 5.7.1. Monitoring Resources for Failure Some points worth adding/emphasizing would be: 1. documentation of the role property (role=Master is mentioned later, but role=Stopped never) 2. In maintenance mode, monitor operations don't run 3. If management of a resource is switched off, its role=Started monitor operation continues running until failure, then the role=Stopped kicks in (I'm guessing here; also, what about the other nodes?) 4. When management is enabled again, no re-probe happens, the cluster expects the last state and location to be still valid 5. so don't even move unmanaged resources 6. unless you started a resource somewhere before starting the cluster on that node, or you cleaned up the resource 7. same is true for maintenance mode, but for all resources. I have to agree that most of this is evident once you know it. Unfortunately, it's also easy to get wrong while learning the ropes. For example, hastexo has some good information online: http://www.hastexo.com/resources/hints-and-kinks/maintenance-active-pacemaker-clusters But from the sentence "in maintenance mode, you can stop or restart cluster resources at will" I still miss the constraint of not moving the resource between the nodes. Also, setting enabled="false" works funny, it did not get rid of the monitor operation before I set the resource to managed, and deleting the setting or changing it to true did bring it back. I had to restart the resource to have monitor ops again. Why? -- Thanks, Feri. From wferi at niif.hu Fri Aug 29 00:57:50 2014 From: wferi at niif.hu (Ferenc Wagner) Date: Fri, 29 Aug 2014 02:57:50 +0200 Subject: [Linux-cluster] locating a starting resource In-Reply-To: <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> (Andrew Beekhof's message of "Thu, 28 Aug 2014 08:58:23 +1000") References: <871ts39e5c.fsf@lant.ki.iif.hu> <877g1t7odb.fsf@lant.ki.iif.hu> <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> Message-ID: <87ha0waz8x.fsf@lant.ki.iif.hu> Andrew Beekhof writes: > On 28 Aug 2014, at 4:56 am, Ferenc Wagner wrote: > >> Andrew Beekhof writes: >> >>> On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: >>> >>>> crm_resource --locate finds the hosting node of a running (successfully >>>> started) resource just fine. Is there a way to similarly find out the >>>> location of a resource *being* started, ie. whose resource agent is >>>> already running the start action, but that action is not finished yet? >>> >>> You need to set record-pending=true in the op_defaults section. >>> For some reason this is not yet documented :-/ >>> >>> With this in place, crm_resource will find the correct location >> >> I set it in a single start operation, and it works as advertised, >> thanks! At first I was suprised to see "Started" in the crm status >> output while the resource was only starting, but the added order >> constraint worked as expected, ie. the dependent resource started only >> after the start action finished successfully. This begs a bonus >> question: how do I tell apart starting resources with record-pendig=true >> and started resources? > > I'm reasonably sure we don't expose that via crm_resource. > Seems like a reasonable thing to do though. crm_resource -O outputs lines like this: [...] Started : vm-elm_start_0 (node=lant, call=-1, rc=14): pending which seems good enough for now. -- Thanks, Feri. From andrew at beekhof.net Fri Aug 29 02:31:36 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Fri, 29 Aug 2014 12:31:36 +1000 Subject: [Linux-cluster] locating a starting resource In-Reply-To: <87ha0waz8x.fsf@lant.ki.iif.hu> References: <871ts39e5c.fsf@lant.ki.iif.hu> <877g1t7odb.fsf@lant.ki.iif.hu> <9D0DD413-AB20-4F25-ADF5-02D8471EAA18@beekhof.net> <87ha0waz8x.fsf@lant.ki.iif.hu> Message-ID: <77CDE52A-401F-4851-ABFB-3A643F9913CD@beekhof.net> On 29 Aug 2014, at 10:57 am, Ferenc Wagner wrote: > Andrew Beekhof writes: > >> On 28 Aug 2014, at 4:56 am, Ferenc Wagner wrote: >> >>> Andrew Beekhof writes: >>> >>>> On 27 Aug 2014, at 6:42 am, Ferenc Wagner wrote: >>>> >>>>> crm_resource --locate finds the hosting node of a running (successfully >>>>> started) resource just fine. Is there a way to similarly find out the >>>>> location of a resource *being* started, ie. whose resource agent is >>>>> already running the start action, but that action is not finished yet? >>>> >>>> You need to set record-pending=true in the op_defaults section. >>>> For some reason this is not yet documented :-/ >>>> >>>> With this in place, crm_resource will find the correct location >>> >>> I set it in a single start operation, and it works as advertised, >>> thanks! At first I was suprised to see "Started" in the crm status >>> output while the resource was only starting, but the added order >>> constraint worked as expected, ie. the dependent resource started only >>> after the start action finished successfully. This begs a bonus >>> question: how do I tell apart starting resources with record-pendig=true >>> and started resources? >> >> I'm reasonably sure we don't expose that via crm_resource. >> Seems like a reasonable thing to do though. > > crm_resource -O outputs lines like this: > [...] Started : vm-elm_start_0 (node=lant, call=-1, rc=14): pending > which seems good enough for now. > -- More recent versions also have: -j, --pending Display pending state if 'record-pending' is enabled -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From andrew at beekhof.net Fri Aug 29 02:32:50 2014 From: andrew at beekhof.net (Andrew Beekhof) Date: Fri, 29 Aug 2014 12:32:50 +1000 Subject: [Linux-cluster] on exiting maintenance mode In-Reply-To: <87oav4azdq.fsf@lant.ki.iif.hu> References: <8761hlickn.fsf@lant.ki.iif.hu> <67506B71-8594-4C16-82C1-F94779F59826@beekhof.net> <87iolf9mkc.fsf@lant.ki.iif.hu> <87iold7tba.fsf@lant.ki.iif.hu> <749665C4-F970-4C43-9228-BCFD2EE1B442@beekhof.net> <87oav4azdq.fsf@lant.ki.iif.hu> Message-ID: On 29 Aug 2014, at 10:54 am, Ferenc Wagner wrote: > Andrew Beekhof writes: > >> On 28 Aug 2014, at 3:09 am, Ferenc Wagner wrote: >> >>> So crm_resource -r whatever -C is the way, for each resource separately. >>> Is there no way to do this for all resources at once? >> >> I think you can just drop the -r > > Unfortunately, that does not work under version 1.1.7: You know what I'm going to say here right? > > $ sudo crm_resource -C > Error performing operation: The object/attribute does not exist > >>> Andrew Beekhof writes: >>> >>>> On 27 Aug 2014, at 3:40 am, Ferenc Wagner wrote: >>>> >>>>> My experiences show that you may not *move around* resources while in >>>>> maintenance mode. >>>> >>>> Correct >>>> >>>>> That would indeed require a cluster-wide re-probe, which does not >>>>> seem to happen (unless forced some way). >>> >>> After all this, I suggest to clarify this issue in the fine manual. >>> I've read it a couple of times, and still got the wrong impression. >> >> Which specific section do you suggest? > > 5.7.1. Monitoring Resources for Failure Ok, I'll endeavour to improve that section :) > > Some points worth adding/emphasizing would be: > 1. documentation of the role property (role=Master is mentioned later, > but role=Stopped never) > 2. In maintenance mode, monitor operations don't run > 3. If management of a resource is switched off, its role=Started monitor > operation continues running until failure, then the role=Stopped > kicks in (I'm guessing here; also, what about the other nodes?) > 4. When management is enabled again, no re-probe happens, the cluster > expects the last state and location to be still valid > 5. so don't even move unmanaged resources > 6. unless you started a resource somewhere before starting the cluster > on that node, or you cleaned up the resource > 7. same is true for maintenance mode, but for all resources. > > I have to agree that most of this is evident once you know it. > Unfortunately, it's also easy to get wrong while learning the ropes. > For example, hastexo has some good information online: > http://www.hastexo.com/resources/hints-and-kinks/maintenance-active-pacemaker-clusters > But from the sentence "in maintenance mode, you can stop or restart > cluster resources at will" I still miss the constraint of not moving the > resource between the nodes. Also, setting enabled="false" works funny, > it did not get rid of the monitor operation before I set the resource to > managed, and deleting the setting or changing it to true did bring it > back. I had to restart the resource to have monitor ops again. Why? > -- > Thanks, > Feri. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From manish631 at rediffmail.com Sat Aug 30 14:12:42 2014 From: manish631 at rediffmail.com (manish vaidya) Date: 30 Aug 2014 14:12:42 -0000 Subject: [Linux-cluster] =?utf-8?q?Please_help_me_on_cluster_error?= Message-ID: <20140830141242.4308.qmail@f5mail-224-126.rediffmail.com> i created four node cluster in kvm enviorment But i faced error when create new pv such as pvcreate /dev/sdb1 got error , lock from node 2 & lock from node3 also strange cluster logs Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e 5f Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5f 60 Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 61 Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 63 64 Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 69 6a Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 78 Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 84 85 Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 9a 9b Please help me on this issue -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Sat Aug 30 14:53:08 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Sat, 30 Aug 2014 16:53:08 +0200 Subject: [Linux-cluster] Please help me on cluster error In-Reply-To: <20140830141242.4308.qmail@f5mail-224-126.rediffmail.com> References: <20140830141242.4308.qmail@f5mail-224-126.rediffmail.com> Message-ID: are you using clvmd? if your answer is = yes, you need to be sure, you pv is visibile to your cluster nodes 2014-08-30 16:12 GMT+02:00 manish vaidya : > i created four node cluster in kvm enviorment But i faced error when > create new pv such as pvcreate /dev/sdb1 > got error , lock from node 2 & lock from node3 > > also strange cluster logs > > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e > > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e > 5f > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5f > 60 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 61 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 63 > 64 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 69 > 6a > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 78 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 84 > 85 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 9a > 9b > > > Please help me on this issue > > > Get your own *FREE* website, *FREE* domain & *FREE* mobile app with > Company email. > *Know More >* > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Sat Aug 30 16:35:52 2014 From: lists at alteeve.ca (Digimer) Date: Sat, 30 Aug 2014 12:35:52 -0400 Subject: [Linux-cluster] Please help me on cluster error In-Reply-To: <20140830141242.4308.qmail@f5mail-224-126.rediffmail.com> References: <20140830141242.4308.qmail@f5mail-224-126.rediffmail.com> Message-ID: <5401FD68.8000407@alteeve.ca> Can you share your cluster information please? This could be a network problem, as the messages below happen when the network between the nodes isn't fast enough or has too long latency and cluster traffic is considered lost and re-requested. If you don't have fencing working properly, and if a network issue caused a node to be declared lost, clustered LVM (and anything else using cluster locking) will fail (by design). If you share your configuration and more of your logs, it will help us understand what is happening. Please also tell us what version of the cluster software you're using. digimer On 30/08/14 10:12 AM, manish vaidya wrote: > i created four node cluster in kvm enviorment But i faced error when > create new pv such as pvcreate /dev/sdb1 > got error , lock from node 2 & lock from node3 > > also strange cluster logs > > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e > > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5e > 5f > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 5f > 60 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 61 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 63 > 64 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 69 > 6a > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 78 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 84 > 85 > Jun 10 14:46:24 node1 corosync[3266]: [TOTEM ] Retransmit List: 9a > 9b > > > Please help me on this issue > > > Get your own *FREE* website, *FREE* domain & *FREE* mobile app with > Company email. > *Know More >* > > > > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?