[Linux-cluster] on exiting maintenance mode
Andrew Beekhof
andrew at beekhof.net
Wed Aug 27 04:54:33 UTC 2014
On 27 Aug 2014, at 3:40 am, Ferenc Wagner <wferi at niif.hu> wrote:
> Andrew Beekhof <andrew at beekhof.net> writes:
>
>> On 22 Aug 2014, at 10:37 am, Ferenc Wagner <wferi at niif.hu> wrote:
>>
>>> While my Pacemaker cluster was in maintenance mode, resources were moved
>>> (by hand) between the nodes as I rebooted each node in turn. In the end
>>> the crm status output became perfectly empty, as the reboot of a given
>>> node removed from the output the resources which were located on the
>>> rebooted node at the time of entering maintenance mode. I expected full
>>> resource discovery on exiting maintenance mode,
>>
>> Version and logs?
>
> (The more interesting part comes later, please skip to the theoretical
> part if you're short on time. :)
>
> I left those out, as I don't expect the actual behavior to be a bug.
> But I experienced this with Pacemaker version 1.1.7. I know it's old
No kidding :)
> and it suffers from crmd segfault on entering maintenance mode (cf.
> http://thread.gmane.org/gmane.linux.highavailability.user/39121), but
> works well generally so I did not get to upgrade it yet. Now that I
> mentioned the crmd segfault: I noted that it died on the DC when I
> entered maintenance mode:
>
> crmd: [7452]: info: te_rsc_command: Initiating action 64: cancel vm-tmvp_monitor_60000 on n01 (local)
> crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel.
That looks like the lrmd died.
> crmd: [7452]: ERROR: get_lrm_resource: Could not add resource vm-tmvp to LRM
> crmd: [7452]: ERROR: do_lrm_invoke: Invalid resource definition
> crmd: [7452]: WARN: do_lrm_invoke: bad input <create_request_adv origin="te_rsc_command" t="crmd" version="3.0.6" subt="request" reference="lrm_invoke-tengine-1408517719-30820" crm_task="lrm_invoke" crm_sys_to="lrmd" crm_sys_from="tengine" crm_host_to="n01" >
> crmd: [7452]: WARN: do_lrm_invoke: bad input <crm_xml >
> crmd: [7452]: WARN: do_lrm_invoke: bad input <rsc_op id="64" operation="cancel" operation_key="vm-tmvp_monitor_60000" on_node="n01" on_node_uuid="n01" transition-key="64:20579:0:1b0a6e79-af5a-41e4-8ced-299371e7922c" >
> crmd: [7452]: WARN: do_lrm_invoke: bad input <primitive id="vm-tmvp" long-id="vm-tmvp" class="ocf" provider="niif" type="TransientDomain" />
> crmd: [7452]: info: te_rsc_command: Initiating action 86: cancel vm-wfweb_monitor_60000 on n01 (local)
> crmd: [7452]: ERROR: lrm_add_rsc(870): failed to send a addrsc message to lrmd via ch_cmd channel.
> crmd: [7452]: ERROR: lrm_get_rsc(666): failed to send a getrsc message to lrmd via ch_cmd channel.
> corosync[6966]: [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x1dc6ea0, async-conn=0x1dc6ea0) left
> pacemakerd: [7443]: WARN: Managed crmd process 7452 killed by signal 11 [SIGSEGV - Segmentation violation].
Which created a condition in the crmd that it couldn't handle so it crashed too.
> pacemakerd: [7443]: notice: pcmk_child_exit: Child process crmd terminated with signal 11 (pid=7452, rc=0)
>
> However, it got restarted seamlessly, without the node being fenced, so
> I did not even notice this until now. Should this have resulted in the
> node being fenced?
Depends how fast the node can respawn.
>
> But back to the issue at hand. The Pacemaker shutdown seemed normal,
> apart from the bunch of messages like:
>
> crmd: [13794]: ERROR: verify_stopped: Resource vm-web5 was active at shutdown. You may ignore this error if it is unmanaged.
In maintenance mode, everything is unmanaged. So that would be expected.
>
> appearing twice and warnings like:
>
> cib: [7447]: WARN: send_ipc_message: IPC Channel to 13794 is not connected
> cib: [7447]: WARN: send_via_callback_channel: Delivery of reply to client 13794/bf6f43a2-70db-40ac-a902-eabc3c12e20d failed
> cib: [7447]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed
> corosync[6966]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
>
> On reboot, corosync complained until the some Pacemaker components
> started:
>
> corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
> corosync[8461]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
>
> Pacemaker then probed the resources on the local node (all was inactive):
>
> lrmd: [8946]: info: rsc:stonith-n01 probe[5] (pid 9081)
> lrmd: [8946]: info: rsc:dlm:0 probe[6] (pid 9082)
> [...]
> lrmd: [8946]: info: operation monitor[112] on vm-fir for client 8949: pid 12015 exited with return code 7
> crmd: [8949]: info: process_lrm_event: LRM operation vm-fir_monitor_0 (call=112, rc=7, cib-update=130, confirmed=true) not running
> attrd: [8947]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
> attrd: [8947]: notice: attrd_perform_update: Sent update 4: probe_complete=true
>
> Then I cleaned up some resources running on other nodes, which resulted
> in those showing up in the crm status output providing log lines like eg.:
>
> crmd: [8949]: WARN: status_from_rc: Action 4 (vm-web5_monitor_0) on n02 failed (target: 7 vs. rc: 0): Error
>
> Finally, I exited maintenance mode, and Pacemaker started every resource
> I did not clean up beforehand, concurrently with their already running
> instances:
>
> pengine: [8948]: notice: LogActions: Start vm-web9#011(n03)
>
> I can provide more logs if this behavior is indeed unexpected, but it
> looks more like I miss the exact concept of maintenance mode.
>
>> The discovery usually happens at the point the cluster is started on a node.
>
> A local discovery did happen, but it could not find anything, as the
> cluster was started by the init scripts, well before any resource could
> have been moved to the freshly rebooted node (manually, to free the next
> node for rebooting).
Thats your problem then, you've started resources outside of the control of the cluster.
Two options... recurring monitor actions with role=Stopped would have caught this or you can run crm_resource --cleanup after you've moved resources around.
>
>> Maintenance mode just prevents the cluster from doing anything about it.
>
> Fine. So I should have restarted Pacemaker on each node before leaving
> maintenance mode, right? Or is there a better way?
See above
> (Unfortunately, I
> could not manage the rolling reboot through Pacemaker, as some DLM/cLVM
> freeze made the cluster inoperable in its normal way.)
>
>>> but it probably did not happen, as the cluster started up resources
>>> already running on other nodes, which is generally forbidden. Given
>>> that all resources were running (though possibly migrated during the
>>> maintenance), what would have been the correct way of bringing the
>>> cluster out of maintenance mode? This should have required no
>>> resource actions at all. Would cleanup of all resources have helped?
>>> Or is there a better way?
>
> You say in the above thread that resource definitions can be changed:
> http://thread.gmane.org/gmane.linux.highavailability.user/39121/focus=39437
> Let me quote from there (starting with the words of Ulrich Windl):
>
>>>>> I think it's a common misconception that you can modify cluster
>>>>> resources while in maintenance mode:
>>>>
>>>> No, you _should_ be able to. If that's not the case, its a bug.
>>>
>>> So the end of maintenance mode starts with a "re-probe"?
>>
>> No, but it doesn't need to.
>> The policy engine already knows if the resource definitions changed
>> and the recurring monitor ops will find out if any are not running.
>
> My experiences show that you may not *move around* resources while in
> maintenance mode.
Correct
> That would indeed require a cluster-wide re-probe,
> which does not seem to happen (unless forced some way). Probably there
> was some misunderstanding in the above discussion, I guess Ulrich meant
> moving resources when he wrote "modifying cluster resources". Does this
> make sense?
No, I've reasonably sure he meant changing their definitions in the cib.
Or at least thats what I thought he meant at the time.
> --
> Thanks,
> Feri.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20140827/5ccbe96f/attachment.sig>
More information about the Linux-cluster
mailing list