[Linux-cluster] on exiting maintenance mode

Fri Aug 29 00:54:57 UTC 2014

Andrew Beekhof <andrew at beekhof.net> writes:

> On 28 Aug 2014, at 3:09 am, Ferenc Wagner <wferi at niif.hu> wrote:
>
>> So crm_resource -r whatever -C is the way, for each resource separately.
>> Is there no way to do this for all resources at once?
>
> I think you can just drop the -r

Unfortunately, that does not work under version 1.1.7:

$ sudo crm_resource -C
Error performing operation: The object/attribute does not exist

>> Andrew Beekhof <andrew at beekhof.net> writes:
>> 
>>> On 27 Aug 2014, at 3:40 am, Ferenc Wagner <wferi at niif.hu> wrote:
>>> 
>>>> My experiences show that you may not *move around* resources while in
>>>> maintenance mode.
>>> 
>>> Correct
>>> 
>>>> That would indeed require a cluster-wide re-probe, which does not
>>>> seem to happen (unless forced some way).
>> 
>> After all this, I suggest to clarify this issue in the fine manual.
>> I've read it a couple of times, and still got the wrong impression.
>
> Which specific section do you suggest?

5.7.1. Monitoring Resources for Failure

Some points worth adding/emphasizing would be:
1. documentation of the role property (role=Master is mentioned later,
   but role=Stopped never)
2. In maintenance mode, monitor operations don't run
3. If management of a resource is switched off, its role=Started monitor
   operation continues running until failure, then the role=Stopped
   kicks in (I'm guessing here; also, what about the other nodes?)
4. When management is enabled again, no re-probe happens, the cluster
   expects the last state and location to be still valid
5. so don't even move unmanaged resources
6. unless you started a resource somewhere before starting the cluster
   on that node, or you cleaned up the resource
7. same is true for maintenance mode, but for all resources.

I have to agree that most of this is evident once you know it.
Unfortunately, it's also easy to get wrong while learning the ropes.
For example, hastexo has some good information online:
http://www.hastexo.com/resources/hints-and-kinks/maintenance-active-pacemaker-clusters
But from the sentence "in maintenance mode, you can stop or restart
cluster resources at will" I still miss the constraint of not moving the
resource between the nodes.  Also, setting enabled="false" works funny,
it did not get rid of the monitor operation before I set the resource to
managed, and deleting the setting or changing it to true did bring it
back.  I had to restart the resource to have monitor ops again.  Why?
-- 
Thanks,
Feri.