[Linux-cluster] Re: About CS global shutdown

Patrick Caulfield pcaulfie at redhat.com
Fri Jul 21 07:04:52 UTC 2006


Alain Moulle wrote:
> Alain Moulle wrote:
> 
> 
>>>>> Hi
>>>>>
>>>>> I wonder what is the best way to stop safely the CS 4 on a big amount
>>>>> HA pairs, from a management node,  with the idea to shutdown
>>>>> all nodes after that :
>>>>>
>>>>> Is it better to use :
>>>>> 1/
>>>>> pdsh "clusvcadm -l" launched on all nodes
>>>>> then
>>>>> pdsh "clushutdown" launched on all nodes
>>>>> then
>>>>> proceed to poweroff on all nodes.
>>>>>
>>>>> or
>>>>> 2/
>>>>> pdsh "service rgmanager stop" launched on all nodes
>>>>> pdsh "service fenced stop" launched on all nodes
>>>>> pdsh "service cman stop" launched on all nodes
>>>>> pdsh "service ccsd stop" launched on all nodes
>>>>>
>>>>> I think that 1/ is more safe, because in 2/ some applications
>>>>> could begin to failover just before the 2nd node of a pair
>>>>> receives at its turn the stop commands .
>>>>> The 1/ avoids any attempt to failover for nothing.
>>>>>
>>>>> Am I right ?
>>>>> Thanks
>>>>> Alain
>>>>>
> 
> 
>>> Hi Alain,
> 
>>> I'm kind of new to this (so I could be wrong), but here are my beliefs:
> 
>>> Option 1 only takes care of stopping the rgmanager services running on the
>>> cluster.  (Besides, clushutdown already does a clusvcadm -l for you).  It
>>> really doesn't inform the rest of the cluster the intent of the node
>>> going down.
> 
>>> Option 2 shouldn't really be necessary, because reboot or halt should
>>> execute the proper service shutdowns in the proper order.  However, it
>>> also doesn't inform the rest of the cluster the intent of the node going
>>> down.
> 
>>> You want to inform the rest of the cluster that they're going down,
>>> otherwise
>>> some of the nodes may start trying to fence the other nodes going down, etc.
>>> Perhaps you should consider something like:
> 
>>> pdsh "clushutdown" launched on all nodes
>>> pdsh "cman_tool leave remove" launched on all nodes
>>> pdsh "shutdown" or "halt" launched on all nodes
> 
>>> Regards,
> 
>>> Bob Peterson
>>> Red Hat Cluster Suite
> 
> Hi Bob
> Thanks for you clarification . Nethertheless, several questions :
> 
> 1/It seems that clushutdown is not packaged in the CS4 update 2 rpms whereas
> the man is in package ...  is it missing for a special reason ?
> 
> 2/ what does cman_tool leave remove does exactly ?
> and would it be a workaround for the remaining problem about
> "service cman stop" which fails 1 time for 5 (and in this case
> lsmod lists a remaining user of the module, but without its name)


Don't confuse the "cman_tool" commands with the shutdown scripts (I'm not sure
you are but just to be safe!).

"cman_tool leave remove" should be called /by/ the shutdown scripts after all
the other services have been shutdown.

What it does is to remove the node from the cluster but tell the remaining
nodes of the cluster to adjust quorum, so that the rest of the cluster will
continue to work. This allows you to shut down those nodes similarly without
any services jamming.

What we really need, I suppose, is something equivalent to VMS's
CLUSTER_SHUTDOWN option that allows all nodes to be shut down in careful
synchronisation - but we don't have that.

-- 

patrick




More information about the Linux-cluster mailing list