[Linux-cluster] node fenced by dlm_controld on a clean shutdown

Jan Friesse jfriesse at redhat.com
Wed Nov 21 16:19:02 UTC 2012


Jacek Konieczny napsal(a):
> On Wed, Nov 21, 2012 at 11:19:02AM +0100, Jan Friesse wrote:
>> Hi,
>> we've discussed this problem with dave, but I would like to get some
>> information:
>> - What distro are you using?
> 
> PLD Linux
> 
>> - Packages are compiled or disro?
> 
> I am making packages for the distro as a part of my job.
> 
>> - what you mean by "clean shutdown"? This is something like service
>> dlm_control stop, or your own script?
> 
> systemd, using the corosync.service unit file provided with corosync
> sources (it is far from being 'systemd' native) and the dlm.service

Ya, far far away. But it has good reasons...

> as comes with dlm sources (includes my patches).
> 
> Shutdown is started by '/sbin/halt' or '/sbin/reboot' using standard
> systemd procedure. I have added some rules to make sure Pacemaker is
> stopped before the rest, but dlm and corosync order is not affected.
> 

Ok, cool. This is information I was seeking.

> Systemd kills dlm_controld first and as soon as it exits its initiates
> stop of corosync. Adding an artificial delay between those two fixes my
> problem.
> 

Problem may be, that if dlm_controld refuses to exit, maybe (= this is
theory) it will kill it anyway.

> When calling shutdown scripts by hand or the old SysVinit way (through
> other shell scripts), the delay between the two jobs could be
> 'naturally' longer.
> 
> Unfortunately, I have been distracted recently by some other, higher
> priority, job, so I could not do more investigation in this matter
> (still on my TODO, though).
> 

Understand. You gave me enough information anyway, so thanks.

> Greets,
>         Jacek
> 

Regards,
  Honza




More information about the Linux-cluster mailing list