[Linux-cluster] DLM won't (stay) running
jagauthier at gmail.com
Tue May 8 17:44:18 UTC 2018
On Tue, May 8, 2018 at 10:50 AM, David Teigland <teigland at redhat.com> wrote:
> On Tue, May 08, 2018 at 07:18:17AM -0400, Jason Gauthier wrote:
>> node 1084772368: alpha
>> node 1084772369: beta
>> primitive p_dlm_controld ocf:pacemaker:controld \
>> op monitor interval=60 timeout=60 \
>> meta target-role=Started args=-K
>> primitive p_gfs_controld ocf:pacemaker:controld \
>> params daemon=gfs_controld \
>> meta target-role=Started
>> primitive stonith_sbd stonith:external/sbd \
>> params pcmk_delay_max=30 sbd_device="/dev/sdb1"
>> group g_gfs2 p_dlm_controld p_gfs_controld
>> clone cl_gfs2 g_gfs2 \
>> meta interleave=true target-role=Started
>> property cib-bootstrap-options: \
>> have-watchdog=false \
>> dc-version=1.1.16-94ff4df \
>> cluster-infrastructure=corosync \
>> cluster-name=zeta \
>> last-lrm-refresh=1525523370 \
>> stonith-enabled=true \
>> When a bring the resources up, I get a quick blip in my logs.
>> May 8 07:13:58 beta dlm_controld: 253556 dlm_controld 4.0.7 started
>> May 8 07:14:00 beta kernel: [253558.641658] dlm: closing connection
>> to node 1084772369
>> May 8 07:14:00 beta kernel: [253558.641764] dlm: closing connection
>> to node 1084772368
> When you're starting the dlm through pacemaker, be sure that systemd is
> not also starting it. I don't think pacemaker is happy if dlm_controld
> is already started.
Thanks David. dlm is not enabled with systemd at all.
>> This is the same messaging I see when I run dlm manually and then stop
>> it. My challenge here is that I cannot find out what dlm is doing.
>> I've tried adding -K to /etc/default/dlm, but I don't think that file
>> is being respected. I would like to figure out how to increase the
>> verbose output of dlm_controld so I can see why it won't stay running
>> when it's launched through the cluster. I haven't been able to
>> figure out how to pass arguments directly to the a daemon in the
>> primitive config, if it's even possible. Otherwise, I would try to
>> pass -K there.
> In /etc/dlm/dlm.conf put
> then you should see all the debug info in
I made this change to both my nodes.. and tried to start the resource.
I just get the same two lines in messages, and a new log file for
dlm_controld.log does not appear.
More information about the Linux-cluster