[Linux-cluster] disabling DLM and GFS kernel modules
Chris Harms
chris at cmiware.com
Tue Sep 18 15:00:06 UTC 2007
The only other thing I can think of is that I started NTPd and there was
likely a big time adjustment as it had not been running.
Sep 17 10:27:32 ntpd[1118]: synchronized to 206.222.28.90, stratum 2
Sep 17 15:53:38 ntpd[1118]: time reset +18217.299628 s
Sep 17 15:53:38 ntpd[1118]: kernel time sync enabled 0001
Sep 17 15:53:38 openais[4457]: [TOTEM] The token was lost in the
OPERATIONAL state.
Sep 17 15:53:38 dlm_controld[4480]: cluster is down, exiting
Sep 17 15:53:38 gfs_controld[4486]: cluster is down, exiting
Sep 17 15:53:38 fenced[4474]: cluster is down, exiting
Sep 17 15:53:38 kernel: dlm: closing connection to node 1
Sep 17 15:53:48 named[8732]: *** POKED TIMER ***
Sep 17 15:53:48 named[8733]: *** POKED TIMER ***
Sep 17 15:54:04 ccsd[4437]: Unable to connect to cluster infrastructure
after 30 seconds.
David Teigland wrote:
> On Tue, Sep 18, 2007 at 09:34:45AM -0500, Chris Harms wrote:
>
>> It said something about an out of memory condition. This was logged
>> just prior to where it would have panicked:
>>
>> groupd[9639]: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
>> groupd[9639]: local node must be reset to clear 1 uncontrolled instances
>> of gfs and/or dlm
>> openais[9625]: [CMAN ] cman killed by node 1 because we were killed by
>> cman_tool or other application
>> fenced[9647]: cman_init error 0 111
>> dlm_controld[9653]: cman_init error 0 111
>> gfs_controld[9659]: cman_init error 111
>>
>
> These messages mean that the userspace cluster software all exited for
> some unknown reason, leaving behind a dlm lockspace (in the kernel) from
> rgmanager. At this point, you needed to reboot the machine, but instead
> you restarted the userspace cluster software, which rightly complained
> that you hadn't rebooted the machine, and refused to do operate.
>
> This probably doesn't help, though, because it doesn't tell us anything
> about the original problem(s) you had. The original problem(s) probably
> caused the cluster software to exit the first time, and was probably
> related to the runaway processes.
>
>
>
>> There were 2 runaway processes related to GFS / DLM before I tried to
>> shut it down. We had not encountered any issues like this until now.
>> The only changes to our setup were a superficial change to some cluster
>> services, and an upgrade of the DRBD kernel module.
>>
>> Kevin Anderson wrote:
>>
>>> On Mon, 2007-09-17 at 17:50 -0500, Chris Harms wrote:
>>>
>>>> Is there an easy way to disable GFS and related kernel modules if one
>>>> does not need GFS? We are running the 5.1 Beta 1 version of the cluster
>>>> and had a mysterious crash of the cluster suite. There were issues with
>>>> the GFS and dlm modules. The kernel panicked on shutdown.
>>>>
>>>>
>>>>
>>> Do you have any details on the panic?
>>>
>>> Kevin
>>> ------------------------------------------------------------------------
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
More information about the Linux-cluster
mailing list