[Linux-cluster] dlm caused a kernel panic
teigland at redhat.com
Wed Dec 14 19:45:38 UTC 2005
On Wed, Dec 14, 2005 at 07:43:40AM -0800, Jeff Dinisco wrote:
> Is the slow output from df expected? Does it just take considerable
> time to read a gfs superblock?
Yes, it's expected; df locks ever resource group in the fs to collect
usage information, so large fs's will take longer, and heavy writers on
other nodes will delay it further.
> In my scenario, is it likely that heavy lock load was caused by the
> combination df and a umount at the same time?
I'm not sure lock load is related to this particular case. After studying
your logs I think I know what the problem is; it's a situation where a dlm
message from an unmounting node is received after recovery for it is
completed on the remaining nodes. A quick and correct fix would be to
remove the assertion (or perhaps change it, I'll see.)
> Were the gfs recover events in the log prior to the kernel panic
> normal, or is it possible that I attempted the umount too quickly after
Mounting and unmounting always involve dlm recovery which is more prone to
bugs and corner cases, so avoiding unnecessary or rapidly repeating
mounting/unmounting is usually wise. You didn't do anything wrong,
though; it's simply a corner case we aren't handling properly.
> Would r/o mounts decrease lock load and the likelihood of this occurring
More information about the Linux-cluster