[Linux-cluster] GFS 6.0 crashing x86_64 machine

micah nerren mnerren at paracel.com
Mon Aug 9 20:57:00 UTC 2004


On Mon, 2004-08-09 at 08:12, Michael Conrad Tadpol Tilstra wrote:
> On Fri, Aug 06, 2004 at 04:31:55PM -0700, micah nerren wrote:
> > So it appears to be specifically related to lock_gulm.
> 
> hrms, so no pushing this off onto someone else. oh well. ;)
> 
> 
> > Anything else I should try?
> well, it still pretty much looks like a stack overflow.  And looking at
> the calling tree, there is not much left to take out of the stacks.  So
> I guess we'll have to try making the stack shorter.
> 
> So, another patch.  This still works on my intels, give it a go and
> lets see how it does on your opterons.
> 
> > I really appreciate all your help in debugging this!
> np.
> 

I tried the patch, it still crashes with the same oops. However, I tried
something I hadn't tried before which may shed some light on this. I
rebooted the system into UP mode, loaded the UP modules, and did the
mount of the file system. This time, no oops. It still doesn't work, but
the machine lives. The mount process simply hangs. When I go to another
terminal and kill the mount process, this appears in the syslog:

lock_gulm: ERROR cm_login failed. -512
lock_gulm: ERROR Got a -512 trying to start the threads.
lock_gulm: fsid=hopkins:gfs01: Exiting gulm_mount with errors -512
GFS: can't mount proto = lock_gulm, table = hopkins:gfs01, hostdata =

So, does that shed some light onto things? Something specific to SMP and
lock_gulm. It still doesn't work in UP mode, but it does not oops.





More information about the Linux-cluster mailing list