[Linux-cluster] multipath/gfs lockout under heavy write

Tue Jan 25 09:12:06 UTC 2005

On Tue, Jan 25, 2005 at 01:41:54AM -0700, Marcelo Matus wrote:

> >In the past, GFS would immediately panic the machine when it saw i/o
> >errors.  Now it tries to shut down the bad fs instead.  After this happens
> >you should be able to unmount the offending fs, leave the cluster and
> >reboot the machine cleanly.
> 
> I have a question about your last comment. We did the following 
> experiment with GFS 6.0.2:
> 
> 1.- Setup a cluster using a unique GFS server and gnbd device (lock_gulm 
> master and gnbd_export in the same node).
> 
> 2.- Fence out a node manually using fence_gnbd.
> 
> then we observed two cases:
> 
> 1.- If the fenced machine is not mounting the GFS/gnbd fs, but only
> importing it, then we can properly either reboot or restart the GFS
> services with no problem.
> 
> 2.- If the fenced machine is mounting the GFS/gnbd fs, but with no
> process using it, almost everything produces a kernel panic, even just
> unmounting the unused fs.  In fact the only thing that works, besides
> pushing the reset button, is 'reboot -f', which is almost the same.
> 
> So, when you say "In the past", do you refer to GFS 6.0.2 ?

I was actually referring to the code Lazar is using which is the next, as
yet unreleased, version of GFS from the public cvs.  Your situation could
be explained similarly, like this:

- running fence_gnbd causes the node to get i/o errors if it tries to use
  gnbd

- if the node has GFS mounted, GFS will try to use gnbd

- when GFS 6.0.2 sees i/o errors it will panic

If you don't have GFS mounted, the last two steps don't exist and there's
no panic.

-- 
Dave Teigland  <teigland at redhat.com>