[Linux-cluster] Spectator mount option

Tue Jun 24 19:54:48 UTC 2008

Ok, got it?!
So my understanding is as follows.

Except from the spectator mount option not being adviceable for production use 
the *features* are:

* the spectator node does not get a journal
* the spectator node cannot replay a journal and therefore cannot be involved 
in fencing. Or the other way round it is not involved in fencing
* the spectator node must not have votes. I can but it mustn't.
* the spectator node is less involved in locking. It does not hold locks?! 
Does it? Or does it only have read-locks? That is not clear to me. See below.

On spectator and locking:
Everything clear except from the locking topic. Why do you think the spectator 
node is less involved in locking. Doesn't it have to request a readlock for 
any file it wants to read. And as it cannot (if so?!) hold the lock itself it 
has to ask for it. Isn't this from the network point of view more network 
traffic then if it would "master/cache" the lock?
The only advantage would be that no or a less loaded gfs_scand would be 
running?! I have to admit that this wouldn't be to bad.

If this is so (correct me if I'm wrong) for me only one usecase would be the 
use of old less powerfull nodes that only need ro access to the fs.
As a drawback the symmetry of the cluster would go away and one would end with 
type a and type b nodes. I not yet sure of if I like it or not or if it is a 
very common usecase.

But isn't another usecase more likely. The readonly attempt. That means 
readonly access to the filesystem and everything else (fencing, journal 
replay and locking) would be running as before? But one could be sure that 
the ro-nodes would only request ro-locks but would be holding themselves and 
therefore could respond more quickly and perhaps some other tunings for 
readonly access to the gfs.

Because normally the usecase for readonly access would be (wouldn't it) that 
there are some nodes changing data (more or less frequently but only in few 
cases continuously) and others reading/serving the data. Those readonly nodes 
should be able to access the data very quickly and should respond to request 
instantaneous and therefore should be more powerfull then the rw-nodes. 

Could this usecase benefit from the spectator mount options? Or should this 
usecase not be build with spectator mountoptions. Or wouldn't it be better to 
reduce the demote_secs use glock_purging and if need be increase the size of 
the hashtables and use rw as mountoption?

Regards Marc.
On Tuesday 24 June 2008 18:38:41 David Teigland wrote:
> On Tue, Jun 24, 2008 at 09:53:56AM +0200, Marc Grimme wrote:
> > Hello,
> > we are currently testing the specator mount option for giving nodes
> > readonly access to a gfs filesystem.
> >
> > One thing we found out is that any node having mounted the filesystem
> > with spectator mount option cannot do recovery when a node in the cluster
> > fails. That means we need at least 2 rw-nodes. It's clear when I keep in
> > mind that the node has no rw-access to the journal and therefor cannot do
> > the journal replay. But it is not mentioned anywhere.
> >
> > Could you please explain the ideas and other "unnormal" behaviors coming
> > along with the spectator mount-options.
> >
> > And are there any advantages from it except the "having no journal"?
>
> It's not mentioned much because it's never crossed the grey line of being
> a promoted or "supportable" feature.  Among the reasons are:
>
> - The use case(s) or benefits have never been clearly understood or stated.
>   What exactly are the spectator features?  (see below)
>   When should you use the spectator features, and why?
>   Are the benefits great enough to justify all the work/testing?
>
> - None of the spectator features have been tested.  QE would need to
>   develop tests for them, run them, and we'd need to fix the problems
>   that fall out.
>
> "Spectator features" refers to more than the spectator mount option in
> gfs.  There are three non-standard configuration modes that could be used
> together (although they can be used independently, too):
>
> 1. The spectator mount option in gfs.  With this option, gfs will never
>    write to the fs.  It won't do journal recovery, and won't allow
>    remount rw.  The main benefit of this is that the node does not need
>    to be fenced if it fails, so the node can mount without joining the
>    fence domain.
>
>    You point out some of the thorny problems with this option (along with
>    the ro mount option).  What happens when the last rw node fails,
>    leaving only spectators who can't recover the journal, and other
>    similar scenarios?  gfs_controld has code to handle these cases,
>    but it would require serious testing/validation.
>
> 2. Quorum votes in cman.  It may make sense in some environments for a node
>    to not contribute to quorum, either positively or negatively, of course.
>    <clusternode name="foo" nodeid="1" votes="0"/>
>
> 3. Resource mastering in dlm.  Nodes can be configured to never master
>    any dlm resources, which means there's less disruption in the dlm when
>    they join/leave the lockspace.  See this bug for more details:
>    https://bugzilla.redhat.com/show_bug.cgi?id=206336
>
> We'd like to understand specific use cases people have where these things
> would provide real advantages.  We need to be able to advise people when,
> why and how to use these settings, and we need to be able to test them as
> they'd be used.
>
> Thanks,
> Dave

-- 
Gruss / Regards,

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

**
ATIX Informationstechnologie und Consulting AG
Einsteinstr. 10 
85716 Unterschleissheim
Deutschland/Germany

Phone: +49-89 452 3538-0
Fax:   +49-89 990 1766-0

Registergericht: Amtsgericht Muenchen
Registernummer: HRB 168930
USt.-Id.: DE209485962

Vorstand: 
Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.)

Vorsitzender des Aufsichtsrats:
Dr. Martin Buss