[Linux-cluster] Resource Groups

Wed Nov 30 15:19:32 UTC 2005

Lon

Thanks for the info, but I'm still a bit stuck and am no longer sure if it's
a cluster.conf problem or something else.

I am using already mounted gfs file system and just want the IP address and
NFS export to be movable around the cluster.  I presume that the cluster
won't care that a gfs file system is already mounted when it tries to mount
it?

Below is the resource manager extract from my cluster.conf file, I found
another thread that had an example of this so I used that as a basis along
with 
the information you provided.

        <rm>
                <resources>
                        <clusterfs fstype="gfs" name="wavfs"
mountpoint="/mnt/encoded/audio/wav" device="/dev/mapper/shared_disk-wav"
options=""/>
                        <nfsexport name="NFSexports"/>
                        <nfsclient name="read-write" target="*"
options="rw,sync"/>
                </resources>

                <service name="wav" >
                        <ip address="10.0.1.50" monitor_link="yes"/>
                        <clusterfs ref="wavfs">
                                <nfsexport ref="NFSexports">
                                        <nfsclient ref="read-write"/>
                                </nfsexport>
                        </clusterfs>
                </service>
        </rm>

Unfortunately when I tried to start the service, it failed.  Now when I try
to disable it (clusvcadm -d wav), I get the following reported in
/var/log/messages

Nov 30 13:53:48 jrmedia-a clurgmgrd[2194]: <notice> Stopping service wav 
Nov 30 13:53:48 jrmedia-a rpc.mountd: Caught signal 15, un-registering and
exiting.
Nov 30 13:53:52 jrmedia-a kernel: nfsd: last server has exited
Nov 30 13:53:52 jrmedia-a kernel: nfsd: unexporting all filesystems
Nov 30 13:53:52 jrmedia-a kernel: RPC: failed to contact portmap (errno -5).
Nov 30 13:53:55 jrmedia-a clurgmgrd[2194]: <notice> stop on nfsexport
"NFSexports" returned 1 (generic error) 
Nov 30 13:53:55 jrmedia-a clurgmgrd[2194]: <crit> #12: RG wav failed to
stop; intervention required 
Nov 30 13:53:55 jrmedia-a clurgmgrd[2194]: <notice> Service wav is failed 

Cheers
Ben 

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
Sent: 30 November 2005 14:17
To: linux clustering
Subject: Re: [Linux-cluster] Resource Groups

On Wed, 2005-11-30 at 10:46 +0000, Ben Yarwood wrote:
> I can't find any documentation on adding resource groups to the 
> cluster.conf file.
> Can anyone point me in the right direction please or give me an 
> example of an NFS service.
> 
> Regards
> Ben Yarwood

Hi Ben,

Give me a day or three and I will write a real howto for rgmanager.  It is
simple, but not very intuitive (if that makes any sense...).

Basically, a typical NFS service looks like this,

  <service ... >
    <fs ... >
      <nfsexport>  <!-- no attributes -->
        <nfsclient target="*" name="World" options="ro"/>
      </nfsexport>
    </fs>
    <ip ... />
  </service>

The goal for this design is full active-active NFS - where you can have as
many NFS services as you want moving around in the cluster at any time
completely independent of one another.  Due to some kernel bugs, this does
not currently work correctly in all cases, unfortunately (hard ones...
believe me, we are working on them).

Tangent.  You can ignore this next part/example if you do not want to
experiment with largely untested resources...

Contrast to single-NFS service (nfsserver) in the head branch of CVS - which
works generally (even issues SM_NOTIFY on the correct IPs...), but has the
limitation of only allowing *one* NFS server in the entire cluster (*ouch*).

Example nfsserver implementation:

  <service ... >
    <fs ... >
      <nfsserver ... >
        <nfsclient ... />
        <nfsclient ... />
        <ip ... />
      </nfsserver>
    </fs>
  </service>

Ok, on to some general hints...

There is a pre-arranged start/stop ordering for resource types with certain
children.  With a service, the order is:

start:
  fs         <!-- mount ext2, ext3, etc. -->
  clusterfs  <!-- mount gfs -->
  netfs      <!-- mount an outside NFS export -->
  ip         <!-- bring up an IP -->
  script     <!-- User-scripts -->

stop:
  ip         
  script     
  netfs      
  clusterfs  
  fs

There is no guaranteed ordering within a resource type if it has a defined
start/stop order (so, five <fs> direct descendents of <service> may
start/stop in any order), and no ordering guarantees among similar or
different resource types if there is no defined start/stop order.  

Instead, if you need ordering apart from the <service> child guarantees, it
is better to make children.  Children of a resource are always started
before the next resource at the same level of the tree.  A common example of
this is having a sub-mount point: mount /a, then mount /a/b and /a/c for
example.

Wrong:

  <service ... >
    <fs mountpoint="/a" ... />
    <fs mountpoint="/a/b" ... />
    <fs mountpoint="/a/c" ... />
  </service>

Correct:

  <service ... >
    <fs mountpoint="/a" ... >
      <fs mountpoint="/a/b" ... />
      <fs mountpoint="/a/c" ... />
    </fs>
  </service>

Similarly, some user applications *require* that the IP address be up at the
time the service starts and torn down *after* the application has exited.

Wrong for this case (but correct for *most* applications!):

  <service ... >
    <fs mountpoint="/a" ... />
    <ip ... />
    <script ... />
  </service>

Correct:

  <service ... >
    <fs mountpoint="/a" ... />
    <ip ... >
      <script ... />
    </ip>
  </service>

This is kind of a "top-down" dependency for start, and a "bottom-up" for
stop.  In the above example, you may not stop the <ip> until the <script>
resource has successfully stopped, and you may not start the <script> until
after the <ip> has successfully started.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster