[Linux-cluster] How to set up NFS HA service
birger
birger at birger.sh
Tue Apr 19 13:08:18 UTC 2005
Debugging a cluster setup with this software could have been easier
given better error messages from the components, but I'm getting there...
I thought I'd just mount my gfs file systems outside the resource
manager's control to have them present all the time and just use the
resource manager to move over the IP address and do the NFS magic. That
seems impossible, as I couldn't get any exports to happen when I defined
them in cluster.conf without a surrounding <fs>. I could define the
exports in /etc/exports, but then I would have to synch files. So in the
end I put all my gfs file systems into cluster.conf.
It almost works. I get mounts, and they get exported. But I have some
error messages in the log file and the exports take a loooong time. Only
2 of the 3 exports defined seem to show up.
I'm also a bit puzzled about why the file systems don't get unmounted
when I disable all services.
As for file locking:
I copied /etc/init.d/nfslock to /etc/init.d/nfslock-svc and made some
changes.
First, I added a little code to enable nfslock to read a variable
STATD_STATEDIR for the -p option from the config file in /etc/sysconfig.
I think this should get propagated back to upcoming fedora releases if
someone who knows how would bother to do it... I then changed
nfslock-svc to read a different config file (/etc/sysconfig/nfs-svc) and
to do 'service nfslock stop' at the top of the start section and
'service nfslock start' at the bottom of the stop section.
This enables me to have statd running as e.g. 'server1' on the cluster
node until it takes over the nfs service. At takeover, statd gets
restarted with statedir on a cluster file system (so it can take over
lock info belonging to the service) and with the name of the NFS service
IP address. Does this sound reasonable? I know I'll loose any locks the
cluster node may have had (as NFS client) when it takes over the nfs
service, but I cannot see any reason why the cluster node should have
nfs locks (or nfs mounts for that matter) except when doing admin work.
I think I could fix it by copying /var/lib/nfs/statd/sm* into the
clustered file system right after the 'service nfslock stop' I put in.
I have appended part of my messages file and my cluster.conf file. Any
help with my NFS export issues will be appreciated.
--
birger
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: text/xml
Size: 2950 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050419/bd2d3168/attachment.xml>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: messages
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050419/bd2d3168/attachment.ksh>
More information about the Linux-cluster
mailing list