[Linux-cluster] Asking information about NFS /CS4 Cookbook

Fajar A. Nugraha fajar at telkom.net.id
Thu May 24 10:31:50 UTC 2007


Hi Wendy,

Wendy Cheng wrote:
>>> - can the client handle the change gracefully so that
>>> there will be no "stale nfs handle"?
>>
>
> This is a tough problem to solve. Hopefully the bugzilla entry
> explains well.
>
The bugzilla does say a lot. Thank you.
>>>
>>> On the server side :
>>> - When the exported filesystem is non-cluster (e.g. ext3), how does the
>>> server handle locking issues? If an nfs-client is locking a file, can
>>> the server (in Managed NFS service) forcefully unmount the file system,
>>> considering that the nfs daemon is kernel-space, so that it can't be
>>> killed?
>>
>
> We have a tentative patch set for this. It is usable but still under
> revised.
>
Please help me go through this summary from the bugzilla

Before we complete the work, for NFS v2/V3, RHEL 4.4 has the following
restrictions:

==> Is this still valid for RHEL 4.5 and RHEL5?

B-1: Unless NFS client applications can tolerate ESTALE and/or EPERM errors,
     IO activities on the failover ip interface must be temporarily quiesced
     until active-active failover transition completes. This is to avoid
     non-idempotent NFS operation failure on the new server. (check out
     "Why NFS Sucks" by Olaf Kirch, placed as "kirch-reprint.pdf" in 2006
     OLS proceeding).

==> What does this mean, exactly? For example, does this mean that I
should not use RHCS-nfs-mounted storage for
busy-accessed-all-the-time-web-servers because I'd likely get
ESTALE/EPERM during failover?

B-2: With various possible base kernel bugs outside RHCS' control, there
     are possibilities that local filesystem (such as ext3) umount could
     fail. To ensure data integrity, RHCS will abort the failover. Admin
     could specify the self-fence (reboot taken-over server) option
     to force failover (via cluster.conf file).

==> In short, it'd be better using GFS, right?

B-3: If nfs client invokes NLM locking call, the subject nfs servers (both
     taken-over and take-over) will enter a global 90-second (tunable)
     locking grace period for every nfs service on the servers.

==> What does "locking grace" mean? Does it mean read-write access
allowed but no locks, or no acess at all?

B-4: If NFS-TCP is involved, failover should not be issued on the same pair
     of machines multiple times within 30-minute period; for example,
     failing over from node A to B, then immediately failing from B back to
     A would hang the connection. This is to avoid TCP TIME_WAIT issue.

==> So what does this mean currently in TCP vs UDP world? Does it mean
nfs v3 UDP is the preferred method?

>>> - Can the client gracefully handle new, failover,  nfs clients, since
>>> some nfs information is stored on /var/lib/nfs, which is on a local
>>> file
>>> system?
>>>
>>>  
>>
>
> This would need to be moved into the shared storage area (and done by
> RHCS).
>
Okay, this one makes sense.
That means there's an additional step involved here, not mentioned (yet)
in nfs cookbook, right?

Regards,

Fajar






More information about the Linux-cluster mailing list