Solaris NFS clients go wonky when server went FC4->FC5

Wed Sep 20 02:11:19 UTC 2006

I just ran into an unrelated issue that caught me by surprise... but it
might just be worth looking into for this problem also... selinux. Just
for grins and giggles you might try turning it off for NFS to see what
happens. Desktop->System->Security...
It also might be worth checking into iptables settings. These are
generally good things so put them back if this suggestion doesn't help.
My surprise was related to smb support which may just be similar enough.
I needed to carefully adjust these features. Good luck.
<><Randy

<><Randall Grimshaw
Room 203 Machinery Hall
Syracuse University
Syracuse, NY   13244
315-443-5779
rgrimsha at syr.edu
>>> will.partain at verilab.com 09/19/06 5:26 PM >>>
"Chris Mohler" <cr33dog at gmail.com> writes:

[previous recap is at the end]

> Probably not related, but I see those errors on my NFS clients from
> time to time.
>
> It's usually when:
>
> A - The server is using 90%+ of the CPU
> B - the network traffic is very high.

Chris, thanks for your reply.  That's definitely not it -- this is a
quiet-backwater network.  Further developments and facts:

* I upgraded the SPARC box (client) to Solaris 10 6/06 (i.e. the
  latest), in case the problem was old, crufty code there.  No change.

* It is something _I'm_ doing that is triggering the problem; I have a
  colleague who has been using a similar box for days without incident.

* I can make the problem happen always and instantly.  My test case
  happens to involve an NFS partition named /sysadm/.-ark-install-ALL

  I can 'ls /sysadm/.-ark-install-ALL' and it mounts and works fine.

  If I 'truss' the offending test case, it fails at the syscall...

   open64("/our/.-ark-deploy/arkbase/share/ark/arkcmd", O_RDONLY)

  I changed the mount from 'intr' to 'soft', so that I would get an
  error message other than just "server not responding".  (Useful
  trick, no?)

  In every case, I get...

   NFS <op> failed for server foo: error 5 (RPC: Timed out)

  ... where <op> is usually getattr, but can be something else.

  But running all the stuff like '/usr/bin/rpcinfo -t foo nfs' shows
  everything a picture of happiness.

  [The exact mount opts were:

read/write/nosetuid/nodevices/nodev/timeo=600/retrans=2/proto=tcp/vers=3/soft/xattr]

* Once it goes ga-ga over one mount from the server, it is ga-ga about
  other mounts from the same server -- until it rights itself again.

* I _thought_ it might have something to do with running as root; but
  no deal -- I can burst it as me, too.

I count this as slight progress :-(  Any other ideas?

Will

== recap ============================================================

For a long time (years), have had sparc-solaris8 NFS clients
(well-patched) talking to a RH/Fedora NFS server, recently FC4
(x86_64).  The mount options, dished out through autofs, were
(probably sub-optimally):

  -rw,nosuid,nodev,sync,retry=5,rsize=16384,wsize=16384,intr

These were lightly-used clients; it worked; everybody happy.

I yum-upgraded the server to FC5 (current kernel, nfs-utils).  It
works.. most of the time, but the clients now often-but-not-always
wander off into...

  NFS server foo not responding still trying
  NFS server foo ok

-- 
fedora-list mailing list
fedora-list at redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list