df hangs -- nfs related problem

Patrick O'Callaghan pocallaghan at gmail.com
Wed Jan 7 15:16:32 UTC 2009


On Wed, Jan 7, 2009 at 9:33 PM, Rick Stevens <ricks at nerd.com> wrote:
> Aldo Foot wrote:
>>
>> I have a NFS client in which the df commands hangs. I have to manually
>> kill it to get back to the prompt.
>>
>> The client had a nfs export mounted, but as of now the export is not
>> available to the client, and the client is trying to access it.
>> Is there a way to terminate the nfs mount on the client?
>> There are no messages in the log files pointing to this.
>>
>> I know its a nfs problem because strace bails out when it tries
>> to access /mnt/nfs.
>>
>> **command output trimmed for clarity.
>>
>> % strace df -h
>> statfs64("/proc/fs/nfsd", 84, {f_type=0x6e667364, f_bsize=4096,
>> f_blocks=0, \
>> f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0},
>> f_namelen=255, f_frsize=4096}) = 0
>> statfs64("/mnt/nfs", 84,
>>
>> The mtab shows the export
>>    nfsclient> cat /etc/mtab
>>    nfsserver:/stor1 /mnt/nfs nfs rw,addr=xxx.xxx.xxx.xxx 0 0
>>
>> The mount command also shows the relevant line.
>>   nfsclient> mount -l
>>   nfsserver:/stor1 on /mnt/nfs type nfs (rw,addr=xxx.xxx.xxx.xxx)
>>
>> Umount says the nfs export is busy --but there is nothing there.
>>   nfsclient> umount nfsserver:/stor1
>>   umount: /mnt/nfs: device is busy
>>   umount: /mnt/nfs: device is busy
>>
>> Cannot see what processes (if any) are accessing nfsserver:/stor1 because
>> the lsof command hangs just like df.
>>
>> Umount does not find anything either.
>>   nfsclient> umount /stor1
>>   umount: /stor1: not found
>>
>> A cron job created a number of df processes over several days that I had
>> to kill manually on the nfsclient.
>>
>> Does anyone have a tip about this? A web search did not yield much.
>
> Try "lsof | grep nfs" to see if anything has the mountpoint open.  If
> not, try "umount -f /mnt/nfs" as the root user to try a forced umount.
>
> Also check to see if the mount command (or /etc/fstab entry) has "hard"
> specified (that's the default as well).  Unless you're really certain
> about the stability of the network and of the NFS server, I'd recommend
> you specify "soft" in the mount command (see "man 5 nfs" for details).

That would depend on which is worse, potentially losing data or having
a client machine hang because the server is (perhaps temporarily)
unavailable. It depends totally on the specific application scenario.
To quote nfs(5):

<quote>
A  so-called  "soft"  timeout can cause silent data corruption in
certain cases. As such, use the soft option only when client
responsiveness is more important than data integrity.  Using NFS over
TCP or increasing the value of the retrans option  may  mitigate some
of the risks of using the soft option.
</quote>

IOW there is no "right" answer to this.

poc




More information about the fedora-list mailing list