[Linux-cluster] bug: nfsclient.sh

Thu Feb 24 15:00:42 UTC 2011

If multiple NFS services are defined, a race condition exists with 
parallel invokations of /usr/share/cluster/nfsclient.sh

exportfs in add/remove mode reads the existing exports in from kernel 
(or/etab/xtab), applies the command and then writes a _complete_ 
exportlist back to the kernel, not just an incremental change.

Unfortunately exportfs takes no account of other running copies of 
itself and kernel exports or /var/lib/[e|x]tab can change while it's 
open without it noticing. Similarly it may write back to the same files 
without checking for locks set by other copies of itself (there aren't 
any to check for in any case)

This can result in differing copies of the exports being written back to 
the kerne. As an example, 5 exports are added at once using 5 
simultaneous exportfs commands - there's a good chance only 3 of the 5 
will make it into the kernel, with the others having been written by one 
process and then overwritten by another.

This primarily manifests at startup/failover and typically results in 
10-20% of filesystem export commands failing and the assciated service 
automatically restarting. (We have 84 separate NFS services. That's a 
LOT of exportfs parallelisation, giving a higher collision rate for 
reading/writing the e/xtab files.

The fix is easy - flock calls around every exportfs invokation to ensure 
only one copy is ever running at once.

See the attached modified version of nfsclient.sh

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: nfsclient.sh-WITH-FLOCK
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110224/29c951f0/attachment.ksh>