[Linux-cluster] Data Loss / Files and Folders "2-Node_GFS-Cluster"

Wendy Cheng s.wendy.cheng at gmail.com
Fri Oct 31 02:02:00 UTC 2008


Jason Ralph wrote:
> Hello List,
>
> We currently have in production a two node cluster with a shared SAS 
> storage device.  Both nodes are running RHEL5 AP and are connected 
> directly to the storage device via SAS.  We also have configured a 
> high availability NFS service directory that is being exported out and 
> is mounted on multiple other linux servers. 
>
> The problem that I am seeing is:
> FIle and folders that are using the GFS filesystem and live on the 
> storage device are mysteriously getting lost.  My first thought was 
> that maybe one of our many users has deleted them. So I have revoked 
> the users privilleges and it is still happening.  My other tought was 
> that a rsync script may have overwrote these files or deleted them.  I 
> have stopped all scripting and crons and it has happened again.
>
> Can someone help me with a command or a log to view that would show me 
> where any of these folders may have gone?  Or has anyone else ever run 
> into this type of data loss using the similar setup?
>


I don't (or "didn't") have adequate involvements with RHEL5 GFS. I may
not know enough to response. However, ......

Before RHEL 5.1 and/or community version 2.6.22 kernels, NFS lock (via
flock, fcntl, etc from client ends) is not populated into filesystem 
layer. It only reaches Linux VFS layer (local to one particular server). 
If your file access needs to get synchronized by either flock or posix 
fcntl *between multiple hosts (NFS servers)*, data loss could occur.
Newer versions of RHEL and 2.6.22-and-after kernels should have the fixes.

There was an old write-up in section 4.1 of
"http://people.redhat.com/wcheng/Project/nfs.htm" about this issue.


-- Wendy





More information about the Linux-cluster mailing list