[Linux-cluster] Data Loss / Files and Folders "2-Node_GFS-Cluster"

Wendy Cheng s.wendy.cheng at gmail.com
Thu Oct 30 18:49:37 UTC 2008


Jason Ralph wrote:
> Hello List,
>
> We currently have in production a two node cluster with a shared SAS 
> storage device.  Both nodes are running RHEL5 AP and are connected 
> directly to the storage device via SAS.  We also have configured a 
> high availability NFS service directory that is being exported out and 
> is mounted on multiple other linux servers. 
>
> The problem that I am seeing is:
> FIle and folders that are using the GFS filesystem and live on the 
> storage device are mysteriously getting lost.  My first thought was 
> that maybe one of our many users has deleted them. So I have revoked 
> the users privilleges and it is still happening.  My other tought was 
> that a rsync script may have overwrote these files or deleted them.  I 
> have stopped all scripting and crons and it has happened again.
>
> Can someone help me with a command or a log to view that would show me 
> where any of these folders may have gone?  Or has anyone else ever run 
> into this type of data loss using the similar setup?
>

I don't (or "didn't") have adequate involvements with RHEL5 GFS. I may 
not know enough to response. However, users should be aware of ...

Before RHEL 5.1 and community version 2.6.22 kernels, NFS locks (i.e. 
flock, posix lock, etc) is not populated into filesystem layer. It only 
reaches Linux VFS layer (local to one particular server). If your file 
access needs to get synchronized via either flock or posix locks 
*between multiple hosts (i.e. NFS servers)*,  data loss could occur. 
Newer versions of RHEL and 2.6.22-and-above kernels should have the code 
to support this new feature.

There was an old write-up in section 4.1 of 
"http://people.redhat.com/wcheng/Project/nfs.htm" about this issue.

-- Wendy




More information about the Linux-cluster mailing list