[Linux-cluster] Ext3/ext4 in a clustered environement

Fri Nov 4 18:05:34 UTC 2011

Hi !

We are curently using RHEL 6.1 with GFS2 file systems on top of 
fiber-channel stoarage for our cluster. All fs' are in lv's, with clvmd.

Our services are divided into directories. For exemple : 
/GFSVolume1/Service1, GFSVolume1/Service2, and so forth. Almost everuything 
the service needs to run is under those directories (apache, php 
executables, websites data, java servers, etc).

On some services, there are document directories that are huge, not that 
much in size (about 35 gigs), but in number of files, around one million. 
One service even has 3 data directories with that many files each.

It works pretty well for now, but when it comes to data update (via rsync) 
and backup (also via rsync), the node doing the rsync crawls to a stop, all 
16 logical cores are used at 100% system, and it sometimes freezes the file 
system for other services on other nodes.

We've changed recently the way the rsync is done, we just to a rsync -nv to 
see what files would be transfered and transfer thos files manually. But 
it's still too much sometimes for the gfs.

In our case, nfs is not an option, there is a lot of is_file called that 
access this directory structure all the time, and the added latency of nfs 
is not viable.

So, I'm thinking of putting each of thos directories into a single ext4 
filesystem of about 100 gigs to speed up all of those process. Where those 
huge directories are used, they are used by one service and one service 
alone. So, I would do a file system in cluster.conf, something like :

<fs device="/dev/VGx/documentsA" force_unmount="1" fstype="ext4" 
mountpoint="/GFSVolume1/Service1/documentsA" name="documentsA" 
options="noatime,quota=off"/>

So, first, is this doable ?

Second, is this risky ? In the sens of that with force_unmont true, I assume 
that no other node would mount that filesystem before it is unmounted on the 
stopping service. I know that for some reason umount could hang, but it's 
not likely since this data is mostly read-only. In that case the service 
would be failed and need to be manually restarted. What would be the 
consequence if the filesystem happens to be mounted on 2 nodes ?

One could add self_fence="1" to the fs line, so that even if it fails, it 
will self-fence the node to force the umount. But I'm not there yet.

Third, I've been told that it's not recommended to mount a file system like 
this "on top" of another clustered fs. Why is that ? I suppose I'll have to 
mount under /mnt/something and symlink to that.

Thanks for any insights.