[Linux-cluster] Ext3/ext4 in a clustered environement

Mon Nov 7 21:00:27 UTC 2011

> I've seen significant a performance drop with ext3 (and other) filesystems
> with 10s to 100s of thousands of files per directory. Make sure that the
> "directory hash" option is enabled for ext3. With ~1M files per directory, 
> I'd
> do some performance tests comparing rsync under ext3, ext4, and gfs befor
> changing filesystems...while ext3/4 do perform better than gfs, the 
> directory
> size may be such an overwhelming factor that the filesystem choice is
> irrelevent.

Get me right, there are millions of files, but no more than a few hundreds 
per directory. They are spread out splited on the database id, 2 caracters 
at a time. So a file name 1234567.jpg would end up in a directory 12/34/5/, 
or something similar.

> Is this a GFS issue strictly, or an issue with rsync. Have you set up a
> similar environment under ext3/4 to test jus the rsync part? Rsync is
> known for being a memory & resource hog, particularly at the initial
> stage of  building the filesystem tree.
>
> I would strongly recommend benchmarking rsync on ext3/4 before making the
> switch.
>
> One option would be to do several 'rsync' operations (serially, not in
> parallel!), each operating on a subset of the filesystem, while continuing
> to use gfs.

Yes it is a GFS specific, our backup server is on ext3 and rsyncing can be 
made in a couple of hours, without eating cpu at all (only memory), and 
without bringing the server on it's knees.

Spliting on subdirectories might be an option, but that would be more like a 
band-aid... I'll try to avoid that.

> => <fs device="/dev/VGx/documentsA" force_unmount="1" fstype="ext4"
> => mountpoint="/GFSVolume1/Service1/documentsA" name="documentsA"
> => options="noatime,quota=off"/>
> =>
> => So, first, is this doable ?
>
> Yes.
>
> We have been doing something very similar for the past ~2 years, except
> not mounting the ext3/4 partition under a GFS mountpoint.

I will be doing some expiriment with that...

>
> =>
> => Second, is this risky ? In the sens of that with force_unmont true, I 
> assume
> => that no other node would mount that filesystem before it is unmounted 
> on the
> => stopping service. I know that for some reason umount could hang, but 
> it's
> => not likely since this data is mostly read-only. In that case the 
> service
>
> We've experienced numerous cases where the filesystem hangs after a
> service migration due a node (or service) failover. These hangs all
> seem to be related to quota or NFS issues, so this may not be an issue
> in your environment.

While we do not use nfs on top of the 3 most important directories, it will 
be used on some of those volumes...

> => would be failed and need to be manually restarted. What would be the
> => consequence if the filesystem happens to be mounted on 2 nodes ?
>
> Most likely, filesystem corruption.

Other responses led me to beleive that if I let the cluster manage the 
filesystem, and never mount it myselef, it's much less likely to happen.

Thanks