[Linux-cluster] clvmd without GFS?

Wed Oct 27 14:50:39 UTC 2004

David Teigland wrote:
> On Tue, Oct 26, 2004 at 04:52:12PM -0500, Matt Mitchell wrote:
> 
>>Just seeking some opinions here...
>>
>>I have observed some really poor performance in GFS when dealing with 
>>large numbers of small files.  It seems to be designed to scale well 
>>with respect to throughput, at the apparent expense of metadata and 
>>directory operations, which are really slow.  For example, in a 
>>directory with 100,000 4k files (roughly) a simple 'ls -l' with lock_dlm 
>>took over three hours to complete on our test setup with no contention 
>>(and only one machine had the disk mounted at all).  (Using Debian 
>>packages dated 16 September 2004.)
> 
> 
> Lots of small files can certainly expose some of the performance
> limitations of gfs.  "Hours" sounds very odd, though, so I ran a couple
> sanity tests on my own test hardware.
> 
> One node mounted with lock_dlm, the directory has 100,000 4k files,
> running "time ls -l | wc -l".
> 
> - dual P3 700 MHz, 256 MB, some old FC disks in a JBOD
>   5 min 30 sec
> 
> - P4 2.4 GHz, 512 MB, iscsi to a netapp
>   2 min 30 sec
> 
> Having more nodes mounted didn't change this.  (Four nodes of the first
> kind all running this at the same time averaged about 17 minutes each.)

My initial setup was using a dinky SCSI-IDE RAID box that happened to 
have two interfaces.  Now that we have the fibre channel hardware 
in-house I am recreating the setup on it in order to get some 
performance numbers on that.

It seems like there is a lot of contention for the directory inodes 
(which is probably unavoidable) and that would likely be helped by 
segregating the files into smaller subdirectories.  Implementation-wise, 
is there a magic number or formula to follow for sizing these 
directories?   Does the number of journals make a difference?

-m