[Linux-cluster] Directories with >100K files

Jeff Sturm jeff.sturm at eprize.com
Wed Jan 21 03:32:01 UTC 2009


> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of 
> nick at javacat.f2s.com
> Sent: Tuesday, January 20, 2009 5:19 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Directories with >100K files
> 
> We have a GFS filesystem mounted over iSCSI. When doing an 
> 'ls' on directories with several thousand files it takes 
> around 10 minutes to get a response back -

You don't say how many nodes you have, or anything about your
networking.

Some general pointers:

- A plain "ls" is probably much faster any variant that fetches inode
metatdata, e.g. "ls -l".  The latter performs a stat() on each
individual file which in turn triggers locking activity of some sort.
This is known to be slow on GFS1.  (I've heard reports that GFS2 is/will
be better.)

- You want a fast, reliable low-latency network for your cluster.  Intel
GigE cards and a fast switch are a good bet.

- Unless your application needs access times or quota support, mounting
with "noquota,noatime" is a good idea.  Maybe also "nodiratime".

> Can anyone recommend any GFS tunables to help us out here ?

You could try bumping demote_secs up from its default of 5 minutes.
That'll cause locks to be held longer so they may not need to be
reacquired so often.  It won't help with the initial directory listing,
but should help on subsequent invocations.

In your case, with "ls" taking 8 minutes to run, some locks initially
acuired during execution of the command have already been demoted once
complete.

> Should we set statfs_fast to 1 ?

Probably good to set this, regardless.

> What about glock_purge ?

Glock_purge helps limit CPU time consumed by gfs_scand when a large
number of unused glocks are present.  See
http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4
.  This may make your system run better but I'm not sure it's going to
help with listing your giant directories.

> Here is the fstab entry for the GFS filesystem:
> /dev/vggfs/lvol00       /apps                   gfs     
> _netdev         1 2

Try "noatime,noquota" here.

Jeff





More information about the Linux-cluster mailing list